* [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring
@ 2026-06-03 22:25 Maíra Canal
2026-06-03 22:25 ` [PATCH v3 01/14] drm/v3d: Drop unused drm_encoder.h include from v3d_drv.h Maíra Canal
` (13 more replies)
0 siblings, 14 replies; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
This series addresses several issues in the v3d scheduler and submission
code. Most of the fixes were motivated by feedback in the vc4 scheduler
series [1], which inherited issues from v3d. Based on the issues found
there, this series addresses the issues in the v3d driver as well.
This series has cleanup patches, fixes, and finally, a refactoring of
the submission code, which allowed us to fix the atomicity of a submission.
- Cleanups and small improvements:
- PATCH 1/14: "drm/v3d: Drop unused drm_encoder.h include from v3d_drv.h"
- PATCH 3/14: "drm/v3d: Use inline lock for dma fence initialization"
- PATCH 4/14: "drm/v3d: Replace spin_lock_irqsave() with spin_lock()"
- Fixes:
- PATCH 2/14: "drm/v3d: Clear queue->active_job when v3d_fence_create() fails"
- PATCH 5/14: "drm/v3d: Extract v3d_job_add_syncobjs() helper"
- PATCH 6/14: "drm/v3d: Reject invalid syncobj handles in submit ioctls"
- PATCH 13/14: "drm/v3d: Reject invalid out_sync handles in submit ioctls"
- Submission refactoring:
- PATCH 7/14: "drm/v3d: Migrate BO reservation locking to DRM exec"
- PATCH 8/14: "drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls"
- PATCH 9/14: "drm/v3d: Make v3d_get_cpu_indirect_csd_params() a pure parser"
- PATCH 10/14: "drm/v3d: Convert submit helpers to operate on struct v3d_submit"
- PATCH 11/14: "drm/v3d: Refactor CPU ioctl into unified submission chain"
- PATCH 12/14: "drm/v3d: Split BO fence attach from syncobj output handling"
- PATCH 14/14: "drm/v3d: Ensure atomic submissions in v3d_submit_jobs()" (Also a fix)
During the refactoring, I tried to break the conversion into small steps.
On the one hand, this helps reviewability; on the other hand, some things
may look unfinished at the end of a given commit. I tried to balance the
trade-off, but I'm happy to take split or squash requests during review.
[1] https://lore.kernel.org/dri-devel/20260205-vc4-drm-scheduler-v1-0-c6174fd7bbc1@igalia.com/T/
Best regards,
- Maíra
---
v1 -> v2: https://lore.kernel.org/r/20260413-v3d-sched-misc-fixes-v1-0-bac63a8ceb6c@igalia.com
- [2/14, 3/14, 5/14, 6/14] Add Tvrtko's R-b tag (Tvrtko Ursulin)
- [4/14] Use spin_lock() instead of spin_lock_irq() and change the commit message
(Tvrtko Ursulin)
- [4/14] Use scoped_guard() instead of open-coding spin_(un)lock()
- [6/14] s/NULL/zero in the comments (Tvrtko Ursulin)
- [8/14] s/kcalloc(1, size, GFP_KERNEL)/kzalloc (Tvrtko Ursulin)
- [8/14] Save one atomic in v3d_attach_perfmon_to_jobs() by adding a conditional
in the loop (Tvrtko Ursulin)
- [8/14] Zero is implied when initializing the struct v3d_submit (Tvrtko Ursulin)
- [8/14] Create v3d_submit_put_jobs() and v3d_submit_cleanup_jobs() (Tvrtko Ursulin)
- [9/14, 10/14, 11/14] NEW PATCHES: Trying to make the original patch
"[PATCH 09/10] drm/v3d: Refactor CPU ioctl into unified
submission chain" a bit easier to review.
- [10/14] Wrap drm_exec_fini() in a function v3d_submit_unlock_reservations()
(Tvrtko Ursulin)
- [12/14] NEW PATCH: "drm/v3d: Split BO fence attach from syncobj output handling"
- Split v3d_attach_fences_and_unlock_reservation() into different functions
and fold parts of it into v3d_submit_jobs (Tvrtko Ursulin)
- [13/14] NEW PATCH: "drm/v3d: Reject invalid out_sync handles in submit ioctls"
- Stop silently ignoring invalid syncobj handles (Tvrtko Ursulin)
- [14/14] Move the error path to a goto
- [14/14] Skip fence attachment and syncobj exportation to user-space if the
submission has failed (Tvrtko Ursulin)
v2 -> v3: https://lore.kernel.org/r/20260510-v3d-sched-misc-fixes-v2-0-ca4aba343ef6@igalia.com
- Rebased on top of drm-misc-next.
- [1/14] Add Iago's R-b tag (Iago Toral)
- [4/14, 7/14, 9/14] Add Tvrtko's R-b tag (Tvrtko Ursulin)
- [5/14] Create a job->queue variable to be used by v3d_job_add_syncobjs()
- [8/14] v3d_submit_add_job() returns struct v3d_job * instead of using
void **container as argument (Tvrtko Ursulin)
- [10/14] Pull v3d_submit_lock_reservations() out of
v3d_setup_csd_jobs_and_bos() (Tvrtko Ursulin)
- [11/14] Assert that jobs[1] is actually a CSD job (Tvrtko Ursulin)
- [11/14] v3d_submit_lock_reservations() is now called unconditionally
in v3d_submit_cpu_ioctl() (Tvrtko Ursulin)
- [12/14, 13/14] Fold v3d_submit_process_post_deps() and v3d_submit_put_jobs()
into v3d_submit_jobs() (Tvrtko Ursulin)
---
Maíra Canal (14):
drm/v3d: Drop unused drm_encoder.h include from v3d_drv.h
drm/v3d: Clear queue->active_job when v3d_fence_create() fails
drm/v3d: Use inline lock for dma fence initialization
drm/v3d: Replace spin_lock_irqsave() with spin_lock()
drm/v3d: Extract v3d_job_add_syncobjs() helper
drm/v3d: Reject invalid syncobj handles in submit ioctls
drm/v3d: Migrate BO reservation locking to DRM exec
drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls
drm/v3d: Make v3d_get_cpu_indirect_csd_params() a pure parser
drm/v3d: Convert submit helpers to operate on struct v3d_submit
drm/v3d: Refactor CPU ioctl into unified submission chain
drm/v3d: Split BO fence attach from syncobj output handling
drm/v3d: Reject invalid out_sync handles in submit ioctls
drm/v3d: Ensure atomic submissions in v3d_submit_jobs()
drivers/gpu/drm/v3d/Kconfig | 1 +
drivers/gpu/drm/v3d/v3d_drv.h | 35 +-
drivers/gpu/drm/v3d/v3d_fence.c | 2 +-
drivers/gpu/drm/v3d/v3d_irq.c | 17 +-
drivers/gpu/drm/v3d/v3d_sched.c | 73 ++--
drivers/gpu/drm/v3d/v3d_submit.c | 758 +++++++++++++++++++--------------------
6 files changed, 447 insertions(+), 439 deletions(-)
---
base-commit: 6e7eb171ac96fefa32197afa658e7d9da238be89
change-id: 20260407-v3d-sched-misc-fixes-623739017e53
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v3 01/14] drm/v3d: Drop unused drm_encoder.h include from v3d_drv.h
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-03 22:25 ` [PATCH v3 02/14] drm/v3d: Clear queue->active_job when v3d_fence_create() fails Maíra Canal
` (12 subsequent siblings)
13 siblings, 0 replies; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
The V3D driver has no display pipeline, so nothing in the driver requires
drm_encoder.h. Remove the stale include.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_drv.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 4ebe175a8c6b..5a0b9da2c3aa 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -7,7 +7,6 @@
#include <linux/spinlock_types.h>
#include <linux/workqueue.h>
-#include <drm/drm_encoder.h>
#include <drm/drm_gem.h>
#include <drm/drm_gem_shmem_helper.h>
#include <drm/gpu_scheduler.h>
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 02/14] drm/v3d: Clear queue->active_job when v3d_fence_create() fails
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
2026-06-03 22:25 ` [PATCH v3 01/14] drm/v3d: Drop unused drm_encoder.h include from v3d_drv.h Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-04 2:50 ` sashiko-bot
2026-06-03 22:25 ` [PATCH v3 03/14] drm/v3d: Use inline lock for dma fence initialization Maíra Canal
` (11 subsequent siblings)
13 siblings, 1 reply; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
The run_job() callbacks for BIN, RENDER, TFU and CSD assign the incoming
job to queue->active_job before calling v3d_fence_create(). If
v3d_fence_create() fails, the callback returns NULL without clearing
active_job, leaving a dangling pointer.
Create a failure path in all run_job() callbacks that clears the active
job before returning NULL. The BIN path takes queue->queue_lock around the
clear as it races against v3d_overflow_mem_work(); RENDER, TFU and CSD
paths have no concurrent reader, so the clear is lock-free.
Fixes: a783a09ee76d ("drm/v3d: Refactor job management.")
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_sched.c | 52 ++++++++++++++++++++++++-----------------
1 file changed, 30 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 0e266b29317f..46f4fc09c59e 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -192,12 +192,8 @@ static struct dma_fence *v3d_bin_job_run(struct drm_sched_job *sched_job)
struct dma_fence *fence;
unsigned long irqflags;
- if (unlikely(job->base.base.s_fence->finished.error)) {
- spin_lock_irqsave(&queue->queue_lock, irqflags);
- queue->active_job = NULL;
- spin_unlock_irqrestore(&queue->queue_lock, irqflags);
- return NULL;
- }
+ if (unlikely(job->base.base.s_fence->finished.error))
+ goto out_clean_job;
/* Lock required around bin_job update vs
* v3d_overflow_mem_work().
@@ -214,7 +210,7 @@ static struct dma_fence *v3d_bin_job_run(struct drm_sched_job *sched_job)
fence = v3d_fence_create(v3d, V3D_BIN);
if (IS_ERR(fence))
- return NULL;
+ goto out_clean_job;
if (job->base.irq_fence)
dma_fence_put(job->base.irq_fence);
@@ -242,6 +238,12 @@ static struct dma_fence *v3d_bin_job_run(struct drm_sched_job *sched_job)
V3D_CORE_WRITE(0, V3D_CLE_CT0QEA, job->end);
return fence;
+
+out_clean_job:
+ spin_lock_irqsave(&queue->queue_lock, irqflags);
+ queue->active_job = NULL;
+ spin_unlock_irqrestore(&queue->queue_lock, irqflags);
+ return NULL;
}
static struct dma_fence *v3d_render_job_run(struct drm_sched_job *sched_job)
@@ -251,10 +253,8 @@ static struct dma_fence *v3d_render_job_run(struct drm_sched_job *sched_job)
struct drm_device *dev = &v3d->drm;
struct dma_fence *fence;
- if (unlikely(job->base.base.s_fence->finished.error)) {
- v3d->queue[V3D_RENDER].active_job = NULL;
- return NULL;
- }
+ if (unlikely(job->base.base.s_fence->finished.error))
+ goto out_clean_job;
v3d->queue[V3D_RENDER].active_job = &job->base;
@@ -268,7 +268,7 @@ static struct dma_fence *v3d_render_job_run(struct drm_sched_job *sched_job)
fence = v3d_fence_create(v3d, V3D_RENDER);
if (IS_ERR(fence))
- return NULL;
+ goto out_clean_job;
if (job->base.irq_fence)
dma_fence_put(job->base.irq_fence);
@@ -289,6 +289,10 @@ static struct dma_fence *v3d_render_job_run(struct drm_sched_job *sched_job)
V3D_CORE_WRITE(0, V3D_CLE_CT1QEA, job->end);
return fence;
+
+out_clean_job:
+ v3d->queue[V3D_RENDER].active_job = NULL;
+ return NULL;
}
static struct dma_fence *
@@ -299,16 +303,14 @@ v3d_tfu_job_run(struct drm_sched_job *sched_job)
struct drm_device *dev = &v3d->drm;
struct dma_fence *fence;
- if (unlikely(job->base.base.s_fence->finished.error)) {
- v3d->queue[V3D_TFU].active_job = NULL;
- return NULL;
- }
+ if (unlikely(job->base.base.s_fence->finished.error))
+ goto out_clean_job;
v3d->queue[V3D_TFU].active_job = &job->base;
fence = v3d_fence_create(v3d, V3D_TFU);
if (IS_ERR(fence))
- return NULL;
+ goto out_clean_job;
if (job->base.irq_fence)
dma_fence_put(job->base.irq_fence);
@@ -336,6 +338,10 @@ v3d_tfu_job_run(struct drm_sched_job *sched_job)
V3D_WRITE(V3D_TFU_ICFG(v3d->ver), job->args.icfg | V3D_TFU_ICFG_IOC);
return fence;
+
+out_clean_job:
+ v3d->queue[V3D_TFU].active_job = NULL;
+ return NULL;
}
static struct dma_fence *
@@ -347,10 +353,8 @@ v3d_csd_job_run(struct drm_sched_job *sched_job)
struct dma_fence *fence;
int i, csd_cfg0_reg;
- if (unlikely(job->base.base.s_fence->finished.error)) {
- v3d->queue[V3D_CSD].active_job = NULL;
- return NULL;
- }
+ if (unlikely(job->base.base.s_fence->finished.error))
+ goto out_clean_job;
v3d->queue[V3D_CSD].active_job = &job->base;
@@ -358,7 +362,7 @@ v3d_csd_job_run(struct drm_sched_job *sched_job)
fence = v3d_fence_create(v3d, V3D_CSD);
if (IS_ERR(fence))
- return NULL;
+ goto out_clean_job;
if (job->base.irq_fence)
dma_fence_put(job->base.irq_fence);
@@ -385,6 +389,10 @@ v3d_csd_job_run(struct drm_sched_job *sched_job)
V3D_CORE_WRITE(0, csd_cfg0_reg, job->args.cfg[0]);
return fence;
+
+out_clean_job:
+ v3d->queue[V3D_CSD].active_job = NULL;
+ return NULL;
}
static void
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 03/14] drm/v3d: Use inline lock for dma fence initialization
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
2026-06-03 22:25 ` [PATCH v3 01/14] drm/v3d: Drop unused drm_encoder.h include from v3d_drv.h Maíra Canal
2026-06-03 22:25 ` [PATCH v3 02/14] drm/v3d: Clear queue->active_job when v3d_fence_create() fails Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-04 2:49 ` sashiko-bot
2026-06-03 22:25 ` [PATCH v3 04/14] drm/v3d: Replace spin_lock_irqsave() with spin_lock() Maíra Canal
` (10 subsequent siblings)
13 siblings, 1 reply; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
Since commit 1f32f310a13c ("dma-buf: inline spinlock for fence protection
v5"), dma_fence_init() accepts a NULL external lock and will fall back to
an inline spinlock embedded in the fence itself. The embedded spinlock
allows the decoupling the lock and the fence lifetime.
Pass NULL so each v3d fence uses its own inline lock. This will allow
queue_lock to use spin_(un)lock() in all its uses instead of
spin_(un)lock_irqsave().
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_fence.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/v3d/v3d_fence.c b/drivers/gpu/drm/v3d/v3d_fence.c
index c500136d0455..9b1a882a4c15 100644
--- a/drivers/gpu/drm/v3d/v3d_fence.c
+++ b/drivers/gpu/drm/v3d/v3d_fence.c
@@ -15,7 +15,7 @@ struct dma_fence *v3d_fence_create(struct v3d_dev *v3d, enum v3d_queue q)
fence->dev = &v3d->drm;
fence->queue = q;
fence->seqno = ++queue->emit_seqno;
- dma_fence_init(&fence->base, &v3d_fence_ops, &queue->queue_lock,
+ dma_fence_init(&fence->base, &v3d_fence_ops, NULL,
queue->fence_context, fence->seqno);
return &fence->base;
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 04/14] drm/v3d: Replace spin_lock_irqsave() with spin_lock()
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
` (2 preceding siblings ...)
2026-06-03 22:25 ` [PATCH v3 03/14] drm/v3d: Use inline lock for dma fence initialization Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-04 2:51 ` sashiko-bot
2026-06-03 22:25 ` [PATCH v3 05/14] drm/v3d: Extract v3d_job_add_syncobjs() helper Maíra Canal
` (9 subsequent siblings)
13 siblings, 1 reply; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
queue->queue_lock is only acquired from v3d_overflow_mem_work() and
v3d_bin_job_run(), both of which run from workqueue context. The hard
IRQ handler does not take this lock, so disabling interrupts while
holding it is unnecessary.
Drop the spin_lock_irqsave() and use plain spin_(un)lock() instead.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_irq.c | 17 +++++++----------
drivers/gpu/drm/v3d/v3d_sched.c | 21 ++++++++++-----------
2 files changed, 17 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_irq.c b/drivers/gpu/drm/v3d/v3d_irq.c
index 86efaef2722c..754a969b862b 100644
--- a/drivers/gpu/drm/v3d/v3d_irq.c
+++ b/drivers/gpu/drm/v3d/v3d_irq.c
@@ -47,7 +47,6 @@ v3d_overflow_mem_work(struct work_struct *work)
struct v3d_queue_state *queue = &v3d->queue[V3D_BIN];
struct v3d_bin_job *bin_job;
struct drm_gem_object *obj;
- unsigned long irqflags;
if (IS_ERR(bo)) {
drm_err(dev, "Couldn't allocate binner overflow mem\n");
@@ -64,18 +63,16 @@ v3d_overflow_mem_work(struct work_struct *work)
* bin job got scheduled, that's fine. We'll just give them
* some binner pool anyway.
*/
- spin_lock_irqsave(&queue->queue_lock, irqflags);
- bin_job = (struct v3d_bin_job *)queue->active_job;
+ scoped_guard(spinlock, &queue->queue_lock) {
+ bin_job = (struct v3d_bin_job *)queue->active_job;
- if (!bin_job) {
- spin_unlock_irqrestore(&queue->queue_lock, irqflags);
- goto out;
+ if (!bin_job)
+ goto out;
+
+ drm_gem_object_get(obj);
+ list_add_tail(&bo->unref_head, &bin_job->render->unref_list);
}
- drm_gem_object_get(obj);
- list_add_tail(&bo->unref_head, &bin_job->render->unref_list);
- spin_unlock_irqrestore(&queue->queue_lock, irqflags);
-
v3d_mmu_flush_all(v3d);
V3D_CORE_WRITE(0, V3D_PTB_BPOA, bo->node.start << V3D_MMU_PAGE_SHIFT);
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 46f4fc09c59e..04fd1a365576 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -190,7 +190,6 @@ static struct dma_fence *v3d_bin_job_run(struct drm_sched_job *sched_job)
struct v3d_queue_state *queue = &v3d->queue[V3D_BIN];
struct drm_device *dev = &v3d->drm;
struct dma_fence *fence;
- unsigned long irqflags;
if (unlikely(job->base.base.s_fence->finished.error))
goto out_clean_job;
@@ -198,13 +197,13 @@ static struct dma_fence *v3d_bin_job_run(struct drm_sched_job *sched_job)
/* Lock required around bin_job update vs
* v3d_overflow_mem_work().
*/
- spin_lock_irqsave(&queue->queue_lock, irqflags);
- queue->active_job = &job->base;
- /* Clear out the overflow allocation, so we don't
- * reuse the overflow attached to a previous job.
- */
- V3D_CORE_WRITE(0, V3D_PTB_BPOS, 0);
- spin_unlock_irqrestore(&queue->queue_lock, irqflags);
+ scoped_guard(spinlock, &queue->queue_lock) {
+ queue->active_job = &job->base;
+ /* Clear out the overflow allocation, so we don't
+ * reuse the overflow attached to a previous job.
+ */
+ V3D_CORE_WRITE(0, V3D_PTB_BPOS, 0);
+ }
v3d_invalidate_caches(v3d);
@@ -240,9 +239,9 @@ static struct dma_fence *v3d_bin_job_run(struct drm_sched_job *sched_job)
return fence;
out_clean_job:
- spin_lock_irqsave(&queue->queue_lock, irqflags);
- queue->active_job = NULL;
- spin_unlock_irqrestore(&queue->queue_lock, irqflags);
+ scoped_guard(spinlock, &queue->queue_lock) {
+ queue->active_job = NULL;
+ }
return NULL;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 05/14] drm/v3d: Extract v3d_job_add_syncobjs() helper
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
` (3 preceding siblings ...)
2026-06-03 22:25 ` [PATCH v3 04/14] drm/v3d: Replace spin_lock_irqsave() with spin_lock() Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-03 22:25 ` [PATCH v3 06/14] drm/v3d: Reject invalid syncobj handles in submit ioctls Maíra Canal
` (8 subsequent siblings)
13 siblings, 0 replies; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
Move the syncobj dependency setup out of v3d_job_init() into its own
v3d_job_add_syncobjs() helper and make the queue that the job was
submitted a variable in struct v3d_job, so that v3d_job_add_syncobjs()
can use it. No functional change.
This prepares for the next commit which changes the error handling, and
for a later consolidation that separates job allocation from syncobj
attachment.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_drv.h | 3 ++
drivers/gpu/drm/v3d/v3d_submit.c | 72 ++++++++++++++++++++++++----------------
2 files changed, 47 insertions(+), 28 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 5a0b9da2c3aa..79672abef70d 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -294,6 +294,9 @@ struct v3d_job {
struct v3d_dev *v3d;
+ /* The queue that the job was submitted on. */
+ enum v3d_queue queue;
+
/* This is the array of BOs that were looked up at the start
* of submission.
*/
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 1db43c6a078d..8250376d104c 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -180,17 +180,56 @@ v3d_job_deallocate(void **container)
*container = NULL;
}
+static int
+v3d_job_add_syncobjs(struct v3d_job *job, struct drm_file *file_priv,
+ u32 in_sync, struct v3d_submit_ext *se)
+{
+ bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC);
+ struct v3d_dev *v3d = job->v3d;
+ int ret = 0;
+
+ if (!has_multisync) {
+ ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv,
+ in_sync, 0);
+ // TODO: Investigate why this was filtered out for the IOCTL.
+ if (ret && ret != -ENOENT)
+ return ret;
+ return 0;
+ }
+
+ if (se->in_sync_count && se->wait_stage == job->queue) {
+ struct drm_v3d_sem __user *handle = u64_to_user_ptr(se->in_syncs);
+
+ for (int i = 0; i < se->in_sync_count; i++) {
+ struct drm_v3d_sem in;
+
+ if (copy_from_user(&in, handle++, sizeof(in))) {
+ drm_dbg(&v3d->drm, "Failed to copy wait dep handle.\n");
+ return -EFAULT;
+ }
+
+ ret = drm_sched_job_add_syncobj_dependency(&job->base,
+ file_priv, in.handle, 0);
+ // TODO: Investigate why this was filtered out for the IOCTL.
+ if (ret && ret != -ENOENT)
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
static int
v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
struct v3d_job *job, void (*free)(struct kref *ref),
u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue queue)
{
struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
- bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC);
- int ret, i;
+ int ret;
job->v3d = v3d;
job->free = free;
+ job->queue = queue;
job->file_priv = v3d_priv;
ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
@@ -198,32 +237,9 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
if (ret)
return ret;
- if (has_multisync) {
- if (se->in_sync_count && se->wait_stage == queue) {
- struct drm_v3d_sem __user *handle = u64_to_user_ptr(se->in_syncs);
-
- for (i = 0; i < se->in_sync_count; i++) {
- struct drm_v3d_sem in;
-
- if (copy_from_user(&in, handle++, sizeof(in))) {
- ret = -EFAULT;
- drm_dbg(&v3d->drm, "Failed to copy wait dep handle.\n");
- goto fail_job_init;
- }
- ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv, in.handle, 0);
-
- // TODO: Investigate why this was filtered out for the IOCTL.
- if (ret && ret != -ENOENT)
- goto fail_job_init;
- }
- }
- } else {
- ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv, in_sync, 0);
-
- // TODO: Investigate why this was filtered out for the IOCTL.
- if (ret && ret != -ENOENT)
- goto fail_job_init;
- }
+ ret = v3d_job_add_syncobjs(job, file_priv, in_sync, se);
+ if (ret)
+ goto fail_job_init;
/* CPU jobs don't require hardware resources */
if (queue != V3D_CPU) {
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 06/14] drm/v3d: Reject invalid syncobj handles in submit ioctls
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
` (4 preceding siblings ...)
2026-06-03 22:25 ` [PATCH v3 05/14] drm/v3d: Extract v3d_job_add_syncobjs() helper Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-04 2:51 ` sashiko-bot
2026-06-03 22:25 ` [PATCH v3 07/14] drm/v3d: Migrate BO reservation locking to DRM exec Maíra Canal
` (7 subsequent siblings)
13 siblings, 1 reply; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
drm_sched_job_add_syncobj_dependency() returns -ENOENT both when the
handle is zero and when the handle is non-zero but does not find a
corresponding existing syncobj (userspace bug). The driver previously
ignored -ENOENT in both cases, silently accepting broken handles.
Distinguish the two: skip the call entirely when the handle is zero, as
there is no dependency, and let -ENOENT propagate for non-zero handles
that don't resolve, turning the error into a proper return to userspace.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_submit.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 8250376d104c..0babe2e67266 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -189,12 +189,11 @@ v3d_job_add_syncobjs(struct v3d_job *job, struct drm_file *file_priv,
int ret = 0;
if (!has_multisync) {
- ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv,
- in_sync, 0);
- // TODO: Investigate why this was filtered out for the IOCTL.
- if (ret && ret != -ENOENT)
- return ret;
- return 0;
+ /* Ignore syncobj if its handle is zero */
+ if (in_sync)
+ ret = drm_sched_job_add_syncobj_dependency(&job->base, file_priv,
+ in_sync, 0);
+ return ret;
}
if (se->in_sync_count && se->wait_stage == job->queue) {
@@ -208,11 +207,13 @@ v3d_job_add_syncobjs(struct v3d_job *job, struct drm_file *file_priv,
return -EFAULT;
}
- ret = drm_sched_job_add_syncobj_dependency(&job->base,
- file_priv, in.handle, 0);
- // TODO: Investigate why this was filtered out for the IOCTL.
- if (ret && ret != -ENOENT)
- return ret;
+ /* Ignore syncobj if its handle is zero */
+ if (in.handle) {
+ ret = drm_sched_job_add_syncobj_dependency(&job->base,
+ file_priv, in.handle, 0);
+ if (ret)
+ return ret;
+ }
}
}
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 07/14] drm/v3d: Migrate BO reservation locking to DRM exec
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
` (5 preceding siblings ...)
2026-06-03 22:25 ` [PATCH v3 06/14] drm/v3d: Reject invalid syncobj handles in submit ioctls Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-03 22:25 ` [PATCH v3 08/14] drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls Maíra Canal
` (6 subsequent siblings)
13 siblings, 0 replies; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
Replace the drm_gem_(un)lock_reservations() + ww_acquire_ctx pattern with
DRM exec across all submit ioctls. Just a straightforward conversion; no
functional change.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/Kconfig | 1 +
drivers/gpu/drm/v3d/v3d_drv.h | 3 +-
drivers/gpu/drm/v3d/v3d_submit.c | 69 +++++++++++++++++-----------------------
3 files changed, 33 insertions(+), 40 deletions(-)
diff --git a/drivers/gpu/drm/v3d/Kconfig b/drivers/gpu/drm/v3d/Kconfig
index ce62c5908e1d..6a33e0ab30de 100644
--- a/drivers/gpu/drm/v3d/Kconfig
+++ b/drivers/gpu/drm/v3d/Kconfig
@@ -5,6 +5,7 @@ config DRM_V3D
depends on DRM
depends on COMMON_CLK
depends on MMU
+ select DRM_EXEC
select DRM_SCHED
select DRM_GEM_SHMEM_HELPER
help
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 79672abef70d..07bbdf52c33c 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -7,6 +7,7 @@
#include <linux/spinlock_types.h>
#include <linux/workqueue.h>
+#include <drm/drm_exec.h>
#include <drm/drm_gem.h>
#include <drm/drm_gem_shmem_helper.h>
#include <drm/gpu_scheduler.h>
@@ -421,7 +422,7 @@ struct v3d_indirect_csd_info {
struct drm_gem_object *indirect;
/* Context of the Indirect CSD job */
- struct ww_acquire_ctx acquire_ctx;
+ struct drm_exec exec;
};
struct v3d_timestamp_query_info {
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 0babe2e67266..d80afbf2c03b 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -20,20 +20,19 @@
* to v3d, so we don't attach dma-buf fences to them.
*/
static int
-v3d_lock_bo_reservations(struct v3d_job *job,
- struct ww_acquire_ctx *acquire_ctx)
+v3d_lock_bo_reservations(struct v3d_job *job, struct drm_exec *exec)
{
int i, ret;
- ret = drm_gem_lock_reservations(job->bo, job->bo_count, acquire_ctx);
+ drm_exec_init(exec, DRM_EXEC_INTERRUPTIBLE_WAIT, job->bo_count);
+ drm_exec_until_all_locked(exec) {
+ ret = drm_exec_prepare_array(exec, job->bo, job->bo_count, 1);
+ }
+
if (ret)
- return ret;
+ goto fail;
for (i = 0; i < job->bo_count; i++) {
- ret = dma_resv_reserve_fences(job->bo[i]->resv, 1);
- if (ret)
- goto fail;
-
ret = drm_sched_job_add_implicit_dependencies(&job->base,
job->bo[i], true);
if (ret)
@@ -43,7 +42,7 @@ v3d_lock_bo_reservations(struct v3d_job *job,
return 0;
fail:
- drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);
+ drm_exec_fini(exec);
return ret;
}
@@ -278,7 +277,7 @@ v3d_push_job(struct v3d_job *job)
static void
v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv,
struct v3d_job *job,
- struct ww_acquire_ctx *acquire_ctx,
+ struct drm_exec *exec,
u32 out_sync,
struct v3d_submit_ext *se,
struct dma_fence *done_fence)
@@ -293,7 +292,7 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv,
DMA_RESV_USAGE_WRITE);
}
- drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);
+ drm_exec_fini(exec);
/* Update the return sync object for the job */
/* If it only supports a single signal semaphore*/
@@ -324,7 +323,7 @@ v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv,
struct v3d_csd_job **job,
struct v3d_job **clean_job,
struct v3d_submit_ext *se,
- struct ww_acquire_ctx *acquire_ctx)
+ struct drm_exec *exec)
{
int ret;
@@ -357,7 +356,7 @@ v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv,
if (ret)
return ret;
- return v3d_lock_bo_reservations(*clean_job, acquire_ctx);
+ return v3d_lock_bo_reservations(*clean_job, exec);
}
static void
@@ -515,7 +514,7 @@ v3d_get_cpu_indirect_csd_params(struct drm_file *file_priv,
return v3d_setup_csd_jobs_and_bos(file_priv, v3d, &indirect_csd.submit,
&info->job, &info->clean_job,
- NULL, &info->acquire_ctx);
+ NULL, &info->exec);
}
/* Get data for the query timestamp job submission. */
@@ -930,7 +929,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
struct v3d_render_job *render = NULL;
struct v3d_job *clean_job = NULL;
struct v3d_job *last_job;
- struct ww_acquire_ctx acquire_ctx;
+ struct drm_exec exec;
int ret = 0;
trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args->rcl_end);
@@ -1010,7 +1009,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
if (ret)
goto fail;
- ret = v3d_lock_bo_reservations(last_job, &acquire_ctx);
+ ret = v3d_lock_bo_reservations(last_job, &exec);
if (ret)
goto fail;
@@ -1059,7 +1058,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
v3d_attach_fences_and_unlock_reservation(file_priv,
last_job,
- &acquire_ctx,
+ &exec,
args->out_sync,
&se,
last_job->done_fence);
@@ -1073,8 +1072,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
fail_unreserve:
mutex_unlock(&v3d->sched_lock);
fail_perfmon:
- drm_gem_unlock_reservations(last_job->bo,
- last_job->bo_count, &acquire_ctx);
+ drm_exec_fini(&exec);
fail:
v3d_job_cleanup((void *)bin);
v3d_job_cleanup((void *)render);
@@ -1101,7 +1099,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
struct drm_v3d_submit_tfu *args = data;
struct v3d_submit_ext se = {0};
struct v3d_tfu_job *job = NULL;
- struct ww_acquire_ctx acquire_ctx;
+ struct drm_exec exec;
int ret = 0;
trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia);
@@ -1157,7 +1155,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
job->base.bo[job->base.bo_count] = bo;
}
- ret = v3d_lock_bo_reservations(&job->base, &acquire_ctx);
+ ret = v3d_lock_bo_reservations(&job->base, &exec);
if (ret)
goto fail;
@@ -1166,7 +1164,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
mutex_unlock(&v3d->sched_lock);
v3d_attach_fences_and_unlock_reservation(file_priv,
- &job->base, &acquire_ctx,
+ &job->base, &exec,
args->out_sync,
&se,
job->base.done_fence);
@@ -1201,7 +1199,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
struct v3d_submit_ext se = {0};
struct v3d_csd_job *job = NULL;
struct v3d_job *clean_job = NULL;
- struct ww_acquire_ctx acquire_ctx;
+ struct drm_exec exec;
int ret;
trace_v3d_submit_csd_ioctl(&v3d->drm, args->cfg[5], args->cfg[6]);
@@ -1228,8 +1226,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
}
ret = v3d_setup_csd_jobs_and_bos(file_priv, v3d, args,
- &job, &clean_job, &se,
- &acquire_ctx);
+ &job, &clean_job, &se, &exec);
if (ret)
goto fail;
@@ -1260,7 +1257,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
v3d_attach_fences_and_unlock_reservation(file_priv,
clean_job,
- &acquire_ctx,
+ &exec,
args->out_sync,
&se,
clean_job->done_fence);
@@ -1273,8 +1270,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
fail_unreserve:
mutex_unlock(&v3d->sched_lock);
fail_perfmon:
- drm_gem_unlock_reservations(clean_job->bo, clean_job->bo_count,
- &acquire_ctx);
+ drm_exec_fini(&exec);
fail:
v3d_job_cleanup((void *)job);
v3d_job_cleanup(clean_job);
@@ -1312,7 +1308,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
struct v3d_cpu_job *cpu_job = NULL;
struct v3d_csd_job *csd_job = NULL;
struct v3d_job *clean_job = NULL;
- struct ww_acquire_ctx acquire_ctx;
+ struct drm_exec exec;
int ret;
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
@@ -1363,7 +1359,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
if (ret)
goto fail;
- ret = v3d_lock_bo_reservations(&cpu_job->base, &acquire_ctx);
+ ret = v3d_lock_bo_reservations(&cpu_job->base, &exec);
if (ret)
goto fail;
}
@@ -1397,14 +1393,14 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
v3d_attach_fences_and_unlock_reservation(file_priv,
&cpu_job->base,
- &acquire_ctx, 0,
+ &exec, 0,
out_se, cpu_job->base.done_fence);
switch (cpu_job->job_type) {
case V3D_CPU_JOB_TYPE_INDIRECT_CSD:
v3d_attach_fences_and_unlock_reservation(file_priv,
clean_job,
- &cpu_job->indirect_csd.acquire_ctx,
+ &cpu_job->indirect_csd.exec,
0, &se, clean_job->done_fence);
break;
default:
@@ -1419,13 +1415,8 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
fail_unreserve:
mutex_unlock(&v3d->sched_lock);
-
- drm_gem_unlock_reservations(cpu_job->base.bo, cpu_job->base.bo_count,
- &acquire_ctx);
-
- drm_gem_unlock_reservations(clean_job->bo, clean_job->bo_count,
- &cpu_job->indirect_csd.acquire_ctx);
-
+ drm_exec_fini(&exec);
+ drm_exec_fini(&cpu_job->indirect_csd.exec);
fail:
v3d_job_cleanup((void *)cpu_job);
v3d_job_cleanup((void *)csd_job);
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 08/14] drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
` (6 preceding siblings ...)
2026-06-03 22:25 ` [PATCH v3 07/14] drm/v3d: Migrate BO reservation locking to DRM exec Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-04 2:54 ` sashiko-bot
2026-06-04 8:58 ` Tvrtko Ursulin
2026-06-03 22:25 ` [PATCH v3 09/14] drm/v3d: Make v3d_get_cpu_indirect_csd_params() a pure parser Maíra Canal
` (5 subsequent siblings)
13 siblings, 2 replies; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
As the V3D driver grew with time, different types of submission were added
and the submission code grew more complex, but the driver stuck with the
same abstractions.
Nowadays, the submission ioctls don't submit a single job, but a
chain of jobs:
1. v3d_submit_cl_ioctl() submits a BIN job (optional), RENDER job
(mandatory), and a CLEAN_CACHE job (optional).
2. v3d_submit_csd_ioctl() submits a CSD, and a CLEAN_CACHE job.
3. v3d_submit_tfu_ioctl() submits a TFU job.
Therefore, each ioctl submits a chain of jobs in which each job depends on
the previous one. However, this concept is not well represented in software
at the moment.
To address this, introduce a new concept: the struct v3d_submit, which
groups the submission state and represents the submission chain formed by
an ordered array of jobs.
Add new helpers to allocate, add jobs to the chain and submit jobs to
the scheduler, all based on the new struct. Convert v3d_submit_cl_ioctl(),
v3d_submit_tfu_ioctl() and v3d_submit_csd_ioctl() to the new pattern. Each
ioctl now follows the same flow: add jobs -> attach perfmon -> lookup BOs
-> lock reservations -> submit chain -> attach fences -> put jobs.
The CPU ioctl is left on the old helpers for now; its indirect CSD path
requires some restructuring that will be addressed in the next few
commits.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_drv.h | 21 +++
drivers/gpu/drm/v3d/v3d_submit.c | 386 ++++++++++++++++++++++-----------------
2 files changed, 241 insertions(+), 166 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 07bbdf52c33c..fae0c53e7aa3 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -288,6 +288,27 @@ to_v3d_fence(struct dma_fence *fence)
#define V3D_CORE_READ(core, offset) readl(v3d->core_regs[core] + offset)
#define V3D_CORE_WRITE(core, offset, val) writel(val, v3d->core_regs[core] + offset)
+#define V3D_MAX_JOBS_PER_SUBMISSION 3
+
+/* Per-ioctl submission context */
+struct v3d_submit {
+ struct v3d_dev *v3d;
+
+ struct drm_file *file_priv;
+
+ /* DRM exec context for this submission. */
+ struct drm_exec exec;
+
+ /* Ordered array of jobs forming the submission chain. Jobs are
+ * appended via v3d_submit_add_job(), then chained and pushed to
+ * the scheduler by v3d_submit_jobs().
+ */
+ struct v3d_job *jobs[V3D_MAX_JOBS_PER_SUBMISSION];
+
+ /* Number of jobs currently in @jobs. */
+ u32 job_count;
+};
+
struct v3d_job {
struct drm_sched_job base;
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index d80afbf2c03b..fcaf3a6cfddc 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -261,17 +261,105 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
return ret;
}
-static void
-v3d_push_job(struct v3d_job *job)
+static const struct {
+ size_t size;
+ void (*free)(struct kref *ref);
+} v3d_job_types[] = {
+ [V3D_BIN] = { sizeof(struct v3d_bin_job), v3d_job_free },
+ [V3D_RENDER] = { sizeof(struct v3d_render_job), v3d_render_job_free },
+ [V3D_TFU] = { sizeof(struct v3d_tfu_job), v3d_job_free },
+ [V3D_CSD] = { sizeof(struct v3d_csd_job), v3d_job_free },
+ [V3D_CACHE_CLEAN] = { sizeof(struct v3d_job), v3d_job_free },
+ [V3D_CPU] = { sizeof(struct v3d_cpu_job), v3d_cpu_job_free },
+};
+
+static struct v3d_job *
+v3d_submit_add_job(struct v3d_submit *submit, enum v3d_queue queue)
{
- drm_sched_job_arm(&job->base);
+ struct v3d_file_priv *v3d_priv = submit->file_priv->driver_priv;
+ struct v3d_dev *v3d = submit->v3d;
+ struct v3d_job *job;
+ int ret;
- job->done_fence = dma_fence_get(&job->base.s_fence->finished);
+ if (queue >= V3D_MAX_QUEUES)
+ return ERR_PTR(-EINVAL);
- /* put by scheduler job completion */
- kref_get(&job->refcount);
+ job = kzalloc(v3d_job_types[queue].size, GFP_KERNEL);
+ if (!job)
+ return ERR_PTR(-ENOMEM);
- drm_sched_entity_push_job(&job->base);
+ job->v3d = v3d;
+ job->queue = queue;
+ job->file_priv = v3d_priv;
+ job->free = v3d_job_types[queue].free;
+
+ ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
+ 1, v3d_priv, submit->file_priv->client_id);
+ if (ret)
+ goto fail_free;
+
+ /* CPU jobs don't require hardware resources */
+ if (queue != V3D_CPU) {
+ ret = v3d_pm_runtime_get(v3d);
+ if (ret)
+ goto fail_sched_job;
+ job->has_pm_ref = true;
+ }
+
+ kref_init(&job->refcount);
+
+ job->client_stats = v3d_stats_get(v3d_priv->stats[queue]);
+ job->global_stats = v3d_stats_get(v3d->queue[queue].stats);
+
+ submit->jobs[submit->job_count++] = job;
+
+ return job;
+
+fail_sched_job:
+ drm_sched_job_cleanup(&job->base);
+fail_free:
+ kfree(job);
+ return ERR_PTR(ret);
+}
+
+static void
+v3d_submit_put_jobs(struct v3d_submit *submit)
+{
+ for (int i = 0; i < submit->job_count; i++)
+ v3d_job_put(submit->jobs[i]);
+}
+
+static void
+v3d_submit_cleanup_jobs(struct v3d_submit *submit)
+{
+ for (int i = 0; i < submit->job_count; i++)
+ v3d_job_cleanup(submit->jobs[i]);
+}
+
+static int
+v3d_attach_perfmon_to_jobs(struct v3d_submit *submit, u32 perfmon_id)
+{
+ struct v3d_file_priv *v3d_priv = submit->file_priv->driver_priv;
+ struct v3d_dev *v3d = submit->v3d;
+ struct v3d_perfmon *perfmon;
+
+ if (!perfmon_id)
+ return 0;
+
+ if (v3d->global_perfmon)
+ return -EAGAIN;
+
+ perfmon = v3d_perfmon_find(v3d_priv, perfmon_id);
+ if (!perfmon)
+ return -ENOENT;
+
+ for (int i = 0; i < submit->job_count; i++) {
+ submit->jobs[i]->perfmon = perfmon;
+ if (i != 0)
+ v3d_perfmon_get(perfmon);
+ }
+
+ return 0;
}
static void
@@ -316,6 +404,45 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv,
}
}
+static void
+v3d_push_job(struct v3d_job *job)
+{
+ drm_sched_job_arm(&job->base);
+
+ job->done_fence = dma_fence_get(&job->base.s_fence->finished);
+
+ /* put by scheduler job completion */
+ kref_get(&job->refcount);
+
+ drm_sched_entity_push_job(&job->base);
+}
+
+static int
+v3d_submit_jobs(struct v3d_submit *submit)
+{
+ struct v3d_dev *v3d = submit->v3d;
+ int ret = 0;
+
+ mutex_lock(&v3d->sched_lock);
+
+ for (int i = 0; i < submit->job_count; i++) {
+ struct v3d_job *job = submit->jobs[i];
+
+ v3d_push_job(job);
+
+ if (i + 1 < submit->job_count) {
+ ret = drm_sched_job_add_dependency(&submit->jobs[i + 1]->base,
+ dma_fence_get(job->done_fence));
+ if (ret)
+ goto err;
+ }
+ }
+
+err:
+ mutex_unlock(&v3d->sched_lock);
+ return ret;
+}
+
static int
v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv,
struct v3d_dev *v3d,
@@ -921,18 +1048,15 @@ int
v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
{
- struct v3d_dev *v3d = to_v3d_dev(dev);
- struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
+ struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
struct drm_v3d_submit_cl *args = data;
struct v3d_submit_ext se = {0};
struct v3d_bin_job *bin = NULL;
struct v3d_render_job *render = NULL;
struct v3d_job *clean_job = NULL;
- struct v3d_job *last_job;
- struct drm_exec exec;
- int ret = 0;
+ int ret;
- trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args->rcl_end);
+ trace_v3d_submit_cl_ioctl(dev, args->rcl_start, args->rcl_end);
if (args->pad)
return -EINVAL;
@@ -952,30 +1076,10 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
}
}
- ret = v3d_job_allocate(v3d, (void *)&render, sizeof(*render));
- if (ret)
- return ret;
-
- ret = v3d_job_init(v3d, file_priv, &render->base,
- v3d_render_job_free, args->in_sync_rcl, &se, V3D_RENDER);
- if (ret) {
- v3d_job_deallocate((void *)&render);
- goto fail;
- }
-
- render->start = args->rcl_start;
- render->end = args->rcl_end;
- INIT_LIST_HEAD(&render->unref_list);
-
if (args->bcl_start != args->bcl_end) {
- ret = v3d_job_allocate(v3d, (void *)&bin, sizeof(*bin));
- if (ret)
- goto fail;
-
- ret = v3d_job_init(v3d, file_priv, &bin->base,
- v3d_job_free, args->in_sync_bcl, &se, V3D_BIN);
- if (ret) {
- v3d_job_deallocate((void *)&bin);
+ bin = (struct v3d_bin_job *) v3d_submit_add_job(&submit, V3D_BIN);
+ if (IS_ERR(bin)) {
+ ret = PTR_ERR(bin);
goto fail;
}
@@ -984,99 +1088,71 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
bin->qma = args->qma;
bin->qms = args->qms;
bin->qts = args->qts;
- bin->render = render;
- }
- if (args->flags & DRM_V3D_SUBMIT_CL_FLUSH_CACHE) {
- ret = v3d_job_allocate(v3d, (void *)&clean_job, sizeof(*clean_job));
+ ret = v3d_job_add_syncobjs(&bin->base, file_priv, args->in_sync_bcl,
+ &se);
if (ret)
goto fail;
-
- ret = v3d_job_init(v3d, file_priv, clean_job,
- v3d_job_free, 0, NULL, V3D_CACHE_CLEAN);
- if (ret) {
- v3d_job_deallocate((void *)&clean_job);
- goto fail;
- }
-
- last_job = clean_job;
- } else {
- last_job = &render->base;
}
- ret = v3d_lookup_bos(dev, file_priv, last_job,
+ render = (struct v3d_render_job *) v3d_submit_add_job(&submit, V3D_RENDER);
+ if (IS_ERR(render)) {
+ ret = PTR_ERR(render);
+ goto fail;
+ }
+
+ INIT_LIST_HEAD(&render->unref_list);
+ render->start = args->rcl_start;
+ render->end = args->rcl_end;
+
+ if (bin)
+ bin->render = render;
+
+ ret = v3d_job_add_syncobjs(&render->base, file_priv, args->in_sync_rcl, &se);
+ if (ret)
+ goto fail;
+
+ if (args->flags & DRM_V3D_SUBMIT_CL_FLUSH_CACHE) {
+ clean_job = v3d_submit_add_job(&submit, V3D_CACHE_CLEAN);
+ if (IS_ERR(clean_job)) {
+ ret = PTR_ERR(clean_job);
+ goto fail;
+ }
+ }
+
+ ret = v3d_attach_perfmon_to_jobs(&submit, args->perfmon_id);
+ if (ret)
+ goto fail;
+
+ ret = v3d_lookup_bos(dev, file_priv,
+ submit.jobs[submit.job_count - 1],
args->bo_handles, args->bo_handle_count);
if (ret)
goto fail;
- ret = v3d_lock_bo_reservations(last_job, &exec);
+ ret = v3d_lock_bo_reservations(submit.jobs[submit.job_count - 1],
+ &submit.exec);
if (ret)
goto fail;
- if (args->perfmon_id) {
- if (v3d->global_perfmon) {
- ret = -EAGAIN;
- goto fail_perfmon;
- }
-
- render->base.perfmon = v3d_perfmon_find(v3d_priv,
- args->perfmon_id);
-
- if (!render->base.perfmon) {
- ret = -ENOENT;
- goto fail_perfmon;
- }
- }
-
- mutex_lock(&v3d->sched_lock);
- if (bin) {
- bin->base.perfmon = render->base.perfmon;
- v3d_perfmon_get(bin->base.perfmon);
- v3d_push_job(&bin->base);
-
- ret = drm_sched_job_add_dependency(&render->base.base,
- dma_fence_get(bin->base.done_fence));
- if (ret)
- goto fail_unreserve;
- }
-
- v3d_push_job(&render->base);
-
- if (clean_job) {
- struct dma_fence *render_fence =
- dma_fence_get(render->base.done_fence);
- ret = drm_sched_job_add_dependency(&clean_job->base,
- render_fence);
- if (ret)
- goto fail_unreserve;
- clean_job->perfmon = render->base.perfmon;
- v3d_perfmon_get(clean_job->perfmon);
- v3d_push_job(clean_job);
- }
-
- mutex_unlock(&v3d->sched_lock);
+ ret = v3d_submit_jobs(&submit);
+ if (ret)
+ goto fail_unreserve;
v3d_attach_fences_and_unlock_reservation(file_priv,
- last_job,
- &exec,
- args->out_sync,
- &se,
- last_job->done_fence);
+ submit.jobs[submit.job_count - 1],
+ &submit.exec,
+ args->out_sync, &se,
+ submit.jobs[submit.job_count - 1]->done_fence);
- v3d_job_put(&bin->base);
- v3d_job_put(&render->base);
- v3d_job_put(clean_job);
+ v3d_submit_put_jobs(&submit);
return 0;
fail_unreserve:
- mutex_unlock(&v3d->sched_lock);
-fail_perfmon:
- drm_exec_fini(&exec);
+ drm_exec_fini(&submit.exec);
fail:
- v3d_job_cleanup((void *)bin);
- v3d_job_cleanup((void *)render);
- v3d_job_cleanup(clean_job);
+ v3d_submit_cleanup_jobs(&submit);
v3d_put_multisync_post_deps(&se);
return ret;
@@ -1095,14 +1171,13 @@ int
v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
{
- struct v3d_dev *v3d = to_v3d_dev(dev);
+ struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
struct drm_v3d_submit_tfu *args = data;
struct v3d_submit_ext se = {0};
struct v3d_tfu_job *job = NULL;
- struct drm_exec exec;
int ret = 0;
- trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia);
+ trace_v3d_submit_tfu_ioctl(dev, args->iia);
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
drm_dbg(dev, "invalid flags: %d\n", args->flags);
@@ -1117,17 +1192,16 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
}
}
- ret = v3d_job_allocate(v3d, (void *)&job, sizeof(*job));
- if (ret)
- return ret;
-
- ret = v3d_job_init(v3d, file_priv, &job->base,
- v3d_job_free, args->in_sync, &se, V3D_TFU);
- if (ret) {
- v3d_job_deallocate((void *)&job);
+ job = (struct v3d_tfu_job *) v3d_submit_add_job(&submit, V3D_TFU);
+ if (IS_ERR(job)) {
+ ret = PTR_ERR(job);
goto fail;
}
+ ret = v3d_job_add_syncobjs(&job->base, file_priv, args->in_sync, &se);
+ if (ret)
+ goto fail;
+
job->base.bo = kzalloc_objs(*job->base.bo, ARRAY_SIZE(args->bo_handles));
if (!job->base.bo) {
ret = -ENOMEM;
@@ -1155,26 +1229,27 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
job->base.bo[job->base.bo_count] = bo;
}
- ret = v3d_lock_bo_reservations(&job->base, &exec);
+ ret = v3d_lock_bo_reservations(&job->base, &submit.exec);
if (ret)
goto fail;
- mutex_lock(&v3d->sched_lock);
- v3d_push_job(&job->base);
- mutex_unlock(&v3d->sched_lock);
+ ret = v3d_submit_jobs(&submit);
+ if (ret)
+ goto fail_unreserve;
v3d_attach_fences_and_unlock_reservation(file_priv,
- &job->base, &exec,
- args->out_sync,
- &se,
+ &job->base, &submit.exec,
+ args->out_sync, &se,
job->base.done_fence);
- v3d_job_put(&job->base);
+ v3d_submit_put_jobs(&submit);
return 0;
+fail_unreserve:
+ drm_exec_fini(&submit.exec);
fail:
- v3d_job_cleanup((void *)job);
+ v3d_submit_cleanup_jobs(&submit);
v3d_put_multisync_post_deps(&se);
return ret;
@@ -1193,21 +1268,19 @@ int
v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
{
- struct v3d_dev *v3d = to_v3d_dev(dev);
- struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
+ struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
struct drm_v3d_submit_csd *args = data;
struct v3d_submit_ext se = {0};
struct v3d_csd_job *job = NULL;
struct v3d_job *clean_job = NULL;
- struct drm_exec exec;
int ret;
- trace_v3d_submit_csd_ioctl(&v3d->drm, args->cfg[5], args->cfg[6]);
+ trace_v3d_submit_csd_ioctl(dev, args->cfg[5], args->cfg[6]);
if (args->pad)
return -EINVAL;
- if (!v3d_has_csd(v3d)) {
+ if (!v3d_has_csd(submit.v3d)) {
drm_warn(dev, "Attempting CSD submit on non-CSD hardware\n");
return -EINVAL;
}
@@ -1225,55 +1298,36 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
}
}
- ret = v3d_setup_csd_jobs_and_bos(file_priv, v3d, args,
- &job, &clean_job, &se, &exec);
+ ret = v3d_setup_csd_jobs_and_bos(file_priv, submit.v3d, args,
+ &job, &clean_job, &se, &submit.exec);
if (ret)
goto fail;
- if (args->perfmon_id) {
- if (v3d->global_perfmon) {
- ret = -EAGAIN;
- goto fail_perfmon;
- }
+ submit.jobs[submit.job_count++] = &job->base;
+ submit.jobs[submit.job_count++] = clean_job;
- job->base.perfmon = v3d_perfmon_find(v3d_priv,
- args->perfmon_id);
- if (!job->base.perfmon) {
- ret = -ENOENT;
- goto fail_perfmon;
- }
- }
-
- mutex_lock(&v3d->sched_lock);
- v3d_push_job(&job->base);
-
- ret = drm_sched_job_add_dependency(&clean_job->base,
- dma_fence_get(job->base.done_fence));
+ ret = v3d_attach_perfmon_to_jobs(&submit, args->perfmon_id);
if (ret)
goto fail_unreserve;
- v3d_push_job(clean_job);
- mutex_unlock(&v3d->sched_lock);
+ ret = v3d_submit_jobs(&submit);
+ if (ret)
+ goto fail_unreserve;
v3d_attach_fences_and_unlock_reservation(file_priv,
clean_job,
- &exec,
- args->out_sync,
- &se,
+ &submit.exec,
+ args->out_sync, &se,
clean_job->done_fence);
- v3d_job_put(&job->base);
- v3d_job_put(clean_job);
+ v3d_submit_put_jobs(&submit);
return 0;
fail_unreserve:
- mutex_unlock(&v3d->sched_lock);
-fail_perfmon:
- drm_exec_fini(&exec);
+ drm_exec_fini(&submit.exec);
fail:
- v3d_job_cleanup((void *)job);
- v3d_job_cleanup(clean_job);
+ v3d_submit_cleanup_jobs(&submit);
v3d_put_multisync_post_deps(&se);
return ret;
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 09/14] drm/v3d: Make v3d_get_cpu_indirect_csd_params() a pure parser
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
` (7 preceding siblings ...)
2026-06-03 22:25 ` [PATCH v3 08/14] drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-04 2:53 ` sashiko-bot
2026-06-03 22:25 ` [PATCH v3 10/14] drm/v3d: Convert submit helpers to operate on struct v3d_submit Maíra Canal
` (4 subsequent siblings)
13 siblings, 1 reply; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
v3d_get_cpu_indirect_csd_params() currently does double duty: it parses
the indirect CSD extension and, while still inside the extension parser,
also creates the CSD/clean jobs and locks their BOs through a separate
DRM exec context. This nested submission deviates from the standard flow
and makes it hard to fold the indirect CSD path into the unified submit
chain.
Stash the parsed drm_v3d_submit_csd args in struct v3d_indirect_csd_info
and have the parser only fill in the parameters. Then, move job creation
(v3d_setup_csd_jobs_and_bos()) into v3d_submit_cpu_ioctl(), where is the
proper place to create jobs.
No functional change, but prepares to move the CPU ioctl into the
unified submission chain.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_drv.h | 5 +++++
drivers/gpu/drm/v3d/v3d_submit.c | 16 +++++++++++++---
2 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index fae0c53e7aa3..52ed0af67f5e 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -428,6 +428,11 @@ struct v3d_indirect_csd_info {
/* Clean cache job associated to the Indirect CSD job */
struct v3d_job *clean_job;
+ /* Indirect CSD args, stashed by the extension parser and later used
+ * to create the CSD job from them.
+ */
+ struct drm_v3d_submit_csd args;
+
/* Offset within the BO where the workgroup counts are stored */
u32 offset;
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index fcaf3a6cfddc..64eba912dc64 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -632,6 +632,7 @@ v3d_get_cpu_indirect_csd_params(struct drm_file *file_priv,
}
job->job_type = V3D_CPU_JOB_TYPE_INDIRECT_CSD;
+ info->args = indirect_csd.submit;
info->offset = indirect_csd.offset;
info->wg_size = indirect_csd.wg_size;
memcpy(&info->wg_uniform_offsets, &indirect_csd.wg_uniform_offsets,
@@ -639,9 +640,7 @@ v3d_get_cpu_indirect_csd_params(struct drm_file *file_priv,
info->indirect = drm_gem_object_lookup(file_priv, indirect_csd.indirect);
- return v3d_setup_csd_jobs_and_bos(file_priv, v3d, &indirect_csd.submit,
- &info->job, &info->clean_job,
- NULL, &info->exec);
+ return 0;
}
/* Get data for the query timestamp job submission. */
@@ -1404,6 +1403,17 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
goto fail;
}
+ if (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) {
+ ret = v3d_setup_csd_jobs_and_bos(file_priv, v3d,
+ &cpu_job->indirect_csd.args,
+ &cpu_job->indirect_csd.job,
+ &cpu_job->indirect_csd.clean_job,
+ NULL,
+ &cpu_job->indirect_csd.exec);
+ if (ret)
+ goto fail;
+ }
+
clean_job = cpu_job->indirect_csd.clean_job;
csd_job = cpu_job->indirect_csd.job;
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 10/14] drm/v3d: Convert submit helpers to operate on struct v3d_submit
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
` (8 preceding siblings ...)
2026-06-03 22:25 ` [PATCH v3 09/14] drm/v3d: Make v3d_get_cpu_indirect_csd_params() a pure parser Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-04 9:49 ` Tvrtko Ursulin
2026-06-03 22:25 ` [PATCH v3 11/14] drm/v3d: Refactor CPU ioctl into unified submission chain Maíra Canal
` (3 subsequent siblings)
13 siblings, 1 reply; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
Generalize the submission helpers so they act on a whole struct v3d_submit
(the entire job chain) rather than on individual jobs and a drm_exec. This
lets a submission of several chained jobs be locked, fenced, and finalized
as a single unit, and is the groundwork for collapsing the indirect CSD
path into one chain.
The following helpers were generalized:
- v3d_lookup_bos()
- v3d_lock_bo_reservations() (renamed to v3d_submit_lock_reservations()):
- v3d_attach_fences_and_unlock_reservation()
- v3d_setup_csd_jobs_and_bos()
Now, the locking helper now iterates over all jobs and locks the union of
their BOs under one DRM exec, using DRM_EXEC_IGNORE_DUPLICATES to tolerate
shared BO references. The fence-attach helper similarly walks every job
and attaches the chain's last fence to all touched BOs.
Also, v3d_submit_jobs() becomes the single submit-and-finalize entry
point and callers no longer need to open-code fence attachment,
reservation unlocking, etc.
Update CL/TFU/CSD/CPU ioctls to use the new helper signatures. The CPU
ioctl still uses two struct v3d_submit instances (one for the CPU job,
one for the indirect CSD jobs) and keeps its manual two-pass fence-attach
flow. Converting the indirect CSD path into the unified chain is done in
the next commit.
No functional change.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_submit.c | 340 +++++++++++++++------------------------
1 file changed, 128 insertions(+), 212 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 64eba912dc64..62c23feb8fbb 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -20,32 +20,51 @@
* to v3d, so we don't attach dma-buf fences to them.
*/
static int
-v3d_lock_bo_reservations(struct v3d_job *job, struct drm_exec *exec)
+v3d_submit_lock_reservations(struct v3d_submit *submit)
{
- int i, ret;
+ int i, j, ret;
- drm_exec_init(exec, DRM_EXEC_INTERRUPTIBLE_WAIT, job->bo_count);
- drm_exec_until_all_locked(exec) {
- ret = drm_exec_prepare_array(exec, job->bo, job->bo_count, 1);
- }
+ drm_exec_init(&submit->exec,
+ DRM_EXEC_INTERRUPTIBLE_WAIT | DRM_EXEC_IGNORE_DUPLICATES, 0);
+ drm_exec_until_all_locked(&submit->exec) {
+ for (i = 0; i < submit->job_count; i++) {
+ struct v3d_job *job = submit->jobs[i];
- if (ret)
- goto fail;
-
- for (i = 0; i < job->bo_count; i++) {
- ret = drm_sched_job_add_implicit_dependencies(&job->base,
- job->bo[i], true);
+ ret = drm_exec_prepare_array(&submit->exec, job->bo,
+ job->bo_count, 1);
+ if (ret)
+ break;
+ }
+ drm_exec_retry_on_contention(&submit->exec);
if (ret)
goto fail;
}
+ for (i = 0; i < submit->job_count; i++) {
+ struct v3d_job *job = submit->jobs[i];
+
+ for (j = 0; j < job->bo_count; j++) {
+ ret = drm_sched_job_add_implicit_dependencies(&job->base,
+ job->bo[j],
+ true);
+ if (ret)
+ goto fail;
+ }
+ }
+
return 0;
fail:
- drm_exec_fini(exec);
+ drm_exec_fini(&submit->exec);
return ret;
}
+static void
+v3d_submit_unlock_reservations(struct v3d_submit *submit)
+{
+ drm_exec_fini(&submit->exec);
+}
+
/**
* v3d_lookup_bos() - Sets up job->bo[] with the GEM objects
* referenced by the job.
@@ -63,25 +82,23 @@ v3d_lock_bo_reservations(struct v3d_job *job, struct drm_exec *exec)
* failure, because that will happen at `v3d_job_free()`.
*/
static int
-v3d_lookup_bos(struct drm_device *dev,
- struct drm_file *file_priv,
- struct v3d_job *job,
- u64 bo_handles,
- u32 bo_count)
+v3d_lookup_bos(struct v3d_submit *submit, u64 bo_handles, u32 bo_count)
{
- job->bo_count = bo_count;
+ struct v3d_job *last_job = submit->jobs[submit->job_count - 1];
- if (!job->bo_count) {
+ last_job->bo_count = bo_count;
+
+ if (!last_job->bo_count) {
/* See comment on bo_index for why we have to check
* this.
*/
- drm_warn(dev, "Rendering requires BOs\n");
+ drm_warn(&submit->v3d->drm, "Rendering requires BOs\n");
return -EINVAL;
}
- return drm_gem_objects_lookup(file_priv,
+ return drm_gem_objects_lookup(submit->file_priv,
(void __user *)(uintptr_t)bo_handles,
- job->bo_count, &job->bo);
+ last_job->bo_count, &last_job->bo);
}
static void
@@ -160,25 +177,6 @@ void v3d_job_put(struct v3d_job *job)
kref_put(&job->refcount, job->free);
}
-static int
-v3d_job_allocate(struct v3d_dev *v3d, void **container, size_t size)
-{
- *container = kcalloc(1, size, GFP_KERNEL);
- if (!*container) {
- drm_err(&v3d->drm, "Cannot allocate memory for V3D job.\n");
- return -ENOMEM;
- }
-
- return 0;
-}
-
-static void
-v3d_job_deallocate(void **container)
-{
- kfree(*container);
- *container = NULL;
-}
-
static int
v3d_job_add_syncobjs(struct v3d_job *job, struct drm_file *file_priv,
u32 in_sync, struct v3d_submit_ext *se)
@@ -219,48 +217,6 @@ v3d_job_add_syncobjs(struct v3d_job *job, struct drm_file *file_priv,
return 0;
}
-static int
-v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
- struct v3d_job *job, void (*free)(struct kref *ref),
- u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue queue)
-{
- struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
- int ret;
-
- job->v3d = v3d;
- job->free = free;
- job->queue = queue;
- job->file_priv = v3d_priv;
-
- ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
- 1, v3d_priv, file_priv->client_id);
- if (ret)
- return ret;
-
- ret = v3d_job_add_syncobjs(job, file_priv, in_sync, se);
- if (ret)
- goto fail_job_init;
-
- /* CPU jobs don't require hardware resources */
- if (queue != V3D_CPU) {
- ret = v3d_pm_runtime_get(v3d);
- if (ret)
- goto fail_job_init;
- job->has_pm_ref = true;
- }
-
- kref_init(&job->refcount);
-
- job->client_stats = v3d_stats_get(v3d_priv->stats[queue]);
- job->global_stats = v3d_stats_get(v3d->queue[queue].stats);
-
- return 0;
-
-fail_job_init:
- drm_sched_job_cleanup(&job->base);
- return ret;
-}
-
static const struct {
size_t size;
void (*free)(struct kref *ref);
@@ -363,31 +319,34 @@ v3d_attach_perfmon_to_jobs(struct v3d_submit *submit, u32 perfmon_id)
}
static void
-v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv,
- struct v3d_job *job,
- struct drm_exec *exec,
- u32 out_sync,
- struct v3d_submit_ext *se,
- struct dma_fence *done_fence)
+v3d_attach_fences_and_unlock_reservation(struct v3d_submit *submit,
+ u32 out_sync, struct v3d_submit_ext *se)
{
- struct drm_syncobj *sync_out;
bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC);
- int i;
+ struct v3d_job *last_job = submit->jobs[submit->job_count - 1];
+ struct drm_syncobj *sync_out;
- for (i = 0; i < job->bo_count; i++) {
- /* XXX: Use shared fences for read-only objects. */
- dma_resv_add_fence(job->bo[i]->resv, job->done_fence,
- DMA_RESV_USAGE_WRITE);
+ /* The submission's last fence covers the entire submission. Attach it
+ * to every BO touched by any job in the submission.
+ */
+ for (int i = 0; i < submit->job_count; i++) {
+ struct v3d_job *job = submit->jobs[i];
+
+ for (int j = 0; j < job->bo_count; j++) {
+ /* XXX: Use shared fences for read-only objects. */
+ dma_resv_add_fence(job->bo[j]->resv, last_job->done_fence,
+ DMA_RESV_USAGE_WRITE);
+ }
}
- drm_exec_fini(exec);
+ v3d_submit_unlock_reservations(submit);
/* Update the return sync object for the job */
/* If it only supports a single signal semaphore*/
if (!has_multisync) {
- sync_out = drm_syncobj_find(file_priv, out_sync);
+ sync_out = drm_syncobj_find(submit->file_priv, out_sync);
if (sync_out) {
- drm_syncobj_replace_fence(sync_out, done_fence);
+ drm_syncobj_replace_fence(sync_out, last_job->done_fence);
drm_syncobj_put(sync_out);
}
return;
@@ -395,9 +354,9 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv,
/* If multiple semaphores extension is supported */
if (se->out_sync_count) {
- for (i = 0; i < se->out_sync_count; i++) {
+ for (int i = 0; i < se->out_sync_count; i++) {
drm_syncobj_replace_fence(se->out_syncs[i].syncobj,
- done_fence);
+ last_job->done_fence);
drm_syncobj_put(se->out_syncs[i].syncobj);
}
kvfree(se->out_syncs);
@@ -418,7 +377,8 @@ v3d_push_job(struct v3d_job *job)
}
static int
-v3d_submit_jobs(struct v3d_submit *submit)
+v3d_submit_jobs(struct v3d_submit *submit, u32 out_sync,
+ struct v3d_submit_ext *se)
{
struct v3d_dev *v3d = submit->v3d;
int ret = 0;
@@ -438,52 +398,42 @@ v3d_submit_jobs(struct v3d_submit *submit)
}
}
+ mutex_unlock(&v3d->sched_lock);
+
+ v3d_attach_fences_and_unlock_reservation(submit, out_sync, se);
+ v3d_submit_put_jobs(submit);
+
+ return 0;
+
err:
mutex_unlock(&v3d->sched_lock);
return ret;
}
static int
-v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv,
- struct v3d_dev *v3d,
+v3d_setup_csd_jobs_and_bos(struct v3d_submit *submit,
struct drm_v3d_submit_csd *args,
- struct v3d_csd_job **job,
- struct v3d_job **clean_job,
- struct v3d_submit_ext *se,
- struct drm_exec *exec)
+ struct v3d_submit_ext *se)
{
+ struct v3d_csd_job *job;
+ struct v3d_job *clean_job;
int ret;
- ret = v3d_job_allocate(v3d, (void *)job, sizeof(**job));
+ job = (struct v3d_csd_job *) v3d_submit_add_job(submit, V3D_CSD);
+ if (IS_ERR(job))
+ return PTR_ERR(job);
+
+ ret = v3d_job_add_syncobjs(&job->base, submit->file_priv, args->in_sync, se);
if (ret)
return ret;
- ret = v3d_job_init(v3d, file_priv, &(*job)->base,
- v3d_job_free, args->in_sync, se, V3D_CSD);
- if (ret) {
- v3d_job_deallocate((void *)job);
- return ret;
- }
+ job->args = *args;
- ret = v3d_job_allocate(v3d, (void *)clean_job, sizeof(**clean_job));
- if (ret)
- return ret;
+ clean_job = v3d_submit_add_job(submit, V3D_CACHE_CLEAN);
+ if (IS_ERR(clean_job))
+ return PTR_ERR(clean_job);
- ret = v3d_job_init(v3d, file_priv, *clean_job,
- v3d_job_free, 0, NULL, V3D_CACHE_CLEAN);
- if (ret) {
- v3d_job_deallocate((void *)clean_job);
- return ret;
- }
-
- (*job)->args = *args;
-
- ret = v3d_lookup_bos(&v3d->drm, file_priv, *clean_job,
- args->bo_handles, args->bo_handle_count);
- if (ret)
- return ret;
-
- return v3d_lock_bo_reservations(*clean_job, exec);
+ return v3d_lookup_bos(submit, args->bo_handles, args->bo_handle_count);
}
static void
@@ -1123,33 +1073,22 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
if (ret)
goto fail;
- ret = v3d_lookup_bos(dev, file_priv,
- submit.jobs[submit.job_count - 1],
- args->bo_handles, args->bo_handle_count);
+ ret = v3d_lookup_bos(&submit, args->bo_handles, args->bo_handle_count);
if (ret)
goto fail;
- ret = v3d_lock_bo_reservations(submit.jobs[submit.job_count - 1],
- &submit.exec);
+ ret = v3d_submit_lock_reservations(&submit);
if (ret)
goto fail;
- ret = v3d_submit_jobs(&submit);
+ ret = v3d_submit_jobs(&submit, args->out_sync, &se);
if (ret)
goto fail_unreserve;
- v3d_attach_fences_and_unlock_reservation(file_priv,
- submit.jobs[submit.job_count - 1],
- &submit.exec,
- args->out_sync, &se,
- submit.jobs[submit.job_count - 1]->done_fence);
-
- v3d_submit_put_jobs(&submit);
-
return 0;
fail_unreserve:
- drm_exec_fini(&submit.exec);
+ v3d_submit_unlock_reservations(&submit);
fail:
v3d_submit_cleanup_jobs(&submit);
v3d_put_multisync_post_deps(&se);
@@ -1228,25 +1167,18 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
job->base.bo[job->base.bo_count] = bo;
}
- ret = v3d_lock_bo_reservations(&job->base, &submit.exec);
+ ret = v3d_submit_lock_reservations(&submit);
if (ret)
goto fail;
- ret = v3d_submit_jobs(&submit);
+ ret = v3d_submit_jobs(&submit, args->out_sync, &se);
if (ret)
goto fail_unreserve;
- v3d_attach_fences_and_unlock_reservation(file_priv,
- &job->base, &submit.exec,
- args->out_sync, &se,
- job->base.done_fence);
-
- v3d_submit_put_jobs(&submit);
-
return 0;
fail_unreserve:
- drm_exec_fini(&submit.exec);
+ v3d_submit_unlock_reservations(&submit);
fail:
v3d_submit_cleanup_jobs(&submit);
v3d_put_multisync_post_deps(&se);
@@ -1270,8 +1202,6 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
struct drm_v3d_submit_csd *args = data;
struct v3d_submit_ext se = {0};
- struct v3d_csd_job *job = NULL;
- struct v3d_job *clean_job = NULL;
int ret;
trace_v3d_submit_csd_ioctl(dev, args->cfg[5], args->cfg[6]);
@@ -1297,34 +1227,26 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
}
}
- ret = v3d_setup_csd_jobs_and_bos(file_priv, submit.v3d, args,
- &job, &clean_job, &se, &submit.exec);
+ ret = v3d_setup_csd_jobs_and_bos(&submit, args, &se);
if (ret)
goto fail;
- submit.jobs[submit.job_count++] = &job->base;
- submit.jobs[submit.job_count++] = clean_job;
-
ret = v3d_attach_perfmon_to_jobs(&submit, args->perfmon_id);
if (ret)
- goto fail_unreserve;
+ goto fail;
- ret = v3d_submit_jobs(&submit);
+ ret = v3d_submit_lock_reservations(&submit);
+ if (ret)
+ goto fail;
+
+ ret = v3d_submit_jobs(&submit, args->out_sync, &se);
if (ret)
goto fail_unreserve;
- v3d_attach_fences_and_unlock_reservation(file_priv,
- clean_job,
- &submit.exec,
- args->out_sync, &se,
- clean_job->done_fence);
-
- v3d_submit_put_jobs(&submit);
-
return 0;
fail_unreserve:
- drm_exec_fini(&submit.exec);
+ v3d_submit_unlock_reservations(&submit);
fail:
v3d_submit_cleanup_jobs(&submit);
v3d_put_multisync_post_deps(&se);
@@ -1355,13 +1277,14 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
{
struct v3d_dev *v3d = to_v3d_dev(dev);
+ struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
+ struct v3d_submit indirect_submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
struct drm_v3d_submit_cpu *args = data;
struct v3d_submit_ext se = {0};
struct v3d_submit_ext *out_se = NULL;
struct v3d_cpu_job *cpu_job = NULL;
struct v3d_csd_job *csd_job = NULL;
struct v3d_job *clean_job = NULL;
- struct drm_exec exec;
int ret;
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
@@ -1369,9 +1292,9 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
return -EINVAL;
}
- ret = v3d_job_allocate(v3d, (void *)&cpu_job, sizeof(*cpu_job));
- if (ret)
- return ret;
+ cpu_job = (struct v3d_cpu_job *) v3d_submit_add_job(&submit, V3D_CPU);
+ if (IS_ERR(cpu_job))
+ return PTR_ERR(cpu_job);
if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
ret = v3d_get_extensions(file_priv, args->extensions, &se, cpu_job);
@@ -1396,34 +1319,35 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
trace_v3d_submit_cpu_ioctl(&v3d->drm, cpu_job->job_type);
- ret = v3d_job_init(v3d, file_priv, &cpu_job->base,
- v3d_cpu_job_free, 0, &se, V3D_CPU);
- if (ret) {
- v3d_job_deallocate((void *)&cpu_job);
+ ret = v3d_job_add_syncobjs(&cpu_job->base, file_priv, 0, &se);
+ if (ret)
goto fail;
- }
if (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) {
- ret = v3d_setup_csd_jobs_and_bos(file_priv, v3d,
+ ret = v3d_setup_csd_jobs_and_bos(&indirect_submit,
&cpu_job->indirect_csd.args,
- &cpu_job->indirect_csd.job,
- &cpu_job->indirect_csd.clean_job,
- NULL,
- &cpu_job->indirect_csd.exec);
+ NULL);
if (ret)
goto fail;
+
+ ret = v3d_submit_lock_reservations(&indirect_submit);
+ if (ret)
+ goto fail;
+
+ cpu_job->indirect_csd.job = container_of(indirect_submit.jobs[0],
+ struct v3d_csd_job, base);
+ cpu_job->indirect_csd.clean_job = indirect_submit.jobs[1];
+
+ clean_job = cpu_job->indirect_csd.clean_job;
+ csd_job = cpu_job->indirect_csd.job;
}
- clean_job = cpu_job->indirect_csd.clean_job;
- csd_job = cpu_job->indirect_csd.job;
-
if (args->bo_handle_count) {
- ret = v3d_lookup_bos(dev, file_priv, &cpu_job->base,
- args->bo_handles, args->bo_handle_count);
+ ret = v3d_lookup_bos(&submit, args->bo_handles, args->bo_handle_count);
if (ret)
goto fail;
- ret = v3d_lock_bo_reservations(&cpu_job->base, &exec);
+ ret = v3d_submit_lock_reservations(&submit);
if (ret)
goto fail;
}
@@ -1455,36 +1379,28 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
out_se = (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) ? NULL : &se;
- v3d_attach_fences_and_unlock_reservation(file_priv,
- &cpu_job->base,
- &exec, 0,
- out_se, cpu_job->base.done_fence);
+ v3d_attach_fences_and_unlock_reservation(&submit, 0, out_se);
switch (cpu_job->job_type) {
case V3D_CPU_JOB_TYPE_INDIRECT_CSD:
- v3d_attach_fences_and_unlock_reservation(file_priv,
- clean_job,
- &cpu_job->indirect_csd.exec,
- 0, &se, clean_job->done_fence);
+ v3d_attach_fences_and_unlock_reservation(&indirect_submit, 0, &se);
break;
default:
break;
}
- v3d_job_put(&cpu_job->base);
- v3d_job_put(&csd_job->base);
- v3d_job_put(clean_job);
+ v3d_submit_put_jobs(&submit);
+ v3d_submit_put_jobs(&indirect_submit);
return 0;
fail_unreserve:
mutex_unlock(&v3d->sched_lock);
- drm_exec_fini(&exec);
- drm_exec_fini(&cpu_job->indirect_csd.exec);
+ v3d_submit_unlock_reservations(&submit);
+ v3d_submit_unlock_reservations(&indirect_submit);
fail:
- v3d_job_cleanup((void *)cpu_job);
- v3d_job_cleanup((void *)csd_job);
- v3d_job_cleanup(clean_job);
+ v3d_submit_cleanup_jobs(&submit);
+ v3d_submit_cleanup_jobs(&indirect_submit);
v3d_put_multisync_post_deps(&se);
return ret;
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 11/14] drm/v3d: Refactor CPU ioctl into unified submission chain
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
` (9 preceding siblings ...)
2026-06-03 22:25 ` [PATCH v3 10/14] drm/v3d: Convert submit helpers to operate on struct v3d_submit Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-04 2:58 ` sashiko-bot
2026-06-04 9:54 ` Tvrtko Ursulin
2026-06-03 22:25 ` [PATCH v3 12/14] drm/v3d: Split BO fence attach from syncobj output handling Maíra Canal
` (2 subsequent siblings)
13 siblings, 2 replies; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
Restructure the CPU ioctl so that all job types, including indirect CSD,
use a single struct v3d_submit chain and a single DRM exec context.
Now that v3d_get_cpu_indirect_csd_params() is a pure parser and the
submit helpers operate on struct v3d_submit, fold the indirect CSD path
into the standard flow by appending the CSD and CLEAN_CACHE jobs to the
same struct v3d_submit as the CPU job and locking the union of all jobs'
BOs under one drm_exec. This eliminates the second drm_exec, the nested
submission, and the conditional two-pass fence attachment that the CPU
ioctl previously required for the indirect CSD path.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_drv.h | 6 ---
drivers/gpu/drm/v3d/v3d_submit.c | 90 ++++++++++------------------------------
2 files changed, 23 insertions(+), 73 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 52ed0af67f5e..e91b1b83dc23 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -425,9 +425,6 @@ struct v3d_indirect_csd_info {
/* Indirect CSD */
struct v3d_csd_job *job;
- /* Clean cache job associated to the Indirect CSD job */
- struct v3d_job *clean_job;
-
/* Indirect CSD args, stashed by the extension parser and later used
* to create the CSD job from them.
*/
@@ -446,9 +443,6 @@ struct v3d_indirect_csd_info {
/* Indirect BO */
struct drm_gem_object *indirect;
-
- /* Context of the Indirect CSD job */
- struct drm_exec exec;
};
struct v3d_timestamp_query_info {
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 62c23feb8fbb..1df7ab528422 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -1276,15 +1276,10 @@ int
v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
{
- struct v3d_dev *v3d = to_v3d_dev(dev);
struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
- struct v3d_submit indirect_submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
struct drm_v3d_submit_cpu *args = data;
struct v3d_submit_ext se = {0};
- struct v3d_submit_ext *out_se = NULL;
struct v3d_cpu_job *cpu_job = NULL;
- struct v3d_csd_job *csd_job = NULL;
- struct v3d_job *clean_job = NULL;
int ret;
if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
@@ -1317,90 +1312,51 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
goto fail;
}
- trace_v3d_submit_cpu_ioctl(&v3d->drm, cpu_job->job_type);
+ trace_v3d_submit_cpu_ioctl(dev, cpu_job->job_type);
ret = v3d_job_add_syncobjs(&cpu_job->base, file_priv, 0, &se);
if (ret)
goto fail;
- if (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) {
- ret = v3d_setup_csd_jobs_and_bos(&indirect_submit,
- &cpu_job->indirect_csd.args,
- NULL);
- if (ret)
- goto fail;
-
- ret = v3d_submit_lock_reservations(&indirect_submit);
- if (ret)
- goto fail;
-
- cpu_job->indirect_csd.job = container_of(indirect_submit.jobs[0],
- struct v3d_csd_job, base);
- cpu_job->indirect_csd.clean_job = indirect_submit.jobs[1];
-
- clean_job = cpu_job->indirect_csd.clean_job;
- csd_job = cpu_job->indirect_csd.job;
- }
-
+ /* Look up the CPU jobs' BOs before v3d_setup_csd_jobs_and_bos() appends
+ * the CSD and clean jobs in the case of indirect CSD job.
+ */
if (args->bo_handle_count) {
ret = v3d_lookup_bos(&submit, args->bo_handles, args->bo_handle_count);
if (ret)
goto fail;
+ }
- ret = v3d_submit_lock_reservations(&submit);
+ if (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) {
+ ret = v3d_setup_csd_jobs_and_bos(&submit, &cpu_job->indirect_csd.args,
+ NULL);
if (ret)
goto fail;
+
+ /* The CSD job was appended at jobs[1] */
+ if (WARN_ON(submit.jobs[1]->queue != V3D_CSD)) {
+ ret = -EINVAL;
+ goto fail;
+ }
+
+ cpu_job->indirect_csd.job = container_of(submit.jobs[1], struct v3d_csd_job,
+ base);
}
- mutex_lock(&v3d->sched_lock);
- v3d_push_job(&cpu_job->base);
+ ret = v3d_submit_lock_reservations(&submit);
+ if (ret)
+ goto fail;
- switch (cpu_job->job_type) {
- case V3D_CPU_JOB_TYPE_INDIRECT_CSD:
- ret = drm_sched_job_add_dependency(&csd_job->base.base,
- dma_fence_get(cpu_job->base.done_fence));
- if (ret)
- goto fail_unreserve;
-
- v3d_push_job(&csd_job->base);
-
- ret = drm_sched_job_add_dependency(&clean_job->base,
- dma_fence_get(csd_job->base.done_fence));
- if (ret)
- goto fail_unreserve;
-
- v3d_push_job(clean_job);
-
- break;
- default:
- break;
- }
- mutex_unlock(&v3d->sched_lock);
-
- out_se = (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) ? NULL : &se;
-
- v3d_attach_fences_and_unlock_reservation(&submit, 0, out_se);
-
- switch (cpu_job->job_type) {
- case V3D_CPU_JOB_TYPE_INDIRECT_CSD:
- v3d_attach_fences_and_unlock_reservation(&indirect_submit, 0, &se);
- break;
- default:
- break;
- }
-
- v3d_submit_put_jobs(&submit);
- v3d_submit_put_jobs(&indirect_submit);
+ ret = v3d_submit_jobs(&submit, 0, &se);
+ if (ret)
+ goto fail_unreserve;
return 0;
fail_unreserve:
- mutex_unlock(&v3d->sched_lock);
v3d_submit_unlock_reservations(&submit);
- v3d_submit_unlock_reservations(&indirect_submit);
fail:
v3d_submit_cleanup_jobs(&submit);
- v3d_submit_cleanup_jobs(&indirect_submit);
v3d_put_multisync_post_deps(&se);
return ret;
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 12/14] drm/v3d: Split BO fence attach from syncobj output handling
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
` (10 preceding siblings ...)
2026-06-03 22:25 ` [PATCH v3 11/14] drm/v3d: Refactor CPU ioctl into unified submission chain Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-04 9:59 ` Tvrtko Ursulin
2026-06-03 22:25 ` [PATCH v3 13/14] drm/v3d: Reject invalid out_sync handles in submit ioctls Maíra Canal
2026-06-03 22:25 ` [PATCH v3 14/14] drm/v3d: Ensure atomic submissions in v3d_submit_jobs() Maíra Canal
13 siblings, 1 reply; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
v3d_attach_fences_and_unlock_reservation() does three different things:
(1) attaches the submission's last fence to every BO, (2) releases
drm_exec, and (3) replaces the userspace out_sync syncobjs. Decouple
these three behaviors into different functions, so that each function
has a more self-contained behavior.
v3d_submit_jobs() now invokes the three steps explicitly, which makes the
submission sequence self-documenting and keeps each helper self-contained.
No functional change; just code consolidation.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_submit.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 1df7ab528422..2beb99a25104 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -319,12 +319,9 @@ v3d_attach_perfmon_to_jobs(struct v3d_submit *submit, u32 perfmon_id)
}
static void
-v3d_attach_fences_and_unlock_reservation(struct v3d_submit *submit,
- u32 out_sync, struct v3d_submit_ext *se)
+v3d_submit_attach_object_fences(struct v3d_submit *submit)
{
- bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC);
struct v3d_job *last_job = submit->jobs[submit->job_count - 1];
- struct drm_syncobj *sync_out;
/* The submission's last fence covers the entire submission. Attach it
* to every BO touched by any job in the submission.
@@ -338,8 +335,15 @@ v3d_attach_fences_and_unlock_reservation(struct v3d_submit *submit,
DMA_RESV_USAGE_WRITE);
}
}
+}
- v3d_submit_unlock_reservations(submit);
+static void
+v3d_submit_process_post_deps(struct v3d_submit *submit, u32 out_sync,
+ struct v3d_submit_ext *se)
+{
+ bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC);
+ struct v3d_job *last_job = submit->jobs[submit->job_count - 1];
+ struct drm_syncobj *sync_out;
/* Update the return sync object for the job */
/* If it only supports a single signal semaphore*/
@@ -400,7 +404,10 @@ v3d_submit_jobs(struct v3d_submit *submit, u32 out_sync,
mutex_unlock(&v3d->sched_lock);
- v3d_attach_fences_and_unlock_reservation(submit, out_sync, se);
+ v3d_submit_attach_object_fences(submit);
+ v3d_submit_unlock_reservations(submit);
+ v3d_submit_process_post_deps(submit, out_sync, se);
+
v3d_submit_put_jobs(submit);
return 0;
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 13/14] drm/v3d: Reject invalid out_sync handles in submit ioctls
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
` (11 preceding siblings ...)
2026-06-03 22:25 ` [PATCH v3 12/14] drm/v3d: Split BO fence attach from syncobj output handling Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-04 2:59 ` sashiko-bot
2026-06-04 10:12 ` Tvrtko Ursulin
2026-06-03 22:25 ` [PATCH v3 14/14] drm/v3d: Ensure atomic submissions in v3d_submit_jobs() Maíra Canal
13 siblings, 2 replies; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
v3d_submit_process_post_deps() looks up the out_sync syncobj via
drm_syncobj_find(), and if userspace passes a non-zero handle that doesn't
refer to a valid syncobj, the lookup silently returns NULL and the
post-deps step skips publishing the submission's last fence to it. The
ioctl still returns success, leaving userspace to wait on a invalid
syncobj.
Instead of silently ignoring an invalid non-zero out_sync, move the syncobj
lookup to the submission and make it fail with -ENOENT up front, mirroring
the syncobj validation already done for in_sync. Now,
v3d_submit_process_post_deps() only does the fence replacement.
Note that the lookup is skipped when the multi-sync extension is in use,
since args->out_sync is unused in that case.
To keep cleanup symmetric on error paths, convert the function
v3d_put_multisync_post_deps() into a single function that releases the
references that were acquired but never published for both single-sync
and multi-sync.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_submit.c | 50 +++++++++++++++++++++++++++++-----------
1 file changed, 36 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 2beb99a25104..dc27770d85fd 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -338,17 +338,15 @@ v3d_submit_attach_object_fences(struct v3d_submit *submit)
}
static void
-v3d_submit_process_post_deps(struct v3d_submit *submit, u32 out_sync,
+v3d_submit_process_post_deps(struct v3d_submit *submit, struct drm_syncobj *sync_out,
struct v3d_submit_ext *se)
{
bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC);
struct v3d_job *last_job = submit->jobs[submit->job_count - 1];
- struct drm_syncobj *sync_out;
/* Update the return sync object for the job */
/* If it only supports a single signal semaphore*/
if (!has_multisync) {
- sync_out = drm_syncobj_find(submit->file_priv, out_sync);
if (sync_out) {
drm_syncobj_replace_fence(sync_out, last_job->done_fence);
drm_syncobj_put(sync_out);
@@ -381,7 +379,7 @@ v3d_push_job(struct v3d_job *job)
}
static int
-v3d_submit_jobs(struct v3d_submit *submit, u32 out_sync,
+v3d_submit_jobs(struct v3d_submit *submit, struct drm_syncobj *sync_out,
struct v3d_submit_ext *se)
{
struct v3d_dev *v3d = submit->v3d;
@@ -406,7 +404,7 @@ v3d_submit_jobs(struct v3d_submit *submit, u32 out_sync,
v3d_submit_attach_object_fences(submit);
v3d_submit_unlock_reservations(submit);
- v3d_submit_process_post_deps(submit, out_sync, se);
+ v3d_submit_process_post_deps(submit, sync_out, se);
v3d_submit_put_jobs(submit);
@@ -444,10 +442,13 @@ v3d_setup_csd_jobs_and_bos(struct v3d_submit *submit,
}
static void
-v3d_put_multisync_post_deps(struct v3d_submit_ext *se)
+v3d_submit_put_post_deps(struct drm_syncobj *sync_out, struct v3d_submit_ext *se)
{
unsigned int i;
+ if (sync_out)
+ drm_syncobj_put(sync_out);
+
if (!(se && se->out_sync_count))
return;
@@ -1006,6 +1007,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
{
struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
struct drm_v3d_submit_cl *args = data;
+ struct drm_syncobj *sync_out = NULL;
struct v3d_submit_ext se = {0};
struct v3d_bin_job *bin = NULL;
struct v3d_render_job *render = NULL;
@@ -1032,6 +1034,12 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
}
}
+ if (args->out_sync && !(se.flags & DRM_V3D_EXT_ID_MULTI_SYNC)) {
+ sync_out = drm_syncobj_find(file_priv, args->out_sync);
+ if (!sync_out)
+ return -ENOENT;
+ }
+
if (args->bcl_start != args->bcl_end) {
bin = (struct v3d_bin_job *) v3d_submit_add_job(&submit, V3D_BIN);
if (IS_ERR(bin)) {
@@ -1088,7 +1096,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
if (ret)
goto fail;
- ret = v3d_submit_jobs(&submit, args->out_sync, &se);
+ ret = v3d_submit_jobs(&submit, sync_out, &se);
if (ret)
goto fail_unreserve;
@@ -1098,7 +1106,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
v3d_submit_unlock_reservations(&submit);
fail:
v3d_submit_cleanup_jobs(&submit);
- v3d_put_multisync_post_deps(&se);
+ v3d_submit_put_post_deps(sync_out, &se);
return ret;
}
@@ -1118,6 +1126,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
{
struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
struct drm_v3d_submit_tfu *args = data;
+ struct drm_syncobj *sync_out = NULL;
struct v3d_submit_ext se = {0};
struct v3d_tfu_job *job = NULL;
int ret = 0;
@@ -1137,6 +1146,12 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
}
}
+ if (args->out_sync && !(se.flags & DRM_V3D_EXT_ID_MULTI_SYNC)) {
+ sync_out = drm_syncobj_find(file_priv, args->out_sync);
+ if (!sync_out)
+ return -ENOENT;
+ }
+
job = (struct v3d_tfu_job *) v3d_submit_add_job(&submit, V3D_TFU);
if (IS_ERR(job)) {
ret = PTR_ERR(job);
@@ -1178,7 +1193,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
if (ret)
goto fail;
- ret = v3d_submit_jobs(&submit, args->out_sync, &se);
+ ret = v3d_submit_jobs(&submit, sync_out, &se);
if (ret)
goto fail_unreserve;
@@ -1188,7 +1203,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
v3d_submit_unlock_reservations(&submit);
fail:
v3d_submit_cleanup_jobs(&submit);
- v3d_put_multisync_post_deps(&se);
+ v3d_submit_put_post_deps(sync_out, &se);
return ret;
}
@@ -1208,6 +1223,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
{
struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
struct drm_v3d_submit_csd *args = data;
+ struct drm_syncobj *sync_out = NULL;
struct v3d_submit_ext se = {0};
int ret;
@@ -1234,6 +1250,12 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
}
}
+ if (args->out_sync && !(se.flags & DRM_V3D_EXT_ID_MULTI_SYNC)) {
+ sync_out = drm_syncobj_find(file_priv, args->out_sync);
+ if (!sync_out)
+ return -ENOENT;
+ }
+
ret = v3d_setup_csd_jobs_and_bos(&submit, args, &se);
if (ret)
goto fail;
@@ -1246,7 +1268,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
if (ret)
goto fail;
- ret = v3d_submit_jobs(&submit, args->out_sync, &se);
+ ret = v3d_submit_jobs(&submit, sync_out, &se);
if (ret)
goto fail_unreserve;
@@ -1256,7 +1278,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
v3d_submit_unlock_reservations(&submit);
fail:
v3d_submit_cleanup_jobs(&submit);
- v3d_put_multisync_post_deps(&se);
+ v3d_submit_put_post_deps(sync_out, &se);
return ret;
}
@@ -1354,7 +1376,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
if (ret)
goto fail;
- ret = v3d_submit_jobs(&submit, 0, &se);
+ ret = v3d_submit_jobs(&submit, NULL, &se);
if (ret)
goto fail_unreserve;
@@ -1364,7 +1386,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
v3d_submit_unlock_reservations(&submit);
fail:
v3d_submit_cleanup_jobs(&submit);
- v3d_put_multisync_post_deps(&se);
+ v3d_submit_put_post_deps(NULL, &se);
return ret;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 14/14] drm/v3d: Ensure atomic submissions in v3d_submit_jobs()
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
` (12 preceding siblings ...)
2026-06-03 22:25 ` [PATCH v3 13/14] drm/v3d: Reject invalid out_sync handles in submit ioctls Maíra Canal
@ 2026-06-03 22:25 ` Maíra Canal
2026-06-04 10:25 ` Tvrtko Ursulin
13 siblings, 1 reply; 32+ messages in thread
From: Maíra Canal @ 2026-06-03 22:25 UTC (permalink / raw)
To: Melissa Wen, Iago Toral, Tvrtko Ursulin, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel, Maíra Canal
Currently, v3d_submit_jobs() arms and pushes each job one at a time,
wiring dependencies between consecutive jobs after each push. If
drm_sched_job_add_dependency() fails midway, the already-pushed jobs are
scheduler-owned and will be submitted to the GPU for execution, even though
the subsequent jobs won't be submitted.
This breaks the atomicity of the submissions, as only some of the jobs
from a submission would be submitted to the hardware, while the other part
fails.
Restructure v3d_submit_jobs() into three phases: (1) arm all jobs belonging
to a given submission, (2) wire inter-job dependencies, and (3) push all
jobs to the scheduler unconditionally. Phase (2) can fail; on failure,
it marks every armed job finished fence with an error, so that run_job()
callbacks skip hardware execution.
This guarantees that every armed job is always pushed, either to run
or to be skipped, and it also ensures the atomicity of a submission.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
drivers/gpu/drm/v3d/v3d_sched.c | 6 ++++
drivers/gpu/drm/v3d/v3d_submit.c | 65 ++++++++++++++++++++--------------------
2 files changed, 39 insertions(+), 32 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 04fd1a365576..c16a9d4d41e6 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -656,6 +656,9 @@ v3d_cpu_job_run(struct drm_sched_job *sched_job)
struct v3d_cpu_job *job = to_cpu_job(sched_job);
struct v3d_dev *v3d = job->base.v3d;
+ if (unlikely(job->base.base.s_fence->finished.error))
+ return NULL;
+
if (job->job_type >= ARRAY_SIZE(cpu_job_function)) {
drm_dbg(&v3d->drm, "Unknown CPU job: %d\n", job->job_type);
return NULL;
@@ -679,6 +682,9 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job)
struct v3d_job *job = to_v3d_job(sched_job);
struct v3d_dev *v3d = job->v3d;
+ if (unlikely(job->base.s_fence->finished.error))
+ return NULL;
+
v3d_job_start_stats(job);
v3d_clean_caches(v3d);
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index dc27770d85fd..e118ba69e308 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -365,19 +365,6 @@ v3d_submit_process_post_deps(struct v3d_submit *submit, struct drm_syncobj *sync
}
}
-static void
-v3d_push_job(struct v3d_job *job)
-{
- drm_sched_job_arm(&job->base);
-
- job->done_fence = dma_fence_get(&job->base.s_fence->finished);
-
- /* put by scheduler job completion */
- kref_get(&job->refcount);
-
- drm_sched_entity_push_job(&job->base);
-}
-
static int
v3d_submit_jobs(struct v3d_submit *submit, struct drm_syncobj *sync_out,
struct v3d_submit_ext *se)
@@ -390,16 +377,23 @@ v3d_submit_jobs(struct v3d_submit *submit, struct drm_syncobj *sync_out,
for (int i = 0; i < submit->job_count; i++) {
struct v3d_job *job = submit->jobs[i];
- v3d_push_job(job);
+ drm_sched_job_arm(&job->base);
+ job->done_fence = dma_fence_get(&job->base.s_fence->finished);
- if (i + 1 < submit->job_count) {
- ret = drm_sched_job_add_dependency(&submit->jobs[i + 1]->base,
- dma_fence_get(job->done_fence));
- if (ret)
- goto err;
- }
+ /* put by scheduler job completion */
+ kref_get(&job->refcount);
}
+ for (int i = 0; i + 1 < submit->job_count; i++) {
+ ret = drm_sched_job_add_dependency(&submit->jobs[i + 1]->base,
+ dma_fence_get(submit->jobs[i]->done_fence));
+ if (ret)
+ goto err;
+ }
+
+ for (int i = 0; i < submit->job_count; i++)
+ drm_sched_entity_push_job(&submit->jobs[i]->base);
+
mutex_unlock(&v3d->sched_lock);
v3d_submit_attach_object_fences(submit);
@@ -411,7 +405,18 @@ v3d_submit_jobs(struct v3d_submit *submit, struct drm_syncobj *sync_out,
return 0;
err:
+ /* Mark every armed job as failed so run_job() skips execution */
+ for (int i = 0; i < submit->job_count; i++)
+ dma_fence_set_error(&submit->jobs[i]->base.s_fence->finished, ret);
+
+ for (int i = 0; i < submit->job_count; i++)
+ drm_sched_entity_push_job(&submit->jobs[i]->base);
+
mutex_unlock(&v3d->sched_lock);
+
+ v3d_submit_unlock_reservations(submit);
+ v3d_submit_put_jobs(submit);
+
return ret;
}
@@ -1098,14 +1103,13 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
ret = v3d_submit_jobs(&submit, sync_out, &se);
if (ret)
- goto fail_unreserve;
+ goto fail_submit;
return 0;
-fail_unreserve:
- v3d_submit_unlock_reservations(&submit);
fail:
v3d_submit_cleanup_jobs(&submit);
+fail_submit:
v3d_submit_put_post_deps(sync_out, &se);
return ret;
@@ -1195,14 +1199,13 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
ret = v3d_submit_jobs(&submit, sync_out, &se);
if (ret)
- goto fail_unreserve;
+ goto fail_submit;
return 0;
-fail_unreserve:
- v3d_submit_unlock_reservations(&submit);
fail:
v3d_submit_cleanup_jobs(&submit);
+fail_submit:
v3d_submit_put_post_deps(sync_out, &se);
return ret;
@@ -1270,14 +1273,13 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
ret = v3d_submit_jobs(&submit, sync_out, &se);
if (ret)
- goto fail_unreserve;
+ goto fail_submit;
return 0;
-fail_unreserve:
- v3d_submit_unlock_reservations(&submit);
fail:
v3d_submit_cleanup_jobs(&submit);
+fail_submit:
v3d_submit_put_post_deps(sync_out, &se);
return ret;
@@ -1378,14 +1380,13 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
ret = v3d_submit_jobs(&submit, NULL, &se);
if (ret)
- goto fail_unreserve;
+ goto fail_submit;
return 0;
-fail_unreserve:
- v3d_submit_unlock_reservations(&submit);
fail:
v3d_submit_cleanup_jobs(&submit);
+fail_submit:
v3d_submit_put_post_deps(NULL, &se);
return ret;
--
2.54.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v3 03/14] drm/v3d: Use inline lock for dma fence initialization
2026-06-03 22:25 ` [PATCH v3 03/14] drm/v3d: Use inline lock for dma fence initialization Maíra Canal
@ 2026-06-04 2:49 ` sashiko-bot
0 siblings, 0 replies; 32+ messages in thread
From: sashiko-bot @ 2026-06-04 2:49 UTC (permalink / raw)
To: Maíra Canal; +Cc: dri-devel
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Critical] Decoupling the DMA fence lock from the queue lock breaks implicit synchronization, causing a Use-After-Free of `bin_job` in `v3d_overflow_mem_work()`.
--
commit f6b0f3208abbe884d00ac4c3eab910833b936a7f
Author: Maíra Canal <mcanal@igalia.com>
Subject: drm/v3d: Use inline lock for dma fence initialization
This patch changes dma_fence_init() to accept a NULL external lock,
falling back to an inline spinlock embedded in the fence itself. This
decouples the lock and the fence lifetime, allowing queue_lock to use
spin_(un)lock() instead of the IRQ variants in all its uses.
> diff --git a/drivers/gpu/drm/v3d/v3d_fence.c b/drivers/gpu/drm/v3d/v3d_fence.c
> index c500136d04557..9b1a882a4c159 100644
> --- a/drivers/gpu/drm/v3d/v3d_fence.c
> +++ b/drivers/gpu/drm/v3d/v3d_fence.c
> @@ -15,7 +15,7 @@ struct dma_fence *v3d_fence_create(struct v3d_dev *v3d, enum v3d_queue q)
> fence->dev = &v3d->drm;
> fence->queue = q;
> fence->seqno = ++queue->emit_seqno;
> - dma_fence_init(&fence->base, &v3d_fence_ops, &queue->queue_lock,
> + dma_fence_init(&fence->base, &v3d_fence_ops, NULL,
[Severity: Critical]
Does decoupling the DMA fence lock from the queue lock here break implicit
synchronization and cause a use-after-free of bin_job in
v3d_overflow_mem_work()?
It looks like v3d_overflow_mem_work() runs in a workqueue and accesses
bin_job = (struct v3d_bin_job *)queue->active_job under queue_lock:
drivers/gpu/drm/v3d/v3d_irq.c:v3d_overflow_mem_work() {
...
spin_lock_irqsave(&queue->queue_lock, irq_flags);
...
list_add_tail(&bo->unref_head, &bin_job->render->unref_list);
...
}
Meanwhile, the interrupt handler v3d_irq_signal_fence() clears
queue->active_job and signals the job's fence:
drivers/gpu/drm/v3d/v3d_irq.c:v3d_irq_signal_fence() {
...
queue->active_job = NULL;
...
dma_fence_signal(&fence->base);
...
}
Before this patch, the fence's external lock was set to &queue->queue_lock,
so dma_fence_signal() would block on queue_lock, preventing the DRM scheduler
from waking up and freeing the job while v3d_overflow_mem_work() was executing.
By changing the fence's external lock to NULL, dma_fence_signal() uses its
own inline lock and returns immediately. Could this allow the DRM scheduler's
wait queue to wake up and free bin_job concurrently, leading to a
use-after-free when v3d_overflow_mem_work() dereferences bin_job->render?
> queue->fence_context, fence->seqno);
>
> return &fence->base;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603-v3d-sched-misc-fixes-v3-0-d7114bba55a0@igalia.com?part=3
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 02/14] drm/v3d: Clear queue->active_job when v3d_fence_create() fails
2026-06-03 22:25 ` [PATCH v3 02/14] drm/v3d: Clear queue->active_job when v3d_fence_create() fails Maíra Canal
@ 2026-06-04 2:50 ` sashiko-bot
0 siblings, 0 replies; 32+ messages in thread
From: sashiko-bot @ 2026-06-04 2:50 UTC (permalink / raw)
To: Maíra Canal; +Cc: dri-devel
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] The `run_job` callbacks return `NULL` instead of propagating the error pointer when `v3d_fence_create()` fails, masking submission failures and causing the DRM scheduler to treat the job as successfully completed.
--
commit f3ab1695e1c90559d6118895b31f3b02289aee81
Author: Maíra Canal <mcanal@igalia.com>
drm/v3d: Clear queue->active_job when v3d_fence_create() fails
This commit clears queue->active_job in the run_job callbacks for BIN, RENDER,
TFU, and CSD when v3d_fence_create() fails. It creates a failure path to
prevent dangling pointers when returning from a failed job submission.
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index 0e266b29317f3..46f4fc09c59ec 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
[ ... ]
> @@ -214,7 +210,7 @@ static struct dma_fence *v3d_bin_job_run(struct drm_sched_job *sched_job)
>
> fence = v3d_fence_create(v3d, V3D_BIN);
> if (IS_ERR(fence))
> - return NULL;
> + goto out_clean_job;
[ ... ]
> @@ -242,6 +238,12 @@ static struct dma_fence *v3d_bin_job_run(struct drm_sched_job *sched_job)
> V3D_CORE_WRITE(0, V3D_CLE_CT0QEA, job->end);
>
> return fence;
> +
> +out_clean_job:
> + spin_lock_irqsave(&queue->queue_lock, irqflags);
> + queue->active_job = NULL;
> + spin_unlock_irqrestore(&queue->queue_lock, irqflags);
> + return NULL;
[Severity: High]
This is a pre-existing issue, but does returning NULL here mask the
submission failure from the DRM scheduler?
When v3d_fence_create() fails, it returns an error pointer. If the
v3d_bin_job_run() callback (and the other run_job callbacks modified
in this patch) returns NULL instead of propagating this error pointer,
the DRM scheduler treats the job as having completed synchronously and
successfully.
Could this allow dependent jobs to execute as if the hardware successfully
processed the command lists, potentially leading to GPU page faults or
userspace crashes? Should the callbacks propagate the error pointer back
to the scheduler instead?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603-v3d-sched-misc-fixes-v3-0-d7114bba55a0@igalia.com?part=2
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 06/14] drm/v3d: Reject invalid syncobj handles in submit ioctls
2026-06-03 22:25 ` [PATCH v3 06/14] drm/v3d: Reject invalid syncobj handles in submit ioctls Maíra Canal
@ 2026-06-04 2:51 ` sashiko-bot
0 siblings, 0 replies; 32+ messages in thread
From: sashiko-bot @ 2026-06-04 2:51 UTC (permalink / raw)
To: Maíra Canal; +Cc: dri-devel
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] Missing bounds check on userspace-controlled `in_sync_count` and `out_sync_count`, potentially leading to soft lockups or unbounded memory allocation attempts.
--
commit 99860a2dd307ceb7455992b876495c7efeeb7909
Author: Maíra Canal <mcanal@igalia.com>
drm/v3d: Reject invalid syncobj handles in submit ioctls
This commit modifies v3d_job_add_syncobjs() to properly propagate error
codes when a non-zero syncobj handle doesn't resolve, while continuing to
ignore zero handles.
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
> index 8250376d104c5..0babe2e672660 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
[ ... ]
> @@ -208,11 +207,13 @@ v3d_job_add_syncobjs(struct v3d_job *job, struct drm_file *file_priv,
> return -EFAULT;
> }
>
> - ret = drm_sched_job_add_syncobj_dependency(&job->base,
> - file_priv, in.handle, 0);
> - // TODO: Investigate why this was filtered out for the IOCTL.
> - if (ret && ret != -ENOENT)
> - return ret;
> + /* Ignore syncobj if its handle is zero */
> + if (in.handle) {
> + ret = drm_sched_job_add_syncobj_dependency(&job->base,
> + file_priv, in.handle, 0);
> + if (ret)
> + return ret;
> + }
> }
> }
>
[Severity: High]
This is a pre-existing issue, but is se->in_sync_count bounded by any
upper limit?
This code is executing inside a loop over the userspace-supplied
se->in_sync_count:
v3d_job_add_syncobjs() {
...
for (int i = 0; i < se->in_sync_count; i++) {
struct drm_v3d_sem in;
if (copy_from_user(&in, handle++, sizeof(in))) {
...
}
Since there is no cond_resched() in this loop, could a maliciously large
in_sync_count value allow a user to monopolize the processor and cause a
soft lockup?
Similarly, is se->out_sync_count bounded before it is used for memory
allocation in v3d_get_multisync_post_deps()?
v3d_get_multisync_post_deps() {
...
se->out_syncs = (struct v3d_submit_outsync *)
kvmalloc_objs(struct v3d_submit_outsync, count);
if (!se->out_syncs)
...
}
Could an unbounded out_sync_count lead to a massive memory allocation
request, potentially exhausting physical memory?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603-v3d-sched-misc-fixes-v3-0-d7114bba55a0@igalia.com?part=6
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 04/14] drm/v3d: Replace spin_lock_irqsave() with spin_lock()
2026-06-03 22:25 ` [PATCH v3 04/14] drm/v3d: Replace spin_lock_irqsave() with spin_lock() Maíra Canal
@ 2026-06-04 2:51 ` sashiko-bot
0 siblings, 0 replies; 32+ messages in thread
From: sashiko-bot @ 2026-06-04 2:51 UTC (permalink / raw)
To: Maíra Canal; +Cc: dri-devel
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Removing `spin_lock_irqsave` exposes a uniprocessor race and obstructs the correct fix for an existing SMP use-after-free race.
- [Low] Mixing goto-based error handling and scope-based cleanup helpers (`scoped_guard()`) in the same function violates subsystem guidelines.
--
commit 7d41ab716d6dcf438f92d8905f5e28343e3ab169
Author: Maíra Canal <mcanal@igalia.com>
drm/v3d: Replace spin_lock_irqsave() with spin_lock()
This commit replaces spin_lock_irqsave() and spin_unlock_irqrestore() with
the scoped_guard() macro using plain spinlocks in v3d_overflow_mem_work()
and v3d_bin_job_run().
> diff --git a/drivers/gpu/drm/v3d/v3d_irq.c b/drivers/gpu/drm/v3d/v3d_irq.c
> index 86efaef2722c3..754a969b862b6 100644
> --- a/drivers/gpu/drm/v3d/v3d_irq.c
> +++ b/drivers/gpu/drm/v3d/v3d_irq.c
[ ... ]
> @@ -64,17 +63,15 @@ v3d_overflow_mem_work(struct work_struct *work)
> * bin job got scheduled, that's fine. We'll just give them
> * some binner pool anyway.
> */
> - spin_lock_irqsave(&queue->queue_lock, irqflags);
> - bin_job = (struct v3d_bin_job *)queue->active_job;
> + scoped_guard(spinlock, &queue->queue_lock) {
[Severity: High]
Does removing the IRQ-disabling protection here cement a broken locking design?
The commit message notes the hard IRQ handler doesn't take this lock, but
v3d_irq_signal_fence() modifies queue->active_job locklessly. Because of this
missing lock, an SMP use-after-free race already exists where the DRM
scheduler can free the job while v3d_overflow_mem_work() is dereferencing
bin_job->render.
By switching from spin_lock_irqsave() to spin_lock(), this exposes a
uniprocessor race where the IRQ can preempt the workqueue's critical section.
Furthermore, if someone fixes the SMP race by adding
spin_lock(&queue->queue_lock) to the IRQ handler, won't the system deadlock
when the IRQ interrupts this workqueue?
> + bin_job = (struct v3d_bin_job *)queue->active_job;
>
> - if (!bin_job) {
> - spin_unlock_irqrestore(&queue->queue_lock, irqflags);
> - goto out;
> - }
> + if (!bin_job)
> + goto out;
[Severity: Low]
Is it ok to mix goto-based error handling with scope-based cleanup helpers?
The kernel cleanup guidelines explicitly forbid mixing goto and scope-based
cleanup helpers in the same function because it creates confusing ownership
semantics.
While scoped_guard() is introduced here for lock management, the function
retains explicit goto statements like this one to reach the out label for
drm_gem_object_put().
>
> - drm_gem_object_get(obj);
> - list_add_tail(&bo->unref_head, &bin_job->render->unref_list);
> - spin_unlock_irqrestore(&queue->queue_lock, irqflags);
> + drm_gem_object_get(obj);
> + list_add_tail(&bo->unref_head, &bin_job->render->unref_list);
> + }
>
> v3d_mmu_flush_all(v3d);
>
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index 46f4fc09c59ec..04fd1a365576b 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
[ ... ]
> @@ -240,9 +239,9 @@ static struct dma_fence *v3d_bin_job_run(struct drm_sched_job *sched_job)
> return fence;
>
> out_clean_job:
> - spin_lock_irqsave(&queue->queue_lock, irqflags);
> - queue->active_job = NULL;
> - spin_unlock_irqrestore(&queue->queue_lock, irqflags);
> + scoped_guard(spinlock, &queue->queue_lock) {
> + queue->active_job = NULL;
> + }
> return NULL;
> }
[Severity: Low]
Does this also violate the cleanup guidelines against mixing goto and
scope-based cleanup?
Similar to v3d_overflow_mem_work(), this function introduces scoped_guard()
while still retaining the out_clean_job label that is reached via goto
statements earlier in the function.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603-v3d-sched-misc-fixes-v3-0-d7114bba55a0@igalia.com?part=4
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 09/14] drm/v3d: Make v3d_get_cpu_indirect_csd_params() a pure parser
2026-06-03 22:25 ` [PATCH v3 09/14] drm/v3d: Make v3d_get_cpu_indirect_csd_params() a pure parser Maíra Canal
@ 2026-06-04 2:53 ` sashiko-bot
0 siblings, 0 replies; 32+ messages in thread
From: sashiko-bot @ 2026-06-04 2:53 UTC (permalink / raw)
To: Maíra Canal; +Cc: dri-devel
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
Pre-existing issues:
- [Critical] Error path cleans up DRM scheduler jobs that have already been pushed, causing Use-After-Free and scheduler corruption.
- [Critical] Uninitialized locking context is passed to fini on the success path for CPU jobs without BOs, leading to a kernel panic.
- [High] BO reservation locks acquired for `clean_job` are leaked if subsequent validation fails.
- [High] Userspace can trigger a kernel deadlock by specifying the same BO handle in both the CPU job and the CSD job.
- [High] GFP_KERNEL allocation while holding dma_resv locks causes an AB-BA deadlock with memory reclaim.
--
commit e7875ac22f6ebc1c0172f365953ce0df35e9465b
Author: Maíra Canal <mcanal@igalia.com>
drm/v3d: Make v3d_get_cpu_indirect_csd_params() a pure parser
This commit modifies v3d_get_cpu_indirect_csd_params() to only parse the
indirect CSD extension and stash the arguments. The creation of CSD and
clean jobs, along with their BO locking, is moved into
v3d_submit_cpu_ioctl() to prepare for the unified submission chain.
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
> index fcaf3a6cfddc6..64eba912dc646 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
[ ... ]
> @@ -1404,6 +1403,17 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
> goto fail;
> }
>
> + if (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) {
> + ret = v3d_setup_csd_jobs_and_bos(file_priv, v3d,
> + &cpu_job->indirect_csd.args,
> + &cpu_job->indirect_csd.job,
> + &cpu_job->indirect_csd.clean_job,
> + NULL,
> + &cpu_job->indirect_csd.exec);
> + if (ret)
> + goto fail;
> + }
> +
[Severity: High]
This isn't a bug introduced by this patch, but does this create a lock leak
on error paths?
If v3d_setup_csd_jobs_and_bos() successfully locks BO reservations using
cpu_job->indirect_csd.exec, and later in v3d_submit_cpu_ioctl() a failure
occurs (for example, in v3d_lookup_bos() or v3d_lock_bo_reservations()),
the error handling jumps to fail:
v3d_submit_cpu_ioctl() {
...
if (ret)
goto fail;
...
}
This appears to bypass the fail_unreserve: label, leaving the ww_mutex and
dma_resv locks permanently held. Could this cause subsequent GPU operations
attempting to use these BOs to stall indefinitely?
[Severity: High]
This is a pre-existing issue, but could this lead to a deadlock if userspace
specifies the same BO handle in both the CPU job and the CSD job?
When v3d_submit_cpu_ioctl() uses two separate locking contexts in the same
thread, the first context acquires the WW mutexes for the CSD job's BOs:
ret = v3d_setup_csd_jobs_and_bos(..., &cpu_job->indirect_csd.exec);
The second locking context then attempts to acquire WW mutexes for the CPU
job's BOs:
ret = v3d_lock_bo_reservations(&cpu_job->base, &acquire_ctx);
If the same BO is specified in both jobs, would the second context (younger
transaction) block indefinitely attempting to acquire a WW mutex already held
by the first context, leading to an unrecoverable self-deadlock?
[Severity: Critical]
This is a pre-existing issue, but is it safe to clean up DRM scheduler jobs
that have already been pushed?
In v3d_submit_cpu_ioctl(), once jobs are pushed to the DRM scheduler via
v3d_push_job(), the scheduler takes ownership of their lifecycle.
If a subsequent operation fails:
ret = drm_sched_job_add_dependency(&clean_job->base, ...);
if (ret)
goto fail_unreserve;
The error path jumps to fail_unreserve: and then fail:, calling
v3d_job_cleanup() on the already-pushed jobs.
Does calling drm_sched_job_cleanup() on a pushed job destroy its s_fence and
dependencies xarray while the background scheduler thread might still be
accessing them, potentially causing a use-after-free and scheduler state
corruption?
[Severity: Critical]
This isn't a bug introduced by this patch, but is it possible for an
uninitialized locking context to be passed to the cleanup functions?
In v3d_submit_cpu_ioctl(), the local locking context is bypassed if the CPU
job has no BOs:
if (args->bo_handle_count) {
...
ret = v3d_lock_bo_reservations(&cpu_job->base, &acquire_ctx);
}
The function then proceeds to the success path and calls:
drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);
Could this result in passing uninitialized stack memory to drm_exec_fini() or
ww_acquire_fini(), causing the cleanup function to dereference and free
garbage pointers?
[Severity: High]
This is a pre-existing issue, but does this allocation trigger an AB-BA
deadlock with memory reclaim?
In v3d_submit_cpu_ioctl(), dependencies are added while BOs are locked:
ret = drm_sched_job_add_dependency(&csd_job->base.base,
dma_fence_get(cpu_job->base.done_fence));
This internally uses xa_alloc() with the GFP_KERNEL flag while the thread
holds dma_resv locks acquired by v3d_lock_bo_reservations().
If system memory is tight, could the memory allocator trigger the shrinker,
which would attempt to evict GEM objects and acquire dma_resv locks, leading
to a deadlock?
> clean_job = cpu_job->indirect_csd.clean_job;
> csd_job = cpu_job->indirect_csd.job;
>
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603-v3d-sched-misc-fixes-v3-0-d7114bba55a0@igalia.com?part=9
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 08/14] drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls
2026-06-03 22:25 ` [PATCH v3 08/14] drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls Maíra Canal
@ 2026-06-04 2:54 ` sashiko-bot
2026-06-04 8:58 ` Tvrtko Ursulin
1 sibling, 0 replies; 32+ messages in thread
From: sashiko-bot @ 2026-06-04 2:54 UTC (permalink / raw)
To: Maíra Canal; +Cc: dri-devel
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] Earlier jobs in the submission chain (BIN and RENDER) do not wait for implicit BO dependencies, leading to concurrent hardware access.
--
commit c2a10c5b0fe7c3f03c50306d4d031d5c5d2ba3a6
Author: Maíra Canal <mcanal@igalia.com>
drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls
As the V3D driver grew with time, different types of submission were added
and the submission code grew more complex, but the driver stuck with the
same abstractions.
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
> index d80afbf2c03be..fcaf3a6cfddc6 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
[ ... ]
> @@ -984,99 +1088,71 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
[ ... ]
> - ret = v3d_lookup_bos(dev, file_priv, last_job,
> - args->bo_handles, args->bo_handle_count);
> - if (ret)
> + ret = v3d_lookup_bos(dev, file_priv,
> + submit.jobs[submit.job_count - 1],
> + args->bo_handles, args->bo_handle_count);
> + if (ret)
> goto fail;
>
> - ret = v3d_lock_bo_reservations(last_job, &exec);
> - if (ret)
> - goto fail;
> + ret = v3d_lock_bo_reservations(submit.jobs[submit.job_count - 1],
> + &submit.exec);
> + if (ret)
> + goto fail;
[Severity: High]
This is a pre-existing issue, but since earlier jobs in the submission chain
(like BIN and RENDER) don't have these BOs associated with them, they
receive no implicit fences.
Because v3d_lookup_bos() only associates the BOs with the last job
(submit.jobs[submit.job_count - 1]), the earlier jobs have their bo_count
set to 0. Consequently, they skip adding implicit dependencies when
reservations are processed. Even if a subsequent patch updates the code
to iterate over all jobs to add dependencies, earlier jobs will still be
skipped because their job->bo_count is 0.
Could this cause the earlier jobs to be pushed to the hardware immediately,
allowing them to read BOs while they are still being written by previous
operations from other processes or hardware?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603-v3d-sched-misc-fixes-v3-0-d7114bba55a0@igalia.com?part=8
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 11/14] drm/v3d: Refactor CPU ioctl into unified submission chain
2026-06-03 22:25 ` [PATCH v3 11/14] drm/v3d: Refactor CPU ioctl into unified submission chain Maíra Canal
@ 2026-06-04 2:58 ` sashiko-bot
2026-06-04 9:54 ` Tvrtko Ursulin
1 sibling, 0 replies; 32+ messages in thread
From: sashiko-bot @ 2026-06-04 2:58 UTC (permalink / raw)
To: Maíra Canal; +Cc: dri-devel
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
Pre-existing issues:
- [Critical] Missing validation of `drm_gem_object_lookup()` return value leads to a deterministic kernel crash.
- [Critical] Calling `vmap()` on imported PRIME BOs panics the kernel due to a NULL pages array.
- [High] Calling `drm_sched_job_cleanup()` on jobs that have already been pushed to the DRM scheduler causes a use-after-free and corrupts the scheduler's internal state.
- [High] Implicit BO fences are only added to the last job in a submission, causing earlier GPU jobs to execute prematurely without waiting for synchronization.
--
commit 572410b6778693137514a8800e2af096467fd0aa
Author: Maíra Canal <mcanal@igalia.com>
drm/v3d: Refactor CPU ioctl into unified submission chain
Restructure the CPU ioctl so that all job types, including indirect CSD,
use a single struct v3d_submit chain and a single DRM exec context.
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
> index 62c23feb8fbba..1df7ab528422a 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
> @@ -1317,90 +1312,51 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
[ ... ]
> + /* Look up the CPU jobs' BOs before v3d_setup_csd_jobs_and_bos() appends
> + * the CSD and clean jobs in the case of indirect CSD job.
> + */
> if (args->bo_handle_count) {
> ret = v3d_lookup_bos(&submit, args->bo_handles, args->bo_handle_count);
[Severity: High]
This isn't a bug introduced by this patch, but it looks like BOs might be
attached exclusively to the last job in the submission.
v3d_lookup_bos() assigns the user-provided BO array and bo_count to
submit->jobs[submit->job_count - 1]. Later, v3d_submit_lock_reservations()
iterates over the jobs and calls drm_sched_job_add_implicit_dependencies()
only for jobs with bo_count > 0.
Consequently, earlier jobs in the submission, such as the BIN job in a CL
submission or the CSD job in an indirect CSD submission, receive no
implicit dependencies for the BOs they access. Could this allow them to be
scheduled and executed immediately, racing with other processes or queues
currently writing to those BOs on the GPU?
[Severity: Critical]
This is a pre-existing issue, but I am wondering if mapping these BOs properly
handles imported PRIME BOs.
v3d_get_bo_vaddr() maps GEM objects into kernel space by calling
vmap(obj->pages, ...). However, for PRIME BOs imported via
v3d_prime_import_sg_table(), the drm_gem_shmem backend does not populate the
obj->pages array.
By bypassing the drm_gem_shmem_vmap() helper and directly accessing obj->pages,
v3d_get_bo_vaddr() passes NULL as the pages array to vmap(). Can a local
user trigger a kernel panic by importing a DMA-buf, passing it as a buffer to
a CPU job, and causing v3d_get_bo_vaddr() to crash?
> if (ret)
> goto fail;
[ ... ]
> + if (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) {
> + ret = v3d_setup_csd_jobs_and_bos(&submit, &cpu_job->indirect_csd.args,
> + NULL);
[Severity: Critical]
This is a pre-existing issue, but shouldn't the return value of
drm_gem_object_lookup() be validated when processing the indirect CSD params?
In v3d_get_cpu_indirect_csd_params(), drm_gem_object_lookup() is called
with a user-provided handle and the return value is assigned directly to
info->indirect without checking for NULL.
If userspace passes an invalid handle, NULL is stored. Later, when the CPU job
executes, v3d_rewrite_csd_job_wg_counts_from_indirect() calls
v3d_get_bo_vaddr(to_v3d_bo(indirect_csd->indirect)).
Would this pass a NULL pointer to v3d_get_bo_vaddr(), which unconditionally
accesses obj->pages, resulting in a deterministic kernel panic?
[ ... ]
> + ret = v3d_submit_jobs(&submit, 0, &se);
> + if (ret)
> + goto fail_unreserve;
>
> return 0;
>
> fail_unreserve:
> v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
[Severity: High]
This isn't a bug introduced by this patch, but does calling
v3d_submit_cleanup_jobs() here clean up jobs that have already been pushed?
In v3d_submit_jobs(), jobs are pushed to the DRM scheduler via v3d_push_job().
If a subsequent call to drm_sched_job_add_dependency() fails for a later job
in the submission chain, the function returns an error without unwinding the
already-pushed jobs.
The caller then branches to this error path that invokes
v3d_submit_cleanup_jobs(), which iterates over all jobs and calls
v3d_job_cleanup() -> drm_sched_job_cleanup().
Calling drm_sched_job_cleanup() on a job that the scheduler already owns
violates the scheduler API contract. It sets job->s_fence = NULL and destroys
the dependencies XArray while the scheduler thread might be executing the job.
Can this lead to a NULL pointer dereference and memory corruption?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603-v3d-sched-misc-fixes-v3-0-d7114bba55a0@igalia.com?part=11
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 13/14] drm/v3d: Reject invalid out_sync handles in submit ioctls
2026-06-03 22:25 ` [PATCH v3 13/14] drm/v3d: Reject invalid out_sync handles in submit ioctls Maíra Canal
@ 2026-06-04 2:59 ` sashiko-bot
2026-06-04 10:12 ` Tvrtko Ursulin
1 sibling, 0 replies; 32+ messages in thread
From: sashiko-bot @ 2026-06-04 2:59 UTC (permalink / raw)
To: Maíra Canal; +Cc: dri-devel
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] Resource leak (memory and drm_syncobj references) when parsing chained user-space extensions.
--
commit 42106e7ee90a922dbfb05df35206e0f77fc80751
Author: Maíra Canal <mcanal@igalia.com>
drm/v3d: Move out_sync syncobj lookup to submission
The commit moves the out_sync sync object lookup to the submission phase,
failing with -ENOENT if userspace passes a non-zero handle that doesn't
refer to a valid syncobj. It also consolidates the cleanup routines on
error paths for both single-sync and multi-sync implementations.
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
> index 2beb99a25104..dc27770d85fd 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
[ ... ]
> @@ -1032,6 +1034,12 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
> }
> }
>
[Severity: High]
This isn't a bug introduced by this patch, but can the direct returns when
parsing extensions cause a resource leak?
In v3d_submit_cl_ioctl(), if userspace provides a valid
DRM_V3D_EXT_ID_MULTI_SYNC extension followed by an invalid one (e.g., an
unknown ID), v3d_get_extensions() will successfully allocate se.out_syncs
and acquire drm_syncobj references for the first extension, but then fail
on the second extension and return an error code.
Because the code directly returns here instead of jumping to a cleanup label,
it bypasses v3d_submit_put_post_deps(), leaking both the memory and the
drm_syncobj references.
Could this be triggered repeatedly by userspace chaining a valid multi-sync
extension with an invalid one, potentially exhausting kernel memory and file
descriptors?
> + if (args->out_sync && !(se.flags & DRM_V3D_EXT_ID_MULTI_SYNC)) {
> + sync_out = drm_syncobj_find(file_priv, args->out_sync);
> + if (!sync_out)
> + return -ENOENT;
> + }
> +
> if (args->bcl_start != args->bcl_end) {
> bin = (struct v3d_bin_job *) v3d_submit_add_job(&submit, V3D_BIN);
> if (IS_ERR(bin)) {
[ ... ]
> @@ -1137,6 +1146,12 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
> }
> }
>
[Severity: High]
This is a pre-existing issue, but the same resource leak when
v3d_get_extensions() fails appears here in v3d_submit_tfu_ioctl() as well.
> + if (args->out_sync && !(se.flags & DRM_V3D_EXT_ID_MULTI_SYNC)) {
> + sync_out = drm_syncobj_find(file_priv, args->out_sync);
> + if (!sync_out)
> + return -ENOENT;
> + }
> +
> job = (struct v3d_tfu_job *) v3d_submit_add_job(&submit, V3D_TFU);
> if (IS_ERR(job)) {
> ret = PTR_ERR(job);
[ ... ]
> @@ -1234,6 +1250,12 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
> }
> }
>
[Severity: High]
This is a pre-existing issue, but the same resource leak when
v3d_get_extensions() fails also appears here in v3d_submit_csd_ioctl().
> + if (args->out_sync && !(se.flags & DRM_V3D_EXT_ID_MULTI_SYNC)) {
> + sync_out = drm_syncobj_find(file_priv, args->out_sync);
> + if (!sync_out)
> + return -ENOENT;
> + }
> +
> ret = v3d_setup_csd_jobs_and_bos(&submit, args, &se);
> if (ret)
> goto fail;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603-v3d-sched-misc-fixes-v3-0-d7114bba55a0@igalia.com?part=13
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 08/14] drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls
2026-06-03 22:25 ` [PATCH v3 08/14] drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls Maíra Canal
2026-06-04 2:54 ` sashiko-bot
@ 2026-06-04 8:58 ` Tvrtko Ursulin
2026-06-04 11:52 ` Maíra Canal
1 sibling, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2026-06-04 8:58 UTC (permalink / raw)
To: Maíra Canal, Melissa Wen, Iago Toral, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel
On 03/06/2026 23:25, Maíra Canal wrote:
> As the V3D driver grew with time, different types of submission were added
> and the submission code grew more complex, but the driver stuck with the
> same abstractions.
>
> Nowadays, the submission ioctls don't submit a single job, but a
> chain of jobs:
>
> 1. v3d_submit_cl_ioctl() submits a BIN job (optional), RENDER job
> (mandatory), and a CLEAN_CACHE job (optional).
> 2. v3d_submit_csd_ioctl() submits a CSD, and a CLEAN_CACHE job.
> 3. v3d_submit_tfu_ioctl() submits a TFU job.
>
> Therefore, each ioctl submits a chain of jobs in which each job depends on
> the previous one. However, this concept is not well represented in software
> at the moment.
>
> To address this, introduce a new concept: the struct v3d_submit, which
> groups the submission state and represents the submission chain formed by
> an ordered array of jobs.
>
> Add new helpers to allocate, add jobs to the chain and submit jobs to
> the scheduler, all based on the new struct. Convert v3d_submit_cl_ioctl(),
> v3d_submit_tfu_ioctl() and v3d_submit_csd_ioctl() to the new pattern. Each
> ioctl now follows the same flow: add jobs -> attach perfmon -> lookup BOs
> -> lock reservations -> submit chain -> attach fences -> put jobs.
>
> The CPU ioctl is left on the old helpers for now; its indirect CSD path
> requires some restructuring that will be addressed in the next few
> commits.
>
> Signed-off-by: Maíra Canal <mcanal@igalia.com>
> ---
> drivers/gpu/drm/v3d/v3d_drv.h | 21 +++
> drivers/gpu/drm/v3d/v3d_submit.c | 386 ++++++++++++++++++++++-----------------
> 2 files changed, 241 insertions(+), 166 deletions(-)
>
> diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
> index 07bbdf52c33c..fae0c53e7aa3 100644
> --- a/drivers/gpu/drm/v3d/v3d_drv.h
> +++ b/drivers/gpu/drm/v3d/v3d_drv.h
> @@ -288,6 +288,27 @@ to_v3d_fence(struct dma_fence *fence)
> #define V3D_CORE_READ(core, offset) readl(v3d->core_regs[core] + offset)
> #define V3D_CORE_WRITE(core, offset, val) writel(val, v3d->core_regs[core] + offset)
>
> +#define V3D_MAX_JOBS_PER_SUBMISSION 3
> +
> +/* Per-ioctl submission context */
> +struct v3d_submit {
> + struct v3d_dev *v3d;
> +
> + struct drm_file *file_priv;
> +
> + /* DRM exec context for this submission. */
> + struct drm_exec exec;
> +
> + /* Ordered array of jobs forming the submission chain. Jobs are
> + * appended via v3d_submit_add_job(), then chained and pushed to
> + * the scheduler by v3d_submit_jobs().
> + */
> + struct v3d_job *jobs[V3D_MAX_JOBS_PER_SUBMISSION];
> +
> + /* Number of jobs currently in @jobs. */
> + u32 job_count;
> +};
> +
> struct v3d_job {
> struct drm_sched_job base;
>
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
> index d80afbf2c03b..fcaf3a6cfddc 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
> @@ -261,17 +261,105 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
> return ret;
> }
>
> -static void
> -v3d_push_job(struct v3d_job *job)
> +static const struct {
> + size_t size;
> + void (*free)(struct kref *ref);
> +} v3d_job_types[] = {
> + [V3D_BIN] = { sizeof(struct v3d_bin_job), v3d_job_free },
> + [V3D_RENDER] = { sizeof(struct v3d_render_job), v3d_render_job_free },
> + [V3D_TFU] = { sizeof(struct v3d_tfu_job), v3d_job_free },
> + [V3D_CSD] = { sizeof(struct v3d_csd_job), v3d_job_free },
> + [V3D_CACHE_CLEAN] = { sizeof(struct v3d_job), v3d_job_free },
> + [V3D_CPU] = { sizeof(struct v3d_cpu_job), v3d_cpu_job_free },
> +};
> +
> +static struct v3d_job *
> +v3d_submit_add_job(struct v3d_submit *submit, enum v3d_queue queue)
> {
> - drm_sched_job_arm(&job->base);
> + struct v3d_file_priv *v3d_priv = submit->file_priv->driver_priv;
> + struct v3d_dev *v3d = submit->v3d;
> + struct v3d_job *job;
> + int ret;
>
> - job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> + if (queue >= V3D_MAX_QUEUES)
> + return ERR_PTR(-EINVAL);
>
> - /* put by scheduler job completion */
> - kref_get(&job->refcount);
> + job = kzalloc(v3d_job_types[queue].size, GFP_KERNEL);
> + if (!job)
> + return ERR_PTR(-ENOMEM);
>
> - drm_sched_entity_push_job(&job->base);
> + job->v3d = v3d;
> + job->queue = queue;
> + job->file_priv = v3d_priv;
> + job->free = v3d_job_types[queue].free;
> +
> + ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
> + 1, v3d_priv, submit->file_priv->client_id);
> + if (ret)
> + goto fail_free;
> +
> + /* CPU jobs don't require hardware resources */
> + if (queue != V3D_CPU) {
> + ret = v3d_pm_runtime_get(v3d);
> + if (ret)
> + goto fail_sched_job;
> + job->has_pm_ref = true;
> + }
> +
> + kref_init(&job->refcount);
> +
> + job->client_stats = v3d_stats_get(v3d_priv->stats[queue]);
> + job->global_stats = v3d_stats_get(v3d->queue[queue].stats);
> +
> + submit->jobs[submit->job_count++] = job;
> +
> + return job;
> +
> +fail_sched_job:
> + drm_sched_job_cleanup(&job->base);
> +fail_free:
> + kfree(job);
> + return ERR_PTR(ret);
> +}
> +
> +static void
> +v3d_submit_put_jobs(struct v3d_submit *submit)
> +{
> + for (int i = 0; i < submit->job_count; i++)
> + v3d_job_put(submit->jobs[i]);
> +}
> +
> +static void
> +v3d_submit_cleanup_jobs(struct v3d_submit *submit)
> +{
> + for (int i = 0; i < submit->job_count; i++)
> + v3d_job_cleanup(submit->jobs[i]);
> +}
> +
> +static int
> +v3d_attach_perfmon_to_jobs(struct v3d_submit *submit, u32 perfmon_id)
> +{
> + struct v3d_file_priv *v3d_priv = submit->file_priv->driver_priv;
> + struct v3d_dev *v3d = submit->v3d;
> + struct v3d_perfmon *perfmon;
> +
> + if (!perfmon_id)
> + return 0;
> +
> + if (v3d->global_perfmon)
> + return -EAGAIN;
> +
> + perfmon = v3d_perfmon_find(v3d_priv, perfmon_id);
> + if (!perfmon)
> + return -ENOENT;
> +
> + for (int i = 0; i < submit->job_count; i++) {
> + submit->jobs[i]->perfmon = perfmon;
> + if (i != 0)
> + v3d_perfmon_get(perfmon);
> + }
> +
> + return 0;
> }
>
> static void
> @@ -316,6 +404,45 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv,
> }
> }
>
> +static void
> +v3d_push_job(struct v3d_job *job)
> +{
> + drm_sched_job_arm(&job->base);
> +
> + job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> +
> + /* put by scheduler job completion */
> + kref_get(&job->refcount);
> +
> + drm_sched_entity_push_job(&job->base);
> +}
> +
> +static int
> +v3d_submit_jobs(struct v3d_submit *submit)
> +{
> + struct v3d_dev *v3d = submit->v3d;
> + int ret = 0;
> +
> + mutex_lock(&v3d->sched_lock);
> +
> + for (int i = 0; i < submit->job_count; i++) {
> + struct v3d_job *job = submit->jobs[i];
> +
> + v3d_push_job(job);
> +
> + if (i + 1 < submit->job_count) {
> + ret = drm_sched_job_add_dependency(&submit->jobs[i + 1]->base,
> + dma_fence_get(job->done_fence));
Is there a theoretical race where job->done_fence could be completed and
last reference dropped between push and here?
> + if (ret)
> + goto err;
> + }
> + }
> +
> +err:
> + mutex_unlock(&v3d->sched_lock);
> + return ret;
> +}
> +
> static int
> v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv,
> struct v3d_dev *v3d,
> @@ -921,18 +1048,15 @@ int
> v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file_priv)
> {
> - struct v3d_dev *v3d = to_v3d_dev(dev);
> - struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
> + struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
> struct drm_v3d_submit_cl *args = data;
> struct v3d_submit_ext se = {0};
> struct v3d_bin_job *bin = NULL;
> struct v3d_render_job *render = NULL;
> struct v3d_job *clean_job = NULL;
render and clean_job do not need to be initialised.
> - struct v3d_job *last_job;
> - struct drm_exec exec;
> - int ret = 0;
> + int ret;
>
> - trace_v3d_submit_cl_ioctl(&v3d->drm, args->rcl_start, args->rcl_end);
> + trace_v3d_submit_cl_ioctl(dev, args->rcl_start, args->rcl_end);
>
> if (args->pad)
> return -EINVAL;
> @@ -952,30 +1076,10 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
> }
> }
>
> - ret = v3d_job_allocate(v3d, (void *)&render, sizeof(*render));
> - if (ret)
> - return ret;
> -
> - ret = v3d_job_init(v3d, file_priv, &render->base,
> - v3d_render_job_free, args->in_sync_rcl, &se, V3D_RENDER);
> - if (ret) {
> - v3d_job_deallocate((void *)&render);
> - goto fail;
> - }
> -
> - render->start = args->rcl_start;
> - render->end = args->rcl_end;
> - INIT_LIST_HEAD(&render->unref_list);
> -
> if (args->bcl_start != args->bcl_end) {
> - ret = v3d_job_allocate(v3d, (void *)&bin, sizeof(*bin));
> - if (ret)
> - goto fail;
> -
> - ret = v3d_job_init(v3d, file_priv, &bin->base,
> - v3d_job_free, args->in_sync_bcl, &se, V3D_BIN);
> - if (ret) {
> - v3d_job_deallocate((void *)&bin);
> + bin = (struct v3d_bin_job *) v3d_submit_add_job(&submit, V3D_BIN);
> + if (IS_ERR(bin)) {
> + ret = PTR_ERR(bin);
> goto fail;
> }
>
> @@ -984,99 +1088,71 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
> bin->qma = args->qma;
> bin->qms = args->qms;
> bin->qts = args->qts;
> - bin->render = render;
> - }
>
> - if (args->flags & DRM_V3D_SUBMIT_CL_FLUSH_CACHE) {
> - ret = v3d_job_allocate(v3d, (void *)&clean_job, sizeof(*clean_job));
> + ret = v3d_job_add_syncobjs(&bin->base, file_priv, args->in_sync_bcl,
> + &se);
> if (ret)
> goto fail;
> -
> - ret = v3d_job_init(v3d, file_priv, clean_job,
> - v3d_job_free, 0, NULL, V3D_CACHE_CLEAN);
> - if (ret) {
> - v3d_job_deallocate((void *)&clean_job);
> - goto fail;
> - }
> -
> - last_job = clean_job;
> - } else {
> - last_job = &render->base;
> }
>
> - ret = v3d_lookup_bos(dev, file_priv, last_job,
> + render = (struct v3d_render_job *) v3d_submit_add_job(&submit, V3D_RENDER);
> + if (IS_ERR(render)) {
> + ret = PTR_ERR(render);
> + goto fail;
> + }
> +
> + INIT_LIST_HEAD(&render->unref_list);
> + render->start = args->rcl_start;
> + render->end = args->rcl_end;
> +
> + if (bin)
> + bin->render = render;
> +
> + ret = v3d_job_add_syncobjs(&render->base, file_priv, args->in_sync_rcl, &se);
> + if (ret)
> + goto fail;
> +
> + if (args->flags & DRM_V3D_SUBMIT_CL_FLUSH_CACHE) {
> + clean_job = v3d_submit_add_job(&submit, V3D_CACHE_CLEAN);
> + if (IS_ERR(clean_job)) {
> + ret = PTR_ERR(clean_job);
> + goto fail;
> + }
> + }
> +
> + ret = v3d_attach_perfmon_to_jobs(&submit, args->perfmon_id);
> + if (ret)
> + goto fail;
> +
> + ret = v3d_lookup_bos(dev, file_priv,
> + submit.jobs[submit.job_count - 1],
submit.jobs[submit.job_count - 1] is repeated four times in the function
so you could assign it to a last_job local for more readability. Up to you.
> args->bo_handles, args->bo_handle_count);
> if (ret)
> goto fail;
>
> - ret = v3d_lock_bo_reservations(last_job, &exec);
> + ret = v3d_lock_bo_reservations(submit.jobs[submit.job_count - 1],
> + &submit.exec);
> if (ret)
> goto fail;
>
> - if (args->perfmon_id) {
> - if (v3d->global_perfmon) {
> - ret = -EAGAIN;
> - goto fail_perfmon;
> - }
> -
> - render->base.perfmon = v3d_perfmon_find(v3d_priv,
> - args->perfmon_id);
> -
> - if (!render->base.perfmon) {
> - ret = -ENOENT;
> - goto fail_perfmon;
> - }
> - }
> -
> - mutex_lock(&v3d->sched_lock);
> - if (bin) {
> - bin->base.perfmon = render->base.perfmon;
> - v3d_perfmon_get(bin->base.perfmon);
> - v3d_push_job(&bin->base);
> -
> - ret = drm_sched_job_add_dependency(&render->base.base,
> - dma_fence_get(bin->base.done_fence));
> - if (ret)
> - goto fail_unreserve;
> - }
> -
> - v3d_push_job(&render->base);
> -
> - if (clean_job) {
> - struct dma_fence *render_fence =
> - dma_fence_get(render->base.done_fence);
> - ret = drm_sched_job_add_dependency(&clean_job->base,
> - render_fence);
> - if (ret)
> - goto fail_unreserve;
> - clean_job->perfmon = render->base.perfmon;
> - v3d_perfmon_get(clean_job->perfmon);
> - v3d_push_job(clean_job);
> - }
> -
> - mutex_unlock(&v3d->sched_lock);
> + ret = v3d_submit_jobs(&submit);
> + if (ret)
> + goto fail_unreserve;
>
> v3d_attach_fences_and_unlock_reservation(file_priv,
> - last_job,
> - &exec,
> - args->out_sync,
> - &se,
> - last_job->done_fence);
> + submit.jobs[submit.job_count - 1],
> + &submit.exec,
> + args->out_sync, &se,
> + submit.jobs[submit.job_count - 1]->done_fence);
>
> - v3d_job_put(&bin->base);
> - v3d_job_put(&render->base);
> - v3d_job_put(clean_job);
> + v3d_submit_put_jobs(&submit);
>
> return 0;
>
> fail_unreserve:
> - mutex_unlock(&v3d->sched_lock);
> -fail_perfmon:
> - drm_exec_fini(&exec);
> + drm_exec_fini(&submit.exec);
> fail:
> - v3d_job_cleanup((void *)bin);
> - v3d_job_cleanup((void *)render);
> - v3d_job_cleanup(clean_job);
> + v3d_submit_cleanup_jobs(&submit);
> v3d_put_multisync_post_deps(&se);
>
> return ret;
> @@ -1095,14 +1171,13 @@ int
> v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file_priv)
> {
> - struct v3d_dev *v3d = to_v3d_dev(dev);
> + struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
> struct drm_v3d_submit_tfu *args = data;
> struct v3d_submit_ext se = {0};
> struct v3d_tfu_job *job = NULL;
No need to init.
Regards,
Tvrtko
> - struct drm_exec exec;
> int ret = 0;
>
> - trace_v3d_submit_tfu_ioctl(&v3d->drm, args->iia);
> + trace_v3d_submit_tfu_ioctl(dev, args->iia);
>
> if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
> drm_dbg(dev, "invalid flags: %d\n", args->flags);
> @@ -1117,17 +1192,16 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
> }
> }
>
> - ret = v3d_job_allocate(v3d, (void *)&job, sizeof(*job));
> - if (ret)
> - return ret;
> -
> - ret = v3d_job_init(v3d, file_priv, &job->base,
> - v3d_job_free, args->in_sync, &se, V3D_TFU);
> - if (ret) {
> - v3d_job_deallocate((void *)&job);
> + job = (struct v3d_tfu_job *) v3d_submit_add_job(&submit, V3D_TFU);
> + if (IS_ERR(job)) {
> + ret = PTR_ERR(job);
> goto fail;
> }
>
> + ret = v3d_job_add_syncobjs(&job->base, file_priv, args->in_sync, &se);
> + if (ret)
> + goto fail;
> +
> job->base.bo = kzalloc_objs(*job->base.bo, ARRAY_SIZE(args->bo_handles));
> if (!job->base.bo) {
> ret = -ENOMEM;
> @@ -1155,26 +1229,27 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
> job->base.bo[job->base.bo_count] = bo;
> }
>
> - ret = v3d_lock_bo_reservations(&job->base, &exec);
> + ret = v3d_lock_bo_reservations(&job->base, &submit.exec);
> if (ret)
> goto fail;
>
> - mutex_lock(&v3d->sched_lock);
> - v3d_push_job(&job->base);
> - mutex_unlock(&v3d->sched_lock);
> + ret = v3d_submit_jobs(&submit);
> + if (ret)
> + goto fail_unreserve;
>
> v3d_attach_fences_and_unlock_reservation(file_priv,
> - &job->base, &exec,
> - args->out_sync,
> - &se,
> + &job->base, &submit.exec,
> + args->out_sync, &se,
> job->base.done_fence);
>
> - v3d_job_put(&job->base);
> + v3d_submit_put_jobs(&submit);
>
> return 0;
>
> +fail_unreserve:
> + drm_exec_fini(&submit.exec);
> fail:
> - v3d_job_cleanup((void *)job);
> + v3d_submit_cleanup_jobs(&submit);
> v3d_put_multisync_post_deps(&se);
>
> return ret;
> @@ -1193,21 +1268,19 @@ int
> v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file_priv)
> {
> - struct v3d_dev *v3d = to_v3d_dev(dev);
> - struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
> + struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
> struct drm_v3d_submit_csd *args = data;
> struct v3d_submit_ext se = {0};
> struct v3d_csd_job *job = NULL;
> struct v3d_job *clean_job = NULL;
> - struct drm_exec exec;
> int ret;
>
> - trace_v3d_submit_csd_ioctl(&v3d->drm, args->cfg[5], args->cfg[6]);
> + trace_v3d_submit_csd_ioctl(dev, args->cfg[5], args->cfg[6]);
>
> if (args->pad)
> return -EINVAL;
>
> - if (!v3d_has_csd(v3d)) {
> + if (!v3d_has_csd(submit.v3d)) {
> drm_warn(dev, "Attempting CSD submit on non-CSD hardware\n");
> return -EINVAL;
> }
> @@ -1225,55 +1298,36 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
> }
> }
>
> - ret = v3d_setup_csd_jobs_and_bos(file_priv, v3d, args,
> - &job, &clean_job, &se, &exec);
> + ret = v3d_setup_csd_jobs_and_bos(file_priv, submit.v3d, args,
> + &job, &clean_job, &se, &submit.exec);
> if (ret)
> goto fail;
>
> - if (args->perfmon_id) {
> - if (v3d->global_perfmon) {
> - ret = -EAGAIN;
> - goto fail_perfmon;
> - }
> + submit.jobs[submit.job_count++] = &job->base;
> + submit.jobs[submit.job_count++] = clean_job;
>
> - job->base.perfmon = v3d_perfmon_find(v3d_priv,
> - args->perfmon_id);
> - if (!job->base.perfmon) {
> - ret = -ENOENT;
> - goto fail_perfmon;
> - }
> - }
> -
> - mutex_lock(&v3d->sched_lock);
> - v3d_push_job(&job->base);
> -
> - ret = drm_sched_job_add_dependency(&clean_job->base,
> - dma_fence_get(job->base.done_fence));
> + ret = v3d_attach_perfmon_to_jobs(&submit, args->perfmon_id);
> if (ret)
> goto fail_unreserve;
>
> - v3d_push_job(clean_job);
> - mutex_unlock(&v3d->sched_lock);
> + ret = v3d_submit_jobs(&submit);
> + if (ret)
> + goto fail_unreserve;
>
> v3d_attach_fences_and_unlock_reservation(file_priv,
> clean_job,
> - &exec,
> - args->out_sync,
> - &se,
> + &submit.exec,
> + args->out_sync, &se,
> clean_job->done_fence);
>
> - v3d_job_put(&job->base);
> - v3d_job_put(clean_job);
> + v3d_submit_put_jobs(&submit);
>
> return 0;
>
> fail_unreserve:
> - mutex_unlock(&v3d->sched_lock);
> -fail_perfmon:
> - drm_exec_fini(&exec);
> + drm_exec_fini(&submit.exec);
> fail:
> - v3d_job_cleanup((void *)job);
> - v3d_job_cleanup(clean_job);
> + v3d_submit_cleanup_jobs(&submit);
> v3d_put_multisync_post_deps(&se);
>
> return ret;
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 10/14] drm/v3d: Convert submit helpers to operate on struct v3d_submit
2026-06-03 22:25 ` [PATCH v3 10/14] drm/v3d: Convert submit helpers to operate on struct v3d_submit Maíra Canal
@ 2026-06-04 9:49 ` Tvrtko Ursulin
0 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2026-06-04 9:49 UTC (permalink / raw)
To: Maíra Canal, Melissa Wen, Iago Toral, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel
On 03/06/2026 23:25, Maíra Canal wrote:
> Generalize the submission helpers so they act on a whole struct v3d_submit
> (the entire job chain) rather than on individual jobs and a drm_exec. This
> lets a submission of several chained jobs be locked, fenced, and finalized
> as a single unit, and is the groundwork for collapsing the indirect CSD
> path into one chain.
>
> The following helpers were generalized:
>
> - v3d_lookup_bos()
> - v3d_lock_bo_reservations() (renamed to v3d_submit_lock_reservations()):
> - v3d_attach_fences_and_unlock_reservation()
> - v3d_setup_csd_jobs_and_bos()
>
> Now, the locking helper now iterates over all jobs and locks the union of
> their BOs under one DRM exec, using DRM_EXEC_IGNORE_DUPLICATES to tolerate
> shared BO references. The fence-attach helper similarly walks every job
> and attaches the chain's last fence to all touched BOs.
>
> Also, v3d_submit_jobs() becomes the single submit-and-finalize entry
> point and callers no longer need to open-code fence attachment,
> reservation unlocking, etc.
>
> Update CL/TFU/CSD/CPU ioctls to use the new helper signatures. The CPU
> ioctl still uses two struct v3d_submit instances (one for the CPU job,
> one for the indirect CSD jobs) and keeps its manual two-pass fence-attach
> flow. Converting the indirect CSD path into the unified chain is done in
> the next commit.
>
> No functional change.
>
> Signed-off-by: Maíra Canal <mcanal@igalia.com>
> ---
> drivers/gpu/drm/v3d/v3d_submit.c | 340 +++++++++++++++------------------------
> 1 file changed, 128 insertions(+), 212 deletions(-)
>
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
> index 64eba912dc64..62c23feb8fbb 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
> @@ -20,32 +20,51 @@
> * to v3d, so we don't attach dma-buf fences to them.
> */
> static int
> -v3d_lock_bo_reservations(struct v3d_job *job, struct drm_exec *exec)
> +v3d_submit_lock_reservations(struct v3d_submit *submit)
> {
> - int i, ret;
> + int i, j, ret;
>
> - drm_exec_init(exec, DRM_EXEC_INTERRUPTIBLE_WAIT, job->bo_count);
> - drm_exec_until_all_locked(exec) {
> - ret = drm_exec_prepare_array(exec, job->bo, job->bo_count, 1);
> - }
> + drm_exec_init(&submit->exec,
> + DRM_EXEC_INTERRUPTIBLE_WAIT | DRM_EXEC_IGNORE_DUPLICATES, 0);
> + drm_exec_until_all_locked(&submit->exec) {
> + for (i = 0; i < submit->job_count; i++) {
> + struct v3d_job *job = submit->jobs[i];
>
> - if (ret)
> - goto fail;
> -
> - for (i = 0; i < job->bo_count; i++) {
> - ret = drm_sched_job_add_implicit_dependencies(&job->base,
> - job->bo[i], true);
> + ret = drm_exec_prepare_array(&submit->exec, job->bo,
> + job->bo_count, 1);
> + if (ret)
> + break;
> + }
> + drm_exec_retry_on_contention(&submit->exec);
> if (ret)
> goto fail;
> }
>
> + for (i = 0; i < submit->job_count; i++) {
> + struct v3d_job *job = submit->jobs[i];
> +
> + for (j = 0; j < job->bo_count; j++) {
> + ret = drm_sched_job_add_implicit_dependencies(&job->base,
> + job->bo[j],
> + true);
> + if (ret)
> + goto fail;
> + }
> + }
> +
> return 0;
>
> fail:
> - drm_exec_fini(exec);
> + drm_exec_fini(&submit->exec);
> return ret;
> }
>
> +static void
> +v3d_submit_unlock_reservations(struct v3d_submit *submit)
> +{
> + drm_exec_fini(&submit->exec);
> +}
> +
> /**
> * v3d_lookup_bos() - Sets up job->bo[] with the GEM objects
> * referenced by the job.
> @@ -63,25 +82,23 @@ v3d_lock_bo_reservations(struct v3d_job *job, struct drm_exec *exec)
> * failure, because that will happen at `v3d_job_free()`.
> */
> static int
> -v3d_lookup_bos(struct drm_device *dev,
> - struct drm_file *file_priv,
> - struct v3d_job *job,
> - u64 bo_handles,
> - u32 bo_count)
> +v3d_lookup_bos(struct v3d_submit *submit, u64 bo_handles, u32 bo_count)
> {
> - job->bo_count = bo_count;
> + struct v3d_job *last_job = submit->jobs[submit->job_count - 1];
>
> - if (!job->bo_count) {
> + last_job->bo_count = bo_count;
> +
> + if (!last_job->bo_count) {
> /* See comment on bo_index for why we have to check
> * this.
> */
> - drm_warn(dev, "Rendering requires BOs\n");
> + drm_warn(&submit->v3d->drm, "Rendering requires BOs\n");
> return -EINVAL;
> }
>
> - return drm_gem_objects_lookup(file_priv,
> + return drm_gem_objects_lookup(submit->file_priv,
> (void __user *)(uintptr_t)bo_handles,
> - job->bo_count, &job->bo);
> + last_job->bo_count, &last_job->bo);
> }
>
> static void
> @@ -160,25 +177,6 @@ void v3d_job_put(struct v3d_job *job)
> kref_put(&job->refcount, job->free);
> }
>
> -static int
> -v3d_job_allocate(struct v3d_dev *v3d, void **container, size_t size)
> -{
> - *container = kcalloc(1, size, GFP_KERNEL);
> - if (!*container) {
> - drm_err(&v3d->drm, "Cannot allocate memory for V3D job.\n");
> - return -ENOMEM;
> - }
> -
> - return 0;
> -}
> -
> -static void
> -v3d_job_deallocate(void **container)
> -{
> - kfree(*container);
> - *container = NULL;
> -}
> -
> static int
> v3d_job_add_syncobjs(struct v3d_job *job, struct drm_file *file_priv,
> u32 in_sync, struct v3d_submit_ext *se)
> @@ -219,48 +217,6 @@ v3d_job_add_syncobjs(struct v3d_job *job, struct drm_file *file_priv,
> return 0;
> }
>
> -static int
> -v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
> - struct v3d_job *job, void (*free)(struct kref *ref),
> - u32 in_sync, struct v3d_submit_ext *se, enum v3d_queue queue)
> -{
> - struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
> - int ret;
> -
> - job->v3d = v3d;
> - job->free = free;
> - job->queue = queue;
> - job->file_priv = v3d_priv;
> -
> - ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
> - 1, v3d_priv, file_priv->client_id);
> - if (ret)
> - return ret;
> -
> - ret = v3d_job_add_syncobjs(job, file_priv, in_sync, se);
> - if (ret)
> - goto fail_job_init;
> -
> - /* CPU jobs don't require hardware resources */
> - if (queue != V3D_CPU) {
> - ret = v3d_pm_runtime_get(v3d);
> - if (ret)
> - goto fail_job_init;
> - job->has_pm_ref = true;
> - }
> -
> - kref_init(&job->refcount);
> -
> - job->client_stats = v3d_stats_get(v3d_priv->stats[queue]);
> - job->global_stats = v3d_stats_get(v3d->queue[queue].stats);
> -
> - return 0;
> -
> -fail_job_init:
> - drm_sched_job_cleanup(&job->base);
> - return ret;
> -}
> -
> static const struct {
> size_t size;
> void (*free)(struct kref *ref);
> @@ -363,31 +319,34 @@ v3d_attach_perfmon_to_jobs(struct v3d_submit *submit, u32 perfmon_id)
> }
>
> static void
> -v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv,
> - struct v3d_job *job,
> - struct drm_exec *exec,
> - u32 out_sync,
> - struct v3d_submit_ext *se,
> - struct dma_fence *done_fence)
> +v3d_attach_fences_and_unlock_reservation(struct v3d_submit *submit,
> + u32 out_sync, struct v3d_submit_ext *se)
> {
> - struct drm_syncobj *sync_out;
> bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC);
> - int i;
> + struct v3d_job *last_job = submit->jobs[submit->job_count - 1];
> + struct drm_syncobj *sync_out;
>
> - for (i = 0; i < job->bo_count; i++) {
> - /* XXX: Use shared fences for read-only objects. */
> - dma_resv_add_fence(job->bo[i]->resv, job->done_fence,
> - DMA_RESV_USAGE_WRITE);
> + /* The submission's last fence covers the entire submission. Attach it
> + * to every BO touched by any job in the submission.
> + */
> + for (int i = 0; i < submit->job_count; i++) {
> + struct v3d_job *job = submit->jobs[i];
> +
> + for (int j = 0; j < job->bo_count; j++) {
> + /* XXX: Use shared fences for read-only objects. */
> + dma_resv_add_fence(job->bo[j]->resv, last_job->done_fence,
> + DMA_RESV_USAGE_WRITE);
> + }
> }
>
> - drm_exec_fini(exec);
> + v3d_submit_unlock_reservations(submit);
>
> /* Update the return sync object for the job */
> /* If it only supports a single signal semaphore*/
> if (!has_multisync) {
> - sync_out = drm_syncobj_find(file_priv, out_sync);
> + sync_out = drm_syncobj_find(submit->file_priv, out_sync);
> if (sync_out) {
> - drm_syncobj_replace_fence(sync_out, done_fence);
> + drm_syncobj_replace_fence(sync_out, last_job->done_fence);
> drm_syncobj_put(sync_out);
> }
> return;
> @@ -395,9 +354,9 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv,
>
> /* If multiple semaphores extension is supported */
> if (se->out_sync_count) {
> - for (i = 0; i < se->out_sync_count; i++) {
> + for (int i = 0; i < se->out_sync_count; i++) {
> drm_syncobj_replace_fence(se->out_syncs[i].syncobj,
> - done_fence);
> + last_job->done_fence);
> drm_syncobj_put(se->out_syncs[i].syncobj);
> }
> kvfree(se->out_syncs);
> @@ -418,7 +377,8 @@ v3d_push_job(struct v3d_job *job)
> }
>
> static int
> -v3d_submit_jobs(struct v3d_submit *submit)
> +v3d_submit_jobs(struct v3d_submit *submit, u32 out_sync,
> + struct v3d_submit_ext *se)
> {
> struct v3d_dev *v3d = submit->v3d;
> int ret = 0;
> @@ -438,52 +398,42 @@ v3d_submit_jobs(struct v3d_submit *submit)
> }
> }
>
> + mutex_unlock(&v3d->sched_lock);
> +
> + v3d_attach_fences_and_unlock_reservation(submit, out_sync, se);
> + v3d_submit_put_jobs(submit);
> +
> + return 0;
> +
> err:
> mutex_unlock(&v3d->sched_lock);
> return ret;
> }
>
> static int
> -v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv,
> - struct v3d_dev *v3d,
> +v3d_setup_csd_jobs_and_bos(struct v3d_submit *submit,
> struct drm_v3d_submit_csd *args,
> - struct v3d_csd_job **job,
> - struct v3d_job **clean_job,
> - struct v3d_submit_ext *se,
> - struct drm_exec *exec)
> + struct v3d_submit_ext *se)
> {
> + struct v3d_csd_job *job;
> + struct v3d_job *clean_job;
> int ret;
>
> - ret = v3d_job_allocate(v3d, (void *)job, sizeof(**job));
> + job = (struct v3d_csd_job *) v3d_submit_add_job(submit, V3D_CSD);
> + if (IS_ERR(job))
> + return PTR_ERR(job);
> +
> + ret = v3d_job_add_syncobjs(&job->base, submit->file_priv, args->in_sync, se);
> if (ret)
> return ret;
>
> - ret = v3d_job_init(v3d, file_priv, &(*job)->base,
> - v3d_job_free, args->in_sync, se, V3D_CSD);
> - if (ret) {
> - v3d_job_deallocate((void *)job);
> - return ret;
> - }
> + job->args = *args;
>
> - ret = v3d_job_allocate(v3d, (void *)clean_job, sizeof(**clean_job));
> - if (ret)
> - return ret;
> + clean_job = v3d_submit_add_job(submit, V3D_CACHE_CLEAN);
> + if (IS_ERR(clean_job))
> + return PTR_ERR(clean_job);
>
> - ret = v3d_job_init(v3d, file_priv, *clean_job,
> - v3d_job_free, 0, NULL, V3D_CACHE_CLEAN);
> - if (ret) {
> - v3d_job_deallocate((void *)clean_job);
> - return ret;
> - }
> -
> - (*job)->args = *args;
> -
> - ret = v3d_lookup_bos(&v3d->drm, file_priv, *clean_job,
> - args->bo_handles, args->bo_handle_count);
> - if (ret)
> - return ret;
> -
> - return v3d_lock_bo_reservations(*clean_job, exec);
> + return v3d_lookup_bos(submit, args->bo_handles, args->bo_handle_count);
> }
>
> static void
> @@ -1123,33 +1073,22 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
> if (ret)
> goto fail;
>
> - ret = v3d_lookup_bos(dev, file_priv,
> - submit.jobs[submit.job_count - 1],
> - args->bo_handles, args->bo_handle_count);
> + ret = v3d_lookup_bos(&submit, args->bo_handles, args->bo_handle_count);
> if (ret)
> goto fail;
>
> - ret = v3d_lock_bo_reservations(submit.jobs[submit.job_count - 1],
> - &submit.exec);
> + ret = v3d_submit_lock_reservations(&submit);
> if (ret)
> goto fail;
>
> - ret = v3d_submit_jobs(&submit);
> + ret = v3d_submit_jobs(&submit, args->out_sync, &se);
> if (ret)
> goto fail_unreserve;
>
> - v3d_attach_fences_and_unlock_reservation(file_priv,
> - submit.jobs[submit.job_count - 1],
> - &submit.exec,
> - args->out_sync, &se,
> - submit.jobs[submit.job_count - 1]->done_fence);
> -
> - v3d_submit_put_jobs(&submit);
> -
> return 0;
>
> fail_unreserve:
> - drm_exec_fini(&submit.exec);
> + v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> v3d_put_multisync_post_deps(&se);
> @@ -1228,25 +1167,18 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
> job->base.bo[job->base.bo_count] = bo;
> }
>
> - ret = v3d_lock_bo_reservations(&job->base, &submit.exec);
> + ret = v3d_submit_lock_reservations(&submit);
> if (ret)
> goto fail;
>
> - ret = v3d_submit_jobs(&submit);
> + ret = v3d_submit_jobs(&submit, args->out_sync, &se);
> if (ret)
> goto fail_unreserve;
>
> - v3d_attach_fences_and_unlock_reservation(file_priv,
> - &job->base, &submit.exec,
> - args->out_sync, &se,
> - job->base.done_fence);
> -
> - v3d_submit_put_jobs(&submit);
> -
> return 0;
>
> fail_unreserve:
> - drm_exec_fini(&submit.exec);
> + v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> v3d_put_multisync_post_deps(&se);
> @@ -1270,8 +1202,6 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
> struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
> struct drm_v3d_submit_csd *args = data;
> struct v3d_submit_ext se = {0};
> - struct v3d_csd_job *job = NULL;
> - struct v3d_job *clean_job = NULL;
> int ret;
>
> trace_v3d_submit_csd_ioctl(dev, args->cfg[5], args->cfg[6]);
> @@ -1297,34 +1227,26 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
> }
> }
>
> - ret = v3d_setup_csd_jobs_and_bos(file_priv, submit.v3d, args,
> - &job, &clean_job, &se, &submit.exec);
> + ret = v3d_setup_csd_jobs_and_bos(&submit, args, &se);
> if (ret)
> goto fail;
>
> - submit.jobs[submit.job_count++] = &job->base;
> - submit.jobs[submit.job_count++] = clean_job;
> -
> ret = v3d_attach_perfmon_to_jobs(&submit, args->perfmon_id);
> if (ret)
> - goto fail_unreserve;
> + goto fail;
>
> - ret = v3d_submit_jobs(&submit);
> + ret = v3d_submit_lock_reservations(&submit);
> + if (ret)
> + goto fail;
> +
> + ret = v3d_submit_jobs(&submit, args->out_sync, &se);
> if (ret)
> goto fail_unreserve;
>
> - v3d_attach_fences_and_unlock_reservation(file_priv,
> - clean_job,
> - &submit.exec,
> - args->out_sync, &se,
> - clean_job->done_fence);
> -
> - v3d_submit_put_jobs(&submit);
> -
> return 0;
>
> fail_unreserve:
> - drm_exec_fini(&submit.exec);
> + v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> v3d_put_multisync_post_deps(&se);
> @@ -1355,13 +1277,14 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file_priv)
> {
> struct v3d_dev *v3d = to_v3d_dev(dev);
> + struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
> + struct v3d_submit indirect_submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
> struct drm_v3d_submit_cpu *args = data;
> struct v3d_submit_ext se = {0};
> struct v3d_submit_ext *out_se = NULL;
No need to init.
> struct v3d_cpu_job *cpu_job = NULL;
> struct v3d_csd_job *csd_job = NULL;
> struct v3d_job *clean_job = NULL;
Ditto for above three AFAICT.
> - struct drm_exec exec;
> int ret;
>
> if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
> @@ -1369,9 +1292,9 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
> return -EINVAL;
> }
>
> - ret = v3d_job_allocate(v3d, (void *)&cpu_job, sizeof(*cpu_job));
> - if (ret)
> - return ret;
> + cpu_job = (struct v3d_cpu_job *) v3d_submit_add_job(&submit, V3D_CPU);
> + if (IS_ERR(cpu_job))
> + return PTR_ERR(cpu_job);
>
> if (args->flags & DRM_V3D_SUBMIT_EXTENSION) {
> ret = v3d_get_extensions(file_priv, args->extensions, &se, cpu_job);
> @@ -1396,34 +1319,35 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
>
> trace_v3d_submit_cpu_ioctl(&v3d->drm, cpu_job->job_type);
>
> - ret = v3d_job_init(v3d, file_priv, &cpu_job->base,
> - v3d_cpu_job_free, 0, &se, V3D_CPU);
> - if (ret) {
> - v3d_job_deallocate((void *)&cpu_job);
> + ret = v3d_job_add_syncobjs(&cpu_job->base, file_priv, 0, &se);
> + if (ret)
> goto fail;
> - }
>
> if (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) {
> - ret = v3d_setup_csd_jobs_and_bos(file_priv, v3d,
> + ret = v3d_setup_csd_jobs_and_bos(&indirect_submit,
> &cpu_job->indirect_csd.args,
> - &cpu_job->indirect_csd.job,
> - &cpu_job->indirect_csd.clean_job,
> - NULL,
> - &cpu_job->indirect_csd.exec);
> + NULL);
> if (ret)
> goto fail;
> +
> + ret = v3d_submit_lock_reservations(&indirect_submit);
> + if (ret)
> + goto fail;
> +
> + cpu_job->indirect_csd.job = container_of(indirect_submit.jobs[0],
> + struct v3d_csd_job, base);
> + cpu_job->indirect_csd.clean_job = indirect_submit.jobs[1];
> +
> + clean_job = cpu_job->indirect_csd.clean_job;
> + csd_job = cpu_job->indirect_csd.job;
> }
>
> - clean_job = cpu_job->indirect_csd.clean_job;
> - csd_job = cpu_job->indirect_csd.job;
> -
> if (args->bo_handle_count) {
> - ret = v3d_lookup_bos(dev, file_priv, &cpu_job->base,
> - args->bo_handles, args->bo_handle_count);
> + ret = v3d_lookup_bos(&submit, args->bo_handles, args->bo_handle_count);
> if (ret)
> goto fail;
>
> - ret = v3d_lock_bo_reservations(&cpu_job->base, &exec);
> + ret = v3d_submit_lock_reservations(&submit);
> if (ret)
> goto fail;
> }
> @@ -1455,36 +1379,28 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
>
> out_se = (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) ? NULL : &se;
Move under one of the other two checks for this job type a bit above?
>
> - v3d_attach_fences_and_unlock_reservation(file_priv,
> - &cpu_job->base,
> - &exec, 0,
> - out_se, cpu_job->base.done_fence);
> + v3d_attach_fences_and_unlock_reservation(&submit, 0, out_se);
>
> switch (cpu_job->job_type) {
> case V3D_CPU_JOB_TYPE_INDIRECT_CSD:
> - v3d_attach_fences_and_unlock_reservation(file_priv,
> - clean_job,
> - &cpu_job->indirect_csd.exec,
> - 0, &se, clean_job->done_fence);
> + v3d_attach_fences_and_unlock_reservation(&indirect_submit, 0, &se);
> break;
> default:
> break;
> }
Would this switch statement (and the one above it, not shown in the
diff) look more streamlined as an if block? Would be more compact so
more readable at least.
Maybe it is temporary code TBF.
Okay:
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Regards,
Tvrtko
>
> - v3d_job_put(&cpu_job->base);
> - v3d_job_put(&csd_job->base);
> - v3d_job_put(clean_job);
> + v3d_submit_put_jobs(&submit);
> + v3d_submit_put_jobs(&indirect_submit);
>
> return 0;
>
> fail_unreserve:
> mutex_unlock(&v3d->sched_lock);
> - drm_exec_fini(&exec);
> - drm_exec_fini(&cpu_job->indirect_csd.exec);
> + v3d_submit_unlock_reservations(&submit);
> + v3d_submit_unlock_reservations(&indirect_submit);
> fail:
> - v3d_job_cleanup((void *)cpu_job);
> - v3d_job_cleanup((void *)csd_job);
> - v3d_job_cleanup(clean_job);
> + v3d_submit_cleanup_jobs(&submit);
> + v3d_submit_cleanup_jobs(&indirect_submit);
> v3d_put_multisync_post_deps(&se);
>
> return ret;
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 11/14] drm/v3d: Refactor CPU ioctl into unified submission chain
2026-06-03 22:25 ` [PATCH v3 11/14] drm/v3d: Refactor CPU ioctl into unified submission chain Maíra Canal
2026-06-04 2:58 ` sashiko-bot
@ 2026-06-04 9:54 ` Tvrtko Ursulin
1 sibling, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2026-06-04 9:54 UTC (permalink / raw)
To: Maíra Canal, Melissa Wen, Iago Toral, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel
On 03/06/2026 23:25, Maíra Canal wrote:
> Restructure the CPU ioctl so that all job types, including indirect CSD,
> use a single struct v3d_submit chain and a single DRM exec context.
>
> Now that v3d_get_cpu_indirect_csd_params() is a pure parser and the
> submit helpers operate on struct v3d_submit, fold the indirect CSD path
> into the standard flow by appending the CSD and CLEAN_CACHE jobs to the
> same struct v3d_submit as the CPU job and locking the union of all jobs'
> BOs under one drm_exec. This eliminates the second drm_exec, the nested
> submission, and the conditional two-pass fence attachment that the CPU
> ioctl previously required for the indirect CSD path.
>
> Signed-off-by: Maíra Canal <mcanal@igalia.com>
> ---
> drivers/gpu/drm/v3d/v3d_drv.h | 6 ---
> drivers/gpu/drm/v3d/v3d_submit.c | 90 ++++++++++------------------------------
> 2 files changed, 23 insertions(+), 73 deletions(-)
>
> diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
> index 52ed0af67f5e..e91b1b83dc23 100644
> --- a/drivers/gpu/drm/v3d/v3d_drv.h
> +++ b/drivers/gpu/drm/v3d/v3d_drv.h
> @@ -425,9 +425,6 @@ struct v3d_indirect_csd_info {
> /* Indirect CSD */
> struct v3d_csd_job *job;
>
> - /* Clean cache job associated to the Indirect CSD job */
> - struct v3d_job *clean_job;
> -
> /* Indirect CSD args, stashed by the extension parser and later used
> * to create the CSD job from them.
> */
> @@ -446,9 +443,6 @@ struct v3d_indirect_csd_info {
>
> /* Indirect BO */
> struct drm_gem_object *indirect;
> -
> - /* Context of the Indirect CSD job */
> - struct drm_exec exec;
> };
>
> struct v3d_timestamp_query_info {
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
> index 62c23feb8fbb..1df7ab528422 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
> @@ -1276,15 +1276,10 @@ int
> v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file_priv)
> {
> - struct v3d_dev *v3d = to_v3d_dev(dev);
> struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
> - struct v3d_submit indirect_submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
> struct drm_v3d_submit_cpu *args = data;
> struct v3d_submit_ext se = {0};
> - struct v3d_submit_ext *out_se = NULL;
> struct v3d_cpu_job *cpu_job = NULL;
> - struct v3d_csd_job *csd_job = NULL;
> - struct v3d_job *clean_job = NULL;
> int ret;
>
> if (args->flags && !(args->flags & DRM_V3D_SUBMIT_EXTENSION)) {
> @@ -1317,90 +1312,51 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
> goto fail;
> }
>
> - trace_v3d_submit_cpu_ioctl(&v3d->drm, cpu_job->job_type);
> + trace_v3d_submit_cpu_ioctl(dev, cpu_job->job_type);
>
> ret = v3d_job_add_syncobjs(&cpu_job->base, file_priv, 0, &se);
> if (ret)
> goto fail;
>
> - if (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) {
> - ret = v3d_setup_csd_jobs_and_bos(&indirect_submit,
> - &cpu_job->indirect_csd.args,
> - NULL);
> - if (ret)
> - goto fail;
> -
> - ret = v3d_submit_lock_reservations(&indirect_submit);
> - if (ret)
> - goto fail;
> -
> - cpu_job->indirect_csd.job = container_of(indirect_submit.jobs[0],
> - struct v3d_csd_job, base);
> - cpu_job->indirect_csd.clean_job = indirect_submit.jobs[1];
> -
> - clean_job = cpu_job->indirect_csd.clean_job;
> - csd_job = cpu_job->indirect_csd.job;
> - }
> -
> + /* Look up the CPU jobs' BOs before v3d_setup_csd_jobs_and_bos() appends
> + * the CSD and clean jobs in the case of indirect CSD job.
> + */
> if (args->bo_handle_count) {
> ret = v3d_lookup_bos(&submit, args->bo_handles, args->bo_handle_count);
> if (ret)
> goto fail;
> + }
>
> - ret = v3d_submit_lock_reservations(&submit);
> + if (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) {
> + ret = v3d_setup_csd_jobs_and_bos(&submit, &cpu_job->indirect_csd.args,
> + NULL);
> if (ret)
> goto fail;
> +
> + /* The CSD job was appended at jobs[1] */
> + if (WARN_ON(submit.jobs[1]->queue != V3D_CSD)) {
> + ret = -EINVAL;
> + goto fail;
> + }
> +
> + cpu_job->indirect_csd.job = container_of(submit.jobs[1], struct v3d_csd_job,
> + base);
> }
>
> - mutex_lock(&v3d->sched_lock);
> - v3d_push_job(&cpu_job->base);
> + ret = v3d_submit_lock_reservations(&submit);
> + if (ret)
> + goto fail;
>
> - switch (cpu_job->job_type) {
> - case V3D_CPU_JOB_TYPE_INDIRECT_CSD:
> - ret = drm_sched_job_add_dependency(&csd_job->base.base,
> - dma_fence_get(cpu_job->base.done_fence));
> - if (ret)
> - goto fail_unreserve;
> -
> - v3d_push_job(&csd_job->base);
> -
> - ret = drm_sched_job_add_dependency(&clean_job->base,
> - dma_fence_get(csd_job->base.done_fence));
> - if (ret)
> - goto fail_unreserve;
> -
> - v3d_push_job(clean_job);
> -
> - break;
> - default:
> - break;
> - }
> - mutex_unlock(&v3d->sched_lock);
> -
> - out_se = (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) ? NULL : &se;
> -
> - v3d_attach_fences_and_unlock_reservation(&submit, 0, out_se);
> -
> - switch (cpu_job->job_type) {
> - case V3D_CPU_JOB_TYPE_INDIRECT_CSD:
> - v3d_attach_fences_and_unlock_reservation(&indirect_submit, 0, &se);
> - break;
> - default:
> - break;
> - }
> -
> - v3d_submit_put_jobs(&submit);
> - v3d_submit_put_jobs(&indirect_submit);
> + ret = v3d_submit_jobs(&submit, 0, &se);
> + if (ret)
> + goto fail_unreserve;
>
> return 0;
>
> fail_unreserve:
> - mutex_unlock(&v3d->sched_lock);
> v3d_submit_unlock_reservations(&submit);
> - v3d_submit_unlock_reservations(&indirect_submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> - v3d_submit_cleanup_jobs(&indirect_submit);
> v3d_put_multisync_post_deps(&se);
>
> return ret;
>
It's all coming nicely together now!
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Regards,
Tvrtko
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 12/14] drm/v3d: Split BO fence attach from syncobj output handling
2026-06-03 22:25 ` [PATCH v3 12/14] drm/v3d: Split BO fence attach from syncobj output handling Maíra Canal
@ 2026-06-04 9:59 ` Tvrtko Ursulin
0 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2026-06-04 9:59 UTC (permalink / raw)
To: Maíra Canal, Melissa Wen, Iago Toral, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel
On 03/06/2026 23:25, Maíra Canal wrote:
> v3d_attach_fences_and_unlock_reservation() does three different things:
> (1) attaches the submission's last fence to every BO, (2) releases
> drm_exec, and (3) replaces the userspace out_sync syncobjs. Decouple
> these three behaviors into different functions, so that each function
> has a more self-contained behavior.
>
> v3d_submit_jobs() now invokes the three steps explicitly, which makes the
> submission sequence self-documenting and keeps each helper self-contained.
>
> No functional change; just code consolidation.
>
> Signed-off-by: Maíra Canal <mcanal@igalia.com>
> ---
> drivers/gpu/drm/v3d/v3d_submit.c | 19 +++++++++++++------
> 1 file changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
> index 1df7ab528422..2beb99a25104 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
> @@ -319,12 +319,9 @@ v3d_attach_perfmon_to_jobs(struct v3d_submit *submit, u32 perfmon_id)
> }
>
> static void
> -v3d_attach_fences_and_unlock_reservation(struct v3d_submit *submit,
> - u32 out_sync, struct v3d_submit_ext *se)
> +v3d_submit_attach_object_fences(struct v3d_submit *submit)
> {
> - bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC);
> struct v3d_job *last_job = submit->jobs[submit->job_count - 1];
> - struct drm_syncobj *sync_out;
>
> /* The submission's last fence covers the entire submission. Attach it
> * to every BO touched by any job in the submission.
> @@ -338,8 +335,15 @@ v3d_attach_fences_and_unlock_reservation(struct v3d_submit *submit,
> DMA_RESV_USAGE_WRITE);
> }
> }
> +}
>
> - v3d_submit_unlock_reservations(submit);
> +static void
> +v3d_submit_process_post_deps(struct v3d_submit *submit, u32 out_sync,
> + struct v3d_submit_ext *se)
> +{
> + bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC);
> + struct v3d_job *last_job = submit->jobs[submit->job_count - 1];
> + struct drm_syncobj *sync_out;
>
> /* Update the return sync object for the job */
> /* If it only supports a single signal semaphore*/
> @@ -400,7 +404,10 @@ v3d_submit_jobs(struct v3d_submit *submit, u32 out_sync,
>
> mutex_unlock(&v3d->sched_lock);
>
> - v3d_attach_fences_and_unlock_reservation(submit, out_sync, se);
> + v3d_submit_attach_object_fences(submit);
> + v3d_submit_unlock_reservations(submit);
> + v3d_submit_process_post_deps(submit, out_sync, se);
> +
> v3d_submit_put_jobs(submit);
>
> return 0;
>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Regards,
Tvrtko
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 13/14] drm/v3d: Reject invalid out_sync handles in submit ioctls
2026-06-03 22:25 ` [PATCH v3 13/14] drm/v3d: Reject invalid out_sync handles in submit ioctls Maíra Canal
2026-06-04 2:59 ` sashiko-bot
@ 2026-06-04 10:12 ` Tvrtko Ursulin
1 sibling, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2026-06-04 10:12 UTC (permalink / raw)
To: Maíra Canal, Melissa Wen, Iago Toral, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel
On 03/06/2026 23:25, Maíra Canal wrote:
> v3d_submit_process_post_deps() looks up the out_sync syncobj via
> drm_syncobj_find(), and if userspace passes a non-zero handle that doesn't
> refer to a valid syncobj, the lookup silently returns NULL and the
> post-deps step skips publishing the submission's last fence to it. The
> ioctl still returns success, leaving userspace to wait on a invalid
> syncobj.
>
> Instead of silently ignoring an invalid non-zero out_sync, move the syncobj
> lookup to the submission and make it fail with -ENOENT up front, mirroring
> the syncobj validation already done for in_sync. Now,
> v3d_submit_process_post_deps() only does the fence replacement.
>
> Note that the lookup is skipped when the multi-sync extension is in use,
> since args->out_sync is unused in that case.
>
> To keep cleanup symmetric on error paths, convert the function
> v3d_put_multisync_post_deps() into a single function that releases the
> references that were acquired but never published for both single-sync
> and multi-sync.
>
> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Signed-off-by: Maíra Canal <mcanal@igalia.com>
> ---
> drivers/gpu/drm/v3d/v3d_submit.c | 50 +++++++++++++++++++++++++++++-----------
> 1 file changed, 36 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
> index 2beb99a25104..dc27770d85fd 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
> @@ -338,17 +338,15 @@ v3d_submit_attach_object_fences(struct v3d_submit *submit)
> }
>
> static void
> -v3d_submit_process_post_deps(struct v3d_submit *submit, u32 out_sync,
> +v3d_submit_process_post_deps(struct v3d_submit *submit, struct drm_syncobj *sync_out,
> struct v3d_submit_ext *se)
> {
> bool has_multisync = se && (se->flags & DRM_V3D_EXT_ID_MULTI_SYNC);
> struct v3d_job *last_job = submit->jobs[submit->job_count - 1];
> - struct drm_syncobj *sync_out;
>
> /* Update the return sync object for the job */
> /* If it only supports a single signal semaphore*/
> if (!has_multisync) {
> - sync_out = drm_syncobj_find(submit->file_priv, out_sync);
> if (sync_out) {
> drm_syncobj_replace_fence(sync_out, last_job->done_fence);
> drm_syncobj_put(sync_out);
Is it worth adding an assert that sync_obj and has_multisync are
mutually exclusive? It caught me up thinking what prevents a ref leak.
> @@ -381,7 +379,7 @@ v3d_push_job(struct v3d_job *job)
> }
>
> static int
> -v3d_submit_jobs(struct v3d_submit *submit, u32 out_sync,
> +v3d_submit_jobs(struct v3d_submit *submit, struct drm_syncobj *sync_out,
> struct v3d_submit_ext *se)
> {
> struct v3d_dev *v3d = submit->v3d;
> @@ -406,7 +404,7 @@ v3d_submit_jobs(struct v3d_submit *submit, u32 out_sync,
>
> v3d_submit_attach_object_fences(submit);
> v3d_submit_unlock_reservations(submit);
> - v3d_submit_process_post_deps(submit, out_sync, se);
> + v3d_submit_process_post_deps(submit, sync_out, se);
>
> v3d_submit_put_jobs(submit);
>
> @@ -444,10 +442,13 @@ v3d_setup_csd_jobs_and_bos(struct v3d_submit *submit,
> }
>
> static void
> -v3d_put_multisync_post_deps(struct v3d_submit_ext *se)
> +v3d_submit_put_post_deps(struct drm_syncobj *sync_out, struct v3d_submit_ext *se)
> {
> unsigned int i;
>
> + if (sync_out)
> + drm_syncobj_put(sync_out);
> +
> if (!(se && se->out_sync_count))
> return;
>
> @@ -1006,6 +1007,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
> {
> struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
> struct drm_v3d_submit_cl *args = data;
> + struct drm_syncobj *sync_out = NULL;
> struct v3d_submit_ext se = {0};
> struct v3d_bin_job *bin = NULL;
> struct v3d_render_job *render = NULL;
> @@ -1032,6 +1034,12 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
> }
> }
>
> + if (args->out_sync && !(se.flags & DRM_V3D_EXT_ID_MULTI_SYNC)) {
Is it feasible to make sure out_sync is zero if multi-sync flag is set?
Or we have to allow letting garbage in for backwards compatibility?
Either way:
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Regards,
Tvrtko
> + sync_out = drm_syncobj_find(file_priv, args->out_sync);
> + if (!sync_out)
> + return -ENOENT;
> + }
> +
> if (args->bcl_start != args->bcl_end) {
> bin = (struct v3d_bin_job *) v3d_submit_add_job(&submit, V3D_BIN);
> if (IS_ERR(bin)) {
> @@ -1088,7 +1096,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
> if (ret)
> goto fail;
>
> - ret = v3d_submit_jobs(&submit, args->out_sync, &se);
> + ret = v3d_submit_jobs(&submit, sync_out, &se);
> if (ret)
> goto fail_unreserve;
>
> @@ -1098,7 +1106,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
> v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> - v3d_put_multisync_post_deps(&se);
> + v3d_submit_put_post_deps(sync_out, &se);
>
> return ret;
> }
> @@ -1118,6 +1126,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
> {
> struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
> struct drm_v3d_submit_tfu *args = data;
> + struct drm_syncobj *sync_out = NULL;
> struct v3d_submit_ext se = {0};
> struct v3d_tfu_job *job = NULL;
> int ret = 0;
> @@ -1137,6 +1146,12 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
> }
> }
>
> + if (args->out_sync && !(se.flags & DRM_V3D_EXT_ID_MULTI_SYNC)) {
> + sync_out = drm_syncobj_find(file_priv, args->out_sync);
> + if (!sync_out)
> + return -ENOENT;
> + }
> +
> job = (struct v3d_tfu_job *) v3d_submit_add_job(&submit, V3D_TFU);
> if (IS_ERR(job)) {
> ret = PTR_ERR(job);
> @@ -1178,7 +1193,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
> if (ret)
> goto fail;
>
> - ret = v3d_submit_jobs(&submit, args->out_sync, &se);
> + ret = v3d_submit_jobs(&submit, sync_out, &se);
> if (ret)
> goto fail_unreserve;
>
> @@ -1188,7 +1203,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
> v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> - v3d_put_multisync_post_deps(&se);
> + v3d_submit_put_post_deps(sync_out, &se);
>
> return ret;
> }
> @@ -1208,6 +1223,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
> {
> struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv = file_priv };
> struct drm_v3d_submit_csd *args = data;
> + struct drm_syncobj *sync_out = NULL;
> struct v3d_submit_ext se = {0};
> int ret;
>
> @@ -1234,6 +1250,12 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
> }
> }
>
> + if (args->out_sync && !(se.flags & DRM_V3D_EXT_ID_MULTI_SYNC)) {
> + sync_out = drm_syncobj_find(file_priv, args->out_sync);
> + if (!sync_out)
> + return -ENOENT;
> + }
> +
> ret = v3d_setup_csd_jobs_and_bos(&submit, args, &se);
> if (ret)
> goto fail;
> @@ -1246,7 +1268,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
> if (ret)
> goto fail;
>
> - ret = v3d_submit_jobs(&submit, args->out_sync, &se);
> + ret = v3d_submit_jobs(&submit, sync_out, &se);
> if (ret)
> goto fail_unreserve;
>
> @@ -1256,7 +1278,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
> v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> - v3d_put_multisync_post_deps(&se);
> + v3d_submit_put_post_deps(sync_out, &se);
>
> return ret;
> }
> @@ -1354,7 +1376,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
> if (ret)
> goto fail;
>
> - ret = v3d_submit_jobs(&submit, 0, &se);
> + ret = v3d_submit_jobs(&submit, NULL, &se);
> if (ret)
> goto fail_unreserve;
>
> @@ -1364,7 +1386,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
> v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> - v3d_put_multisync_post_deps(&se);
> + v3d_submit_put_post_deps(NULL, &se);
>
> return ret;
> }
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 14/14] drm/v3d: Ensure atomic submissions in v3d_submit_jobs()
2026-06-03 22:25 ` [PATCH v3 14/14] drm/v3d: Ensure atomic submissions in v3d_submit_jobs() Maíra Canal
@ 2026-06-04 10:25 ` Tvrtko Ursulin
0 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2026-06-04 10:25 UTC (permalink / raw)
To: Maíra Canal, Melissa Wen, Iago Toral, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel
On 03/06/2026 23:25, Maíra Canal wrote:
> Currently, v3d_submit_jobs() arms and pushes each job one at a time,
> wiring dependencies between consecutive jobs after each push. If
> drm_sched_job_add_dependency() fails midway, the already-pushed jobs are
> scheduler-owned and will be submitted to the GPU for execution, even though
> the subsequent jobs won't be submitted.
>
> This breaks the atomicity of the submissions, as only some of the jobs
> from a submission would be submitted to the hardware, while the other part
> fails.
>
> Restructure v3d_submit_jobs() into three phases: (1) arm all jobs belonging
> to a given submission, (2) wire inter-job dependencies, and (3) push all
> jobs to the scheduler unconditionally. Phase (2) can fail; on failure,
> it marks every armed job finished fence with an error, so that run_job()
> callbacks skip hardware execution.
>
> This guarantees that every armed job is always pushed, either to run
> or to be skipped, and it also ensures the atomicity of a submission.
>
> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Signed-off-by: Maíra Canal <mcanal@igalia.com>
> ---
> drivers/gpu/drm/v3d/v3d_sched.c | 6 ++++
> drivers/gpu/drm/v3d/v3d_submit.c | 65 ++++++++++++++++++++--------------------
> 2 files changed, 39 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index 04fd1a365576..c16a9d4d41e6 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> @@ -656,6 +656,9 @@ v3d_cpu_job_run(struct drm_sched_job *sched_job)
> struct v3d_cpu_job *job = to_cpu_job(sched_job);
> struct v3d_dev *v3d = job->base.v3d;
>
> + if (unlikely(job->base.base.s_fence->finished.error))
> + return NULL;
> +
> if (job->job_type >= ARRAY_SIZE(cpu_job_function)) {
> drm_dbg(&v3d->drm, "Unknown CPU job: %d\n", job->job_type);
> return NULL;
> @@ -679,6 +682,9 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job)
> struct v3d_job *job = to_v3d_job(sched_job);
> struct v3d_dev *v3d = job->v3d;
>
> + if (unlikely(job->base.s_fence->finished.error))
> + return NULL;
> +
> v3d_job_start_stats(job);
>
> v3d_clean_caches(v3d);
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
> index dc27770d85fd..e118ba69e308 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
> @@ -365,19 +365,6 @@ v3d_submit_process_post_deps(struct v3d_submit *submit, struct drm_syncobj *sync
> }
> }
>
> -static void
> -v3d_push_job(struct v3d_job *job)
> -{
> - drm_sched_job_arm(&job->base);
> -
> - job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> -
> - /* put by scheduler job completion */
> - kref_get(&job->refcount);
> -
> - drm_sched_entity_push_job(&job->base);
> -}
> -
> static int
> v3d_submit_jobs(struct v3d_submit *submit, struct drm_syncobj *sync_out,
> struct v3d_submit_ext *se)
> @@ -390,16 +377,23 @@ v3d_submit_jobs(struct v3d_submit *submit, struct drm_syncobj *sync_out,
> for (int i = 0; i < submit->job_count; i++) {
> struct v3d_job *job = submit->jobs[i];
>
> - v3d_push_job(job);
> + drm_sched_job_arm(&job->base);
> + job->done_fence = dma_fence_get(&job->base.s_fence->finished);
>
> - if (i + 1 < submit->job_count) {
> - ret = drm_sched_job_add_dependency(&submit->jobs[i + 1]->base,
> - dma_fence_get(job->done_fence));
> - if (ret)
> - goto err;
> - }
> + /* put by scheduler job completion */
> + kref_get(&job->refcount);
> }
>
> + for (int i = 0; i + 1 < submit->job_count; i++) {
i + 1 < count is a pattern I dont' remember seeing but maybe it is just
me. Normally I would expect something like:
for (i = 1; i < count; i++)
add_dep(job[i], job[i - 1]);
Anyway, it looks functionally correct to me as is:
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Regards,
Tvrtko
> + ret = drm_sched_job_add_dependency(&submit->jobs[i + 1]->base,
> + dma_fence_get(submit->jobs[i]->done_fence));
> + if (ret)
> + goto err;
> + }
> +
> + for (int i = 0; i < submit->job_count; i++)
> + drm_sched_entity_push_job(&submit->jobs[i]->base);
> +
> mutex_unlock(&v3d->sched_lock);
>
> v3d_submit_attach_object_fences(submit);
> @@ -411,7 +405,18 @@ v3d_submit_jobs(struct v3d_submit *submit, struct drm_syncobj *sync_out,
> return 0;
>
> err:
> + /* Mark every armed job as failed so run_job() skips execution */
> + for (int i = 0; i < submit->job_count; i++)
> + dma_fence_set_error(&submit->jobs[i]->base.s_fence->finished, ret);
> +
> + for (int i = 0; i < submit->job_count; i++)
> + drm_sched_entity_push_job(&submit->jobs[i]->base);
> +
> mutex_unlock(&v3d->sched_lock);
> +
> + v3d_submit_unlock_reservations(submit);
> + v3d_submit_put_jobs(submit);
> +
> return ret;
> }
>
> @@ -1098,14 +1103,13 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
>
> ret = v3d_submit_jobs(&submit, sync_out, &se);
> if (ret)
> - goto fail_unreserve;
> + goto fail_submit;
>
> return 0;
>
> -fail_unreserve:
> - v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> +fail_submit:
> v3d_submit_put_post_deps(sync_out, &se);
>
> return ret;
> @@ -1195,14 +1199,13 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
>
> ret = v3d_submit_jobs(&submit, sync_out, &se);
> if (ret)
> - goto fail_unreserve;
> + goto fail_submit;
>
> return 0;
>
> -fail_unreserve:
> - v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> +fail_submit:
> v3d_submit_put_post_deps(sync_out, &se);
>
> return ret;
> @@ -1270,14 +1273,13 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
>
> ret = v3d_submit_jobs(&submit, sync_out, &se);
> if (ret)
> - goto fail_unreserve;
> + goto fail_submit;
>
> return 0;
>
> -fail_unreserve:
> - v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> +fail_submit:
> v3d_submit_put_post_deps(sync_out, &se);
>
> return ret;
> @@ -1378,14 +1380,13 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data,
>
> ret = v3d_submit_jobs(&submit, NULL, &se);
> if (ret)
> - goto fail_unreserve;
> + goto fail_submit;
>
> return 0;
>
> -fail_unreserve:
> - v3d_submit_unlock_reservations(&submit);
> fail:
> v3d_submit_cleanup_jobs(&submit);
> +fail_submit:
> v3d_submit_put_post_deps(NULL, &se);
>
> return ret;
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 08/14] drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls
2026-06-04 8:58 ` Tvrtko Ursulin
@ 2026-06-04 11:52 ` Maíra Canal
2026-06-04 12:05 ` Tvrtko Ursulin
0 siblings, 1 reply; 32+ messages in thread
From: Maíra Canal @ 2026-06-04 11:52 UTC (permalink / raw)
To: Tvrtko Ursulin, Melissa Wen, Iago Toral, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel
Hi Tvrtko,
On 04/06/26 05:58, Tvrtko Ursulin wrote:
>
> On 03/06/2026 23:25, Maíra Canal wrote:
>> +static int
>> +v3d_submit_jobs(struct v3d_submit *submit)
>> +{
>> + struct v3d_dev *v3d = submit->v3d;
>> + int ret = 0;
>> +
>> + mutex_lock(&v3d->sched_lock);
>> +
>> + for (int i = 0; i < submit->job_count; i++) {
>> + struct v3d_job *job = submit->jobs[i];
>> +
>> + v3d_push_job(job);
>> +
>> + if (i + 1 < submit->job_count) {
>> + ret = drm_sched_job_add_dependency(&submit->jobs[i + 1]-
>> >base,
>> + dma_fence_get(job->done_fence));
>
> Is there a theoretical race where job->done_fence could be completed and
> last reference dropped between push and here?
There is, but it'll be fixed in "[PATCH v3 14/14] drm/v3d: Ensure atomic
submissions in v3d_submit_jobs()".
>
>> + if (ret)
>> + goto err;
>> + }
>> + }
>> +
>> +err:
>> + mutex_unlock(&v3d->sched_lock);
>> + return ret;
>> +}
>> +
>> static int
>> v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv,
>> struct v3d_dev *v3d,
>> @@ -921,18 +1048,15 @@ int
>> v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
>> struct drm_file *file_priv)
>> {
>> - struct v3d_dev *v3d = to_v3d_dev(dev);
>> - struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
>> + struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv =
>> file_priv };
>> struct drm_v3d_submit_cl *args = data;
>> struct v3d_submit_ext se = {0};
>> struct v3d_bin_job *bin = NULL;
>> struct v3d_render_job *render = NULL;
>> struct v3d_job *clean_job = NULL;
>
> render and clean_job do not need to be initialised.
Ack.
>> +
>> + ret = v3d_attach_perfmon_to_jobs(&submit, args->perfmon_id);
>> + if (ret)
>> + goto fail;
>> +
>> + ret = v3d_lookup_bos(dev, file_priv,
>> + submit.jobs[submit.job_count - 1],
>
> submit.jobs[submit.job_count - 1] is repeated four times in the function
> so you could assign it to a last_job local for more readability. Up to you.
This will all disappear in the next few patches.
Considering this, would you consider R-b-ing this patch and I'll address
the initializations in the next version?
Best regards,
- Maíra
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 08/14] drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls
2026-06-04 11:52 ` Maíra Canal
@ 2026-06-04 12:05 ` Tvrtko Ursulin
2026-06-04 12:10 ` Maíra Canal
0 siblings, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2026-06-04 12:05 UTC (permalink / raw)
To: Maíra Canal, Melissa Wen, Iago Toral, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel
On 04/06/2026 12:52, Maíra Canal wrote:
> Hi Tvrtko,
>
> On 04/06/26 05:58, Tvrtko Ursulin wrote:
>>
>> On 03/06/2026 23:25, Maíra Canal wrote:
>>> +static int
>>> +v3d_submit_jobs(struct v3d_submit *submit)
>>> +{
>>> + struct v3d_dev *v3d = submit->v3d;
>>> + int ret = 0;
>>> +
>>> + mutex_lock(&v3d->sched_lock);
>>> +
>>> + for (int i = 0; i < submit->job_count; i++) {
>>> + struct v3d_job *job = submit->jobs[i];
>>> +
>>> + v3d_push_job(job);
>>> +
>>> + if (i + 1 < submit->job_count) {
>>> + ret = drm_sched_job_add_dependency(&submit->jobs[i + 1]-
>>> >base,
>>> + dma_fence_get(job->done_fence));
>>
>> Is there a theoretical race where job->done_fence could be completed
>> and last reference dropped between push and here?
>
> There is, but it'll be fixed in "[PATCH v3 14/14] drm/v3d: Ensure atomic
> submissions in v3d_submit_jobs()".
>
>>
>>> + if (ret)
>>> + goto err;
>>> + }
>>> + }
>>> +
>>> +err:
>>> + mutex_unlock(&v3d->sched_lock);
>>> + return ret;
>>> +}
>>> +
>>> static int
>>> v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv,
>>> struct v3d_dev *v3d,
>>> @@ -921,18 +1048,15 @@ int
>>> v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
>>> struct drm_file *file_priv)
>>> {
>>> - struct v3d_dev *v3d = to_v3d_dev(dev);
>>> - struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
>>> + struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv
>>> = file_priv };
>>> struct drm_v3d_submit_cl *args = data;
>>> struct v3d_submit_ext se = {0};
>>> struct v3d_bin_job *bin = NULL;
>>> struct v3d_render_job *render = NULL;
>>> struct v3d_job *clean_job = NULL;
>>
>> render and clean_job do not need to be initialised.
>
> Ack.
>
>>> +
>>> + ret = v3d_attach_perfmon_to_jobs(&submit, args->perfmon_id);
>>> + if (ret)
>>> + goto fail;
>>> +
>>> + ret = v3d_lookup_bos(dev, file_priv,
>>> + submit.jobs[submit.job_count - 1],
>>
>> submit.jobs[submit.job_count - 1] is repeated four times in the
>> function so you could assign it to a last_job local for more
>> readability. Up to you.
>
> This will all disappear in the next few patches.
>
> Considering this, would you consider R-b-ing this patch and I'll address
> the initializations in the next version?
Normally best practice would be avoid leaving a know broken stage
mid-series. But, as I don't see this series getting half-reverted, but
instead if there will be regressions you will be the only person working
to fix them on top, I think we can dispense an exception:
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Regards,
Tvrtko
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 08/14] drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls
2026-06-04 12:05 ` Tvrtko Ursulin
@ 2026-06-04 12:10 ` Maíra Canal
0 siblings, 0 replies; 32+ messages in thread
From: Maíra Canal @ 2026-06-04 12:10 UTC (permalink / raw)
To: Tvrtko Ursulin, Melissa Wen, Iago Toral, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Christian König
Cc: kernel-dev, dri-devel
On 04/06/26 09:05, Tvrtko Ursulin wrote:
>
> On 04/06/2026 12:52, Maíra Canal wrote:
>> Hi Tvrtko,
>>
>> On 04/06/26 05:58, Tvrtko Ursulin wrote:
>>>
>>> On 03/06/2026 23:25, Maíra Canal wrote:
>>>> +static int
>>>> +v3d_submit_jobs(struct v3d_submit *submit)
>>>> +{
>>>> + struct v3d_dev *v3d = submit->v3d;
>>>> + int ret = 0;
>>>> +
>>>> + mutex_lock(&v3d->sched_lock);
>>>> +
>>>> + for (int i = 0; i < submit->job_count; i++) {
>>>> + struct v3d_job *job = submit->jobs[i];
>>>> +
>>>> + v3d_push_job(job);
>>>> +
>>>> + if (i + 1 < submit->job_count) {
>>>> + ret = drm_sched_job_add_dependency(&submit->jobs[i +
>>>> 1]- >base,
>>>> + dma_fence_get(job->done_fence));
>>>
>>> Is there a theoretical race where job->done_fence could be completed
>>> and last reference dropped between push and here?
>>
>> There is, but it'll be fixed in "[PATCH v3 14/14] drm/v3d: Ensure atomic
>> submissions in v3d_submit_jobs()".
>>
>>>
>>>> + if (ret)
>>>> + goto err;
>>>> + }
>>>> + }
>>>> +
>>>> +err:
>>>> + mutex_unlock(&v3d->sched_lock);
>>>> + return ret;
>>>> +}
>>>> +
>>>> static int
>>>> v3d_setup_csd_jobs_and_bos(struct drm_file *file_priv,
>>>> struct v3d_dev *v3d,
>>>> @@ -921,18 +1048,15 @@ int
>>>> v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
>>>> struct drm_file *file_priv)
>>>> {
>>>> - struct v3d_dev *v3d = to_v3d_dev(dev);
>>>> - struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
>>>> + struct v3d_submit submit = { .v3d = to_v3d_dev(dev), .file_priv
>>>> = file_priv };
>>>> struct drm_v3d_submit_cl *args = data;
>>>> struct v3d_submit_ext se = {0};
>>>> struct v3d_bin_job *bin = NULL;
>>>> struct v3d_render_job *render = NULL;
>>>> struct v3d_job *clean_job = NULL;
>>>
>>> render and clean_job do not need to be initialised.
>>
>> Ack.
>>
>>>> +
>>>> + ret = v3d_attach_perfmon_to_jobs(&submit, args->perfmon_id);
>>>> + if (ret)
>>>> + goto fail;
>>>> +
>>>> + ret = v3d_lookup_bos(dev, file_priv,
>>>> + submit.jobs[submit.job_count - 1],
>>>
>>> submit.jobs[submit.job_count - 1] is repeated four times in the
>>> function so you could assign it to a last_job local for more
>>> readability. Up to you.
>>
>> This will all disappear in the next few patches.
>>
>> Considering this, would you consider R-b-ing this patch and I'll address
>> the initializations in the next version?
>
> Normally best practice would be avoid leaving a know broken stage mid-
> series. But, as I don't see this series getting half-reverted, but
To be fair about drm_sched_job_add_dependency() situation, it was
already broken before this series. We are just leaving it as broken as
it was to finally fix it in the end :)
> instead if there will be regressions you will be the only person working
> to fix them on top, I think we can dispense an exception:
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Thank you a lot for all the reviews!
Best regards,
- Maíra
>
> Regards,
>
> Tvrtko
>
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2026-06-04 12:10 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-03 22:25 [PATCH v3 00/14] drm/v3d: Scheduler and submission fixes and refactoring Maíra Canal
2026-06-03 22:25 ` [PATCH v3 01/14] drm/v3d: Drop unused drm_encoder.h include from v3d_drv.h Maíra Canal
2026-06-03 22:25 ` [PATCH v3 02/14] drm/v3d: Clear queue->active_job when v3d_fence_create() fails Maíra Canal
2026-06-04 2:50 ` sashiko-bot
2026-06-03 22:25 ` [PATCH v3 03/14] drm/v3d: Use inline lock for dma fence initialization Maíra Canal
2026-06-04 2:49 ` sashiko-bot
2026-06-03 22:25 ` [PATCH v3 04/14] drm/v3d: Replace spin_lock_irqsave() with spin_lock() Maíra Canal
2026-06-04 2:51 ` sashiko-bot
2026-06-03 22:25 ` [PATCH v3 05/14] drm/v3d: Extract v3d_job_add_syncobjs() helper Maíra Canal
2026-06-03 22:25 ` [PATCH v3 06/14] drm/v3d: Reject invalid syncobj handles in submit ioctls Maíra Canal
2026-06-04 2:51 ` sashiko-bot
2026-06-03 22:25 ` [PATCH v3 07/14] drm/v3d: Migrate BO reservation locking to DRM exec Maíra Canal
2026-06-03 22:25 ` [PATCH v3 08/14] drm/v3d: Introduce struct v3d_submit and convert CL/TFU/CSD ioctls Maíra Canal
2026-06-04 2:54 ` sashiko-bot
2026-06-04 8:58 ` Tvrtko Ursulin
2026-06-04 11:52 ` Maíra Canal
2026-06-04 12:05 ` Tvrtko Ursulin
2026-06-04 12:10 ` Maíra Canal
2026-06-03 22:25 ` [PATCH v3 09/14] drm/v3d: Make v3d_get_cpu_indirect_csd_params() a pure parser Maíra Canal
2026-06-04 2:53 ` sashiko-bot
2026-06-03 22:25 ` [PATCH v3 10/14] drm/v3d: Convert submit helpers to operate on struct v3d_submit Maíra Canal
2026-06-04 9:49 ` Tvrtko Ursulin
2026-06-03 22:25 ` [PATCH v3 11/14] drm/v3d: Refactor CPU ioctl into unified submission chain Maíra Canal
2026-06-04 2:58 ` sashiko-bot
2026-06-04 9:54 ` Tvrtko Ursulin
2026-06-03 22:25 ` [PATCH v3 12/14] drm/v3d: Split BO fence attach from syncobj output handling Maíra Canal
2026-06-04 9:59 ` Tvrtko Ursulin
2026-06-03 22:25 ` [PATCH v3 13/14] drm/v3d: Reject invalid out_sync handles in submit ioctls Maíra Canal
2026-06-04 2:59 ` sashiko-bot
2026-06-04 10:12 ` Tvrtko Ursulin
2026-06-03 22:25 ` [PATCH v3 14/14] drm/v3d: Ensure atomic submissions in v3d_submit_jobs() Maíra Canal
2026-06-04 10:25 ` Tvrtko Ursulin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.