linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI
@ 2025-04-24  8:38 Pierre-Eric Pelloux-Prayer
  2025-04-24  8:38 ` [PATCH v9 01/10] drm/debugfs: output client_id in in drm_clients_info Pierre-Eric Pelloux-Prayer
                   ` (10 more replies)
  0 siblings, 11 replies; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-04-24  8:38 UTC (permalink / raw)
  Cc: Pierre-Eric Pelloux-Prayer, Christian König,
	Maíra Canal, Thomas Hellström, Abhinav Kumar,
	Alex Deucher, Boris Brezillon, Danilo Krummrich, David Airlie,
	Dmitry Baryshkov, Felix Kuehling, Frank Binns, Jonathan Corbet,
	Liviu Dudau, Lizhi Hou, Lucas De Marchi, Lucas Stach, Lyude Paul,
	Maarten Lankhorst, Matt Coster, Matthew Brost, Maxime Ripard,
	Melissa Wen, Min Ma, Oded Gabbay, Philipp Stanner, Qiang Yu,
	Rob Clark, Rob Herring, Rodrigo Vivi, Simona Vetter, Steven Price,
	Sumit Semwal, Thomas Zimmermann, amd-gfx, dri-devel, etnaviv,
	freedreno, intel-xe, lima, linaro-mm-sig, linux-arm-msm,
	linux-doc, linux-kernel, linux-media, nouveau

Hi,

The initial goal of this series was to improve the drm and amdgpu
trace events to be able to expose more of the inner workings of
the scheduler and drivers to developers via tools.

Then, the series evolved to become focused only on gpu_scheduler.
The changes around vblank events will be part of a different
series, as well as the amdgpu ones.

Moreover Sima suggested to make some trace events stable uAPI,
so tools can rely on them long term.

The first patches extend and cleanup the gpu scheduler events,
then add a documentation entry in drm-uapi.rst.

The last 2 patches are new in v8. One is based on a suggestion
from Tvrtko and gets rid of drm_sched_job::id. The other is a
cleanup of amdgpu trace events to use the fence=%llu:%llu format.

The drm_sched_job patches don't affect gpuvis which has code to parse
the gpu_scheduler events but these events are not enabled.

Changes since v8:
* swapped patches 8 & 9
* rebased on drm-next

Changes since v7:
* uint64_t -> u64
* reworked dependencies tracing (Tvrtko)
* use common name prefix for all events (Tvrtko)
* dropped drm_sched_job::id (Tvrtko)

Useful links:
- userspace tool using the updated events:
https://gitlab.freedesktop.org/tomstdenis/umr/-/merge_requests/37
- v8:
https://lists.freedesktop.org/archives/dri-devel/2025-March/496781.html

Pierre-Eric Pelloux-Prayer (10):
  drm/debugfs: output client_id in in drm_clients_info
  drm/sched: store the drm client_id in drm_sched_fence
  drm/sched: add device name to the drm_sched_process_job event
  drm/sched: cleanup gpu_scheduler trace events
  drm/sched: trace dependencies for gpu jobs
  drm/sched: add the drm_client_id to the drm_sched_run/exec_job events
  drm/sched: cleanup event names
  drm: get rid of drm_sched_job::id
  drm/doc: document some tracepoints as uAPI
  drm/amdgpu: update trace format to match gpu_scheduler_trace

 Documentation/gpu/drm-uapi.rst                |  19 ++++
 drivers/accel/amdxdna/aie2_ctx.c              |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h       |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h     |  32 +++---
 drivers/gpu/drm/drm_debugfs.c                 |  10 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |   2 +-
 drivers/gpu/drm/imagination/pvr_job.c         |   2 +-
 drivers/gpu/drm/imagination/pvr_queue.c       |   5 +-
 drivers/gpu/drm/imagination/pvr_queue.h       |   2 +-
 drivers/gpu/drm/lima/lima_gem.c               |   2 +-
 drivers/gpu/drm/lima/lima_sched.c             |   6 +-
 drivers/gpu/drm/lima/lima_sched.h             |   3 +-
 drivers/gpu/drm/msm/msm_gem_submit.c          |   8 +-
 drivers/gpu/drm/nouveau/nouveau_sched.c       |   3 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c       |   2 +-
 drivers/gpu/drm/panthor/panthor_drv.c         |   3 +-
 drivers/gpu/drm/panthor/panthor_mmu.c         |   2 +-
 drivers/gpu/drm/panthor/panthor_sched.c       |   5 +-
 drivers/gpu/drm/panthor/panthor_sched.h       |   3 +-
 .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 100 +++++++++++++-----
 drivers/gpu/drm/scheduler/sched_entity.c      |  16 ++-
 drivers/gpu/drm/scheduler/sched_fence.c       |   4 +-
 drivers/gpu/drm/scheduler/sched_internal.h    |   2 +-
 drivers/gpu/drm/scheduler/sched_main.c        |  11 +-
 .../gpu/drm/scheduler/tests/mock_scheduler.c  |   2 +-
 drivers/gpu/drm/v3d/v3d_submit.c              |   2 +-
 drivers/gpu/drm/xe/xe_sched_job.c             |   3 +-
 include/drm/gpu_scheduler.h                   |  13 ++-
 31 files changed, 184 insertions(+), 97 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v9 01/10] drm/debugfs: output client_id in in drm_clients_info
  2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
@ 2025-04-24  8:38 ` Pierre-Eric Pelloux-Prayer
  2025-04-24  8:38 ` [PATCH v9 02/10] drm/sched: store the drm client_id in drm_sched_fence Pierre-Eric Pelloux-Prayer
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-04-24  8:38 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter
  Cc: Pierre-Eric Pelloux-Prayer, Christian König, dri-devel,
	linux-kernel

client_id is a unique id used by fdinfo. Having it listed in 'clients'
output means a userspace application can correlate the fields, eg:
given a fdinfo id get the fdinfo name.

Geiven that client_id is a uint64_t, we use a %20llu printf format to
keep the output aligned (20 = digit count of the biggest uint64_t).

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/drm_debugfs.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
index 3dfd8b34dceb..abceb28b23fc 100644
--- a/drivers/gpu/drm/drm_debugfs.c
+++ b/drivers/gpu/drm/drm_debugfs.c
@@ -77,14 +77,15 @@ static int drm_clients_info(struct seq_file *m, void *data)
 	kuid_t uid;
 
 	seq_printf(m,
-		   "%20s %5s %3s master a %5s %10s %*s\n",
+		   "%20s %5s %3s master a %5s %10s %*s %20s\n",
 		   "command",
 		   "tgid",
 		   "dev",
 		   "uid",
 		   "magic",
 		   DRM_CLIENT_NAME_MAX_LEN,
-		   "name");
+		   "name",
+		   "id");
 
 	/* dev->filelist is sorted youngest first, but we want to present
 	 * oldest first (i.e. kernel, servers, clients), so walk backwardss.
@@ -100,7 +101,7 @@ static int drm_clients_info(struct seq_file *m, void *data)
 		pid = rcu_dereference(priv->pid);
 		task = pid_task(pid, PIDTYPE_TGID);
 		uid = task ? __task_cred(task)->euid : GLOBAL_ROOT_UID;
-		seq_printf(m, "%20s %5d %3d   %c    %c %5d %10u %*s\n",
+		seq_printf(m, "%20s %5d %3d   %c    %c %5d %10u %*s %20llu\n",
 			   task ? task->comm : "<unknown>",
 			   pid_vnr(pid),
 			   priv->minor->index,
@@ -109,7 +110,8 @@ static int drm_clients_info(struct seq_file *m, void *data)
 			   from_kuid_munged(seq_user_ns(m), uid),
 			   priv->magic,
 			   DRM_CLIENT_NAME_MAX_LEN,
-			   priv->client_name ? priv->client_name : "<unset>");
+			   priv->client_name ? priv->client_name : "<unset>",
+			   priv->client_id);
 		rcu_read_unlock();
 		mutex_unlock(&priv->client_name_lock);
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v9 02/10] drm/sched: store the drm client_id in drm_sched_fence
  2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
  2025-04-24  8:38 ` [PATCH v9 01/10] drm/debugfs: output client_id in in drm_clients_info Pierre-Eric Pelloux-Prayer
@ 2025-04-24  8:38 ` Pierre-Eric Pelloux-Prayer
  2025-05-14 12:44   ` Philipp Stanner
  2025-04-24  8:38 ` [PATCH v9 03/10] drm/sched: add device name to the drm_sched_process_job event Pierre-Eric Pelloux-Prayer
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-04-24  8:38 UTC (permalink / raw)
  To: Min Ma, Lizhi Hou, Oded Gabbay, Felix Kuehling, Alex Deucher,
	Christian König, David Airlie, Simona Vetter, Lucas Stach,
	Russell King, Christian Gmeiner, Frank Binns, Matt Coster,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Qiang Yu,
	Rob Clark, Abhinav Kumar, Dmitry Baryshkov, Sean Paul,
	Marijn Suijten, Lyude Paul, Danilo Krummrich, Boris Brezillon,
	Rob Herring, Steven Price, Liviu Dudau, Matthew Brost,
	Philipp Stanner, Melissa Wen, Maíra Canal, Lucas De Marchi,
	Thomas Hellström, Rodrigo Vivi
  Cc: Pierre-Eric Pelloux-Prayer, Christian König, dri-devel,
	linux-kernel, amd-gfx, etnaviv, lima, linux-arm-msm, freedreno,
	nouveau, intel-xe

This will be used in a later commit to trace the drm client_id in
some of the gpu_scheduler trace events.

This requires changing all the users of drm_sched_job_init to
add an extra parameter.

The newly added drm_client_id field in the drm_sched_fence is a bit
of a duplicate of the owner one. One suggestion I received was to
merge those 2 fields - this can't be done right now as amdgpu uses
some special values (AMDGPU_FENCE_OWNER_*) that can't really be
translated into a client id. Christian is working on getting rid of
those; when it's done we should be able to squash owner/drm_client_id
together.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/accel/amdxdna/aie2_ctx.c                 |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c       |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c          |  8 +++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h          |  3 ++-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c     |  2 +-
 drivers/gpu/drm/imagination/pvr_job.c            |  2 +-
 drivers/gpu/drm/imagination/pvr_queue.c          |  5 +++--
 drivers/gpu/drm/imagination/pvr_queue.h          |  2 +-
 drivers/gpu/drm/lima/lima_gem.c                  |  2 +-
 drivers/gpu/drm/lima/lima_sched.c                |  6 ++++--
 drivers/gpu/drm/lima/lima_sched.h                |  3 ++-
 drivers/gpu/drm/msm/msm_gem_submit.c             |  8 +++++---
 drivers/gpu/drm/nouveau/nouveau_sched.c          |  3 ++-
 drivers/gpu/drm/panfrost/panfrost_drv.c          |  2 +-
 drivers/gpu/drm/panthor/panthor_drv.c            |  3 ++-
 drivers/gpu/drm/panthor/panthor_mmu.c            |  2 +-
 drivers/gpu/drm/panthor/panthor_sched.c          |  5 +++--
 drivers/gpu/drm/panthor/panthor_sched.h          |  3 ++-
 drivers/gpu/drm/scheduler/sched_fence.c          |  4 +++-
 drivers/gpu/drm/scheduler/sched_internal.h       |  2 +-
 drivers/gpu/drm/scheduler/sched_main.c           |  6 ++++--
 drivers/gpu/drm/scheduler/tests/mock_scheduler.c |  2 +-
 drivers/gpu/drm/v3d/v3d_submit.c                 |  2 +-
 drivers/gpu/drm/xe/xe_sched_job.c                |  3 ++-
 include/drm/gpu_scheduler.h                      | 10 +++++++++-
 26 files changed, 62 insertions(+), 34 deletions(-)

diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c
index e04549f64d69..3e38a5f637ea 100644
--- a/drivers/accel/amdxdna/aie2_ctx.c
+++ b/drivers/accel/amdxdna/aie2_ctx.c
@@ -848,7 +848,8 @@ int aie2_cmd_submit(struct amdxdna_hwctx *hwctx, struct amdxdna_sched_job *job,
 		goto up_sem;
 	}
 
-	ret = drm_sched_job_init(&job->base, &hwctx->priv->entity, 1, hwctx);
+	ret = drm_sched_job_init(&job->base, &hwctx->priv->entity, 1, hwctx,
+				 hwctx->client->filp->client_id);
 	if (ret) {
 		XDNA_ERR(xdna, "DRM job init failed, ret %d", ret);
 		goto free_chain;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 4cec3a873995..1a77ba7036c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -639,7 +639,7 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev,
 		goto err;
 	}
 
-	ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job);
+	ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job, 0);
 	if (ret)
 		goto err;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 82df06a72ee0..5a231b997d65 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -293,7 +293,8 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
 
 	for (i = 0; i < p->gang_size; ++i) {
 		ret = amdgpu_job_alloc(p->adev, vm, p->entities[i], vm,
-				       num_ibs[i], &p->jobs[i]);
+				       num_ibs[i], &p->jobs[i],
+				       p->filp->client_id);
 		if (ret)
 			goto free_all_kdata;
 		p->jobs[i]->enforce_isolation = p->adev->enforce_isolation[fpriv->xcp_id];
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index acb21fc8b3ce..75262ce8db27 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -204,7 +204,8 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
 
 int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		     struct drm_sched_entity *entity, void *owner,
-		     unsigned int num_ibs, struct amdgpu_job **job)
+		     unsigned int num_ibs, struct amdgpu_job **job,
+		     u64 drm_client_id)
 {
 	if (num_ibs == 0)
 		return -EINVAL;
@@ -222,7 +223,8 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 	if (!entity)
 		return 0;
 
-	return drm_sched_job_init(&(*job)->base, entity, 1, owner);
+	return drm_sched_job_init(&(*job)->base, entity, 1, owner,
+				  drm_client_id);
 }
 
 int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
@@ -232,7 +234,7 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
 {
 	int r;
 
-	r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job);
+	r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job, 0);
 	if (r)
 		return r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
index ce6b9ba967ff..5a8bc6342222 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
@@ -90,7 +90,8 @@ static inline struct amdgpu_ring *amdgpu_job_ring(struct amdgpu_job *job)
 
 int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		     struct drm_sched_entity *entity, void *owner,
-		     unsigned int num_ibs, struct amdgpu_job **job);
+		     unsigned int num_ibs, struct amdgpu_job **job,
+		     u64 drm_client_id);
 int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
 			     struct drm_sched_entity *entity, void *owner,
 			     size_t size, enum amdgpu_ib_pool_type pool_type,
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 3c0a5c3e0e3d..76c742328edb 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -534,7 +534,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
 
 	ret = drm_sched_job_init(&submit->sched_job,
 				 &ctx->sched_entity[args->pipe],
-				 1, submit->ctx);
+				 1, submit->ctx, file->client_id);
 	if (ret)
 		goto err_submit_put;
 
diff --git a/drivers/gpu/drm/imagination/pvr_job.c b/drivers/gpu/drm/imagination/pvr_job.c
index 59b334d094fa..7564b0f21b42 100644
--- a/drivers/gpu/drm/imagination/pvr_job.c
+++ b/drivers/gpu/drm/imagination/pvr_job.c
@@ -446,7 +446,7 @@ create_job(struct pvr_device *pvr_dev,
 	if (err)
 		goto err_put_job;
 
-	err = pvr_queue_job_init(job);
+	err = pvr_queue_job_init(job, pvr_file->file->client_id);
 	if (err)
 		goto err_put_job;
 
diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c
index 5e9bc0992824..5a41ee79fed6 100644
--- a/drivers/gpu/drm/imagination/pvr_queue.c
+++ b/drivers/gpu/drm/imagination/pvr_queue.c
@@ -1073,6 +1073,7 @@ static int pvr_queue_cleanup_fw_context(struct pvr_queue *queue)
 /**
  * pvr_queue_job_init() - Initialize queue related fields in a pvr_job object.
  * @job: The job to initialize.
+ * @drm_client_id: drm_file.client_id submitting the job
  *
  * Bind the job to a queue and allocate memory to guarantee pvr_queue_job_arm()
  * and pvr_queue_job_push() can't fail. We also make sure the context type is
@@ -1082,7 +1083,7 @@ static int pvr_queue_cleanup_fw_context(struct pvr_queue *queue)
  *  * 0 on success, or
  *  * An error code if something failed.
  */
-int pvr_queue_job_init(struct pvr_job *job)
+int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id)
 {
 	/* Fragment jobs need at least one native fence wait on the geometry job fence. */
 	u32 min_native_dep_count = job->type == DRM_PVR_JOB_TYPE_FRAGMENT ? 1 : 0;
@@ -1099,7 +1100,7 @@ int pvr_queue_job_init(struct pvr_job *job)
 	if (!pvr_cccb_cmdseq_can_fit(&queue->cccb, job_cmds_size(job, min_native_dep_count)))
 		return -E2BIG;
 
-	err = drm_sched_job_init(&job->base, &queue->entity, 1, THIS_MODULE);
+	err = drm_sched_job_init(&job->base, &queue->entity, 1, THIS_MODULE, drm_client_id);
 	if (err)
 		return err;
 
diff --git a/drivers/gpu/drm/imagination/pvr_queue.h b/drivers/gpu/drm/imagination/pvr_queue.h
index 93fe9ac9f58c..fc1986d73fc8 100644
--- a/drivers/gpu/drm/imagination/pvr_queue.h
+++ b/drivers/gpu/drm/imagination/pvr_queue.h
@@ -143,7 +143,7 @@ struct pvr_queue {
 
 bool pvr_queue_fence_is_ufo_backed(struct dma_fence *f);
 
-int pvr_queue_job_init(struct pvr_job *job);
+int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id);
 
 void pvr_queue_job_cleanup(struct pvr_job *job);
 
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index 5deec673c11e..9722b847a539 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -341,7 +341,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit)
 
 	err = lima_sched_task_init(
 		submit->task, submit->ctx->context + submit->pipe,
-		bos, submit->nr_bos, vm);
+		bos, submit->nr_bos, vm, file->client_id);
 	if (err)
 		goto err_out1;
 
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index 7934098e651b..954f4325b859 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -113,7 +113,8 @@ static inline struct lima_sched_pipe *to_lima_pipe(struct drm_gpu_scheduler *sch
 int lima_sched_task_init(struct lima_sched_task *task,
 			 struct lima_sched_context *context,
 			 struct lima_bo **bos, int num_bos,
-			 struct lima_vm *vm)
+			 struct lima_vm *vm,
+			 u64 drm_client_id)
 {
 	int err, i;
 
@@ -124,7 +125,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
 	for (i = 0; i < num_bos; i++)
 		drm_gem_object_get(&bos[i]->base.base);
 
-	err = drm_sched_job_init(&task->base, &context->base, 1, vm);
+	err = drm_sched_job_init(&task->base, &context->base, 1, vm,
+				 drm_client_id);
 	if (err) {
 		kfree(task->bos);
 		return err;
diff --git a/drivers/gpu/drm/lima/lima_sched.h b/drivers/gpu/drm/lima/lima_sched.h
index 85b23ba901d5..1a08faf8a529 100644
--- a/drivers/gpu/drm/lima/lima_sched.h
+++ b/drivers/gpu/drm/lima/lima_sched.h
@@ -88,7 +88,8 @@ struct lima_sched_pipe {
 int lima_sched_task_init(struct lima_sched_task *task,
 			 struct lima_sched_context *context,
 			 struct lima_bo **bos, int num_bos,
-			 struct lima_vm *vm);
+			 struct lima_vm *vm,
+			 u64 drm_client_id);
 void lima_sched_task_fini(struct lima_sched_task *task);
 
 int lima_sched_context_init(struct lima_sched_pipe *pipe,
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index 3e9aa2cc38ef..d9be0fe3d674 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -30,7 +30,7 @@
 static struct msm_gem_submit *submit_create(struct drm_device *dev,
 		struct msm_gpu *gpu,
 		struct msm_gpu_submitqueue *queue, uint32_t nr_bos,
-		uint32_t nr_cmds)
+		uint32_t nr_cmds, u64 drm_client_id)
 {
 	static atomic_t ident = ATOMIC_INIT(0);
 	struct msm_gem_submit *submit;
@@ -54,7 +54,8 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
 		return ERR_PTR(ret);
 	}
 
-	ret = drm_sched_job_init(&submit->base, queue->entity, 1, queue);
+	ret = drm_sched_job_init(&submit->base, queue->entity, 1, queue,
+				 drm_client_id);
 	if (ret) {
 		kfree(submit->hw_fence);
 		kfree(submit);
@@ -693,7 +694,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 		}
 	}
 
-	submit = submit_create(dev, gpu, queue, args->nr_bos, args->nr_cmds);
+	submit = submit_create(dev, gpu, queue, args->nr_bos, args->nr_cmds,
+			       file->client_id);
 	if (IS_ERR(submit)) {
 		ret = PTR_ERR(submit);
 		goto out_post_unlock;
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
index d326e55d2d24..460a5fb02412 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
@@ -87,7 +87,8 @@ nouveau_job_init(struct nouveau_job *job,
 	}
 
 	ret = drm_sched_job_init(&job->base, &sched->entity,
-				 args->credits, NULL);
+				 args->credits, NULL,
+				 job->file_priv->client_id);
 	if (ret)
 		goto err_free_chains;
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index b87f83e94eda..d5c2c6530ed8 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -312,7 +312,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 
 	ret = drm_sched_job_init(&job->base,
 				 &file_priv->sched_entity[slot],
-				 1, NULL);
+				 1, NULL, file->client_id);
 	if (ret)
 		goto out_put_job;
 
diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
index 06fe46e32073..bd8e1900c919 100644
--- a/drivers/gpu/drm/panthor/panthor_drv.c
+++ b/drivers/gpu/drm/panthor/panthor_drv.c
@@ -989,7 +989,8 @@ static int panthor_ioctl_group_submit(struct drm_device *ddev, void *data,
 		const struct drm_panthor_queue_submit *qsubmit = &jobs_args[i];
 		struct drm_sched_job *job;
 
-		job = panthor_job_create(pfile, args->group_handle, qsubmit);
+		job = panthor_job_create(pfile, args->group_handle, qsubmit,
+					 file->client_id);
 		if (IS_ERR(job)) {
 			ret = PTR_ERR(job);
 			goto out_cleanup_submit_ctx;
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index 12a02e28f50f..e0c79bd2d173 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -2516,7 +2516,7 @@ panthor_vm_bind_job_create(struct drm_file *file,
 	kref_init(&job->refcount);
 	job->vm = panthor_vm_get(vm);
 
-	ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm);
+	ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm, file->client_id);
 	if (ret)
 		goto err_put_job;
 
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index 446ec780eb4a..2af860c9068a 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -3729,7 +3729,8 @@ struct panthor_vm *panthor_job_vm(struct drm_sched_job *sched_job)
 struct drm_sched_job *
 panthor_job_create(struct panthor_file *pfile,
 		   u16 group_handle,
-		   const struct drm_panthor_queue_submit *qsubmit)
+		   const struct drm_panthor_queue_submit *qsubmit,
+		   u64 drm_client_id)
 {
 	struct panthor_group_pool *gpool = pfile->groups;
 	struct panthor_job *job;
@@ -3801,7 +3802,7 @@ panthor_job_create(struct panthor_file *pfile,
 
 	ret = drm_sched_job_init(&job->base,
 				 &job->group->queues[job->queue_idx]->entity,
-				 credits, job->group);
+				 credits, job->group, drm_client_id);
 	if (ret)
 		goto err_put_job;
 
diff --git a/drivers/gpu/drm/panthor/panthor_sched.h b/drivers/gpu/drm/panthor/panthor_sched.h
index e650a445cf50..742b0b4ff3a3 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.h
+++ b/drivers/gpu/drm/panthor/panthor_sched.h
@@ -29,7 +29,8 @@ int panthor_group_get_state(struct panthor_file *pfile,
 struct drm_sched_job *
 panthor_job_create(struct panthor_file *pfile,
 		   u16 group_handle,
-		   const struct drm_panthor_queue_submit *qsubmit);
+		   const struct drm_panthor_queue_submit *qsubmit,
+		   u64 drm_client_id);
 struct drm_sched_job *panthor_job_get(struct drm_sched_job *job);
 struct panthor_vm *panthor_job_vm(struct drm_sched_job *sched_job);
 void panthor_job_put(struct drm_sched_job *job);
diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
index e971528504a5..d208d384d38d 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -206,7 +206,8 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
 EXPORT_SYMBOL(to_drm_sched_fence);
 
 struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
-					      void *owner)
+					      void *owner,
+					      u64 drm_client_id)
 {
 	struct drm_sched_fence *fence = NULL;
 
@@ -215,6 +216,7 @@ struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
 		return NULL;
 
 	fence->owner = owner;
+	fence->drm_client_id = drm_client_id;
 	spin_lock_init(&fence->lock);
 
 	return fence;
diff --git a/drivers/gpu/drm/scheduler/sched_internal.h b/drivers/gpu/drm/scheduler/sched_internal.h
index 599cf6e1bb74..7ea5a6736f98 100644
--- a/drivers/gpu/drm/scheduler/sched_internal.h
+++ b/drivers/gpu/drm/scheduler/sched_internal.h
@@ -24,7 +24,7 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity);
 struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity);
 
 struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *s_entity,
-					      void *owner);
+					      void *owner, u64 drm_client_id);
 void drm_sched_fence_init(struct drm_sched_fence *fence,
 			  struct drm_sched_entity *entity);
 void drm_sched_fence_free(struct drm_sched_fence *fence);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 829579c41c6b..60611618f3ab 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -764,6 +764,7 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
  * @credits: the number of credits this job contributes to the schedulers
  * credit limit
  * @owner: job owner for debugging
+ * @drm_client_id: drm_file.client_id of the owner
  *
  * Refer to drm_sched_entity_push_job() documentation
  * for locking considerations.
@@ -784,7 +785,8 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
  */
 int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_sched_entity *entity,
-		       u32 credits, void *owner)
+		       u32 credits, void *owner,
+		       uint64_t drm_client_id)
 {
 	if (!entity->rq) {
 		/* This will most likely be followed by missing frames
@@ -810,7 +812,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 
 	job->entity = entity;
 	job->credits = credits;
-	job->s_fence = drm_sched_fence_alloc(entity, owner);
+	job->s_fence = drm_sched_fence_alloc(entity, owner, drm_client_id);
 	if (!job->s_fence)
 		return -ENOMEM;
 
diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
index f999c8859cf7..09ffbdb32d76 100644
--- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
+++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
@@ -35,7 +35,7 @@ drm_mock_sched_entity_new(struct kunit *test,
 	ret = drm_sched_entity_init(&entity->base,
 				    priority,
 				    &drm_sched, 1,
-				    NULL);
+				    NULL, 1);
 	KUNIT_ASSERT_EQ(test, ret, 0);
 
 	entity->test = test;
diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c
index 4ff5de46fb22..5171ffe9012d 100644
--- a/drivers/gpu/drm/v3d/v3d_submit.c
+++ b/drivers/gpu/drm/v3d/v3d_submit.c
@@ -169,7 +169,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 	job->file = file_priv;
 
 	ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
-				 1, v3d_priv);
+				 1, v3d_priv, file_priv->client_id);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c
index 1905ca590965..f4679cb9a56b 100644
--- a/drivers/gpu/drm/xe/xe_sched_job.c
+++ b/drivers/gpu/drm/xe/xe_sched_job.c
@@ -113,7 +113,8 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
 	kref_init(&job->refcount);
 	xe_exec_queue_get(job->q);
 
-	err = drm_sched_job_init(&job->drm, q->entity, 1, NULL);
+	err = drm_sched_job_init(&job->drm, q->entity, 1, NULL,
+				 q->xef->drm->client_id);
 	if (err)
 		goto err_free;
 
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 1a7e377d4cbb..6fe3b4c0cffb 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -305,6 +305,13 @@ struct drm_sched_fence {
          * @owner: job owner for debugging
          */
 	void				*owner;
+
+	/**
+	 * @drm_client_id:
+	 *
+	 * The client_id of the drm_file which owns the job.
+	 */
+	uint64_t			drm_client_id;
 };
 
 struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
@@ -629,7 +636,8 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list,
 
 int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_sched_entity *entity,
-		       u32 credits, void *owner);
+		       u32 credits, void *owner,
+		       u64 drm_client_id);
 void drm_sched_job_arm(struct drm_sched_job *job);
 void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
 int drm_sched_job_add_dependency(struct drm_sched_job *job,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v9 03/10] drm/sched: add device name to the drm_sched_process_job event
  2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
  2025-04-24  8:38 ` [PATCH v9 01/10] drm/debugfs: output client_id in in drm_clients_info Pierre-Eric Pelloux-Prayer
  2025-04-24  8:38 ` [PATCH v9 02/10] drm/sched: store the drm client_id in drm_sched_fence Pierre-Eric Pelloux-Prayer
@ 2025-04-24  8:38 ` Pierre-Eric Pelloux-Prayer
  2025-05-19 15:34   ` Danilo Krummrich
  2025-04-24  8:38 ` [PATCH v9 04/10] drm/sched: cleanup gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-04-24  8:38 UTC (permalink / raw)
  To: Matthew Brost, Danilo Krummrich, Philipp Stanner,
	Christian König, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter
  Cc: Pierre-Eric Pelloux-Prayer, Christian König, dri-devel,
	linux-kernel

Since switching the scheduler from using kthreads to workqueues in
commit a6149f039369 ("drm/sched: Convert drm scheduler to use a work
queue rather than kthread") userspace applications cannot determine
the device from the PID of the threads sending the trace events
anymore.

Each queue had its own kthread which had a given PID for the whole
time. So, at least for amdgpu, it was possible to associate a PID
to the hardware queues of each GPU in the system. Then, when a
drm_run_job trace event was received by userspace, the source PID
allowed to associate it back to the correct GPU.

With workqueues this is not possible anymore, so the event needs to
contain the dev_name() to identify the device.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/scheduler/gpu_scheduler_trace.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
index f56e77e7f6d0..713df3516a17 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
@@ -42,6 +42,7 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 			     __field(uint64_t, id)
 			     __field(u32, job_count)
 			     __field(int, hw_job_count)
+			     __string(dev, dev_name(sched_job->sched->dev))
 			     ),
 
 	    TP_fast_assign(
@@ -52,9 +53,10 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 			   __entry->job_count = spsc_queue_count(&entity->job_queue);
 			   __entry->hw_job_count = atomic_read(
 				   &sched_job->sched->credit_count);
+			   __assign_str(dev);
 			   ),
-	    TP_printk("entity=%p, id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
-		      __entry->entity, __entry->id,
+	    TP_printk("dev=%s, entity=%p, id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
+		      __get_str(dev), __entry->entity, __entry->id,
 		      __entry->fence, __get_str(name),
 		      __entry->job_count, __entry->hw_job_count)
 );
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v9 04/10] drm/sched: cleanup gpu_scheduler trace events
  2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
                   ` (2 preceding siblings ...)
  2025-04-24  8:38 ` [PATCH v9 03/10] drm/sched: add device name to the drm_sched_process_job event Pierre-Eric Pelloux-Prayer
@ 2025-04-24  8:38 ` Pierre-Eric Pelloux-Prayer
  2025-04-24  8:38 ` [PATCH v9 05/10] drm/sched: trace dependencies for gpu jobs Pierre-Eric Pelloux-Prayer
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-04-24  8:38 UTC (permalink / raw)
  To: Matthew Brost, Danilo Krummrich, Philipp Stanner,
	Christian König, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal
  Cc: Pierre-Eric Pelloux-Prayer, Tvrtko Ursulin, dri-devel,
	linux-kernel, linux-media, linaro-mm-sig

A fence uniquely identify a job, so this commits updates the places
where a kernel pointer was used as an identifier by:

   "fence=%llu:%llu"

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
---
 .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 44 ++++++++++---------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
index 713df3516a17..6f5bd05131aa 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
@@ -36,28 +36,28 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
 	    TP_ARGS(sched_job, entity),
 	    TP_STRUCT__entry(
-			     __field(struct drm_sched_entity *, entity)
-			     __field(struct dma_fence *, fence)
-			     __string(name, sched_job->sched->name)
 			     __field(uint64_t, id)
+			     __string(name, sched_job->sched->name)
 			     __field(u32, job_count)
 			     __field(int, hw_job_count)
 			     __string(dev, dev_name(sched_job->sched->dev))
+			     __field(u64, fence_context)
+			     __field(u64, fence_seqno)
 			     ),
 
 	    TP_fast_assign(
-			   __entry->entity = entity;
 			   __entry->id = sched_job->id;
-			   __entry->fence = &sched_job->s_fence->finished;
 			   __assign_str(name);
 			   __entry->job_count = spsc_queue_count(&entity->job_queue);
 			   __entry->hw_job_count = atomic_read(
 				   &sched_job->sched->credit_count);
 			   __assign_str(dev);
+			   __entry->fence_context = sched_job->s_fence->finished.context;
+			   __entry->fence_seqno = sched_job->s_fence->finished.seqno;
 			   ),
-	    TP_printk("dev=%s, entity=%p, id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
-		      __get_str(dev), __entry->entity, __entry->id,
-		      __entry->fence, __get_str(name),
+	    TP_printk("dev=%s, id=%llu, fence=%llu:%llu, ring=%s, job count:%u, hw job count:%d",
+		      __get_str(dev), __entry->id,
+		      __entry->fence_context, __entry->fence_seqno, __get_str(name),
 		      __entry->job_count, __entry->hw_job_count)
 );
 
@@ -75,37 +75,39 @@ TRACE_EVENT(drm_sched_process_job,
 	    TP_PROTO(struct drm_sched_fence *fence),
 	    TP_ARGS(fence),
 	    TP_STRUCT__entry(
-		    __field(struct dma_fence *, fence)
+		    __field(u64, fence_context)
+		    __field(u64, fence_seqno)
 		    ),
 
 	    TP_fast_assign(
-		    __entry->fence = &fence->finished;
+		    __entry->fence_context = fence->finished.context;
+		    __entry->fence_seqno = fence->finished.seqno;
 		    ),
-	    TP_printk("fence=%p signaled", __entry->fence)
+	    TP_printk("fence=%llu:%llu signaled",
+		      __entry->fence_context, __entry->fence_seqno)
 );
 
 TRACE_EVENT(drm_sched_job_wait_dep,
 	    TP_PROTO(struct drm_sched_job *sched_job, struct dma_fence *fence),
 	    TP_ARGS(sched_job, fence),
 	    TP_STRUCT__entry(
-			     __string(name, sched_job->sched->name)
+			     __field(u64, fence_context)
+			     __field(u64, fence_seqno)
 			     __field(uint64_t, id)
-			     __field(struct dma_fence *, fence)
-			     __field(uint64_t, ctx)
-			     __field(unsigned, seqno)
+			     __field(u64, ctx)
+			     __field(u64, seqno)
 			     ),
 
 	    TP_fast_assign(
-			   __assign_str(name);
+			   __entry->fence_context = sched_job->s_fence->finished.context;
+			   __entry->fence_seqno = sched_job->s_fence->finished.seqno;
 			   __entry->id = sched_job->id;
-			   __entry->fence = fence;
 			   __entry->ctx = fence->context;
 			   __entry->seqno = fence->seqno;
 			   ),
-	    TP_printk("job ring=%s, id=%llu, depends fence=%p, context=%llu, seq=%u",
-		      __get_str(name), __entry->id,
-		      __entry->fence, __entry->ctx,
-		      __entry->seqno)
+	    TP_printk("fence=%llu:%llu, id=%llu depends on unsignalled fence=%llu:%llu",
+		      __entry->fence_context, __entry->fence_seqno, __entry->id,
+		      __entry->ctx, __entry->seqno)
 );
 
 #endif /* _GPU_SCHED_TRACE_H_ */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v9 05/10] drm/sched: trace dependencies for gpu jobs
  2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
                   ` (3 preceding siblings ...)
  2025-04-24  8:38 ` [PATCH v9 04/10] drm/sched: cleanup gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
@ 2025-04-24  8:38 ` Pierre-Eric Pelloux-Prayer
  2025-05-14 12:46   ` Philipp Stanner
  2025-04-24  8:38 ` [PATCH v9 06/10] drm/sched: add the drm_client_id to the drm_sched_run/exec_job events Pierre-Eric Pelloux-Prayer
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-04-24  8:38 UTC (permalink / raw)
  To: Matthew Brost, Danilo Krummrich, Philipp Stanner,
	Christian König, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal
  Cc: Pierre-Eric Pelloux-Prayer, Tvrtko Ursulin, dri-devel,
	linux-kernel, linux-media, linaro-mm-sig

We can't trace dependencies from drm_sched_job_add_dependency
because when it's called the job's fence is not available yet.

So instead each dependency is traced individually when
drm_sched_entity_push_job is used.

Tracing the dependencies allows tools to analyze the dependencies
between the jobs (previously it was only possible for fences
traced by drm_sched_job_wait_dep).

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
---
 .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 23 +++++++++++++++++++
 drivers/gpu/drm/scheduler/sched_entity.c      |  8 +++++++
 2 files changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
index 6f5bd05131aa..5d9992ad47d3 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
@@ -87,6 +87,29 @@ TRACE_EVENT(drm_sched_process_job,
 		      __entry->fence_context, __entry->fence_seqno)
 );
 
+TRACE_EVENT(drm_sched_job_add_dep,
+	TP_PROTO(struct drm_sched_job *sched_job, struct dma_fence *fence),
+	TP_ARGS(sched_job, fence),
+	TP_STRUCT__entry(
+		    __field(u64, fence_context)
+		    __field(u64, fence_seqno)
+		    __field(u64, id)
+		    __field(u64, ctx)
+		    __field(u64, seqno)
+		    ),
+
+	TP_fast_assign(
+		    __entry->fence_context = sched_job->s_fence->finished.context;
+		    __entry->fence_seqno = sched_job->s_fence->finished.seqno;
+		    __entry->id = sched_job->id;
+		    __entry->ctx = fence->context;
+		    __entry->seqno = fence->seqno;
+		    ),
+	TP_printk("fence=%llu:%llu, id=%llu depends on fence=%llu:%llu",
+		  __entry->fence_context, __entry->fence_seqno, __entry->id,
+		  __entry->ctx, __entry->seqno)
+);
+
 TRACE_EVENT(drm_sched_job_wait_dep,
 	    TP_PROTO(struct drm_sched_job *sched_job, struct dma_fence *fence),
 	    TP_ARGS(sched_job, fence),
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index bd39db7bb240..be579e132711 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -587,6 +587,14 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 	ktime_t submit_ts;
 
 	trace_drm_sched_job(sched_job, entity);
+
+	if (trace_drm_sched_job_add_dep_enabled()) {
+		struct dma_fence *entry;
+		unsigned long index;
+
+		xa_for_each(&sched_job->dependencies, index, entry)
+			trace_drm_sched_job_add_dep(sched_job, entry);
+	}
 	atomic_inc(entity->rq->sched->score);
 	WRITE_ONCE(entity->last_user, current->group_leader);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v9 06/10] drm/sched: add the drm_client_id to the drm_sched_run/exec_job events
  2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
                   ` (4 preceding siblings ...)
  2025-04-24  8:38 ` [PATCH v9 05/10] drm/sched: trace dependencies for gpu jobs Pierre-Eric Pelloux-Prayer
@ 2025-04-24  8:38 ` Pierre-Eric Pelloux-Prayer
  2025-04-24  8:38 ` [PATCH v9 07/10] drm/sched: cleanup event names Pierre-Eric Pelloux-Prayer
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-04-24  8:38 UTC (permalink / raw)
  To: Matthew Brost, Danilo Krummrich, Philipp Stanner,
	Christian König, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter
  Cc: Pierre-Eric Pelloux-Prayer, Christian König, dri-devel,
	linux-kernel

For processes with multiple drm_file instances, the drm_client_id is
the only way to map jobs back to their unique owner.

It's even more useful if drm client_name is set, because now a tool
can map jobs to the client name instead of only having access to
the process name.

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Philipp Stanner <phasta@kernel.org>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/scheduler/gpu_scheduler_trace.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
index 5d9992ad47d3..38cdd659a286 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
@@ -43,6 +43,7 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 			     __string(dev, dev_name(sched_job->sched->dev))
 			     __field(u64, fence_context)
 			     __field(u64, fence_seqno)
+			     __field(u64, client_id)
 			     ),
 
 	    TP_fast_assign(
@@ -54,11 +55,12 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 			   __assign_str(dev);
 			   __entry->fence_context = sched_job->s_fence->finished.context;
 			   __entry->fence_seqno = sched_job->s_fence->finished.seqno;
+			   __entry->client_id = sched_job->s_fence->drm_client_id;
 			   ),
-	    TP_printk("dev=%s, id=%llu, fence=%llu:%llu, ring=%s, job count:%u, hw job count:%d",
+	    TP_printk("dev=%s, id=%llu, fence=%llu:%llu, ring=%s, job count:%u, hw job count:%d, client_id:%llu",
 		      __get_str(dev), __entry->id,
 		      __entry->fence_context, __entry->fence_seqno, __get_str(name),
-		      __entry->job_count, __entry->hw_job_count)
+		      __entry->job_count, __entry->hw_job_count, __entry->client_id)
 );
 
 DEFINE_EVENT(drm_sched_job, drm_sched_job,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v9 07/10] drm/sched: cleanup event names
  2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
                   ` (5 preceding siblings ...)
  2025-04-24  8:38 ` [PATCH v9 06/10] drm/sched: add the drm_client_id to the drm_sched_run/exec_job events Pierre-Eric Pelloux-Prayer
@ 2025-04-24  8:38 ` Pierre-Eric Pelloux-Prayer
  2025-04-24  8:38 ` [PATCH v9 08/10] drm: get rid of drm_sched_job::id Pierre-Eric Pelloux-Prayer
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-04-24  8:38 UTC (permalink / raw)
  To: Matthew Brost, Danilo Krummrich, Philipp Stanner,
	Christian König, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter
  Cc: Pierre-Eric Pelloux-Prayer, Tvrtko Ursulin, dri-devel,
	linux-kernel

All events now start with the same prefix (drm_sched_job_).

drm_sched_job_wait_dep was misleading because it wasn't waiting
at all. It's now replaced by trace_drm_sched_job_unschedulable,
which is only traced if the job cannot be scheduled.
For moot dependencies, nothing is traced.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
---
 drivers/gpu/drm/scheduler/gpu_scheduler_trace.h | 8 ++++----
 drivers/gpu/drm/scheduler/sched_entity.c        | 8 ++++----
 drivers/gpu/drm/scheduler/sched_main.c          | 4 ++--
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
index 38cdd659a286..4ce53e493fef 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
@@ -63,17 +63,17 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 		      __entry->job_count, __entry->hw_job_count, __entry->client_id)
 );
 
-DEFINE_EVENT(drm_sched_job, drm_sched_job,
+DEFINE_EVENT(drm_sched_job, drm_sched_job_queue,
 	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
 	    TP_ARGS(sched_job, entity)
 );
 
-DEFINE_EVENT(drm_sched_job, drm_run_job,
+DEFINE_EVENT(drm_sched_job, drm_sched_job_run,
 	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
 	    TP_ARGS(sched_job, entity)
 );
 
-TRACE_EVENT(drm_sched_process_job,
+TRACE_EVENT(drm_sched_job_done,
 	    TP_PROTO(struct drm_sched_fence *fence),
 	    TP_ARGS(fence),
 	    TP_STRUCT__entry(
@@ -112,7 +112,7 @@ TRACE_EVENT(drm_sched_job_add_dep,
 		  __entry->ctx, __entry->seqno)
 );
 
-TRACE_EVENT(drm_sched_job_wait_dep,
+TRACE_EVENT(drm_sched_job_unschedulable,
 	    TP_PROTO(struct drm_sched_job *sched_job, struct dma_fence *fence),
 	    TP_ARGS(sched_job, fence),
 	    TP_STRUCT__entry(
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index be579e132711..59162cb81c4e 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -477,10 +477,10 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 
 	while ((entity->dependency =
 			drm_sched_job_dependency(sched_job, entity))) {
-		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
-
-		if (drm_sched_entity_add_dependency_cb(entity))
+		if (drm_sched_entity_add_dependency_cb(entity)) {
+			trace_drm_sched_job_unschedulable(sched_job, entity->dependency);
 			return NULL;
+		}
 	}
 
 	/* skip jobs from entity that marked guilty */
@@ -586,7 +586,7 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 	bool first;
 	ktime_t submit_ts;
 
-	trace_drm_sched_job(sched_job, entity);
+	trace_drm_sched_job_queue(sched_job, entity);
 
 	if (trace_drm_sched_job_add_dep_enabled()) {
 		struct dma_fence *entry;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 60611618f3ab..195b5f891068 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -401,7 +401,7 @@ static void drm_sched_job_done(struct drm_sched_job *s_job, int result)
 	atomic_sub(s_job->credits, &sched->credit_count);
 	atomic_dec(sched->score);
 
-	trace_drm_sched_process_job(s_fence);
+	trace_drm_sched_job_done(s_fence);
 
 	dma_fence_get(&s_fence->finished);
 	drm_sched_fence_finished(s_fence, result);
@@ -1231,7 +1231,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
 	atomic_add(sched_job->credits, &sched->credit_count);
 	drm_sched_job_begin(sched_job);
 
-	trace_drm_run_job(sched_job, entity);
+	trace_drm_sched_job_run(sched_job, entity);
 	/*
 	 * The run_job() callback must by definition return a fence whose
 	 * refcount has been incremented for the scheduler already.
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v9 08/10] drm: get rid of drm_sched_job::id
  2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
                   ` (6 preceding siblings ...)
  2025-04-24  8:38 ` [PATCH v9 07/10] drm/sched: cleanup event names Pierre-Eric Pelloux-Prayer
@ 2025-04-24  8:38 ` Pierre-Eric Pelloux-Prayer
  2025-04-25  5:26   ` Yadav, Arvind
  2025-05-14 12:50   ` Philipp Stanner
  2025-04-24  8:38 ` [PATCH v9 09/10] drm/doc: document some tracepoints as uAPI Pierre-Eric Pelloux-Prayer
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-04-24  8:38 UTC (permalink / raw)
  To: Alex Deucher, Christian König, David Airlie, Simona Vetter,
	Matthew Brost, Danilo Krummrich, Philipp Stanner,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: Pierre-Eric Pelloux-Prayer, Tvrtko Ursulin, Christian König,
	amd-gfx, dri-devel, linux-kernel

Its only purpose was for trace events, but jobs can already be
uniquely identified using their fence.

The downside of using the fence is that it's only available
after 'drm_sched_job_arm' was called which is true for all trace
events that used job.id so they can safely switch to using it.

Suggested-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h      | 18 ++++++------------
 .../gpu/drm/scheduler/gpu_scheduler_trace.h    | 18 ++++++------------
 drivers/gpu/drm/scheduler/sched_main.c         |  1 -
 include/drm/gpu_scheduler.h                    |  3 ---
 4 files changed, 12 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 11dd2e0f7979..4fd810cb5387 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -167,7 +167,6 @@ TRACE_EVENT(amdgpu_cs_ioctl,
 	    TP_PROTO(struct amdgpu_job *job),
 	    TP_ARGS(job),
 	    TP_STRUCT__entry(
-			     __field(uint64_t, sched_job_id)
 			     __string(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
 			     __field(unsigned int, context)
 			     __field(unsigned int, seqno)
@@ -177,15 +176,14 @@ TRACE_EVENT(amdgpu_cs_ioctl,
 			     ),
 
 	    TP_fast_assign(
-			   __entry->sched_job_id = job->base.id;
 			   __assign_str(timeline);
 			   __entry->context = job->base.s_fence->finished.context;
 			   __entry->seqno = job->base.s_fence->finished.seqno;
 			   __assign_str(ring);
 			   __entry->num_ibs = job->num_ibs;
 			   ),
-	    TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
-		      __entry->sched_job_id, __get_str(timeline), __entry->context,
+	    TP_printk("timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
+		      __get_str(timeline), __entry->context,
 		      __entry->seqno, __get_str(ring), __entry->num_ibs)
 );
 
@@ -193,7 +191,6 @@ TRACE_EVENT(amdgpu_sched_run_job,
 	    TP_PROTO(struct amdgpu_job *job),
 	    TP_ARGS(job),
 	    TP_STRUCT__entry(
-			     __field(uint64_t, sched_job_id)
 			     __string(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
 			     __field(unsigned int, context)
 			     __field(unsigned int, seqno)
@@ -202,15 +199,14 @@ TRACE_EVENT(amdgpu_sched_run_job,
 			     ),
 
 	    TP_fast_assign(
-			   __entry->sched_job_id = job->base.id;
 			   __assign_str(timeline);
 			   __entry->context = job->base.s_fence->finished.context;
 			   __entry->seqno = job->base.s_fence->finished.seqno;
 			   __assign_str(ring);
 			   __entry->num_ibs = job->num_ibs;
 			   ),
-	    TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
-		      __entry->sched_job_id, __get_str(timeline), __entry->context,
+	    TP_printk("timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
+		      __get_str(timeline), __entry->context,
 		      __entry->seqno, __get_str(ring), __entry->num_ibs)
 );
 
@@ -551,7 +547,6 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
 	    TP_ARGS(sched_job, fence),
 	    TP_STRUCT__entry(
 			     __string(ring, sched_job->base.sched->name)
-			     __field(uint64_t, id)
 			     __field(struct dma_fence *, fence)
 			     __field(uint64_t, ctx)
 			     __field(unsigned, seqno)
@@ -559,13 +554,12 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
 
 	    TP_fast_assign(
 			   __assign_str(ring);
-			   __entry->id = sched_job->base.id;
 			   __entry->fence = fence;
 			   __entry->ctx = fence->context;
 			   __entry->seqno = fence->seqno;
 			   ),
-	    TP_printk("job ring=%s, id=%llu, need pipe sync to fence=%p, context=%llu, seq=%u",
-		      __get_str(ring), __entry->id,
+	    TP_printk("job ring=%s need pipe sync to fence=%p, context=%llu, seq=%u",
+		      __get_str(ring),
 		      __entry->fence, __entry->ctx,
 		      __entry->seqno)
 );
diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
index 4ce53e493fef..781b20349389 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
@@ -36,7 +36,6 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
 	    TP_ARGS(sched_job, entity),
 	    TP_STRUCT__entry(
-			     __field(uint64_t, id)
 			     __string(name, sched_job->sched->name)
 			     __field(u32, job_count)
 			     __field(int, hw_job_count)
@@ -47,7 +46,6 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 			     ),
 
 	    TP_fast_assign(
-			   __entry->id = sched_job->id;
 			   __assign_str(name);
 			   __entry->job_count = spsc_queue_count(&entity->job_queue);
 			   __entry->hw_job_count = atomic_read(
@@ -57,8 +55,8 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 			   __entry->fence_seqno = sched_job->s_fence->finished.seqno;
 			   __entry->client_id = sched_job->s_fence->drm_client_id;
 			   ),
-	    TP_printk("dev=%s, id=%llu, fence=%llu:%llu, ring=%s, job count:%u, hw job count:%d, client_id:%llu",
-		      __get_str(dev), __entry->id,
+	    TP_printk("dev=%s, fence=%llu:%llu, ring=%s, job count:%u, hw job count:%d, client_id:%llu",
+		      __get_str(dev),
 		      __entry->fence_context, __entry->fence_seqno, __get_str(name),
 		      __entry->job_count, __entry->hw_job_count, __entry->client_id)
 );
@@ -95,7 +93,6 @@ TRACE_EVENT(drm_sched_job_add_dep,
 	TP_STRUCT__entry(
 		    __field(u64, fence_context)
 		    __field(u64, fence_seqno)
-		    __field(u64, id)
 		    __field(u64, ctx)
 		    __field(u64, seqno)
 		    ),
@@ -103,12 +100,11 @@ TRACE_EVENT(drm_sched_job_add_dep,
 	TP_fast_assign(
 		    __entry->fence_context = sched_job->s_fence->finished.context;
 		    __entry->fence_seqno = sched_job->s_fence->finished.seqno;
-		    __entry->id = sched_job->id;
 		    __entry->ctx = fence->context;
 		    __entry->seqno = fence->seqno;
 		    ),
-	TP_printk("fence=%llu:%llu, id=%llu depends on fence=%llu:%llu",
-		  __entry->fence_context, __entry->fence_seqno, __entry->id,
+	TP_printk("fence=%llu:%llu depends on fence=%llu:%llu",
+		  __entry->fence_context, __entry->fence_seqno,
 		  __entry->ctx, __entry->seqno)
 );
 
@@ -118,7 +114,6 @@ TRACE_EVENT(drm_sched_job_unschedulable,
 	    TP_STRUCT__entry(
 			     __field(u64, fence_context)
 			     __field(u64, fence_seqno)
-			     __field(uint64_t, id)
 			     __field(u64, ctx)
 			     __field(u64, seqno)
 			     ),
@@ -126,12 +121,11 @@ TRACE_EVENT(drm_sched_job_unschedulable,
 	    TP_fast_assign(
 			   __entry->fence_context = sched_job->s_fence->finished.context;
 			   __entry->fence_seqno = sched_job->s_fence->finished.seqno;
-			   __entry->id = sched_job->id;
 			   __entry->ctx = fence->context;
 			   __entry->seqno = fence->seqno;
 			   ),
-	    TP_printk("fence=%llu:%llu, id=%llu depends on unsignalled fence=%llu:%llu",
-		      __entry->fence_context, __entry->fence_seqno, __entry->id,
+	    TP_printk("fence=%llu:%llu depends on unsignalled fence=%llu:%llu",
+		      __entry->fence_context, __entry->fence_seqno,
 		      __entry->ctx, __entry->seqno)
 );
 
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 195b5f891068..dafda1803c7c 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -852,7 +852,6 @@ void drm_sched_job_arm(struct drm_sched_job *job)
 
 	job->sched = sched;
 	job->s_priority = entity->priority;
-	job->id = atomic64_inc_return(&sched->job_id_count);
 
 	drm_sched_fence_init(job->s_fence, job->entity);
 }
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 6fe3b4c0cffb..48190fdf661a 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -326,7 +326,6 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
  * @finish_cb: the callback for the finished fence.
  * @credits: the number of credits this job contributes to the scheduler
  * @work: Helper to reschedule job kill to different context.
- * @id: a unique id assigned to each job scheduled on the scheduler.
  * @karma: increment on every hang caused by this job. If this exceeds the hang
  *         limit of the scheduler then the job is marked guilty and will not
  *         be scheduled further.
@@ -339,8 +338,6 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
  * to schedule the job.
  */
 struct drm_sched_job {
-	u64				id;
-
 	/**
 	 * @submit_ts:
 	 *
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v9 09/10] drm/doc: document some tracepoints as uAPI
  2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
                   ` (7 preceding siblings ...)
  2025-04-24  8:38 ` [PATCH v9 08/10] drm: get rid of drm_sched_job::id Pierre-Eric Pelloux-Prayer
@ 2025-04-24  8:38 ` Pierre-Eric Pelloux-Prayer
  2025-05-14 12:53   ` Philipp Stanner
  2025-04-24  8:38 ` [PATCH v9 10/10] drm/amdgpu: update trace format to match gpu_scheduler_trace Pierre-Eric Pelloux-Prayer
  2025-05-14 12:25 ` [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
  10 siblings, 1 reply; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-04-24  8:38 UTC (permalink / raw)
  To: David Airlie, Simona Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Jonathan Corbet, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Christian König,
	Sumit Semwal
  Cc: Pierre-Eric Pelloux-Prayer, Lucas Stach, Maíra Canal,
	Christian König, dri-devel, linux-doc, linux-kernel,
	linux-media, linaro-mm-sig

This commit adds a document section in drm-uapi.rst about tracepoints,
and mark the events gpu_scheduler_trace.h as stable uAPI.

The goal is to explicitly state that tools can rely on the fields,
formats and semantics of these events.

Acked-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 Documentation/gpu/drm-uapi.rst                | 19 +++++++++++++++++++
 .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 19 +++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
index 69f72e71a96e..4863a4deb0ee 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -693,3 +693,22 @@ dma-buf interoperability
 
 Please see Documentation/userspace-api/dma-buf-alloc-exchange.rst for
 information on how dma-buf is integrated and exposed within DRM.
+
+
+Trace events
+============
+
+See Documentation/trace/tracepoints.rst for information about using
+Linux Kernel Tracepoints.
+In the DRM subsystem, some events are considered stable uAPI to avoid
+breaking tools (e.g.: GPUVis, umr) relying on them. Stable means that fields
+cannot be removed, nor their formatting updated. Adding new fields is
+possible, under the normal uAPI requirements.
+
+Stable uAPI events
+------------------
+
+From ``drivers/gpu/drm/scheduler/gpu_scheduler_trace.h``
+
+.. kernel-doc::  drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
+   :doc: uAPI trace events
\ No newline at end of file
diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
index 781b20349389..7e840d08ef39 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
@@ -32,6 +32,25 @@
 #define TRACE_SYSTEM gpu_scheduler
 #define TRACE_INCLUDE_FILE gpu_scheduler_trace
 
+/**
+ * DOC: uAPI trace events
+ *
+ * ``drm_sched_job_queue``, ``drm_sched_job_run``, ``drm_sched_job_add_dep``,
+ * ``drm_sched_job_done`` and ``drm_sched_job_unschedulable`` are considered
+ * stable uAPI.
+ *
+ * Common trace events attributes:
+ *
+ * * ``dev``   - the dev_name() of the device running the job.
+ *
+ * * ``ring``  - the hardware ring running the job. Together with ``dev`` it
+ *   uniquely identifies where the job is going to be executed.
+ *
+ * * ``fence`` - the &dma_fence.context and the &dma_fence.seqno of
+ *   &drm_sched_fence.finished
+ *
+ */
+
 DECLARE_EVENT_CLASS(drm_sched_job,
 	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
 	    TP_ARGS(sched_job, entity),
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v9 10/10] drm/amdgpu: update trace format to match gpu_scheduler_trace
  2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
                   ` (8 preceding siblings ...)
  2025-04-24  8:38 ` [PATCH v9 09/10] drm/doc: document some tracepoints as uAPI Pierre-Eric Pelloux-Prayer
@ 2025-04-24  8:38 ` Pierre-Eric Pelloux-Prayer
  2025-04-25  5:31   ` Yadav, Arvind
  2025-05-14 12:25 ` [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
  10 siblings, 1 reply; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-04-24  8:38 UTC (permalink / raw)
  To: Alex Deucher, Christian König, David Airlie, Simona Vetter
  Cc: Pierre-Eric Pelloux-Prayer, amd-gfx, dri-devel, linux-kernel

Log fences using the same format for coherency.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 4fd810cb5387..d13e64a69e25 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -168,8 +168,8 @@ TRACE_EVENT(amdgpu_cs_ioctl,
 	    TP_ARGS(job),
 	    TP_STRUCT__entry(
 			     __string(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
-			     __field(unsigned int, context)
-			     __field(unsigned int, seqno)
+			     __field(u64, context)
+			     __field(u64, seqno)
 			     __field(struct dma_fence *, fence)
 			     __string(ring, to_amdgpu_ring(job->base.sched)->name)
 			     __field(u32, num_ibs)
@@ -182,7 +182,7 @@ TRACE_EVENT(amdgpu_cs_ioctl,
 			   __assign_str(ring);
 			   __entry->num_ibs = job->num_ibs;
 			   ),
-	    TP_printk("timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
+	    TP_printk("timeline=%s, fence=%llu:%llu, ring_name=%s, num_ibs=%u",
 		      __get_str(timeline), __entry->context,
 		      __entry->seqno, __get_str(ring), __entry->num_ibs)
 );
@@ -192,8 +192,8 @@ TRACE_EVENT(amdgpu_sched_run_job,
 	    TP_ARGS(job),
 	    TP_STRUCT__entry(
 			     __string(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
-			     __field(unsigned int, context)
-			     __field(unsigned int, seqno)
+			     __field(u64, context)
+			     __field(u64, seqno)
 			     __string(ring, to_amdgpu_ring(job->base.sched)->name)
 			     __field(u32, num_ibs)
 			     ),
@@ -205,7 +205,7 @@ TRACE_EVENT(amdgpu_sched_run_job,
 			   __assign_str(ring);
 			   __entry->num_ibs = job->num_ibs;
 			   ),
-	    TP_printk("timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
+	    TP_printk("timeline=%s, fence=%llu:%llu, ring_name=%s, num_ibs=%u",
 		      __get_str(timeline), __entry->context,
 		      __entry->seqno, __get_str(ring), __entry->num_ibs)
 );
@@ -548,8 +548,8 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
 	    TP_STRUCT__entry(
 			     __string(ring, sched_job->base.sched->name)
 			     __field(struct dma_fence *, fence)
-			     __field(uint64_t, ctx)
-			     __field(unsigned, seqno)
+			     __field(u64, ctx)
+			     __field(u64, seqno)
 			     ),
 
 	    TP_fast_assign(
@@ -558,10 +558,8 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
 			   __entry->ctx = fence->context;
 			   __entry->seqno = fence->seqno;
 			   ),
-	    TP_printk("job ring=%s need pipe sync to fence=%p, context=%llu, seq=%u",
-		      __get_str(ring),
-		      __entry->fence, __entry->ctx,
-		      __entry->seqno)
+	    TP_printk("job ring=%s need pipe sync to fence=%llu:%llu",
+		      __get_str(ring), __entry->ctx, __entry->seqno)
 );
 
 TRACE_EVENT(amdgpu_reset_reg_dumps,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 08/10] drm: get rid of drm_sched_job::id
  2025-04-24  8:38 ` [PATCH v9 08/10] drm: get rid of drm_sched_job::id Pierre-Eric Pelloux-Prayer
@ 2025-04-25  5:26   ` Yadav, Arvind
  2025-05-14 12:50   ` Philipp Stanner
  1 sibling, 0 replies; 24+ messages in thread
From: Yadav, Arvind @ 2025-04-25  5:26 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Alex Deucher, Christian König,
	David Airlie, Simona Vetter, Matthew Brost, Danilo Krummrich,
	Philipp Stanner, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann
  Cc: Tvrtko Ursulin, Christian König, amd-gfx, dri-devel,
	linux-kernel

Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com>

On 4/24/2025 2:08 PM, Pierre-Eric Pelloux-Prayer wrote:
> Its only purpose was for trace events, but jobs can already be
> uniquely identified using their fence.
>
> The downside of using the fence is that it's only available
> after 'drm_sched_job_arm' was called which is true for all trace
> events that used job.id so they can safely switch to using it.
>
> Suggested-by: Tvrtko Ursulin <tursulin@igalia.com>
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h      | 18 ++++++------------
>   .../gpu/drm/scheduler/gpu_scheduler_trace.h    | 18 ++++++------------
>   drivers/gpu/drm/scheduler/sched_main.c         |  1 -
>   include/drm/gpu_scheduler.h                    |  3 ---
>   4 files changed, 12 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> index 11dd2e0f7979..4fd810cb5387 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> @@ -167,7 +167,6 @@ TRACE_EVENT(amdgpu_cs_ioctl,
>   	    TP_PROTO(struct amdgpu_job *job),
>   	    TP_ARGS(job),
>   	    TP_STRUCT__entry(
> -			     __field(uint64_t, sched_job_id)
>   			     __string(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
>   			     __field(unsigned int, context)
>   			     __field(unsigned int, seqno)
> @@ -177,15 +176,14 @@ TRACE_EVENT(amdgpu_cs_ioctl,
>   			     ),
>   
>   	    TP_fast_assign(
> -			   __entry->sched_job_id = job->base.id;
>   			   __assign_str(timeline);
>   			   __entry->context = job->base.s_fence->finished.context;
>   			   __entry->seqno = job->base.s_fence->finished.seqno;
>   			   __assign_str(ring);
>   			   __entry->num_ibs = job->num_ibs;
>   			   ),
> -	    TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
> -		      __entry->sched_job_id, __get_str(timeline), __entry->context,
> +	    TP_printk("timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
> +		      __get_str(timeline), __entry->context,
>   		      __entry->seqno, __get_str(ring), __entry->num_ibs)
>   );
>   
> @@ -193,7 +191,6 @@ TRACE_EVENT(amdgpu_sched_run_job,
>   	    TP_PROTO(struct amdgpu_job *job),
>   	    TP_ARGS(job),
>   	    TP_STRUCT__entry(
> -			     __field(uint64_t, sched_job_id)
>   			     __string(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
>   			     __field(unsigned int, context)
>   			     __field(unsigned int, seqno)
> @@ -202,15 +199,14 @@ TRACE_EVENT(amdgpu_sched_run_job,
>   			     ),
>   
>   	    TP_fast_assign(
> -			   __entry->sched_job_id = job->base.id;
>   			   __assign_str(timeline);
>   			   __entry->context = job->base.s_fence->finished.context;
>   			   __entry->seqno = job->base.s_fence->finished.seqno;
>   			   __assign_str(ring);
>   			   __entry->num_ibs = job->num_ibs;
>   			   ),
> -	    TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
> -		      __entry->sched_job_id, __get_str(timeline), __entry->context,
> +	    TP_printk("timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
> +		      __get_str(timeline), __entry->context,
>   		      __entry->seqno, __get_str(ring), __entry->num_ibs)
>   );
>   
> @@ -551,7 +547,6 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
>   	    TP_ARGS(sched_job, fence),
>   	    TP_STRUCT__entry(
>   			     __string(ring, sched_job->base.sched->name)
> -			     __field(uint64_t, id)
>   			     __field(struct dma_fence *, fence)
>   			     __field(uint64_t, ctx)
>   			     __field(unsigned, seqno)
> @@ -559,13 +554,12 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
>   
>   	    TP_fast_assign(
>   			   __assign_str(ring);
> -			   __entry->id = sched_job->base.id;
>   			   __entry->fence = fence;
>   			   __entry->ctx = fence->context;
>   			   __entry->seqno = fence->seqno;
>   			   ),
> -	    TP_printk("job ring=%s, id=%llu, need pipe sync to fence=%p, context=%llu, seq=%u",
> -		      __get_str(ring), __entry->id,
> +	    TP_printk("job ring=%s need pipe sync to fence=%p, context=%llu, seq=%u",
> +		      __get_str(ring),
>   		      __entry->fence, __entry->ctx,
>   		      __entry->seqno)
>   );
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> index 4ce53e493fef..781b20349389 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> @@ -36,7 +36,6 @@ DECLARE_EVENT_CLASS(drm_sched_job,
>   	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
>   	    TP_ARGS(sched_job, entity),
>   	    TP_STRUCT__entry(
> -			     __field(uint64_t, id)
>   			     __string(name, sched_job->sched->name)
>   			     __field(u32, job_count)
>   			     __field(int, hw_job_count)
> @@ -47,7 +46,6 @@ DECLARE_EVENT_CLASS(drm_sched_job,
>   			     ),
>   
>   	    TP_fast_assign(
> -			   __entry->id = sched_job->id;
>   			   __assign_str(name);
>   			   __entry->job_count = spsc_queue_count(&entity->job_queue);
>   			   __entry->hw_job_count = atomic_read(
> @@ -57,8 +55,8 @@ DECLARE_EVENT_CLASS(drm_sched_job,
>   			   __entry->fence_seqno = sched_job->s_fence->finished.seqno;
>   			   __entry->client_id = sched_job->s_fence->drm_client_id;
>   			   ),
> -	    TP_printk("dev=%s, id=%llu, fence=%llu:%llu, ring=%s, job count:%u, hw job count:%d, client_id:%llu",
> -		      __get_str(dev), __entry->id,
> +	    TP_printk("dev=%s, fence=%llu:%llu, ring=%s, job count:%u, hw job count:%d, client_id:%llu",
> +		      __get_str(dev),
>   		      __entry->fence_context, __entry->fence_seqno, __get_str(name),
>   		      __entry->job_count, __entry->hw_job_count, __entry->client_id)
>   );
> @@ -95,7 +93,6 @@ TRACE_EVENT(drm_sched_job_add_dep,
>   	TP_STRUCT__entry(
>   		    __field(u64, fence_context)
>   		    __field(u64, fence_seqno)
> -		    __field(u64, id)
>   		    __field(u64, ctx)
>   		    __field(u64, seqno)
>   		    ),
> @@ -103,12 +100,11 @@ TRACE_EVENT(drm_sched_job_add_dep,
>   	TP_fast_assign(
>   		    __entry->fence_context = sched_job->s_fence->finished.context;
>   		    __entry->fence_seqno = sched_job->s_fence->finished.seqno;
> -		    __entry->id = sched_job->id;
>   		    __entry->ctx = fence->context;
>   		    __entry->seqno = fence->seqno;
>   		    ),
> -	TP_printk("fence=%llu:%llu, id=%llu depends on fence=%llu:%llu",
> -		  __entry->fence_context, __entry->fence_seqno, __entry->id,
> +	TP_printk("fence=%llu:%llu depends on fence=%llu:%llu",
> +		  __entry->fence_context, __entry->fence_seqno,
>   		  __entry->ctx, __entry->seqno)
>   );
>   
> @@ -118,7 +114,6 @@ TRACE_EVENT(drm_sched_job_unschedulable,
>   	    TP_STRUCT__entry(
>   			     __field(u64, fence_context)
>   			     __field(u64, fence_seqno)
> -			     __field(uint64_t, id)
>   			     __field(u64, ctx)
>   			     __field(u64, seqno)
>   			     ),
> @@ -126,12 +121,11 @@ TRACE_EVENT(drm_sched_job_unschedulable,
>   	    TP_fast_assign(
>   			   __entry->fence_context = sched_job->s_fence->finished.context;
>   			   __entry->fence_seqno = sched_job->s_fence->finished.seqno;
> -			   __entry->id = sched_job->id;
>   			   __entry->ctx = fence->context;
>   			   __entry->seqno = fence->seqno;
>   			   ),
> -	    TP_printk("fence=%llu:%llu, id=%llu depends on unsignalled fence=%llu:%llu",
> -		      __entry->fence_context, __entry->fence_seqno, __entry->id,
> +	    TP_printk("fence=%llu:%llu depends on unsignalled fence=%llu:%llu",
> +		      __entry->fence_context, __entry->fence_seqno,
>   		      __entry->ctx, __entry->seqno)
>   );
>   
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 195b5f891068..dafda1803c7c 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -852,7 +852,6 @@ void drm_sched_job_arm(struct drm_sched_job *job)
>   
>   	job->sched = sched;
>   	job->s_priority = entity->priority;
> -	job->id = atomic64_inc_return(&sched->job_id_count);
>   
>   	drm_sched_fence_init(job->s_fence, job->entity);
>   }
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 6fe3b4c0cffb..48190fdf661a 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -326,7 +326,6 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>    * @finish_cb: the callback for the finished fence.
>    * @credits: the number of credits this job contributes to the scheduler
>    * @work: Helper to reschedule job kill to different context.
> - * @id: a unique id assigned to each job scheduled on the scheduler.
>    * @karma: increment on every hang caused by this job. If this exceeds the hang
>    *         limit of the scheduler then the job is marked guilty and will not
>    *         be scheduled further.
> @@ -339,8 +338,6 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>    * to schedule the job.
>    */
>   struct drm_sched_job {
> -	u64				id;
> -
>   	/**
>   	 * @submit_ts:
>   	 *

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 10/10] drm/amdgpu: update trace format to match gpu_scheduler_trace
  2025-04-24  8:38 ` [PATCH v9 10/10] drm/amdgpu: update trace format to match gpu_scheduler_trace Pierre-Eric Pelloux-Prayer
@ 2025-04-25  5:31   ` Yadav, Arvind
  0 siblings, 0 replies; 24+ messages in thread
From: Yadav, Arvind @ 2025-04-25  5:31 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Alex Deucher, Christian König,
	David Airlie, Simona Vetter
  Cc: amd-gfx, dri-devel, linux-kernel

Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com>

On 4/24/2025 2:08 PM, Pierre-Eric Pelloux-Prayer wrote:
> Log fences using the same format for coherency.
>
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 22 ++++++++++------------
>   1 file changed, 10 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> index 4fd810cb5387..d13e64a69e25 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> @@ -168,8 +168,8 @@ TRACE_EVENT(amdgpu_cs_ioctl,
>   	    TP_ARGS(job),
>   	    TP_STRUCT__entry(
>   			     __string(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
> -			     __field(unsigned int, context)
> -			     __field(unsigned int, seqno)
> +			     __field(u64, context)
> +			     __field(u64, seqno)
>   			     __field(struct dma_fence *, fence)
>   			     __string(ring, to_amdgpu_ring(job->base.sched)->name)
>   			     __field(u32, num_ibs)
> @@ -182,7 +182,7 @@ TRACE_EVENT(amdgpu_cs_ioctl,
>   			   __assign_str(ring);
>   			   __entry->num_ibs = job->num_ibs;
>   			   ),
> -	    TP_printk("timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
> +	    TP_printk("timeline=%s, fence=%llu:%llu, ring_name=%s, num_ibs=%u",
>   		      __get_str(timeline), __entry->context,
>   		      __entry->seqno, __get_str(ring), __entry->num_ibs)
>   );
> @@ -192,8 +192,8 @@ TRACE_EVENT(amdgpu_sched_run_job,
>   	    TP_ARGS(job),
>   	    TP_STRUCT__entry(
>   			     __string(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
> -			     __field(unsigned int, context)
> -			     __field(unsigned int, seqno)
> +			     __field(u64, context)
> +			     __field(u64, seqno)
>   			     __string(ring, to_amdgpu_ring(job->base.sched)->name)
>   			     __field(u32, num_ibs)
>   			     ),
> @@ -205,7 +205,7 @@ TRACE_EVENT(amdgpu_sched_run_job,
>   			   __assign_str(ring);
>   			   __entry->num_ibs = job->num_ibs;
>   			   ),
> -	    TP_printk("timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
> +	    TP_printk("timeline=%s, fence=%llu:%llu, ring_name=%s, num_ibs=%u",
>   		      __get_str(timeline), __entry->context,
>   		      __entry->seqno, __get_str(ring), __entry->num_ibs)
>   );
> @@ -548,8 +548,8 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
>   	    TP_STRUCT__entry(
>   			     __string(ring, sched_job->base.sched->name)
>   			     __field(struct dma_fence *, fence)
> -			     __field(uint64_t, ctx)
> -			     __field(unsigned, seqno)
> +			     __field(u64, ctx)
> +			     __field(u64, seqno)
>   			     ),
>   
>   	    TP_fast_assign(
> @@ -558,10 +558,8 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
>   			   __entry->ctx = fence->context;
>   			   __entry->seqno = fence->seqno;
>   			   ),
> -	    TP_printk("job ring=%s need pipe sync to fence=%p, context=%llu, seq=%u",
> -		      __get_str(ring),
> -		      __entry->fence, __entry->ctx,
> -		      __entry->seqno)
> +	    TP_printk("job ring=%s need pipe sync to fence=%llu:%llu",
> +		      __get_str(ring), __entry->ctx, __entry->seqno)
>   );
>   
>   TRACE_EVENT(amdgpu_reset_reg_dumps,

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI
  2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
                   ` (9 preceding siblings ...)
  2025-04-24  8:38 ` [PATCH v9 10/10] drm/amdgpu: update trace format to match gpu_scheduler_trace Pierre-Eric Pelloux-Prayer
@ 2025-05-14 12:25 ` Pierre-Eric Pelloux-Prayer
  10 siblings, 0 replies; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-05-14 12:25 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Philipp Stanner
  Cc: Christian König, Maíra Canal, Thomas Hellström,
	Abhinav Kumar, Alex Deucher, Boris Brezillon, Danilo Krummrich,
	David Airlie, Dmitry Baryshkov, Felix Kuehling, Frank Binns,
	Jonathan Corbet, Liviu Dudau, Lizhi Hou, Lucas De Marchi,
	Lucas Stach, Lyude Paul, Maarten Lankhorst, Matt Coster,
	Matthew Brost, Maxime Ripard, Melissa Wen, Min Ma, Oded Gabbay,
	Philipp Stanner, Qiang Yu, Rob Clark, Rob Herring, Rodrigo Vivi,
	Simona Vetter, Steven Price, Sumit Semwal, Thomas Zimmermann,
	amd-gfx, dri-devel, etnaviv, freedreno, intel-xe, lima,
	linaro-mm-sig, linux-arm-msm, linux-doc, linux-kernel,
	linux-media, nouveau

Hi Philipp,

Did you get a chance to take a look at the latest revision of this series?

Thanks,
Pierre-Eric

Le 24/04/2025 à 10:38, Pierre-Eric Pelloux-Prayer a écrit :
> Hi,
> 
> The initial goal of this series was to improve the drm and amdgpu
> trace events to be able to expose more of the inner workings of
> the scheduler and drivers to developers via tools.
> 
> Then, the series evolved to become focused only on gpu_scheduler.
> The changes around vblank events will be part of a different
> series, as well as the amdgpu ones.
> 
> Moreover Sima suggested to make some trace events stable uAPI,
> so tools can rely on them long term.
> 
> The first patches extend and cleanup the gpu scheduler events,
> then add a documentation entry in drm-uapi.rst.
> 
> The last 2 patches are new in v8. One is based on a suggestion
> from Tvrtko and gets rid of drm_sched_job::id. The other is a
> cleanup of amdgpu trace events to use the fence=%llu:%llu format.
> 
> The drm_sched_job patches don't affect gpuvis which has code to parse
> the gpu_scheduler events but these events are not enabled.
> 
> Changes since v8:
> * swapped patches 8 & 9
> * rebased on drm-next
> 
> Changes since v7:
> * uint64_t -> u64
> * reworked dependencies tracing (Tvrtko)
> * use common name prefix for all events (Tvrtko)
> * dropped drm_sched_job::id (Tvrtko)
> 
> Useful links:
> - userspace tool using the updated events:
> https://gitlab.freedesktop.org/tomstdenis/umr/-/merge_requests/37
> - v8:
> https://lists.freedesktop.org/archives/dri-devel/2025-March/496781.html
> 
> Pierre-Eric Pelloux-Prayer (10):
>    drm/debugfs: output client_id in in drm_clients_info
>    drm/sched: store the drm client_id in drm_sched_fence
>    drm/sched: add device name to the drm_sched_process_job event
>    drm/sched: cleanup gpu_scheduler trace events
>    drm/sched: trace dependencies for gpu jobs
>    drm/sched: add the drm_client_id to the drm_sched_run/exec_job events
>    drm/sched: cleanup event names
>    drm: get rid of drm_sched_job::id
>    drm/doc: document some tracepoints as uAPI
>    drm/amdgpu: update trace format to match gpu_scheduler_trace
> 
>   Documentation/gpu/drm-uapi.rst                |  19 ++++
>   drivers/accel/amdxdna/aie2_ctx.c              |   3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |   3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h       |   3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h     |  32 +++---
>   drivers/gpu/drm/drm_debugfs.c                 |  10 +-
>   drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |   2 +-
>   drivers/gpu/drm/imagination/pvr_job.c         |   2 +-
>   drivers/gpu/drm/imagination/pvr_queue.c       |   5 +-
>   drivers/gpu/drm/imagination/pvr_queue.h       |   2 +-
>   drivers/gpu/drm/lima/lima_gem.c               |   2 +-
>   drivers/gpu/drm/lima/lima_sched.c             |   6 +-
>   drivers/gpu/drm/lima/lima_sched.h             |   3 +-
>   drivers/gpu/drm/msm/msm_gem_submit.c          |   8 +-
>   drivers/gpu/drm/nouveau/nouveau_sched.c       |   3 +-
>   drivers/gpu/drm/panfrost/panfrost_drv.c       |   2 +-
>   drivers/gpu/drm/panthor/panthor_drv.c         |   3 +-
>   drivers/gpu/drm/panthor/panthor_mmu.c         |   2 +-
>   drivers/gpu/drm/panthor/panthor_sched.c       |   5 +-
>   drivers/gpu/drm/panthor/panthor_sched.h       |   3 +-
>   .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 100 +++++++++++++-----
>   drivers/gpu/drm/scheduler/sched_entity.c      |  16 ++-
>   drivers/gpu/drm/scheduler/sched_fence.c       |   4 +-
>   drivers/gpu/drm/scheduler/sched_internal.h    |   2 +-
>   drivers/gpu/drm/scheduler/sched_main.c        |  11 +-
>   .../gpu/drm/scheduler/tests/mock_scheduler.c  |   2 +-
>   drivers/gpu/drm/v3d/v3d_submit.c              |   2 +-
>   drivers/gpu/drm/xe/xe_sched_job.c             |   3 +-
>   include/drm/gpu_scheduler.h                   |  13 ++-
>   31 files changed, 184 insertions(+), 97 deletions(-)
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 02/10] drm/sched: store the drm client_id in drm_sched_fence
  2025-04-24  8:38 ` [PATCH v9 02/10] drm/sched: store the drm client_id in drm_sched_fence Pierre-Eric Pelloux-Prayer
@ 2025-05-14 12:44   ` Philipp Stanner
  2025-05-15  6:53     ` Pierre-Eric Pelloux-Prayer
  0 siblings, 1 reply; 24+ messages in thread
From: Philipp Stanner @ 2025-05-14 12:44 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Min Ma, Lizhi Hou, Oded Gabbay,
	Felix Kuehling, Alex Deucher, Christian König, David Airlie,
	Simona Vetter, Lucas Stach, Russell King, Christian Gmeiner,
	Frank Binns, Matt Coster, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Qiang Yu, Rob Clark, Abhinav Kumar,
	Dmitry Baryshkov, Sean Paul, Marijn Suijten, Lyude Paul,
	Danilo Krummrich, Boris Brezillon, Rob Herring, Steven Price,
	Liviu Dudau, Matthew Brost, Philipp Stanner, Melissa Wen,
	Maíra Canal, Lucas De Marchi, Thomas Hellström,
	Rodrigo Vivi
  Cc: Christian König, dri-devel, linux-kernel, amd-gfx, etnaviv,
	lima, linux-arm-msm, freedreno, nouveau, intel-xe

On Thu, 2025-04-24 at 10:38 +0200, Pierre-Eric Pelloux-Prayer wrote:
> This will be used in a later commit to trace the drm client_id in
> some of the gpu_scheduler trace events.
> 
> This requires changing all the users of drm_sched_job_init to
> add an extra parameter.
> 
> The newly added drm_client_id field in the drm_sched_fence is a bit
> of a duplicate of the owner one. One suggestion I received was to
> merge those 2 fields - this can't be done right now as amdgpu uses
> some special values (AMDGPU_FENCE_OWNER_*) that can't really be
> translated into a client id. Christian is working on getting rid of
> those; when it's done we should be able to squash owner/drm_client_id
> together.
> 
> Reviewed-by: Christian König <christian.koenig@amd.com>
> Signed-off-by: Pierre-Eric Pelloux-Prayer
> <pierre-eric.pelloux-prayer@amd.com>
> ---
>  drivers/accel/amdxdna/aie2_ctx.c                 |  3 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c       |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           |  3 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c          |  8 +++++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h          |  3 ++-
>  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c     |  2 +-
>  drivers/gpu/drm/imagination/pvr_job.c            |  2 +-
>  drivers/gpu/drm/imagination/pvr_queue.c          |  5 +++--
>  drivers/gpu/drm/imagination/paac             |  8 +++++---
>  drivers/gpu/drm/nouveau/nouveau_sched.c          |  3 ++-
>  drivers/gpu/drm/panfrost/panfrost_drv.c          |  2 +-
>  drivers/gpu/drm/panthor/panthor_drv.c            |  3 ++-
>  drivers/gpu/drm/panthor/panthor_mmu.c            |  2 +-
>  drivers/gpu/drm/panthor/panthor_sched.c          |  5 +++--
>  drivers/gpu/drm/panthor/panthor_sched.h          |  3 ++-
>  drivers/gpu/drm/scheduler/sched_fence.c          |  4 ++
>  drivers/gpu/drm/scheduler/sched_internal.h       |  2 +-
>  drivers/gpu/drm/scheduler/sched_main.c           |  6 ++++--
>  drivers/gpu/drm/scheduler/tests/mock_scheduler.c |  2 +-
>  drivers/gpu/drm/v3d/v3d_submit.c                 |  2 +-
>  drivers/gpu/drm/xe/xe_sched_job.c                |  3 ++-
>  include/drm/gpu_scheduler.h                      | 10 +++++++++-
>  26 files changed, 62 insertions(+), 34 deletions(-)

I think last time I asked about what your merge plan for this is, since
it touches so many drivers. Should I take that?

Besides one comment below, scheduler bits look fine.

> 
> diff --git a/drivers/accel/amdxdna/aie2_ctx.c
> b/drivers/accel/amdxdna/aie2_ctx.c
> index e04549f64d69..3e38a5f637ea 100644
> --- a/drivers/accel/amdxdna/aie2_ctx.c
> +++ b/drivers/accel/amdxdna/aie2_ctx.c
> @@ -848,7 +848,8 @@ int aie2_cmd_submit(struct amdxdna_hwctx *hwctx,
> struct amdxdna_sched_job *job,
>  		goto up_sem;
>  	}
>  
> -	ret = drm_sched_job_init(&job->base, &hwctx->priv->entity,
> 1, hwctx);
> +	ret = drm_sched_job_init(&job->base, &hwctx->priv->entity,
> 1, hwctx,
> +				 hwctx->client->filp->client_id);
>  	if (ret) {
>  		XDNA_ERR(xdna, "DRM job init failed, ret %d", ret);
>  		goto free_chain;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 4cec3a873995..1a77ba7036c9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -639,7 +639,7 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device
> *adev,
>  		goto err;
>  	}
>  
> -	ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job);
> +	ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job, 0);
>  	if (ret)
>  		goto err;
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 82df06a72ee0..5a231b997d65 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -293,7 +293,8 @@ static int amdgpu_cs_pass1(struct
> amdgpu_cs_parser *p,
>  
>  	for (i = 0; i < p->gang_size; ++i) {
>  		ret = amdgpu_job_alloc(p->adev, vm, p->entities[i],
> vm,
> -				       num_ibs[i], &p->jobs[i]);
> +				       num_ibs[i], &p->jobs[i],
> +				       p->filp->client_id);
>  		if (ret)
>  			goto free_all_kdata;
>  		p->jobs[i]->enforce_isolation = p->adev-
> >enforce_isolation[fpriv->xcp_id];
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index acb21fc8b3ce..75262ce8db27 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -204,7 +204,8 @@ static enum drm_gpu_sched_stat
> amdgpu_job_timedout(struct drm_sched_job *s_job)
>  
>  int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm
> *vm,
>  		     struct drm_sched_entity *entity, void *owner,
> -		     unsigned int num_ibs, struct amdgpu_job **job)
> +		     unsigned int num_ibs, struct amdgpu_job **job,
> +		     u64 drm_client_id)
>  {
>  	if (num_ibs == 0)
>  		return -EINVAL;
> @@ -222,7 +223,8 @@ int amdgpu_job_alloc(struct amdgpu_device *adev,
> struct amdgpu_vm *vm,
>  	if (!entity)
>  		return 0;
>  
> -	return drm_sched_job_init(&(*job)->base, entity, 1, owner);
> +	return drm_sched_job_init(&(*job)->base, entity, 1, owner,
> +				  drm_client_id);
>  }
>  
>  int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
> @@ -232,7 +234,7 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device
> *adev,
>  {
>  	int r;
>  
> -	r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job);
> +	r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job, 0);
>  	if (r)
>  		return r;
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> index ce6b9ba967ff..5a8bc6342222 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> @@ -90,7 +90,8 @@ static inline struct amdgpu_ring
> *amdgpu_job_ring(struct amdgpu_job *job)
>  
>  int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm
> *vm,
>  		     struct drm_sched_entity *entity, void *owner,
> -		     unsigned int num_ibs, struct amdgpu_job **job);
> +		     unsigned int num_ibs, struct amdgpu_job **job,
> +		     u64 drm_client_id);
>  int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
>  			     struct drm_sched_entity *entity, void
> *owner,
>  			     size_t size, enum amdgpu_ib_pool_type
> pool_type,
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index 3c0a5c3e0e3d..76c742328edb 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -534,7 +534,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device
> *dev, void *data,
>  
>  	ret = drm_sched_job_init(&submit->sched_job,
>  				 &ctx->sched_entity[args->pipe],
> -				 1, submit->ctx);
> +				 1, submit->ctx, file->client_id);
>  	if (ret)
>  		goto err_submit_put;
>  
> diff --git a/drivers/gpu/drm/imagination/pvr_job.c
> b/drivers/gpu/drm/imagination/pvr_job.c
> index 59b334d094fa..7564b0f21b42 100644
> --- a/drivers/gpu/drm/imagination/pvr_job.c
> +++ b/drivers/gpu/drm/imagination/pvr_job.c
> @@ -446,7 +446,7 @@ create_job(struct pvr_device *pvr_dev,
>  	if (err)
>  		goto err_put_job;
>  
> -	err = pvr_queue_job_init(job);
> +	err = pvr_queue_job_init(job, pvr_file->file->client_id);
>  	if (err)
>  		goto err_put_job;
>  
> diff --git a/drivers/gpu/drm/imagination/pvr_queue.c
> b/drivers/gpu/drm/imagination/pvr_queue.c
> index 5e9bc0992824..5a41ee79fed6 100644
> --- a/drivers/gpu/drm/imagination/pvr_queue.c
> +++ b/drivers/gpu/drm/imagination/pvr_queue.c
> @@ -1073,6 +1073,7 @@ static int pvr_queue_cleanup_fw_context(struct
> pvr_queue *queue)
>  /**
>   * pvr_queue_job_init() - Initialize queue related fields in a
> pvr_job object.
>   * @job: The job to initialize.
> + * @drm_client_id: drm_file.client_id submitting the job
>   *
>   * Bind the job to a queue and allocate memory to guarantee
> pvr_queue_job_arm()
>   * and pvr_queue_job_push() can't fail. We also make sure the
> context type is
> @@ -1082,7 +1083,7 @@ static int pvr_queue_cleanup_fw_context(struct
> pvr_queue *queue)
>   *  * 0 on success, or
>   *  * An error code if something failed.
>   */
> -int pvr_queue_job_init(struct pvr_job *job)
> +int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id)
>  {
>  	/* Fragment jobs need at least one native fence wait on the
> geometry job fence. */
>  	u32 min_native_dep_count = job->type ==
> DRM_PVR_JOB_TYPE_FRAGMENT ? 1 : 0;
> @@ -1099,7 +1100,7 @@ int pvr_queue_job_init(struct pvr_job *job)
>  	if (!pvr_cccb_cmdseq_can_fit(&queue->cccb,
> job_cmds_size(job, min_native_dep_count)))
>  		return -E2BIG;
>  
> -	err = drm_sched_job_init(&job->base, &queue->entity, 1,
> THIS_MODULE);
> +	err = drm_sched_job_init(&job->base, &queue->entity, 1,
> THIS_MODULE, drm_client_id);
>  	if (err)
>  		return err;
>  
> diff --git a/drivers/gpu/drm/imagination/pvr_queue.h
> b/drivers/gpu/drm/imagination/pvr_queue.h
> index 93fe9ac9f58c..fc1986d73fc8 100644
> --- a/drivers/gpu/drm/imagination/pvr_queue.h
> +++ b/drivers/gpu/drm/imagination/pvr_queue.h
> @@ -143,7 +143,7 @@ struct pvr_queue {
>  
>  bool pvr_queue_fence_is_ufo_backed(struct dma_fence *f);
>  
> -int pvr_queue_job_init(struct pvr_job *job);
> +int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id);
>  
>  void pvr_queue_job_cleanup(struct pvr_job *job);
>  
> diff --git a/drivers/gpu/drm/lima/lima_gem.c
> b/drivers/gpu/drm/lima/lima_gem.c
> index 5deec673c11e..9722b847a539 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -341,7 +341,7 @@ int lima_gem_submit(struct drm_file *file, struct
> lima_submit *submit)
>  
>  	err = lima_sched_task_init(
>  		submit->task, submit->ctx->context + submit->pipe,
> -		bos, submit->nr_bos, vm);
> +		bos, submit->nr_bos, vm, file->client_id);
>  	if (err)
>  		goto err_out1;
>  
> diff --git a/drivers/gpu/drm/lima/lima_sched.c
> b/drivers/gpu/drm/lima/lima_sched.c
> index 7934098e651b..954f4325b859 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -113,7 +113,8 @@ static inline struct lima_sched_pipe
> *to_lima_pipe(struct drm_gpu_scheduler *sch
>  int lima_sched_task_init(struct lima_sched_task *task,
>  			 struct lima_sched_context *context,
>  			 struct lima_bo **bos, int num_bos,
> -			 struct lima_vm *vm)
> +			 struct lima_vm *vm,
> +			 u64 drm_client_id)
>  {
>  	int err, i;
>  
> @@ -124,7 +125,8 @@ int lima_sched_task_init(struct lima_sched_task
> *task,
>  	for (i = 0; i < num_bos; i++)
>  		drm_gem_object_get(&bos[i]->base.base);
>  
> -	err = drm_sched_job_init(&task->base, &context->base, 1,
> vm);
> +	err = drm_sched_job_init(&task->base, &context->base, 1, vm,
> +				 drm_client_id);
>  	if (err) {
>  		kfree(task->bos);
>  		return err;
> diff --git a/drivers/gpu/drm/lima/lima_sched.h
> b/drivers/gpu/drm/lima/lima_sched.h
> index 85b23ba901d5..1a08faf8a529 100644
> --- a/drivers/gpu/drm/lima/lima_sched.h
> +++ b/drivers/gpu/drm/lima/lima_sched.h
> @@ -88,7 +88,8 @@ struct lima_sched_pipe {
>  int lima_sched_task_init(struct lima_sched_task *task,
>  			 struct lima_sched_context *context,
>  			 struct lima_bo **bos, int num_bos,
> -			 struct lima_vm *vm);
> +			 struct lima_vm *vm,
> +			 u64 drm_client_id);
>  void lima_sched_task_fini(struct lima_sched_task *task);
>  
>  int lima_sched_context_init(struct lima_sched_pipe *pipe,
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c
> b/drivers/gpu/drm/msm/msm_gem_submit.c
> index 3e9aa2cc38ef..d9be0fe3d674 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -30,7 +30,7 @@
>  static struct msm_gem_submit *submit_create(struct drm_device *dev,
>  		struct msm_gpu *gpu,
>  		struct msm_gpu_submitqueue *queue, uint32_t nr_bos,
> -		uint32_t nr_cmds)
> +		uint32_t nr_cmds, u64 drm_client_id)
>  {
>  	static atomic_t ident = ATOMIC_INIT(0);
>  	struct msm_gem_submit *submit;
> @@ -54,7 +54,8 @@ static struct msm_gem_submit *submit_create(struct
> drm_device *dev,
>  		return ERR_PTR(ret);
>  	}
>  
> -	ret = drm_sched_job_init(&submit->base, queue->entity, 1,
> queue);
> +	ret = drm_sched_job_init(&submit->base, queue->entity, 1,
> queue,
> +				 drm_client_id);
>  	if (ret) {
>  		kfree(submit->hw_fence);
>  		kfree(submit);
> @@ -693,7 +694,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev,
> void *data,
>  		}
>  	}
>  
> -	submit = submit_create(dev, gpu, queue, args->nr_bos, args-
> >nr_cmds);
> +	submit = submit_create(dev, gpu, queue, args->nr_bos, args-
> >nr_cmds,
> +			       file->client_id);
>  	if (IS_ERR(submit)) {
>  		ret = PTR_ERR(submit);
>  		goto out_post_unlock;
> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c
> b/drivers/gpu/drm/nouveau/nouveau_sched.c
> index d326e55d2d24..460a5fb02412 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> @@ -87,7 +87,8 @@ nouveau_job_init(struct nouveau_job *job,
>  	}
>  
>  	ret = drm_sched_job_init(&job->base, &sched->entity,
> -				 args->credits, NULL);
> +				 args->credits, NULL,
> +				 job->file_priv->client_id);
>  	if (ret)
>  		goto err_free_chains;
>  
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index b87f83e94eda..d5c2c6530ed8 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -312,7 +312,7 @@ static int panfrost_ioctl_submit(struct
> drm_device *dev, void *data,
>  
>  	ret = drm_sched_job_init(&job->base,
>  				 &file_priv->sched_entity[slot],
> -				 1, NULL);
> +				 1, NULL, file->client_id);
>  	if (ret)
>  		goto out_put_job;
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c
> b/drivers/gpu/drm/panthor/panthor_drv.c
> index 06fe46e32073..bd8e1900c919 100644
> --- a/drivers/gpu/drm/panthor/panthor_drv.c
> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> @@ -989,7 +989,8 @@ static int panthor_ioctl_group_submit(struct
> drm_device *ddev, void *data,
>  		const struct drm_panthor_queue_submit *qsubmit =
> &jobs_args[i];
>  		struct drm_sched_job *job;
>  
> -		job = panthor_job_create(pfile, args->group_handle,
> qsubmit);
> +		job = panthor_job_create(pfile, args->group_handle,
> qsubmit,
> +					 file->client_id);
>  		if (IS_ERR(job)) {
>  			ret = PTR_ERR(job);
>  			goto out_cleanup_submit_ctx;
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c
> b/drivers/gpu/drm/panthor/panthor_mmu.c
> index 12a02e28f50f..e0c79bd2d173 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -2516,7 +2516,7 @@ panthor_vm_bind_job_create(struct drm_file
> *file,
>  	kref_init(&job->refcount);
>  	job->vm = panthor_vm_get(vm);
>  
> -	ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm);
> +	ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm,
> file->client_id);
>  	if (ret)
>  		goto err_put_job;
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c
> b/drivers/gpu/drm/panthor/panthor_sched.c
> index 446ec780eb4a..2af860c9068a 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -3729,7 +3729,8 @@ struct panthor_vm *panthor_job_vm(struct
> drm_sched_job *sched_job)
>  struct drm_sched_job *
>  panthor_job_create(struct panthor_file *pfile,
>  		   u16 group_handle,
> -		   const struct drm_panthor_queue_submit *qsubmit)
> +		   const struct drm_panthor_queue_submit *qsubmit,
> +		   u64 drm_client_id)
>  {
>  	struct panthor_group_pool *gpool = pfile->groups;
>  	struct panthor_job *job;
> @@ -3801,7 +3802,7 @@ panthor_job_create(struct panthor_file *pfile,
>  
>  	ret = drm_sched_job_init(&job->base,
>  				 &job->group->queues[job-
> >queue_idx]->entity,
> -				 credits, job->group);
> +				 credits, job->group,
> drm_client_id);
>  	if (ret)
>  		goto err_put_job;
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.h
> b/drivers/gpu/drm/panthor/panthor_sched.h
> index e650a445cf50..742b0b4ff3a3 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.h
> +++ b/drivers/gpu/drm/panthor/panthor_sched.h
> @@ -29,7 +29,8 @@ int panthor_group_get_state(struct panthor_file
> *pfile,
>  struct drm_sched_job *
>  panthor_job_create(struct panthor_file *pfile,
>  		   u16 group_handle,
> -		   const struct drm_panthor_queue_submit *qsubmit);
> +		   const struct drm_panthor_queue_submit *qsubmit,
> +		   u64 drm_client_id);
>  struct drm_sched_job *panthor_job_get(struct drm_sched_job *job);
>  struct panthor_vm *panthor_job_vm(struct drm_sched_job *sched_job);
>  void panthor_job_put(struct drm_sched_job *job);
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c
> b/drivers/gpu/drm/scheduler/sched_fence.c
> index e971528504a5..d208d384d38d 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -206,7 +206,8 @@ struct drm_sched_fence *to_drm_sched_fence(struct
> dma_fence *f)
>  EXPORT_SYMBOL(to_drm_sched_fence);
>  
>  struct drm_sched_fence *drm_sched_fence_alloc(struct
> drm_sched_entity *entity,
> -					      void *owner)
> +					      void *owner,
> +					      u64 drm_client_id)
>  {
>  	struct drm_sched_fence *fence = NULL;
>  
> @@ -215,6 +216,7 @@ struct drm_sched_fence
> *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>  		return NULL;
>  
>  	fence->owner = owner;
> +	fence->drm_client_id = drm_client_id;
>  	spin_lock_init(&fence->lock);
>  
>  	return fence;
> diff --git a/drivers/gpu/drm/scheduler/sched_internal.h
> b/drivers/gpu/drm/scheduler/sched_internal.h
> index 599cf6e1bb74..7ea5a6736f98 100644
> --- a/drivers/gpu/drm/scheduler/sched_internal.h
> +++ b/drivers/gpu/drm/scheduler/sched_internal.h
> @@ -24,7 +24,7 @@ void drm_sched_entity_select_rq(struct
> drm_sched_entity *entity);
>  struct drm_sched_job *drm_sched_entity_pop_job(struct
> drm_sched_entity *entity);
>  
>  struct drm_sched_fence *drm_sched_fence_alloc(struct
> drm_sched_entity *s_entity,
> -					      void *owner);
> +					      void *owner, u64
> drm_client_id);
>  void drm_sched_fence_init(struct drm_sched_fence *fence,
>  			  struct drm_sched_entity *entity);
>  void drm_sched_fence_free(struct drm_sched_fence *fence);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 829579c41c6b..60611618f3ab 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -764,6 +764,7 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>   * @credits: the number of credits this job contributes to the
> schedulers
>   * credit limit
>   * @owner: job owner for debugging
> + * @drm_client_id: drm_file.client_id of the owner

For the docu generation to link that properly it must be written as

&struct drm_file.client_id

Besides, if this were an optional parameter, one should document it.
I'm not sure if it is, I haven't used these client_id's before.

P.

>   *
>   * Refer to drm_sched_entity_push_job() documentation
>   * for locking considerations.
> @@ -784,7 +785,8 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>   */
>  int drm_sched_job_init(struct drm_sched_job *job,
>  		       struct drm_sched_entity *entity,
> -		       u32 credits, void *owner)
> +		       u32 credits, void *owner,
> +		       uint64_t drm_client_id)
>  {
>  	if (!entity->rq) {
>  		/* This will most likely be followed by missing
> frames
> @@ -810,7 +812,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  
>  	job->entity = entity;
>  	job->credits = credits;
> -	job->s_fence = drm_sched_fence_alloc(entity, owner);
> +	job->s_fence = drm_sched_fence_alloc(entity, owner,
> drm_client_id);
>  	if (!job->s_fence)
>  		return -ENOMEM;
>  
> diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> index f999c8859cf7..09ffbdb32d76 100644
> --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> @@ -35,7 +35,7 @@ drm_mock_sched_entity_new(struct kunit *test,
>  	ret = drm_sched_entity_init(&entity->base,
>  				    priority,
>  				    &drm_sched, 1,
> -				    NULL);
> +				    NULL, 1);
>  	KUNIT_ASSERT_EQ(test, ret, 0);
>  
>  	entity->test = test;
> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c
> b/drivers/gpu/drm/v3d/v3d_submit.c
> index 4ff5de46fb22..5171ffe9012d 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
> @@ -169,7 +169,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file
> *file_priv,
>  	job->file = file_priv;
>  
>  	ret = drm_sched_job_init(&job->base, &v3d_priv-
> >sched_entity[queue],
> -				 1, v3d_priv);
> +				 1, v3d_priv, file_priv->client_id);
>  	if (ret)
>  		return ret;
>  
> diff --git a/drivers/gpu/drm/xe/xe_sched_job.c
> b/drivers/gpu/drm/xe/xe_sched_job.c
> index 1905ca590965..f4679cb9a56b 100644
> --- a/drivers/gpu/drm/xe/xe_sched_job.c
> +++ b/drivers/gpu/drm/xe/xe_sched_job.c
> @@ -113,7 +113,8 @@ struct xe_sched_job *xe_sched_job_create(struct
> xe_exec_queue *q,
>  	kref_init(&job->refcount);
>  	xe_exec_queue_get(job->q);
>  
> -	err = drm_sched_job_init(&job->drm, q->entity, 1, NULL);
> +	err = drm_sched_job_init(&job->drm, q->entity, 1, NULL,
> +				 q->xef->drm->client_id);
>  	if (err)
>  		goto err_free;
>  
> diff --git a/include/drm/gpu_scheduler.h
> b/include/drm/gpu_scheduler.h
> index 1a7e377d4cbb..6fe3b4c0cffb 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -305,6 +305,13 @@ struct drm_sched_fence {
>           * @owner: job owner for debugging
>           */
>  	void				*owner;
> +
> +	/**
> +	 * @drm_client_id:
> +	 *
> +	 * The client_id of the drm_file which owns the job.
> +	 */
> +	uint64_t			drm_client_id;
>  };
>  
>  struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
> @@ -629,7 +636,8 @@ drm_sched_pick_best(struct drm_gpu_scheduler
> **sched_list,
>  
>  int drm_sched_job_init(struct drm_sched_job *job,
>  		       struct drm_sched_entity *entity,
> -		       u32 credits, void *owner);
> +		       u32 credits, void *owner,
> +		       u64 drm_client_id);
>  void drm_sched_job_arm(struct drm_sched_job *job);
>  void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
>  int drm_sched_job_add_dependency(struct drm_sched_job *job,


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 05/10] drm/sched: trace dependencies for gpu jobs
  2025-04-24  8:38 ` [PATCH v9 05/10] drm/sched: trace dependencies for gpu jobs Pierre-Eric Pelloux-Prayer
@ 2025-05-14 12:46   ` Philipp Stanner
  0 siblings, 0 replies; 24+ messages in thread
From: Philipp Stanner @ 2025-05-14 12:46 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Matthew Brost, Danilo Krummrich,
	Philipp Stanner, Christian König, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal
  Cc: Tvrtko Ursulin, dri-devel, linux-kernel, linux-media,
	linaro-mm-sig

nit: title: s/gpu/GPU

We also mostly start with an upper case letter after the :, but JFYI,
it's not a big deal.


P.

On Thu, 2025-04-24 at 10:38 +0200, Pierre-Eric Pelloux-Prayer wrote:
> We can't trace dependencies from drm_sched_job_add_dependency
> because when it's called the job's fence is not available yet.
> 
> So instead each dependency is traced individually when
> drm_sched_entity_push_job is used.
> 
> Tracing the dependencies allows tools to analyze the dependencies
> between the jobs (previously it was only possible for fences
> traced by drm_sched_job_wait_dep).
> 
> Signed-off-by: Pierre-Eric Pelloux-Prayer
> <pierre-eric.pelloux-prayer@amd.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> ---
>  .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 23
> +++++++++++++++++++
>  drivers/gpu/drm/scheduler/sched_entity.c      |  8 +++++++
>  2 files changed, 31 insertions(+)
> 
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> index 6f5bd05131aa..5d9992ad47d3 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> @@ -87,6 +87,29 @@ TRACE_EVENT(drm_sched_process_job,
>  		      __entry->fence_context, __entry->fence_seqno)
>  );
>  
> +TRACE_EVENT(drm_sched_job_add_dep,
> +	TP_PROTO(struct drm_sched_job *sched_job, struct dma_fence
> *fence),
> +	TP_ARGS(sched_job, fence),
> +	TP_STRUCT__entry(
> +		    __field(u64, fence_context)
> +		    __field(u64, fence_seqno)
> +		    __field(u64, id)
> +		    __field(u64, ctx)
> +		    __field(u64, seqno)
> +		    ),
> +
> +	TP_fast_assign(
> +		    __entry->fence_context = sched_job->s_fence-
> >finished.context;
> +		    __entry->fence_seqno = sched_job->s_fence-
> >finished.seqno;
> +		    __entry->id = sched_job->id;
> +		    __entry->ctx = fence->context;
> +		    __entry->seqno = fence->seqno;
> +		    ),
> +	TP_printk("fence=%llu:%llu, id=%llu depends on
> fence=%llu:%llu",
> +		  __entry->fence_context, __entry->fence_seqno,
> __entry->id,
> +		  __entry->ctx, __entry->seqno)
> +);
> +
>  TRACE_EVENT(drm_sched_job_wait_dep,
>  	    TP_PROTO(struct drm_sched_job *sched_job, struct
> dma_fence *fence),
>  	    TP_ARGS(sched_job, fence),
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index bd39db7bb240..be579e132711 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -587,6 +587,14 @@ void drm_sched_entity_push_job(struct
> drm_sched_job *sched_job)
>  	ktime_t submit_ts;
>  
>  	trace_drm_sched_job(sched_job, entity);
> +
> +	if (trace_drm_sched_job_add_dep_enabled()) {
> +		struct dma_fence *entry;
> +		unsigned long index;
> +
> +		xa_for_each(&sched_job->dependencies, index, entry)
> +			trace_drm_sched_job_add_dep(sched_job,
> entry);
> +	}
>  	atomic_inc(entity->rq->sched->score);
>  	WRITE_ONCE(entity->last_user, current->group_leader);
>  


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 08/10] drm: get rid of drm_sched_job::id
  2025-04-24  8:38 ` [PATCH v9 08/10] drm: get rid of drm_sched_job::id Pierre-Eric Pelloux-Prayer
  2025-04-25  5:26   ` Yadav, Arvind
@ 2025-05-14 12:50   ` Philipp Stanner
  1 sibling, 0 replies; 24+ messages in thread
From: Philipp Stanner @ 2025-05-14 12:50 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Alex Deucher, Christian König,
	David Airlie, Simona Vetter, Matthew Brost, Danilo Krummrich,
	Philipp Stanner, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann
  Cc: Tvrtko Ursulin, Christian König, amd-gfx, dri-devel,
	linux-kernel

On Thu, 2025-04-24 at 10:38 +0200, Pierre-Eric Pelloux-Prayer wrote:
> Its only purpose was for trace events, but jobs can already be
> uniquely identified using their fence.
> 
> The downside of using the fence is that it's only available
> after 'drm_sched_job_arm' was called which is true for all trace
> events that used job.id so they can safely switch to using it.

nit: in the title you use double colon :: as a namespace separator. In
the kernel docu style only . (as here above) or -> are used.

Commit title should be consistent with that.

Other than that, nice clean up.

P.

> 
> Suggested-by: Tvrtko Ursulin <tursulin@igalia.com>
> Signed-off-by: Pierre-Eric Pelloux-Prayer
> <pierre-eric.pelloux-prayer@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h      | 18 ++++++----------
> --
>  .../gpu/drm/scheduler/gpu_scheduler_trace.h    | 18 ++++++----------
> --
>  drivers/gpu/drm/scheduler/sched_main.c         |  1 -
>  include/drm/gpu_scheduler.h                    |  3 ---
>  4 files changed, 12 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> index 11dd2e0f7979..4fd810cb5387 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> @@ -167,7 +167,6 @@ TRACE_EVENT(amdgpu_cs_ioctl,
>  	    TP_PROTO(struct amdgpu_job *job),
>  	    TP_ARGS(job),
>  	    TP_STRUCT__entry(
> -			     __field(uint64_t, sched_job_id)
>  			     __string(timeline,
> AMDGPU_JOB_GET_TIMELINE_NAME(job))
>  			     __field(unsigned int, context)
>  			     __field(unsigned int, seqno)
> @@ -177,15 +176,14 @@ TRACE_EVENT(amdgpu_cs_ioctl,
>  			     ),
>  
>  	    TP_fast_assign(
> -			   __entry->sched_job_id = job->base.id;
>  			   __assign_str(timeline);
>  			   __entry->context = job->base.s_fence-
> >finished.context;
>  			   __entry->seqno = job->base.s_fence-
> >finished.seqno;
>  			   __assign_str(ring);
>  			   __entry->num_ibs = job->num_ibs;
>  			   ),
> -	    TP_printk("sched_job=%llu, timeline=%s, context=%u,
> seqno=%u, ring_name=%s, num_ibs=%u",
> -		      __entry->sched_job_id, __get_str(timeline),
> __entry->context,
> +	    TP_printk("timeline=%s, context=%u, seqno=%u,
> ring_name=%s, num_ibs=%u",
> +		      __get_str(timeline), __entry->context,
>  		      __entry->seqno, __get_str(ring), __entry-
> >num_ibs)
>  );
>  
> @@ -193,7 +191,6 @@ TRACE_EVENT(amdgpu_sched_run_job,
>  	    TP_PROTO(struct amdgpu_job *job),
>  	    TP_ARGS(job),
>  	    TP_STRUCT__entry(
> -			     __field(uint64_t, sched_job_id)
>  			     __string(timeline,
> AMDGPU_JOB_GET_TIMELINE_NAME(job))
>  			     __field(unsigned int, context)
>  			     __field(unsigned int, seqno)
> @@ -202,15 +199,14 @@ TRACE_EVENT(amdgpu_sched_run_job,
>  			     ),
>  
>  	    TP_fast_assign(
> -			   __entry->sched_job_id = job->base.id;
>  			   __assign_str(timeline);
>  			   __entry->context = job->base.s_fence-
> >finished.context;
>  			   __entry->seqno = job->base.s_fence-
> >finished.seqno;
>  			   __assign_str(ring);
>  			   __entry->num_ibs = job->num_ibs;
>  			   ),
> -	    TP_printk("sched_job=%llu, timeline=%s, context=%u,
> seqno=%u, ring_name=%s, num_ibs=%u",
> -		      __entry->sched_job_id, __get_str(timeline),
> __entry->context,
> +	    TP_printk("timeline=%s, context=%u, seqno=%u,
> ring_name=%s, num_ibs=%u",
> +		      __get_str(timeline), __entry->context,
>  		      __entry->seqno, __get_str(ring), __entry-
> >num_ibs)
>  );
>  
> @@ -551,7 +547,6 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
>  	    TP_ARGS(sched_job, fence),
>  	    TP_STRUCT__entry(
>  			     __string(ring, sched_job->base.sched-
> >name)
> -			     __field(uint64_t, id)
>  			     __field(struct dma_fence *, fence)
>  			     __field(uint64_t, ctx)
>  			     __field(unsigned, seqno)
> @@ -559,13 +554,12 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
>  
>  	    TP_fast_assign(
>  			   __assign_str(ring);
> -			   __entry->id = sched_job->base.id;
>  			   __entry->fence = fence;
>  			   __entry->ctx = fence->context;
>  			   __entry->seqno = fence->seqno;
>  			   ),
> -	    TP_printk("job ring=%s, id=%llu, need pipe sync to
> fence=%p, context=%llu, seq=%u",
> -		      __get_str(ring), __entry->id,
> +	    TP_printk("job ring=%s need pipe sync to fence=%p,
> context=%llu, seq=%u",
> +		      __get_str(ring),
>  		      __entry->fence, __entry->ctx,
>  		      __entry->seqno)
>  );
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> index 4ce53e493fef..781b20349389 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> @@ -36,7 +36,6 @@ DECLARE_EVENT_CLASS(drm_sched_job,
>  	    TP_PROTO(struct drm_sched_job *sched_job, struct
> drm_sched_entity *entity),
>  	    TP_ARGS(sched_job, entity),
>  	    TP_STRUCT__entry(
> -			     __field(uint64_t, id)
>  			     __string(name, sched_job->sched->name)
>  			     __field(u32, job_count)
>  			     __field(int, hw_job_count)
> @@ -47,7 +46,6 @@ DECLARE_EVENT_CLASS(drm_sched_job,
>  			     ),
>  
>  	    TP_fast_assign(
> -			   __entry->id = sched_job->id;
>  			   __assign_str(name);
>  			   __entry->job_count =
> spsc_queue_count(&entity->job_queue);
>  			   __entry->hw_job_count = atomic_read(
> @@ -57,8 +55,8 @@ DECLARE_EVENT_CLASS(drm_sched_job,
>  			   __entry->fence_seqno = sched_job-
> >s_fence->finished.seqno;
>  			   __entry->client_id = sched_job->s_fence-
> >drm_client_id;
>  			   ),
> -	    TP_printk("dev=%s, id=%llu, fence=%llu:%llu, ring=%s,
> job count:%u, hw job count:%d, client_id:%llu",
> -		      __get_str(dev), __entry->id,
> +	    TP_printk("dev=%s, fence=%llu:%llu, ring=%s, job
> count:%u, hw job count:%d, client_id:%llu",
> +		      __get_str(dev),
>  		      __entry->fence_context, __entry->fence_seqno,
> __get_str(name),
>  		      __entry->job_count, __entry->hw_job_count,
> __entry->client_id)
>  );
> @@ -95,7 +93,6 @@ TRACE_EVENT(drm_sched_job_add_dep,
>  	TP_STRUCT__entry(
>  		    __field(u64, fence_context)
>  		    __field(u64, fence_seqno)
> -		    __field(u64, id)
>  		    __field(u64, ctx)
>  		    __field(u64, seqno)
>  		    ),
> @@ -103,12 +100,11 @@ TRACE_EVENT(drm_sched_job_add_dep,
>  	TP_fast_assign(
>  		    __entry->fence_context = sched_job->s_fence-
> >finished.context;
>  		    __entry->fence_seqno = sched_job->s_fence-
> >finished.seqno;
> -		    __entry->id = sched_job->id;
>  		    __entry->ctx = fence->context;
>  		    __entry->seqno = fence->seqno;
>  		    ),
> -	TP_printk("fence=%llu:%llu, id=%llu depends on
> fence=%llu:%llu",
> -		  __entry->fence_context, __entry->fence_seqno,
> __entry->id,
> +	TP_printk("fence=%llu:%llu depends on fence=%llu:%llu",
> +		  __entry->fence_context, __entry->fence_seqno,
>  		  __entry->ctx, __entry->seqno)
>  );
>  
> @@ -118,7 +114,6 @@ TRACE_EVENT(drm_sched_job_unschedulable,
>  	    TP_STRUCT__entry(
>  			     __field(u64, fence_context)
>  			     __field(u64, fence_seqno)
> -			     __field(uint64_t, id)
>  			     __field(u64, ctx)
>  			     __field(u64, seqno)
>  			     ),
> @@ -126,12 +121,11 @@ TRACE_EVENT(drm_sched_job_unschedulable,
>  	    TP_fast_assign(
>  			   __entry->fence_context = sched_job-
> >s_fence->finished.context;
>  			   __entry->fence_seqno = sched_job-
> >s_fence->finished.seqno;
> -			   __entry->id = sched_job->id;
>  			   __entry->ctx = fence->context;
>  			   __entry->seqno = fence->seqno;
>  			   ),
> -	    TP_printk("fence=%llu:%llu, id=%llu depends on
> unsignalled fence=%llu:%llu",
> -		      __entry->fence_context, __entry->fence_seqno,
> __entry->id,
> +	    TP_printk("fence=%llu:%llu depends on unsignalled
> fence=%llu:%llu",
> +		      __entry->fence_context, __entry->fence_seqno,
>  		      __entry->ctx, __entry->seqno)
>  );
>  
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 195b5f891068..dafda1803c7c 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -852,7 +852,6 @@ void drm_sched_job_arm(struct drm_sched_job *job)
>  
>  	job->sched = sched;
>  	job->s_priority = entity->priority;
> -	job->id = atomic64_inc_return(&sched->job_id_count);
>  
>  	drm_sched_fence_init(job->s_fence, job->entity);
>  }
> diff --git a/include/drm/gpu_scheduler.h
> b/include/drm/gpu_scheduler.h
> index 6fe3b4c0cffb..48190fdf661a 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -326,7 +326,6 @@ struct drm_sched_fence *to_drm_sched_fence(struct
> dma_fence *f);
>   * @finish_cb: the callback for the finished fence.
>   * @credits: the number of credits this job contributes to the
> scheduler
>   * @work: Helper to reschedule job kill to different context.
> - * @id: a unique id assigned to each job scheduled on the scheduler.
>   * @karma: increment on every hang caused by this job. If this
> exceeds the hang
>   *         limit of the scheduler then the job is marked guilty and
> will not
>   *         be scheduled further.
> @@ -339,8 +338,6 @@ struct drm_sched_fence *to_drm_sched_fence(struct
> dma_fence *f);
>   * to schedule the job.
>   */
>  struct drm_sched_job {
> -	u64				id;
> -
>  	/**
>  	 * @submit_ts:
>  	 *


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 09/10] drm/doc: document some tracepoints as uAPI
  2025-04-24  8:38 ` [PATCH v9 09/10] drm/doc: document some tracepoints as uAPI Pierre-Eric Pelloux-Prayer
@ 2025-05-14 12:53   ` Philipp Stanner
  2025-05-16  7:56     ` Pierre-Eric Pelloux-Prayer
  0 siblings, 1 reply; 24+ messages in thread
From: Philipp Stanner @ 2025-05-14 12:53 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, Matthew Brost, Danilo Krummrich, Philipp Stanner,
	Christian König, Sumit Semwal
  Cc: Lucas Stach, Maíra Canal, Christian König, dri-devel,
	linux-doc, linux-kernel, linux-media, linaro-mm-sig

On Thu, 2025-04-24 at 10:38 +0200, Pierre-Eric Pelloux-Prayer wrote:
> This commit adds a document section in drm-uapi.rst about
> tracepoints,
> and mark the events gpu_scheduler_trace.h as stable uAPI.
> 
> The goal is to explicitly state that tools can rely on the fields,
> formats and semantics of these events.
> 
> Acked-by: Lucas Stach <l.stach@pengutronix.de>
> Acked-by: Maíra Canal <mcanal@igalia.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> Signed-off-by: Pierre-Eric Pelloux-Prayer
> <pierre-eric.pelloux-prayer@amd.com>
> ---
>  Documentation/gpu/drm-uapi.rst                | 19
> +++++++++++++++++++
>  .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 19
> +++++++++++++++++++
>  2 files changed, 38 insertions(+)
> 
> diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-
> uapi.rst
> index 69f72e71a96e..4863a4deb0ee 100644
> --- a/Documentation/gpu/drm-uapi.rst
> +++ b/Documentation/gpu/drm-uapi.rst
> @@ -693,3 +693,22 @@ dma-buf interoperability
>  
>  Please see Documentation/userspace-api/dma-buf-alloc-exchange.rst
> for
>  information on how dma-buf is integrated and exposed within DRM.
> +
> +
> +Trace events
> +============
> +
> +See Documentation/trace/tracepoints.rst for information about using
> +Linux Kernel Tracepoints.
> +In the DRM subsystem, some events are considered stable uAPI to
> avoid
> +breaking tools (e.g.: GPUVis, umr) relying on them. Stable means
> that fields
> +cannot be removed, nor their formatting updated. Adding new fields
> is
> +possible, under the normal uAPI requirements.
> +
> +Stable uAPI events
> +------------------
> +
> +From ``drivers/gpu/drm/scheduler/gpu_scheduler_trace.h``
> +
> +.. kernel-doc::  drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> +   :doc: uAPI trace events
> \ No newline at end of file
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> index 781b20349389..7e840d08ef39 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> @@ -32,6 +32,25 @@
>  #define TRACE_SYSTEM gpu_scheduler
>  #define TRACE_INCLUDE_FILE gpu_scheduler_trace
>  
> +/**
> + * DOC: uAPI trace events
> + *
> + * ``drm_sched_job_queue``, ``drm_sched_job_run``,
> ``drm_sched_job_add_dep``,
> + * ``drm_sched_job_done`` and ``drm_sched_job_unschedulable`` are
> considered
> + * stable uAPI.
> + *
> + * Common trace events attributes:
> + *
> + * * ``dev``   - the dev_name() of the device running the job.
> + *
> + * * ``ring``  - the hardware ring running the job. Together with
> ``dev`` it
> + *   uniquely identifies where the job is going to be executed.
> + *
> + * * ``fence`` - the &dma_fence.context and the &dma_fence.seqno of
> + *   &drm_sched_fence.finished
> + *
> + */

For my understanding, why do you use the double apostrophes here?

Also, the linking for the docu afair here two requires you to write

&struct dma_fence.seqno

If I am not mistaken

https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#highlights-and-cross-references


P.

> +
>  DECLARE_EVENT_CLASS(drm_sched_job,
>  	    TP_PROTO(struct drm_sched_job *sched_job, struct
> drm_sched_entity *entity),
>  	    TP_ARGS(sched_job, entity),


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 02/10] drm/sched: store the drm client_id in drm_sched_fence
  2025-05-14 12:44   ` Philipp Stanner
@ 2025-05-15  6:53     ` Pierre-Eric Pelloux-Prayer
  2025-05-19 11:02       ` Pierre-Eric Pelloux-Prayer
  0 siblings, 1 reply; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-05-15  6:53 UTC (permalink / raw)
  To: phasta
  Cc: Christian König, dri-devel, linux-kernel, amd-gfx, etnaviv,
	lima, linux-arm-msm, freedreno, nouveau, intel-xe,
	Pierre-Eric Pelloux-Prayer, Min Ma, Lizhi Hou, Oded Gabbay,
	Felix Kuehling, Alex Deucher, Christian König, David Airlie,
	Simona Vetter, Lucas Stach, Russell King, Christian Gmeiner,
	Frank Binns, Matt Coster, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Qiang Yu, Rob Clark, Abhinav Kumar,
	Dmitry Baryshkov, Sean Paul, Marijn Suijten, Lyude Paul,
	Danilo Krummrich, Boris Brezillon, Rob Herring, Steven Price,
	Liviu Dudau, Matthew Brost, Melissa Wen, Maíra Canal,
	Lucas De Marchi, Thomas Hellström, Rodrigo Vivi

Hi,

Le 14/05/2025 à 14:44, Philipp Stanner a écrit :
> On Thu, 2025-04-24 at 10:38 +0200, Pierre-Eric Pelloux-Prayer wrote:
>> This will be used in a later commit to trace the drm client_id in
>> some of the gpu_scheduler trace events.
>>
>> This requires changing all the users of drm_sched_job_init to
>> add an extra parameter.
>>
>> The newly added drm_client_id field in the drm_sched_fence is a bit
>> of a duplicate of the owner one. One suggestion I received was to
>> merge those 2 fields - this can't be done right now as amdgpu uses
>> some special values (AMDGPU_FENCE_OWNER_*) that can't really be
>> translated into a client id. Christian is working on getting rid of
>> those; when it's done we should be able to squash owner/drm_client_id
>> together.
>>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer
>> <pierre-eric.pelloux-prayer@amd.com>
>> ---
>>   drivers/accel/amdxdna/aie2_ctx.c                 |  3 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c       |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           |  3 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c          |  8 +++++---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h          |  3 ++-
>>   drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c     |  2 +-
>>   drivers/gpu/drm/imagination/pvr_job.c            |  2 +-
>>   drivers/gpu/drm/imagination/pvr_queue.c          |  5 +++--
>>   drivers/gpu/drm/imagination/paac             |  8 +++++---
>>   drivers/gpu/drm/nouveau/nouveau_sched.c          |  3 ++-
>>   drivers/gpu/drm/panfrost/panfrost_drv.c          |  2 +-
>>   drivers/gpu/drm/panthor/panthor_drv.c            |  3 ++-
>>   drivers/gpu/drm/panthor/panthor_mmu.c            |  2 +-
>>   drivers/gpu/drm/panthor/panthor_sched.c          |  5 +++--
>>   drivers/gpu/drm/panthor/panthor_sched.h          |  3 ++-
>>   drivers/gpu/drm/scheduler/sched_fence.c          |  4 ++
>>   drivers/gpu/drm/scheduler/sched_internal.h       |  2 +-
>>   drivers/gpu/drm/scheduler/sched_main.c           |  6 ++++--
>>   drivers/gpu/drm/scheduler/tests/mock_scheduler.c |  2 +-
>>   drivers/gpu/drm/v3d/v3d_submit.c                 |  2 +-
>>   drivers/gpu/drm/xe/xe_sched_job.c                |  3 ++-
>>   include/drm/gpu_scheduler.h                      | 10 +++++++++-
>>   26 files changed, 62 insertions(+), 34 deletions(-)
> 
> I think last time I asked about what your merge plan for this is, since
> it touches so many drivers. Should I take that?

Based on:

https://drm.pages.freedesktop.org/maintainer-tools/committer/committer-drm-misc.html

"drm-misc is for drm core (non-driver) patches, subsystem-wide refactorings,
and small trivial patches all over (including drivers)."

I assume it should go through drm-misc.


> 
> Besides one comment below, scheduler bits look fine.
> 
>>
>> diff --git a/drivers/accel/amdxdna/aie2_ctx.c
>> b/drivers/accel/amdxdna/aie2_ctx.c
>> index e04549f64d69..3e38a5f637ea 100644
>> --- a/drivers/accel/amdxdna/aie2_ctx.c
>> +++ b/drivers/accel/amdxdna/aie2_ctx.c
>> @@ -848,7 +848,8 @@ int aie2_cmd_submit(struct amdxdna_hwctx *hwctx,
>> struct amdxdna_sched_job *job,
>>   		goto up_sem;
>>   	}
>>   
>> -	ret = drm_sched_job_init(&job->base, &hwctx->priv->entity,
>> 1, hwctx);
>> +	ret = drm_sched_job_init(&job->base, &hwctx->priv->entity,
>> 1, hwctx,
>> +				 hwctx->client->filp->client_id);
>>   	if (ret) {
>>   		XDNA_ERR(xdna, "DRM job init failed, ret %d", ret);
>>   		goto free_chain;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> index 4cec3a873995..1a77ba7036c9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> @@ -639,7 +639,7 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device
>> *adev,
>>   		goto err;
>>   	}
>>   
>> -	ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job);
>> +	ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job, 0);
>>   	if (ret)
>>   		goto err;
>>   
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> index 82df06a72ee0..5a231b997d65 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> @@ -293,7 +293,8 @@ static int amdgpu_cs_pass1(struct
>> amdgpu_cs_parser *p,
>>   
>>   	for (i = 0; i < p->gang_size; ++i) {
>>   		ret = amdgpu_job_alloc(p->adev, vm, p->entities[i],
>> vm,
>> -				       num_ibs[i], &p->jobs[i]);
>> +				       num_ibs[i], &p->jobs[i],
>> +				       p->filp->client_id);
>>   		if (ret)
>>   			goto free_all_kdata;
>>   		p->jobs[i]->enforce_isolation = p->adev-
>>> enforce_isolation[fpriv->xcp_id];
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> index acb21fc8b3ce..75262ce8db27 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> @@ -204,7 +204,8 @@ static enum drm_gpu_sched_stat
>> amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   
>>   int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm
>> *vm,
>>   		     struct drm_sched_entity *entity, void *owner,
>> -		     unsigned int num_ibs, struct amdgpu_job **job)
>> +		     unsigned int num_ibs, struct amdgpu_job **job,
>> +		     u64 drm_client_id)
>>   {
>>   	if (num_ibs == 0)
>>   		return -EINVAL;
>> @@ -222,7 +223,8 @@ int amdgpu_job_alloc(struct amdgpu_device *adev,
>> struct amdgpu_vm *vm,
>>   	if (!entity)
>>   		return 0;
>>   
>> -	return drm_sched_job_init(&(*job)->base, entity, 1, owner);
>> +	return drm_sched_job_init(&(*job)->base, entity, 1, owner,
>> +				  drm_client_id);
>>   }
>>   
>>   int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
>> @@ -232,7 +234,7 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device
>> *adev,
>>   {
>>   	int r;
>>   
>> -	r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job);
>> +	r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job, 0);
>>   	if (r)
>>   		return r;
>>   
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>> index ce6b9ba967ff..5a8bc6342222 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>> @@ -90,7 +90,8 @@ static inline struct amdgpu_ring
>> *amdgpu_job_ring(struct amdgpu_job *job)
>>   
>>   int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm
>> *vm,
>>   		     struct drm_sched_entity *entity, void *owner,
>> -		     unsigned int num_ibs, struct amdgpu_job **job);
>> +		     unsigned int num_ibs, struct amdgpu_job **job,
>> +		     u64 drm_client_id);
>>   int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
>>   			     struct drm_sched_entity *entity, void
>> *owner,
>>   			     size_t size, enum amdgpu_ib_pool_type
>> pool_type,
>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>> b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>> index 3c0a5c3e0e3d..76c742328edb 100644
>> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>> @@ -534,7 +534,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device
>> *dev, void *data,
>>   
>>   	ret = drm_sched_job_init(&submit->sched_job,
>>   				 &ctx->sched_entity[args->pipe],
>> -				 1, submit->ctx);
>> +				 1, submit->ctx, file->client_id);
>>   	if (ret)
>>   		goto err_submit_put;
>>   
>> diff --git a/drivers/gpu/drm/imagination/pvr_job.c
>> b/drivers/gpu/drm/imagination/pvr_job.c
>> index 59b334d094fa..7564b0f21b42 100644
>> --- a/drivers/gpu/drm/imagination/pvr_job.c
>> +++ b/drivers/gpu/drm/imagination/pvr_job.c
>> @@ -446,7 +446,7 @@ create_job(struct pvr_device *pvr_dev,
>>   	if (err)
>>   		goto err_put_job;
>>   
>> -	err = pvr_queue_job_init(job);
>> +	err = pvr_queue_job_init(job, pvr_file->file->client_id);
>>   	if (err)
>>   		goto err_put_job;
>>   
>> diff --git a/drivers/gpu/drm/imagination/pvr_queue.c
>> b/drivers/gpu/drm/imagination/pvr_queue.c
>> index 5e9bc0992824..5a41ee79fed6 100644
>> --- a/drivers/gpu/drm/imagination/pvr_queue.c
>> +++ b/drivers/gpu/drm/imagination/pvr_queue.c
>> @@ -1073,6 +1073,7 @@ static int pvr_queue_cleanup_fw_context(struct
>> pvr_queue *queue)
>>   /**
>>    * pvr_queue_job_init() - Initialize queue related fields in a
>> pvr_job object.
>>    * @job: The job to initialize.
>> + * @drm_client_id: drm_file.client_id submitting the job
>>    *
>>    * Bind the job to a queue and allocate memory to guarantee
>> pvr_queue_job_arm()
>>    * and pvr_queue_job_push() can't fail. We also make sure the
>> context type is
>> @@ -1082,7 +1083,7 @@ static int pvr_queue_cleanup_fw_context(struct
>> pvr_queue *queue)
>>    *  * 0 on success, or
>>    *  * An error code if something failed.
>>    */
>> -int pvr_queue_job_init(struct pvr_job *job)
>> +int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id)
>>   {
>>   	/* Fragment jobs need at least one native fence wait on the
>> geometry job fence. */
>>   	u32 min_native_dep_count = job->type ==
>> DRM_PVR_JOB_TYPE_FRAGMENT ? 1 : 0;
>> @@ -1099,7 +1100,7 @@ int pvr_queue_job_init(struct pvr_job *job)
>>   	if (!pvr_cccb_cmdseq_can_fit(&queue->cccb,
>> job_cmds_size(job, min_native_dep_count)))
>>   		return -E2BIG;
>>   
>> -	err = drm_sched_job_init(&job->base, &queue->entity, 1,
>> THIS_MODULE);
>> +	err = drm_sched_job_init(&job->base, &queue->entity, 1,
>> THIS_MODULE, drm_client_id);
>>   	if (err)
>>   		return err;
>>   
>> diff --git a/drivers/gpu/drm/imagination/pvr_queue.h
>> b/drivers/gpu/drm/imagination/pvr_queue.h
>> index 93fe9ac9f58c..fc1986d73fc8 100644
>> --- a/drivers/gpu/drm/imagination/pvr_queue.h
>> +++ b/drivers/gpu/drm/imagination/pvr_queue.h
>> @@ -143,7 +143,7 @@ struct pvr_queue {
>>   
>>   bool pvr_queue_fence_is_ufo_backed(struct dma_fence *f);
>>   
>> -int pvr_queue_job_init(struct pvr_job *job);
>> +int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id);
>>   
>>   void pvr_queue_job_cleanup(struct pvr_job *job);
>>   
>> diff --git a/drivers/gpu/drm/lima/lima_gem.c
>> b/drivers/gpu/drm/lima/lima_gem.c
>> index 5deec673c11e..9722b847a539 100644
>> --- a/drivers/gpu/drm/lima/lima_gem.c
>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>> @@ -341,7 +341,7 @@ int lima_gem_submit(struct drm_file *file, struct
>> lima_submit *submit)
>>   
>>   	err = lima_sched_task_init(
>>   		submit->task, submit->ctx->context + submit->pipe,
>> -		bos, submit->nr_bos, vm);
>> +		bos, submit->nr_bos, vm, file->client_id);
>>   	if (err)
>>   		goto err_out1;
>>   
>> diff --git a/drivers/gpu/drm/lima/lima_sched.c
>> b/drivers/gpu/drm/lima/lima_sched.c
>> index 7934098e651b..954f4325b859 100644
>> --- a/drivers/gpu/drm/lima/lima_sched.c
>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>> @@ -113,7 +113,8 @@ static inline struct lima_sched_pipe
>> *to_lima_pipe(struct drm_gpu_scheduler *sch
>>   int lima_sched_task_init(struct lima_sched_task *task,
>>   			 struct lima_sched_context *context,
>>   			 struct lima_bo **bos, int num_bos,
>> -			 struct lima_vm *vm)
>> +			 struct lima_vm *vm,
>> +			 u64 drm_client_id)
>>   {
>>   	int err, i;
>>   
>> @@ -124,7 +125,8 @@ int lima_sched_task_init(struct lima_sched_task
>> *task,
>>   	for (i = 0; i < num_bos; i++)
>>   		drm_gem_object_get(&bos[i]->base.base);
>>   
>> -	err = drm_sched_job_init(&task->base, &context->base, 1,
>> vm);
>> +	err = drm_sched_job_init(&task->base, &context->base, 1, vm,
>> +				 drm_client_id);
>>   	if (err) {
>>   		kfree(task->bos);
>>   		return err;
>> diff --git a/drivers/gpu/drm/lima/lima_sched.h
>> b/drivers/gpu/drm/lima/lima_sched.h
>> index 85b23ba901d5..1a08faf8a529 100644
>> --- a/drivers/gpu/drm/lima/lima_sched.h
>> +++ b/drivers/gpu/drm/lima/lima_sched.h
>> @@ -88,7 +88,8 @@ struct lima_sched_pipe {
>>   int lima_sched_task_init(struct lima_sched_task *task,
>>   			 struct lima_sched_context *context,
>>   			 struct lima_bo **bos, int num_bos,
>> -			 struct lima_vm *vm);
>> +			 struct lima_vm *vm,
>> +			 u64 drm_client_id);
>>   void lima_sched_task_fini(struct lima_sched_task *task);
>>   
>>   int lima_sched_context_init(struct lima_sched_pipe *pipe,
>> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c
>> b/drivers/gpu/drm/msm/msm_gem_submit.c
>> index 3e9aa2cc38ef..d9be0fe3d674 100644
>> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
>> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
>> @@ -30,7 +30,7 @@
>>   static struct msm_gem_submit *submit_create(struct drm_device *dev,
>>   		struct msm_gpu *gpu,
>>   		struct msm_gpu_submitqueue *queue, uint32_t nr_bos,
>> -		uint32_t nr_cmds)
>> +		uint32_t nr_cmds, u64 drm_client_id)
>>   {
>>   	static atomic_t ident = ATOMIC_INIT(0);
>>   	struct msm_gem_submit *submit;
>> @@ -54,7 +54,8 @@ static struct msm_gem_submit *submit_create(struct
>> drm_device *dev,
>>   		return ERR_PTR(ret);
>>   	}
>>   
>> -	ret = drm_sched_job_init(&submit->base, queue->entity, 1,
>> queue);
>> +	ret = drm_sched_job_init(&submit->base, queue->entity, 1,
>> queue,
>> +				 drm_client_id);
>>   	if (ret) {
>>   		kfree(submit->hw_fence);
>>   		kfree(submit);
>> @@ -693,7 +694,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev,
>> void *data,
>>   		}
>>   	}
>>   
>> -	submit = submit_create(dev, gpu, queue, args->nr_bos, args-
>>> nr_cmds);
>> +	submit = submit_create(dev, gpu, queue, args->nr_bos, args-
>>> nr_cmds,
>> +			       file->client_id);
>>   	if (IS_ERR(submit)) {
>>   		ret = PTR_ERR(submit);
>>   		goto out_post_unlock;
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c
>> b/drivers/gpu/drm/nouveau/nouveau_sched.c
>> index d326e55d2d24..460a5fb02412 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
>> @@ -87,7 +87,8 @@ nouveau_job_init(struct nouveau_job *job,
>>   	}
>>   
>>   	ret = drm_sched_job_init(&job->base, &sched->entity,
>> -				 args->credits, NULL);
>> +				 args->credits, NULL,
>> +				 job->file_priv->client_id);
>>   	if (ret)
>>   		goto err_free_chains;
>>   
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c
>> b/drivers/gpu/drm/panfrost/panfrost_drv.c
>> index b87f83e94eda..d5c2c6530ed8 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
>> @@ -312,7 +312,7 @@ static int panfrost_ioctl_submit(struct
>> drm_device *dev, void *data,
>>   
>>   	ret = drm_sched_job_init(&job->base,
>>   				 &file_priv->sched_entity[slot],
>> -				 1, NULL);
>> +				 1, NULL, file->client_id);
>>   	if (ret)
>>   		goto out_put_job;
>>   
>> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c
>> b/drivers/gpu/drm/panthor/panthor_drv.c
>> index 06fe46e32073..bd8e1900c919 100644
>> --- a/drivers/gpu/drm/panthor/panthor_drv.c
>> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
>> @@ -989,7 +989,8 @@ static int panthor_ioctl_group_submit(struct
>> drm_device *ddev, void *data,
>>   		const struct drm_panthor_queue_submit *qsubmit =
>> &jobs_args[i];
>>   		struct drm_sched_job *job;
>>   
>> -		job = panthor_job_create(pfile, args->group_handle,
>> qsubmit);
>> +		job = panthor_job_create(pfile, args->group_handle,
>> qsubmit,
>> +					 file->client_id);
>>   		if (IS_ERR(job)) {
>>   			ret = PTR_ERR(job);
>>   			goto out_cleanup_submit_ctx;
>> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c
>> b/drivers/gpu/drm/panthor/panthor_mmu.c
>> index 12a02e28f50f..e0c79bd2d173 100644
>> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
>> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
>> @@ -2516,7 +2516,7 @@ panthor_vm_bind_job_create(struct drm_file
>> *file,
>>   	kref_init(&job->refcount);
>>   	job->vm = panthor_vm_get(vm);
>>   
>> -	ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm);
>> +	ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm,
>> file->client_id);
>>   	if (ret)
>>   		goto err_put_job;
>>   
>> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c
>> b/drivers/gpu/drm/panthor/panthor_sched.c
>> index 446ec780eb4a..2af860c9068a 100644
>> --- a/drivers/gpu/drm/panthor/panthor_sched.c
>> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
>> @@ -3729,7 +3729,8 @@ struct panthor_vm *panthor_job_vm(struct
>> drm_sched_job *sched_job)
>>   struct drm_sched_job *
>>   panthor_job_create(struct panthor_file *pfile,
>>   		   u16 group_handle,
>> -		   const struct drm_panthor_queue_submit *qsubmit)
>> +		   const struct drm_panthor_queue_submit *qsubmit,
>> +		   u64 drm_client_id)
>>   {
>>   	struct panthor_group_pool *gpool = pfile->groups;
>>   	struct panthor_job *job;
>> @@ -3801,7 +3802,7 @@ panthor_job_create(struct panthor_file *pfile,
>>   
>>   	ret = drm_sched_job_init(&job->base,
>>   				 &job->group->queues[job-
>>> queue_idx]->entity,
>> -				 credits, job->group);
>> +				 credits, job->group,
>> drm_client_id);
>>   	if (ret)
>>   		goto err_put_job;
>>   
>> diff --git a/drivers/gpu/drm/panthor/panthor_sched.h
>> b/drivers/gpu/drm/panthor/panthor_sched.h
>> index e650a445cf50..742b0b4ff3a3 100644
>> --- a/drivers/gpu/drm/panthor/panthor_sched.h
>> +++ b/drivers/gpu/drm/panthor/panthor_sched.h
>> @@ -29,7 +29,8 @@ int panthor_group_get_state(struct panthor_file
>> *pfile,
>>   struct drm_sched_job *
>>   panthor_job_create(struct panthor_file *pfile,
>>   		   u16 group_handle,
>> -		   const struct drm_panthor_queue_submit *qsubmit);
>> +		   const struct drm_panthor_queue_submit *qsubmit,
>> +		   u64 drm_client_id);
>>   struct drm_sched_job *panthor_job_get(struct drm_sched_job *job);
>>   struct panthor_vm *panthor_job_vm(struct drm_sched_job *sched_job);
>>   void panthor_job_put(struct drm_sched_job *job);
>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c
>> b/drivers/gpu/drm/scheduler/sched_fence.c
>> index e971528504a5..d208d384d38d 100644
>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>> @@ -206,7 +206,8 @@ struct drm_sched_fence *to_drm_sched_fence(struct
>> dma_fence *f)
>>   EXPORT_SYMBOL(to_drm_sched_fence);
>>   
>>   struct drm_sched_fence *drm_sched_fence_alloc(struct
>> drm_sched_entity *entity,
>> -					      void *owner)
>> +					      void *owner,
>> +					      u64 drm_client_id)
>>   {
>>   	struct drm_sched_fence *fence = NULL;
>>   
>> @@ -215,6 +216,7 @@ struct drm_sched_fence
>> *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>   		return NULL;
>>   
>>   	fence->owner = owner;
>> +	fence->drm_client_id = drm_client_id;
>>   	spin_lock_init(&fence->lock);
>>   
>>   	return fence;
>> diff --git a/drivers/gpu/drm/scheduler/sched_internal.h
>> b/drivers/gpu/drm/scheduler/sched_internal.h
>> index 599cf6e1bb74..7ea5a6736f98 100644
>> --- a/drivers/gpu/drm/scheduler/sched_internal.h
>> +++ b/drivers/gpu/drm/scheduler/sched_internal.h
>> @@ -24,7 +24,7 @@ void drm_sched_entity_select_rq(struct
>> drm_sched_entity *entity);
>>   struct drm_sched_job *drm_sched_entity_pop_job(struct
>> drm_sched_entity *entity);
>>   
>>   struct drm_sched_fence *drm_sched_fence_alloc(struct
>> drm_sched_entity *s_entity,
>> -					      void *owner);
>> +					      void *owner, u64
>> drm_client_id);
>>   void drm_sched_fence_init(struct drm_sched_fence *fence,
>>   			  struct drm_sched_entity *entity);
>>   void drm_sched_fence_free(struct drm_sched_fence *fence);
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 829579c41c6b..60611618f3ab 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -764,6 +764,7 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>>    * @credits: the number of credits this job contributes to the
>> schedulers
>>    * credit limit
>>    * @owner: job owner for debugging
>> + * @drm_client_id: drm_file.client_id of the owner
> 
> For the docu generation to link that properly it must be written as
> 
> &struct drm_file.client_id

Noted.

> 
> Besides, if this were an optional parameter, one should document it.
> I'm not sure if it is, I haven't used these client_id's before.

Passing an invalid client_id would only cause the trace events to print the invalid client_id.

Thanks,
Pierre-Eric


> 
> P.
> 
>>    *
>>    * Refer to drm_sched_entity_push_job() documentation
>>    * for locking considerations.
>> @@ -784,7 +785,8 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>>    */
>>   int drm_sched_job_init(struct drm_sched_job *job,
>>   		       struct drm_sched_entity *entity,
>> -		       u32 credits, void *owner)
>> +		       u32 credits, void *owner,
>> +		       uint64_t drm_client_id)
>>   {
>>   	if (!entity->rq) {
>>   		/* This will most likely be followed by missing
>> frames
>> @@ -810,7 +812,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>   
>>   	job->entity = entity;
>>   	job->credits = credits;
>> -	job->s_fence = drm_sched_fence_alloc(entity, owner);
>> +	job->s_fence = drm_sched_fence_alloc(entity, owner,
>> drm_client_id);
>>   	if (!job->s_fence)
>>   		return -ENOMEM;
>>   
>> diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>> b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>> index f999c8859cf7..09ffbdb32d76 100644
>> --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>> +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>> @@ -35,7 +35,7 @@ drm_mock_sched_entity_new(struct kunit *test,
>>   	ret = drm_sched_entity_init(&entity->base,
>>   				    priority,
>>   				    &drm_sched, 1,
>> -				    NULL);
>> +				    NULL, 1);
>>   	KUNIT_ASSERT_EQ(test, ret, 0);
>>   
>>   	entity->test = test;
>> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c
>> b/drivers/gpu/drm/v3d/v3d_submit.c
>> index 4ff5de46fb22..5171ffe9012d 100644
>> --- a/drivers/gpu/drm/v3d/v3d_submit.c
>> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
>> @@ -169,7 +169,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file
>> *file_priv,
>>   	job->file = file_priv;
>>   
>>   	ret = drm_sched_job_init(&job->base, &v3d_priv-
>>> sched_entity[queue],
>> -				 1, v3d_priv);
>> +				 1, v3d_priv, file_priv->client_id);
>>   	if (ret)
>>   		return ret;
>>   
>> diff --git a/drivers/gpu/drm/xe/xe_sched_job.c
>> b/drivers/gpu/drm/xe/xe_sched_job.c
>> index 1905ca590965..f4679cb9a56b 100644
>> --- a/drivers/gpu/drm/xe/xe_sched_job.c
>> +++ b/drivers/gpu/drm/xe/xe_sched_job.c
>> @@ -113,7 +113,8 @@ struct xe_sched_job *xe_sched_job_create(struct
>> xe_exec_queue *q,
>>   	kref_init(&job->refcount);
>>   	xe_exec_queue_get(job->q);
>>   
>> -	err = drm_sched_job_init(&job->drm, q->entity, 1, NULL);
>> +	err = drm_sched_job_init(&job->drm, q->entity, 1, NULL,
>> +				 q->xef->drm->client_id);
>>   	if (err)
>>   		goto err_free;
>>   
>> diff --git a/include/drm/gpu_scheduler.h
>> b/include/drm/gpu_scheduler.h
>> index 1a7e377d4cbb..6fe3b4c0cffb 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -305,6 +305,13 @@ struct drm_sched_fence {
>>            * @owner: job owner for debugging
>>            */
>>   	void				*owner;
>> +
>> +	/**
>> +	 * @drm_client_id:
>> +	 *
>> +	 * The client_id of the drm_file which owns the job.
>> +	 */
>> +	uint64_t			drm_client_id;
>>   };
>>   
>>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>> @@ -629,7 +636,8 @@ drm_sched_pick_best(struct drm_gpu_scheduler
>> **sched_list,
>>   
>>   int drm_sched_job_init(struct drm_sched_job *job,
>>   		       struct drm_sched_entity *entity,
>> -		       u32 credits, void *owner);
>> +		       u32 credits, void *owner,
>> +		       u64 drm_client_id);
>>   void drm_sched_job_arm(struct drm_sched_job *job);
>>   void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
>>   int drm_sched_job_add_dependency(struct drm_sched_job *job,


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 09/10] drm/doc: document some tracepoints as uAPI
  2025-05-14 12:53   ` Philipp Stanner
@ 2025-05-16  7:56     ` Pierre-Eric Pelloux-Prayer
  0 siblings, 0 replies; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-05-16  7:56 UTC (permalink / raw)
  To: phasta, Pierre-Eric Pelloux-Prayer, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, Matthew Brost, Danilo Krummrich,
	Christian König, Sumit Semwal
  Cc: Lucas Stach, Maíra Canal, Christian König, dri-devel,
	linux-doc, linux-kernel, linux-media, linaro-mm-sig

Hi,

Le 14/05/2025 à 14:53, Philipp Stanner a écrit :
> On Thu, 2025-04-24 at 10:38 +0200, Pierre-Eric Pelloux-Prayer wrote:
>> This commit adds a document section in drm-uapi.rst about
>> tracepoints,
>> and mark the events gpu_scheduler_trace.h as stable uAPI.
>>
>> The goal is to explicitly state that tools can rely on the fields,
>> formats and semantics of these events.
>>
>> Acked-by: Lucas Stach <l.stach@pengutronix.de>
>> Acked-by: Maíra Canal <mcanal@igalia.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer
>> <pierre-eric.pelloux-prayer@amd.com>
>> ---
>>   Documentation/gpu/drm-uapi.rst                | 19
>> +++++++++++++++++++
>>   .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 19
>> +++++++++++++++++++
>>   2 files changed, 38 insertions(+)
>>
>> diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-
>> uapi.rst
>> index 69f72e71a96e..4863a4deb0ee 100644
>> --- a/Documentation/gpu/drm-uapi.rst
>> +++ b/Documentation/gpu/drm-uapi.rst
>> @@ -693,3 +693,22 @@ dma-buf interoperability
>>   
>>   Please see Documentation/userspace-api/dma-buf-alloc-exchange.rst
>> for
>>   information on how dma-buf is integrated and exposed within DRM.
>> +
>> +
>> +Trace events
>> +============
>> +
>> +See Documentation/trace/tracepoints.rst for information about using
>> +Linux Kernel Tracepoints.
>> +In the DRM subsystem, some events are considered stable uAPI to
>> avoid
>> +breaking tools (e.g.: GPUVis, umr) relying on them. Stable means
>> that fields
>> +cannot be removed, nor their formatting updated. Adding new fields
>> is
>> +possible, under the normal uAPI requirements.
>> +
>> +Stable uAPI events
>> +------------------
>> +
>> +From ``drivers/gpu/drm/scheduler/gpu_scheduler_trace.h``
>> +
>> +.. kernel-doc::  drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> +   :doc: uAPI trace events
>> \ No newline at end of file
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> index 781b20349389..7e840d08ef39 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> @@ -32,6 +32,25 @@
>>   #define TRACE_SYSTEM gpu_scheduler
>>   #define TRACE_INCLUDE_FILE gpu_scheduler_trace
>>   
>> +/**
>> + * DOC: uAPI trace events
>> + *
>> + * ``drm_sched_job_queue``, ``drm_sched_job_run``,
>> ``drm_sched_job_add_dep``,
>> + * ``drm_sched_job_done`` and ``drm_sched_job_unschedulable`` are
>> considered
>> + * stable uAPI.
>> + *
>> + * Common trace events attributes:
>> + *
>> + * * ``dev``   - the dev_name() of the device running the job.
>> + *
>> + * * ``ring``  - the hardware ring running the job. Together with
>> ``dev`` it
>> + *   uniquely identifies where the job is going to be executed.
>> + *
>> + * * ``fence`` - the &dma_fence.context and the &dma_fence.seqno of
>> + *   &drm_sched_fence.finished
>> + *
>> + */
> 
> For my understanding, why do you use the double apostrophes here?

To get similar formatting to function arguments and make the output a bit nicer to read.

> 
> Also, the linking for the docu afair here two requires you to write
> 
> &struct dma_fence.seqno
> 
> If I am not mistaken
> 
> https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#highlights-and-cross-references

Indeed, thanks. I fixed this.

Pierre-Eric

> 
> 
> P.
> 
>> +
>>   DECLARE_EVENT_CLASS(drm_sched_job,
>>   	    TP_PROTO(struct drm_sched_job *sched_job, struct
>> drm_sched_entity *entity),
>>   	    TP_ARGS(sched_job, entity),


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 02/10] drm/sched: store the drm client_id in drm_sched_fence
  2025-05-15  6:53     ` Pierre-Eric Pelloux-Prayer
@ 2025-05-19 11:02       ` Pierre-Eric Pelloux-Prayer
  2025-05-19 11:59         ` Philipp Stanner
  0 siblings, 1 reply; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-05-19 11:02 UTC (permalink / raw)
  To: phasta, Matthew Brost, Danilo Krummrich, Christian König
  Cc: dri-devel, linux-kernel, amd-gfx, etnaviv, lima, linux-arm-msm,
	freedreno, nouveau, intel-xe, Pierre-Eric Pelloux-Prayer, Min Ma,
	Lizhi Hou, Oded Gabbay, Felix Kuehling, Alex Deucher,
	Christian König, David Airlie, Simona Vetter, Lucas Stach,
	Russell King, Christian Gmeiner, Frank Binns, Matt Coster,
	Thomas Zimmermann, Qiang Yu, Rob Clark, Abhinav Kumar,
	Dmitry Baryshkov, Sean Paul, Marijn Suijten, Lyude Paul,
	Boris Brezillon, Rob Herring, Steven Price, Liviu Dudau,
	Melissa Wen, Maíra Canal, Lucas De Marchi,
	Thomas Hellström, Rodrigo Vivi, Maxime Ripard,
	Maarten Lankhorst



Le 15/05/2025 à 08:53, Pierre-Eric Pelloux-Prayer a écrit :
> Hi,
> 
> Le 14/05/2025 à 14:44, Philipp Stanner a écrit :
>> On Thu, 2025-04-24 at 10:38 +0200, Pierre-Eric Pelloux-Prayer wrote:
>>> This will be used in a later commit to trace the drm client_id in
>>> some of the gpu_scheduler trace events.
>>>
>>> This requires changing all the users of drm_sched_job_init to
>>> add an extra parameter.
>>>
>>> The newly added drm_client_id field in the drm_sched_fence is a bit
>>> of a duplicate of the owner one. One suggestion I received was to
>>> merge those 2 fields - this can't be done right now as amdgpu uses
>>> some special values (AMDGPU_FENCE_OWNER_*) that can't really be
>>> translated into a client id. Christian is working on getting rid of
>>> those; when it's done we should be able to squash owner/drm_client_id
>>> together.
>>>
>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Signed-off-by: Pierre-Eric Pelloux-Prayer
>>> <pierre-eric.pelloux-prayer@amd.com>
>>> ---
>>>   drivers/accel/amdxdna/aie2_ctx.c                 |  3 ++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c       |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           |  3 ++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c          |  8 +++++---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h          |  3 ++-
>>>   drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c     |  2 +-
>>>   drivers/gpu/drm/imagination/pvr_job.c            |  2 +-
>>>   drivers/gpu/drm/imagination/pvr_queue.c          |  5 +++--
>>>   drivers/gpu/drm/imagination/paac             |  8 +++++---
>>>   drivers/gpu/drm/nouveau/nouveau_sched.c          |  3 ++-
>>>   drivers/gpu/drm/panfrost/panfrost_drv.c          |  2 +-
>>>   drivers/gpu/drm/panthor/panthor_drv.c            |  3 ++-
>>>   drivers/gpu/drm/panthor/panthor_mmu.c            |  2 +-
>>>   drivers/gpu/drm/panthor/panthor_sched.c          |  5 +++--
>>>   drivers/gpu/drm/panthor/panthor_sched.h          |  3 ++-
>>>   drivers/gpu/drm/scheduler/sched_fence.c          |  4 ++
>>>   drivers/gpu/drm/scheduler/sched_internal.h       |  2 +-
>>>   drivers/gpu/drm/scheduler/sched_main.c           |  6 ++++--
>>>   drivers/gpu/drm/scheduler/tests/mock_scheduler.c |  2 +-
>>>   drivers/gpu/drm/v3d/v3d_submit.c                 |  2 +-
>>>   drivers/gpu/drm/xe/xe_sched_job.c                |  3 ++-
>>>   include/drm/gpu_scheduler.h                      | 10 +++++++++-
>>>   26 files changed, 62 insertions(+), 34 deletions(-)
>>
>> I think last time I asked about what your merge plan for this is, since
>> it touches so many drivers. Should I take that?
> 
> Based on:
> 
> https://drm.pages.freedesktop.org/maintainer-tools/committer/committer-drm-misc.html
> 
> "drm-misc is for drm core (non-driver) patches, subsystem-wide refactorings,
> and small trivial patches all over (including drivers)."
> 
> I assume it should go through drm-misc.

I've addressed your comments and pushed an updated branch to 
https://gitlab.freedesktop.org/pepp/linux/-/commits/improve_gpu_scheduler_trace_v10

Any chance to get this merged soon?

Thanks,
Pierre-Eric



> 
> 
>>
>> Besides one comment below, scheduler bits look fine.
>>
>>>
>>> diff --git a/drivers/accel/amdxdna/aie2_ctx.c
>>> b/drivers/accel/amdxdna/aie2_ctx.c
>>> index e04549f64d69..3e38a5f637ea 100644
>>> --- a/drivers/accel/amdxdna/aie2_ctx.c
>>> +++ b/drivers/accel/amdxdna/aie2_ctx.c
>>> @@ -848,7 +848,8 @@ int aie2_cmd_submit(struct amdxdna_hwctx *hwctx,
>>> struct amdxdna_sched_job *job,
>>>           goto up_sem;
>>>       }
>>> -    ret = drm_sched_job_init(&job->base, &hwctx->priv->entity,
>>> 1, hwctx);
>>> +    ret = drm_sched_job_init(&job->base, &hwctx->priv->entity,
>>> 1, hwctx,
>>> +                 hwctx->client->filp->client_id);
>>>       if (ret) {
>>>           XDNA_ERR(xdna, "DRM job init failed, ret %d", ret);
>>>           goto free_chain;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> index 4cec3a873995..1a77ba7036c9 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> @@ -639,7 +639,7 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device
>>> *adev,
>>>           goto err;
>>>       }
>>> -    ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job);
>>> +    ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job, 0);
>>>       if (ret)
>>>           goto err;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index 82df06a72ee0..5a231b997d65 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -293,7 +293,8 @@ static int amdgpu_cs_pass1(struct
>>> amdgpu_cs_parser *p,
>>>       for (i = 0; i < p->gang_size; ++i) {
>>>           ret = amdgpu_job_alloc(p->adev, vm, p->entities[i],
>>> vm,
>>> -                       num_ibs[i], &p->jobs[i]);
>>> +                       num_ibs[i], &p->jobs[i],
>>> +                       p->filp->client_id);
>>>           if (ret)
>>>               goto free_all_kdata;
>>>           p->jobs[i]->enforce_isolation = p->adev-
>>>> enforce_isolation[fpriv->xcp_id];
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> index acb21fc8b3ce..75262ce8db27 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> @@ -204,7 +204,8 @@ static enum drm_gpu_sched_stat
>>> amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>   int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm
>>> *vm,
>>>                struct drm_sched_entity *entity, void *owner,
>>> -             unsigned int num_ibs, struct amdgpu_job **job)
>>> +             unsigned int num_ibs, struct amdgpu_job **job,
>>> +             u64 drm_client_id)
>>>   {
>>>       if (num_ibs == 0)
>>>           return -EINVAL;
>>> @@ -222,7 +223,8 @@ int amdgpu_job_alloc(struct amdgpu_device *adev,
>>> struct amdgpu_vm *vm,
>>>       if (!entity)
>>>           return 0;
>>> -    return drm_sched_job_init(&(*job)->base, entity, 1, owner);
>>> +    return drm_sched_job_init(&(*job)->base, entity, 1, owner,
>>> +                  drm_client_id);
>>>   }
>>>   int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
>>> @@ -232,7 +234,7 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device
>>> *adev,
>>>   {
>>>       int r;
>>> -    r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job);
>>> +    r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job, 0);
>>>       if (r)
>>>           return r;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>>> index ce6b9ba967ff..5a8bc6342222 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>>> @@ -90,7 +90,8 @@ static inline struct amdgpu_ring
>>> *amdgpu_job_ring(struct amdgpu_job *job)
>>>   int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm
>>> *vm,
>>>                struct drm_sched_entity *entity, void *owner,
>>> -             unsigned int num_ibs, struct amdgpu_job **job);
>>> +             unsigned int num_ibs, struct amdgpu_job **job,
>>> +             u64 drm_client_id);
>>>   int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
>>>                    struct drm_sched_entity *entity, void
>>> *owner,
>>>                    size_t size, enum amdgpu_ib_pool_type
>>> pool_type,
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>>> b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>>> index 3c0a5c3e0e3d..76c742328edb 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>>> @@ -534,7 +534,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device
>>> *dev, void *data,
>>>       ret = drm_sched_job_init(&submit->sched_job,
>>>                    &ctx->sched_entity[args->pipe],
>>> -                 1, submit->ctx);
>>> +                 1, submit->ctx, file->client_id);
>>>       if (ret)
>>>           goto err_submit_put;
>>> diff --git a/drivers/gpu/drm/imagination/pvr_job.c
>>> b/drivers/gpu/drm/imagination/pvr_job.c
>>> index 59b334d094fa..7564b0f21b42 100644
>>> --- a/drivers/gpu/drm/imagination/pvr_job.c
>>> +++ b/drivers/gpu/drm/imagination/pvr_job.c
>>> @@ -446,7 +446,7 @@ create_job(struct pvr_device *pvr_dev,
>>>       if (err)
>>>           goto err_put_job;
>>> -    err = pvr_queue_job_init(job);
>>> +    err = pvr_queue_job_init(job, pvr_file->file->client_id);
>>>       if (err)
>>>           goto err_put_job;
>>> diff --git a/drivers/gpu/drm/imagination/pvr_queue.c
>>> b/drivers/gpu/drm/imagination/pvr_queue.c
>>> index 5e9bc0992824..5a41ee79fed6 100644
>>> --- a/drivers/gpu/drm/imagination/pvr_queue.c
>>> +++ b/drivers/gpu/drm/imagination/pvr_queue.c
>>> @@ -1073,6 +1073,7 @@ static int pvr_queue_cleanup_fw_context(struct
>>> pvr_queue *queue)
>>>   /**
>>>    * pvr_queue_job_init() - Initialize queue related fields in a
>>> pvr_job object.
>>>    * @job: The job to initialize.
>>> + * @drm_client_id: drm_file.client_id submitting the job
>>>    *
>>>    * Bind the job to a queue and allocate memory to guarantee
>>> pvr_queue_job_arm()
>>>    * and pvr_queue_job_push() can't fail. We also make sure the
>>> context type is
>>> @@ -1082,7 +1083,7 @@ static int pvr_queue_cleanup_fw_context(struct
>>> pvr_queue *queue)
>>>    *  * 0 on success, or
>>>    *  * An error code if something failed.
>>>    */
>>> -int pvr_queue_job_init(struct pvr_job *job)
>>> +int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id)
>>>   {
>>>       /* Fragment jobs need at least one native fence wait on the
>>> geometry job fence. */
>>>       u32 min_native_dep_count = job->type ==
>>> DRM_PVR_JOB_TYPE_FRAGMENT ? 1 : 0;
>>> @@ -1099,7 +1100,7 @@ int pvr_queue_job_init(struct pvr_job *job)
>>>       if (!pvr_cccb_cmdseq_can_fit(&queue->cccb,
>>> job_cmds_size(job, min_native_dep_count)))
>>>           return -E2BIG;
>>> -    err = drm_sched_job_init(&job->base, &queue->entity, 1,
>>> THIS_MODULE);
>>> +    err = drm_sched_job_init(&job->base, &queue->entity, 1,
>>> THIS_MODULE, drm_client_id);
>>>       if (err)
>>>           return err;
>>> diff --git a/drivers/gpu/drm/imagination/pvr_queue.h
>>> b/drivers/gpu/drm/imagination/pvr_queue.h
>>> index 93fe9ac9f58c..fc1986d73fc8 100644
>>> --- a/drivers/gpu/drm/imagination/pvr_queue.h
>>> +++ b/drivers/gpu/drm/imagination/pvr_queue.h
>>> @@ -143,7 +143,7 @@ struct pvr_queue {
>>>   bool pvr_queue_fence_is_ufo_backed(struct dma_fence *f);
>>> -int pvr_queue_job_init(struct pvr_job *job);
>>> +int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id);
>>>   void pvr_queue_job_cleanup(struct pvr_job *job);
>>> diff --git a/drivers/gpu/drm/lima/lima_gem.c
>>> b/drivers/gpu/drm/lima/lima_gem.c
>>> index 5deec673c11e..9722b847a539 100644
>>> --- a/drivers/gpu/drm/lima/lima_gem.c
>>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>>> @@ -341,7 +341,7 @@ int lima_gem_submit(struct drm_file *file, struct
>>> lima_submit *submit)
>>>       err = lima_sched_task_init(
>>>           submit->task, submit->ctx->context + submit->pipe,
>>> -        bos, submit->nr_bos, vm);
>>> +        bos, submit->nr_bos, vm, file->client_id);
>>>       if (err)
>>>           goto err_out1;
>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c
>>> b/drivers/gpu/drm/lima/lima_sched.c
>>> index 7934098e651b..954f4325b859 100644
>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>> @@ -113,7 +113,8 @@ static inline struct lima_sched_pipe
>>> *to_lima_pipe(struct drm_gpu_scheduler *sch
>>>   int lima_sched_task_init(struct lima_sched_task *task,
>>>                struct lima_sched_context *context,
>>>                struct lima_bo **bos, int num_bos,
>>> -             struct lima_vm *vm)
>>> +             struct lima_vm *vm,
>>> +             u64 drm_client_id)
>>>   {
>>>       int err, i;
>>> @@ -124,7 +125,8 @@ int lima_sched_task_init(struct lima_sched_task
>>> *task,
>>>       for (i = 0; i < num_bos; i++)
>>>           drm_gem_object_get(&bos[i]->base.base);
>>> -    err = drm_sched_job_init(&task->base, &context->base, 1,
>>> vm);
>>> +    err = drm_sched_job_init(&task->base, &context->base, 1, vm,
>>> +                 drm_client_id);
>>>       if (err) {
>>>           kfree(task->bos);
>>>           return err;
>>> diff --git a/drivers/gpu/drm/lima/lima_sched.h
>>> b/drivers/gpu/drm/lima/lima_sched.h
>>> index 85b23ba901d5..1a08faf8a529 100644
>>> --- a/drivers/gpu/drm/lima/lima_sched.h
>>> +++ b/drivers/gpu/drm/lima/lima_sched.h
>>> @@ -88,7 +88,8 @@ struct lima_sched_pipe {
>>>   int lima_sched_task_init(struct lima_sched_task *task,
>>>                struct lima_sched_context *context,
>>>                struct lima_bo **bos, int num_bos,
>>> -             struct lima_vm *vm);
>>> +             struct lima_vm *vm,
>>> +             u64 drm_client_id);
>>>   void lima_sched_task_fini(struct lima_sched_task *task);
>>>   int lima_sched_context_init(struct lima_sched_pipe *pipe,
>>> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c
>>> b/drivers/gpu/drm/msm/msm_gem_submit.c
>>> index 3e9aa2cc38ef..d9be0fe3d674 100644
>>> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
>>> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
>>> @@ -30,7 +30,7 @@
>>>   static struct msm_gem_submit *submit_create(struct drm_device *dev,
>>>           struct msm_gpu *gpu,
>>>           struct msm_gpu_submitqueue *queue, uint32_t nr_bos,
>>> -        uint32_t nr_cmds)
>>> +        uint32_t nr_cmds, u64 drm_client_id)
>>>   {
>>>       static atomic_t ident = ATOMIC_INIT(0);
>>>       struct msm_gem_submit *submit;
>>> @@ -54,7 +54,8 @@ static struct msm_gem_submit *submit_create(struct
>>> drm_device *dev,
>>>           return ERR_PTR(ret);
>>>       }
>>> -    ret = drm_sched_job_init(&submit->base, queue->entity, 1,
>>> queue);
>>> +    ret = drm_sched_job_init(&submit->base, queue->entity, 1,
>>> queue,
>>> +                 drm_client_id);
>>>       if (ret) {
>>>           kfree(submit->hw_fence);
>>>           kfree(submit);
>>> @@ -693,7 +694,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev,
>>> void *data,
>>>           }
>>>       }
>>> -    submit = submit_create(dev, gpu, queue, args->nr_bos, args-
>>>> nr_cmds);
>>> +    submit = submit_create(dev, gpu, queue, args->nr_bos, args-
>>>> nr_cmds,
>>> +                   file->client_id);
>>>       if (IS_ERR(submit)) {
>>>           ret = PTR_ERR(submit);
>>>           goto out_post_unlock;
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c
>>> b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>> index d326e55d2d24..460a5fb02412 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>> @@ -87,7 +87,8 @@ nouveau_job_init(struct nouveau_job *job,
>>>       }
>>>       ret = drm_sched_job_init(&job->base, &sched->entity,
>>> -                 args->credits, NULL);
>>> +                 args->credits, NULL,
>>> +                 job->file_priv->client_id);
>>>       if (ret)
>>>           goto err_free_chains;
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c
>>> b/drivers/gpu/drm/panfrost/panfrost_drv.c
>>> index b87f83e94eda..d5c2c6530ed8 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
>>> @@ -312,7 +312,7 @@ static int panfrost_ioctl_submit(struct
>>> drm_device *dev, void *data,
>>>       ret = drm_sched_job_init(&job->base,
>>>                    &file_priv->sched_entity[slot],
>>> -                 1, NULL);
>>> +                 1, NULL, file->client_id);
>>>       if (ret)
>>>           goto out_put_job;
>>> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c
>>> b/drivers/gpu/drm/panthor/panthor_drv.c
>>> index 06fe46e32073..bd8e1900c919 100644
>>> --- a/drivers/gpu/drm/panthor/panthor_drv.c
>>> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
>>> @@ -989,7 +989,8 @@ static int panthor_ioctl_group_submit(struct
>>> drm_device *ddev, void *data,
>>>           const struct drm_panthor_queue_submit *qsubmit =
>>> &jobs_args[i];
>>>           struct drm_sched_job *job;
>>> -        job = panthor_job_create(pfile, args->group_handle,
>>> qsubmit);
>>> +        job = panthor_job_create(pfile, args->group_handle,
>>> qsubmit,
>>> +                     file->client_id);
>>>           if (IS_ERR(job)) {
>>>               ret = PTR_ERR(job);
>>>               goto out_cleanup_submit_ctx;
>>> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c
>>> b/drivers/gpu/drm/panthor/panthor_mmu.c
>>> index 12a02e28f50f..e0c79bd2d173 100644
>>> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
>>> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
>>> @@ -2516,7 +2516,7 @@ panthor_vm_bind_job_create(struct drm_file
>>> *file,
>>>       kref_init(&job->refcount);
>>>       job->vm = panthor_vm_get(vm);
>>> -    ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm);
>>> +    ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm,
>>> file->client_id);
>>>       if (ret)
>>>           goto err_put_job;
>>> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c
>>> b/drivers/gpu/drm/panthor/panthor_sched.c
>>> index 446ec780eb4a..2af860c9068a 100644
>>> --- a/drivers/gpu/drm/panthor/panthor_sched.c
>>> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
>>> @@ -3729,7 +3729,8 @@ struct panthor_vm *panthor_job_vm(struct
>>> drm_sched_job *sched_job)
>>>   struct drm_sched_job *
>>>   panthor_job_create(struct panthor_file *pfile,
>>>              u16 group_handle,
>>> -           const struct drm_panthor_queue_submit *qsubmit)
>>> +           const struct drm_panthor_queue_submit *qsubmit,
>>> +           u64 drm_client_id)
>>>   {
>>>       struct panthor_group_pool *gpool = pfile->groups;
>>>       struct panthor_job *job;
>>> @@ -3801,7 +3802,7 @@ panthor_job_create(struct panthor_file *pfile,
>>>       ret = drm_sched_job_init(&job->base,
>>>                    &job->group->queues[job-
>>>> queue_idx]->entity,
>>> -                 credits, job->group);
>>> +                 credits, job->group,
>>> drm_client_id);
>>>       if (ret)
>>>           goto err_put_job;
>>> diff --git a/drivers/gpu/drm/panthor/panthor_sched.h
>>> b/drivers/gpu/drm/panthor/panthor_sched.h
>>> index e650a445cf50..742b0b4ff3a3 100644
>>> --- a/drivers/gpu/drm/panthor/panthor_sched.h
>>> +++ b/drivers/gpu/drm/panthor/panthor_sched.h
>>> @@ -29,7 +29,8 @@ int panthor_group_get_state(struct panthor_file
>>> *pfile,
>>>   struct drm_sched_job *
>>>   panthor_job_create(struct panthor_file *pfile,
>>>              u16 group_handle,
>>> -           const struct drm_panthor_queue_submit *qsubmit);
>>> +           const struct drm_panthor_queue_submit *qsubmit,
>>> +           u64 drm_client_id);
>>>   struct drm_sched_job *panthor_job_get(struct drm_sched_job *job);
>>>   struct panthor_vm *panthor_job_vm(struct drm_sched_job *sched_job);
>>>   void panthor_job_put(struct drm_sched_job *job);
>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c
>>> b/drivers/gpu/drm/scheduler/sched_fence.c
>>> index e971528504a5..d208d384d38d 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>> @@ -206,7 +206,8 @@ struct drm_sched_fence *to_drm_sched_fence(struct
>>> dma_fence *f)
>>>   EXPORT_SYMBOL(to_drm_sched_fence);
>>>   struct drm_sched_fence *drm_sched_fence_alloc(struct
>>> drm_sched_entity *entity,
>>> -                          void *owner)
>>> +                          void *owner,
>>> +                          u64 drm_client_id)
>>>   {
>>>       struct drm_sched_fence *fence = NULL;
>>> @@ -215,6 +216,7 @@ struct drm_sched_fence
>>> *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>>           return NULL;
>>>       fence->owner = owner;
>>> +    fence->drm_client_id = drm_client_id;
>>>       spin_lock_init(&fence->lock);
>>>       return fence;
>>> diff --git a/drivers/gpu/drm/scheduler/sched_internal.h
>>> b/drivers/gpu/drm/scheduler/sched_internal.h
>>> index 599cf6e1bb74..7ea5a6736f98 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_internal.h
>>> +++ b/drivers/gpu/drm/scheduler/sched_internal.h
>>> @@ -24,7 +24,7 @@ void drm_sched_entity_select_rq(struct
>>> drm_sched_entity *entity);
>>>   struct drm_sched_job *drm_sched_entity_pop_job(struct
>>> drm_sched_entity *entity);
>>>   struct drm_sched_fence *drm_sched_fence_alloc(struct
>>> drm_sched_entity *s_entity,
>>> -                          void *owner);
>>> +                          void *owner, u64
>>> drm_client_id);
>>>   void drm_sched_fence_init(struct drm_sched_fence *fence,
>>>                 struct drm_sched_entity *entity);
>>>   void drm_sched_fence_free(struct drm_sched_fence *fence);
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 829579c41c6b..60611618f3ab 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -764,6 +764,7 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>>>    * @credits: the number of credits this job contributes to the
>>> schedulers
>>>    * credit limit
>>>    * @owner: job owner for debugging
>>> + * @drm_client_id: drm_file.client_id of the owner
>>
>> For the docu generation to link that properly it must be written as
>>
>> &struct drm_file.client_id
> 
> Noted.
> 
>>
>> Besides, if this were an optional parameter, one should document it.
>> I'm not sure if it is, I haven't used these client_id's before.
> 
> Passing an invalid client_id would only cause the trace events to print the invalid client_id.
> 
> Thanks,
> Pierre-Eric
> 
> 
>>
>> P.
>>
>>>    *
>>>    * Refer to drm_sched_entity_push_job() documentation
>>>    * for locking considerations.
>>> @@ -784,7 +785,8 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>>>    */
>>>   int drm_sched_job_init(struct drm_sched_job *job,
>>>                  struct drm_sched_entity *entity,
>>> -               u32 credits, void *owner)
>>> +               u32 credits, void *owner,
>>> +               uint64_t drm_client_id)
>>>   {
>>>       if (!entity->rq) {
>>>           /* This will most likely be followed by missing
>>> frames
>>> @@ -810,7 +812,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>       job->entity = entity;
>>>       job->credits = credits;
>>> -    job->s_fence = drm_sched_fence_alloc(entity, owner);
>>> +    job->s_fence = drm_sched_fence_alloc(entity, owner,
>>> drm_client_id);
>>>       if (!job->s_fence)
>>>           return -ENOMEM;
>>> diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>>> b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>>> index f999c8859cf7..09ffbdb32d76 100644
>>> --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>>> +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>>> @@ -35,7 +35,7 @@ drm_mock_sched_entity_new(struct kunit *test,
>>>       ret = drm_sched_entity_init(&entity->base,
>>>                       priority,
>>>                       &drm_sched, 1,
>>> -                    NULL);
>>> +                    NULL, 1);
>>>       KUNIT_ASSERT_EQ(test, ret, 0);
>>>       entity->test = test;
>>> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c
>>> b/drivers/gpu/drm/v3d/v3d_submit.c
>>> index 4ff5de46fb22..5171ffe9012d 100644
>>> --- a/drivers/gpu/drm/v3d/v3d_submit.c
>>> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
>>> @@ -169,7 +169,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file
>>> *file_priv,
>>>       job->file = file_priv;
>>>       ret = drm_sched_job_init(&job->base, &v3d_priv-
>>>> sched_entity[queue],
>>> -                 1, v3d_priv);
>>> +                 1, v3d_priv, file_priv->client_id);
>>>       if (ret)
>>>           return ret;
>>> diff --git a/drivers/gpu/drm/xe/xe_sched_job.c
>>> b/drivers/gpu/drm/xe/xe_sched_job.c
>>> index 1905ca590965..f4679cb9a56b 100644
>>> --- a/drivers/gpu/drm/xe/xe_sched_job.c
>>> +++ b/drivers/gpu/drm/xe/xe_sched_job.c
>>> @@ -113,7 +113,8 @@ struct xe_sched_job *xe_sched_job_create(struct
>>> xe_exec_queue *q,
>>>       kref_init(&job->refcount);
>>>       xe_exec_queue_get(job->q);
>>> -    err = drm_sched_job_init(&job->drm, q->entity, 1, NULL);
>>> +    err = drm_sched_job_init(&job->drm, q->entity, 1, NULL,
>>> +                 q->xef->drm->client_id);
>>>       if (err)
>>>           goto err_free;
>>> diff --git a/include/drm/gpu_scheduler.h
>>> b/include/drm/gpu_scheduler.h
>>> index 1a7e377d4cbb..6fe3b4c0cffb 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -305,6 +305,13 @@ struct drm_sched_fence {
>>>            * @owner: job owner for debugging
>>>            */
>>>       void                *owner;
>>> +
>>> +    /**
>>> +     * @drm_client_id:
>>> +     *
>>> +     * The client_id of the drm_file which owns the job.
>>> +     */
>>> +    uint64_t            drm_client_id;
>>>   };
>>>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>>> @@ -629,7 +636,8 @@ drm_sched_pick_best(struct drm_gpu_scheduler
>>> **sched_list,
>>>   int drm_sched_job_init(struct drm_sched_job *job,
>>>                  struct drm_sched_entity *entity,
>>> -               u32 credits, void *owner);
>>> +               u32 credits, void *owner,
>>> +               u64 drm_client_id);
>>>   void drm_sched_job_arm(struct drm_sched_job *job);
>>>   void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
>>>   int drm_sched_job_add_dependency(struct drm_sched_job *job,


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 02/10] drm/sched: store the drm client_id in drm_sched_fence
  2025-05-19 11:02       ` Pierre-Eric Pelloux-Prayer
@ 2025-05-19 11:59         ` Philipp Stanner
  0 siblings, 0 replies; 24+ messages in thread
From: Philipp Stanner @ 2025-05-19 11:59 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, phasta, Matthew Brost,
	Danilo Krummrich, Christian König
  Cc: dri-devel, linux-kernel, amd-gfx, etnaviv, lima, linux-arm-msm,
	freedreno, nouveau, intel-xe, Pierre-Eric Pelloux-Prayer, Min Ma,
	Lizhi Hou, Oded Gabbay, Felix Kuehling, Alex Deucher,
	Christian König, David Airlie, Simona Vetter, Lucas Stach,
	Russell King, Christian Gmeiner, Frank Binns, Matt Coster,
	Thomas Zimmermann, Qiang Yu, Rob Clark, Abhinav Kumar,
	Dmitry Baryshkov, Sean Paul, Marijn Suijten, Lyude Paul,
	Boris Brezillon, Rob Herring, Steven Price, Liviu Dudau,
	Melissa Wen, Maíra Canal, Lucas De Marchi,
	Thomas Hellström, Rodrigo Vivi, Maxime Ripard,
	Maarten Lankhorst

On Mon, 2025-05-19 at 13:02 +0200, Pierre-Eric Pelloux-Prayer wrote:
> 
> 
> Le 15/05/2025 à 08:53, Pierre-Eric Pelloux-Prayer a écrit :
> > Hi,
> > 
> > Le 14/05/2025 à 14:44, Philipp Stanner a écrit :
> > > On Thu, 2025-04-24 at 10:38 +0200, Pierre-Eric Pelloux-Prayer
> > > wrote:
> > > > This will be used in a later commit to trace the drm client_id
> > > > in
> > > > some of the gpu_scheduler trace events.
> > > > 
> > > > This requires changing all the users of drm_sched_job_init to
> > > > add an extra parameter.
> > > > 
> > > > The newly added drm_client_id field in the drm_sched_fence is a
> > > > bit
> > > > of a duplicate of the owner one. One suggestion I received was
> > > > to
> > > > merge those 2 fields - this can't be done right now as amdgpu
> > > > uses
> > > > some special values (AMDGPU_FENCE_OWNER_*) that can't really be
> > > > translated into a client id. Christian is working on getting
> > > > rid of
> > > > those; when it's done we should be able to squash
> > > > owner/drm_client_id
> > > > together.
> > > > 
> > > > Reviewed-by: Christian König <christian.koenig@amd.com>
> > > > Signed-off-by: Pierre-Eric Pelloux-Prayer
> > > > <pierre-eric.pelloux-prayer@amd.com>
> > > > ---
> > > >   drivers/accel/amdxdna/aie2_ctx.c                 |  3 ++-
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c       |  2 +-
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           |  3 ++-
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c          |  8 +++++--
> > > > -
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h          |  3 ++-
> > > >   drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c     |  2 +-
> > > >   drivers/gpu/drm/imagination/pvr_job.c            |  2 +-
> > > >   drivers/gpu/drm/imagination/pvr_queue.c          |  5 +++--
> > > >   drivers/gpu/drm/imagination/paac             |  8 +++++---
> > > >   drivers/gpu/drm/nouveau/nouveau_sched.c          |  3 ++-
> > > >   drivers/gpu/drm/panfrost/panfrost_drv.c          |  2 +-
> > > >   drivers/gpu/drm/panthor/panthor_drv.c            |  3 ++-
> > > >   drivers/gpu/drm/panthor/panthor_mmu.c            |  2 +-
> > > >   drivers/gpu/drm/panthor/panthor_sched.c          |  5 +++--
> > > >   drivers/gpu/drm/panthor/panthor_sched.h          |  3 ++-
> > > >   drivers/gpu/drm/scheduler/sched_fence.c          |  4 ++
> > > >   drivers/gpu/drm/scheduler/sched_internal.h       |  2 +-
> > > >   drivers/gpu/drm/scheduler/sched_main.c           |  6 ++++--
> > > >   drivers/gpu/drm/scheduler/tests/mock_scheduler.c |  2 +-
> > > >   drivers/gpu/drm/v3d/v3d_submit.c                 |  2 +-
> > > >   drivers/gpu/drm/xe/xe_sched_job.c                |  3 ++-
> > > >   include/drm/gpu_scheduler.h                      | 10
> > > > +++++++++-
> > > >   26 files changed, 62 insertions(+), 34 deletions(-)
> > > 
> > > I think last time I asked about what your merge plan for this is,
> > > since
> > > it touches so many drivers. Should I take that?
> > 
> > Based on:
> > 
> > https://drm.pages.freedesktop.org/maintainer-tools/committer/committer-drm-misc.html
> > 
> > "drm-misc is for drm core (non-driver) patches, subsystem-wide
> > refactorings,
> > and small trivial patches all over (including drivers)."
> > 
> > I assume it should go through drm-misc.
> 
> I've addressed your comments and pushed an updated branch to 
> https://gitlab.freedesktop.org/pepp/linux/-/commits/improve_gpu_scheduler_trace_v10
> 
> Any chance to get this merged soon?

I took a look. Looks good!

Be so kind and provide that branch as a v10 so we can conform to the
process. Then I can take them, do some basic smoke tests tomorrow and
then we're good to go. That should now all be good, you've got the
reviews and it has settled.

Thx
P.

> 
> Thanks,
> Pierre-Eric
> 
> 
> 
> > 
> > 
> > > 
> > > Besides one comment below, scheduler bits look fine.
> > > 
> > > > 
> > > > diff --git a/drivers/accel/amdxdna/aie2_ctx.c
> > > > b/drivers/accel/amdxdna/aie2_ctx.c
> > > > index e04549f64d69..3e38a5f637ea 100644
> > > > --- a/drivers/accel/amdxdna/aie2_ctx.c
> > > > +++ b/drivers/accel/amdxdna/aie2_ctx.c
> > > > @@ -848,7 +848,8 @@ int aie2_cmd_submit(struct amdxdna_hwctx
> > > > *hwctx,
> > > > struct amdxdna_sched_job *job,
> > > >           goto up_sem;
> > > >       }
> > > > -    ret = drm_sched_job_init(&job->base, &hwctx->priv->entity,
> > > > 1, hwctx);
> > > > +    ret = drm_sched_job_init(&job->base, &hwctx->priv->entity,
> > > > 1, hwctx,
> > > > +                 hwctx->client->filp->client_id);
> > > >       if (ret) {
> > > >           XDNA_ERR(xdna, "DRM job init failed, ret %d", ret);
> > > >           goto free_chain;
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > > > index 4cec3a873995..1a77ba7036c9 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > > > @@ -639,7 +639,7 @@ int amdgpu_amdkfd_submit_ib(struct
> > > > amdgpu_device
> > > > *adev,
> > > >           goto err;
> > > >       }
> > > > -    ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job);
> > > > +    ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job,
> > > > 0);
> > > >       if (ret)
> > > >           goto err;
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > index 82df06a72ee0..5a231b997d65 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > @@ -293,7 +293,8 @@ static int amdgpu_cs_pass1(struct
> > > > amdgpu_cs_parser *p,
> > > >       for (i = 0; i < p->gang_size; ++i) {
> > > >           ret = amdgpu_job_alloc(p->adev, vm, p->entities[i],
> > > > vm,
> > > > -                       num_ibs[i], &p->jobs[i]);
> > > > +                       num_ibs[i], &p->jobs[i],
> > > > +                       p->filp->client_id);
> > > >           if (ret)
> > > >               goto free_all_kdata;
> > > >           p->jobs[i]->enforce_isolation = p->adev-
> > > > > enforce_isolation[fpriv->xcp_id];
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > index acb21fc8b3ce..75262ce8db27 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > @@ -204,7 +204,8 @@ static enum drm_gpu_sched_stat
> > > > amdgpu_job_timedout(struct drm_sched_job *s_job)
> > > >   int amdgpu_job_alloc(struct amdgpu_device *adev, struct
> > > > amdgpu_vm
> > > > *vm,
> > > >                struct drm_sched_entity *entity, void *owner,
> > > > -             unsigned int num_ibs, struct amdgpu_job **job)
> > > > +             unsigned int num_ibs, struct amdgpu_job **job,
> > > > +             u64 drm_client_id)
> > > >   {
> > > >       if (num_ibs == 0)
> > > >           return -EINVAL;
> > > > @@ -222,7 +223,8 @@ int amdgpu_job_alloc(struct amdgpu_device
> > > > *adev,
> > > > struct amdgpu_vm *vm,
> > > >       if (!entity)
> > > >           return 0;
> > > > -    return drm_sched_job_init(&(*job)->base, entity, 1,
> > > > owner);
> > > > +    return drm_sched_job_init(&(*job)->base, entity, 1, owner,
> > > > +                  drm_client_id);
> > > >   }
> > > >   int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
> > > > @@ -232,7 +234,7 @@ int amdgpu_job_alloc_with_ib(struct
> > > > amdgpu_device
> > > > *adev,
> > > >   {
> > > >       int r;
> > > > -    r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job);
> > > > +    r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job,
> > > > 0);
> > > >       if (r)
> > > >           return r;
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> > > > index ce6b9ba967ff..5a8bc6342222 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> > > > @@ -90,7 +90,8 @@ static inline struct amdgpu_ring
> > > > *amdgpu_job_ring(struct amdgpu_job *job)
> > > >   int amdgpu_job_alloc(struct amdgpu_device *adev, struct
> > > > amdgpu_vm
> > > > *vm,
> > > >                struct drm_sched_entity *entity, void *owner,
> > > > -             unsigned int num_ibs, struct amdgpu_job **job);
> > > > +             unsigned int num_ibs, struct amdgpu_job **job,
> > > > +             u64 drm_client_id);
> > > >   int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
> > > >                    struct drm_sched_entity *entity, void
> > > > *owner,
> > > >                    size_t size, enum amdgpu_ib_pool_type
> > > > pool_type,
> > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > index 3c0a5c3e0e3d..76c742328edb 100644
> > > > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > @@ -534,7 +534,7 @@ int etnaviv_ioctl_gem_submit(struct
> > > > drm_device
> > > > *dev, void *data,
> > > >       ret = drm_sched_job_init(&submit->sched_job,
> > > >                    &ctx->sched_entity[args->pipe],
> > > > -                 1, submit->ctx);
> > > > +                 1, submit->ctx, file->client_id);
> > > >       if (ret)
> > > >           goto err_submit_put;
> > > > diff --git a/drivers/gpu/drm/imagination/pvr_job.c
> > > > b/drivers/gpu/drm/imagination/pvr_job.c
> > > > index 59b334d094fa..7564b0f21b42 100644
> > > > --- a/drivers/gpu/drm/imagination/pvr_job.c
> > > > +++ b/drivers/gpu/drm/imagination/pvr_job.c
> > > > @@ -446,7 +446,7 @@ create_job(struct pvr_device *pvr_dev,
> > > >       if (err)
> > > >           goto err_put_job;
> > > > -    err = pvr_queue_job_init(job);
> > > > +    err = pvr_queue_job_init(job, pvr_file->file->client_id);
> > > >       if (err)
> > > >           goto err_put_job;
> > > > diff --git a/drivers/gpu/drm/imagination/pvr_queue.c
> > > > b/drivers/gpu/drm/imagination/pvr_queue.c
> > > > index 5e9bc0992824..5a41ee79fed6 100644
> > > > --- a/drivers/gpu/drm/imagination/pvr_queue.c
> > > > +++ b/drivers/gpu/drm/imagination/pvr_queue.c
> > > > @@ -1073,6 +1073,7 @@ static int
> > > > pvr_queue_cleanup_fw_context(struct
> > > > pvr_queue *queue)
> > > >   /**
> > > >    * pvr_queue_job_init() - Initialize queue related fields in
> > > > a
> > > > pvr_job object.
> > > >    * @job: The job to initialize.
> > > > + * @drm_client_id: drm_file.client_id submitting the job
> > > >    *
> > > >    * Bind the job to a queue and allocate memory to guarantee
> > > > pvr_queue_job_arm()
> > > >    * and pvr_queue_job_push() can't fail. We also make sure the
> > > > context type is
> > > > @@ -1082,7 +1083,7 @@ static int
> > > > pvr_queue_cleanup_fw_context(struct
> > > > pvr_queue *queue)
> > > >    *  * 0 on success, or
> > > >    *  * An error code if something failed.
> > > >    */
> > > > -int pvr_queue_job_init(struct pvr_job *job)
> > > > +int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id)
> > > >   {
> > > >       /* Fragment jobs need at least one native fence wait on
> > > > the
> > > > geometry job fence. */
> > > >       u32 min_native_dep_count = job->type ==
> > > > DRM_PVR_JOB_TYPE_FRAGMENT ? 1 : 0;
> > > > @@ -1099,7 +1100,7 @@ int pvr_queue_job_init(struct pvr_job
> > > > *job)
> > > >       if (!pvr_cccb_cmdseq_can_fit(&queue->cccb,
> > > > job_cmds_size(job, min_native_dep_count)))
> > > >           return -E2BIG;
> > > > -    err = drm_sched_job_init(&job->base, &queue->entity, 1,
> > > > THIS_MODULE);
> > > > +    err = drm_sched_job_init(&job->base, &queue->entity, 1,
> > > > THIS_MODULE, drm_client_id);
> > > >       if (err)
> > > >           return err;
> > > > diff --git a/drivers/gpu/drm/imagination/pvr_queue.h
> > > > b/drivers/gpu/drm/imagination/pvr_queue.h
> > > > index 93fe9ac9f58c..fc1986d73fc8 100644
> > > > --- a/drivers/gpu/drm/imagination/pvr_queue.h
> > > > +++ b/drivers/gpu/drm/imagination/pvr_queue.h
> > > > @@ -143,7 +143,7 @@ struct pvr_queue {
> > > >   bool pvr_queue_fence_is_ufo_backed(struct dma_fence *f);
> > > > -int pvr_queue_job_init(struct pvr_job *job);
> > > > +int pvr_queue_job_init(struct pvr_job *job, u64
> > > > drm_client_id);
> > > >   void pvr_queue_job_cleanup(struct pvr_job *job);
> > > > diff --git a/drivers/gpu/drm/lima/lima_gem.c
> > > > b/drivers/gpu/drm/lima/lima_gem.c
> > > > index 5deec673c11e..9722b847a539 100644
> > > > --- a/drivers/gpu/drm/lima/lima_gem.c
> > > > +++ b/drivers/gpu/drm/lima/lima_gem.c
> > > > @@ -341,7 +341,7 @@ int lima_gem_submit(struct drm_file *file,
> > > > struct
> > > > lima_submit *submit)
> > > >       err = lima_sched_task_init(
> > > >           submit->task, submit->ctx->context + submit->pipe,
> > > > -        bos, submit->nr_bos, vm);
> > > > +        bos, submit->nr_bos, vm, file->client_id);
> > > >       if (err)
> > > >           goto err_out1;
> > > > diff --git a/drivers/gpu/drm/lima/lima_sched.c
> > > > b/drivers/gpu/drm/lima/lima_sched.c
> > > > index 7934098e651b..954f4325b859 100644
> > > > --- a/drivers/gpu/drm/lima/lima_sched.c
> > > > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > > > @@ -113,7 +113,8 @@ static inline struct lima_sched_pipe
> > > > *to_lima_pipe(struct drm_gpu_scheduler *sch
> > > >   int lima_sched_task_init(struct lima_sched_task *task,
> > > >                struct lima_sched_context *context,
> > > >                struct lima_bo **bos, int num_bos,
> > > > -             struct lima_vm *vm)
> > > > +             struct lima_vm *vm,
> > > > +             u64 drm_client_id)
> > > >   {
> > > >       int err, i;
> > > > @@ -124,7 +125,8 @@ int lima_sched_task_init(struct
> > > > lima_sched_task
> > > > *task,
> > > >       for (i = 0; i < num_bos; i++)
> > > >           drm_gem_object_get(&bos[i]->base.base);
> > > > -    err = drm_sched_job_init(&task->base, &context->base, 1,
> > > > vm);
> > > > +    err = drm_sched_job_init(&task->base, &context->base, 1,
> > > > vm,
> > > > +                 drm_client_id);
> > > >       if (err) {
> > > >           kfree(task->bos);
> > > >           return err;
> > > > diff --git a/drivers/gpu/drm/lima/lima_sched.h
> > > > b/drivers/gpu/drm/lima/lima_sched.h
> > > > index 85b23ba901d5..1a08faf8a529 100644
> > > > --- a/drivers/gpu/drm/lima/lima_sched.h
> > > > +++ b/drivers/gpu/drm/lima/lima_sched.h
> > > > @@ -88,7 +88,8 @@ struct lima_sched_pipe {
> > > >   int lima_sched_task_init(struct lima_sched_task *task,
> > > >                struct lima_sched_context *context,
> > > >                struct lima_bo **bos, int num_bos,
> > > > -             struct lima_vm *vm);
> > > > +             struct lima_vm *vm,
> > > > +             u64 drm_client_id);
> > > >   void lima_sched_task_fini(struct lima_sched_task *task);
> > > >   int lima_sched_context_init(struct lima_sched_pipe *pipe,
> > > > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > index 3e9aa2cc38ef..d9be0fe3d674 100644
> > > > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > @@ -30,7 +30,7 @@
> > > >   static struct msm_gem_submit *submit_create(struct drm_device
> > > > *dev,
> > > >           struct msm_gpu *gpu,
> > > >           struct msm_gpu_submitqueue *queue, uint32_t nr_bos,
> > > > -        uint32_t nr_cmds)
> > > > +        uint32_t nr_cmds, u64 drm_client_id)
> > > >   {
> > > >       static atomic_t ident = ATOMIC_INIT(0);
> > > >       struct msm_gem_submit *submit;
> > > > @@ -54,7 +54,8 @@ static struct msm_gem_submit
> > > > *submit_create(struct
> > > > drm_device *dev,
> > > >           return ERR_PTR(ret);
> > > >       }
> > > > -    ret = drm_sched_job_init(&submit->base, queue->entity, 1,
> > > > queue);
> > > > +    ret = drm_sched_job_init(&submit->base, queue->entity, 1,
> > > > queue,
> > > > +                 drm_client_id);
> > > >       if (ret) {
> > > >           kfree(submit->hw_fence);
> > > >           kfree(submit);
> > > > @@ -693,7 +694,8 @@ int msm_ioctl_gem_submit(struct drm_device
> > > > *dev,
> > > > void *data,
> > > >           }
> > > >       }
> > > > -    submit = submit_create(dev, gpu, queue, args->nr_bos,
> > > > args-
> > > > > nr_cmds);
> > > > +    submit = submit_create(dev, gpu, queue, args->nr_bos,
> > > > args-
> > > > > nr_cmds,
> > > > +                   file->client_id);
> > > >       if (IS_ERR(submit)) {
> > > >           ret = PTR_ERR(submit);
> > > >           goto out_post_unlock;
> > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c
> > > > b/drivers/gpu/drm/nouveau/nouveau_sched.c
> > > > index d326e55d2d24..460a5fb02412 100644
> > > > --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> > > > +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> > > > @@ -87,7 +87,8 @@ nouveau_job_init(struct nouveau_job *job,
> > > >       }
> > > >       ret = drm_sched_job_init(&job->base, &sched->entity,
> > > > -                 args->credits, NULL);
> > > > +                 args->credits, NULL,
> > > > +                 job->file_priv->client_id);
> > > >       if (ret)
> > > >           goto err_free_chains;
> > > > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c
> > > > b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > > > index b87f83e94eda..d5c2c6530ed8 100644
> > > > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> > > > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > > > @@ -312,7 +312,7 @@ static int panfrost_ioctl_submit(struct
> > > > drm_device *dev, void *data,
> > > >       ret = drm_sched_job_init(&job->base,
> > > >                    &file_priv->sched_entity[slot],
> > > > -                 1, NULL);
> > > > +                 1, NULL, file->client_id);
> > > >       if (ret)
> > > >           goto out_put_job;
> > > > diff --git a/drivers/gpu/drm/panthor/panthor_drv.c
> > > > b/drivers/gpu/drm/panthor/panthor_drv.c
> > > > index 06fe46e32073..bd8e1900c919 100644
> > > > --- a/drivers/gpu/drm/panthor/panthor_drv.c
> > > > +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> > > > @@ -989,7 +989,8 @@ static int
> > > > panthor_ioctl_group_submit(struct
> > > > drm_device *ddev, void *data,
> > > >           const struct drm_panthor_queue_submit *qsubmit =
> > > > &jobs_args[i];
> > > >           struct drm_sched_job *job;
> > > > -        job = panthor_job_create(pfile, args->group_handle,
> > > > qsubmit);
> > > > +        job = panthor_job_create(pfile, args->group_handle,
> > > > qsubmit,
> > > > +                     file->client_id);
> > > >           if (IS_ERR(job)) {
> > > >               ret = PTR_ERR(job);
> > > >               goto out_cleanup_submit_ctx;
> > > > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c
> > > > b/drivers/gpu/drm/panthor/panthor_mmu.c
> > > > index 12a02e28f50f..e0c79bd2d173 100644
> > > > --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> > > > +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> > > > @@ -2516,7 +2516,7 @@ panthor_vm_bind_job_create(struct
> > > > drm_file
> > > > *file,
> > > >       kref_init(&job->refcount);
> > > >       job->vm = panthor_vm_get(vm);
> > > > -    ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm);
> > > > +    ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm,
> > > > file->client_id);
> > > >       if (ret)
> > > >           goto err_put_job;
> > > > diff --git a/drivers/gpu/drm/panthor/panthor_sched.c
> > > > b/drivers/gpu/drm/panthor/panthor_sched.c
> > > > index 446ec780eb4a..2af860c9068a 100644
> > > > --- a/drivers/gpu/drm/panthor/panthor_sched.c
> > > > +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> > > > @@ -3729,7 +3729,8 @@ struct panthor_vm *panthor_job_vm(struct
> > > > drm_sched_job *sched_job)
> > > >   struct drm_sched_job *
> > > >   panthor_job_create(struct panthor_file *pfile,
> > > >              u16 group_handle,
> > > > -           const struct drm_panthor_queue_submit *qsubmit)
> > > > +           const struct drm_panthor_queue_submit *qsubmit,
> > > > +           u64 drm_client_id)
> > > >   {
> > > >       struct panthor_group_pool *gpool = pfile->groups;
> > > >       struct panthor_job *job;
> > > > @@ -3801,7 +3802,7 @@ panthor_job_create(struct panthor_file
> > > > *pfile,
> > > >       ret = drm_sched_job_init(&job->base,
> > > >                    &job->group->queues[job-
> > > > > queue_idx]->entity,
> > > > -                 credits, job->group);
> > > > +                 credits, job->group,
> > > > drm_client_id);
> > > >       if (ret)
> > > >           goto err_put_job;
> > > > diff --git a/drivers/gpu/drm/panthor/panthor_sched.h
> > > > b/drivers/gpu/drm/panthor/panthor_sched.h
> > > > index e650a445cf50..742b0b4ff3a3 100644
> > > > --- a/drivers/gpu/drm/panthor/panthor_sched.h
> > > > +++ b/drivers/gpu/drm/panthor/panthor_sched.h
> > > > @@ -29,7 +29,8 @@ int panthor_group_get_state(struct
> > > > panthor_file
> > > > *pfile,
> > > >   struct drm_sched_job *
> > > >   panthor_job_create(struct panthor_file *pfile,
> > > >              u16 group_handle,
> > > > -           const struct drm_panthor_queue_submit *qsubmit);
> > > > +           const struct drm_panthor_queue_submit *qsubmit,
> > > > +           u64 drm_client_id);
> > > >   struct drm_sched_job *panthor_job_get(struct drm_sched_job
> > > > *job);
> > > >   struct panthor_vm *panthor_job_vm(struct drm_sched_job
> > > > *sched_job);
> > > >   void panthor_job_put(struct drm_sched_job *job);
> > > > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c
> > > > b/drivers/gpu/drm/scheduler/sched_fence.c
> > > > index e971528504a5..d208d384d38d 100644
> > > > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > > > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > > > @@ -206,7 +206,8 @@ struct drm_sched_fence
> > > > *to_drm_sched_fence(struct
> > > > dma_fence *f)
> > > >   EXPORT_SYMBOL(to_drm_sched_fence);
> > > >   struct drm_sched_fence *drm_sched_fence_alloc(struct
> > > > drm_sched_entity *entity,
> > > > -                          void *owner)
> > > > +                          void *owner,
> > > > +                          u64 drm_client_id)
> > > >   {
> > > >       struct drm_sched_fence *fence = NULL;
> > > > @@ -215,6 +216,7 @@ struct drm_sched_fence
> > > > *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> > > >           return NULL;
> > > >       fence->owner = owner;
> > > > +    fence->drm_client_id = drm_client_id;
> > > >       spin_lock_init(&fence->lock);
> > > >       return fence;
> > > > diff --git a/drivers/gpu/drm/scheduler/sched_internal.h
> > > > b/drivers/gpu/drm/scheduler/sched_internal.h
> > > > index 599cf6e1bb74..7ea5a6736f98 100644
> > > > --- a/drivers/gpu/drm/scheduler/sched_internal.h
> > > > +++ b/drivers/gpu/drm/scheduler/sched_internal.h
> > > > @@ -24,7 +24,7 @@ void drm_sched_entity_select_rq(struct
> > > > drm_sched_entity *entity);
> > > >   struct drm_sched_job *drm_sched_entity_pop_job(struct
> > > > drm_sched_entity *entity);
> > > >   struct drm_sched_fence *drm_sched_fence_alloc(struct
> > > > drm_sched_entity *s_entity,
> > > > -                          void *owner);
> > > > +                          void *owner, u64
> > > > drm_client_id);
> > > >   void drm_sched_fence_init(struct drm_sched_fence *fence,
> > > >                 struct drm_sched_entity *entity);
> > > >   void drm_sched_fence_free(struct drm_sched_fence *fence);
> > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > index 829579c41c6b..60611618f3ab 100644
> > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > @@ -764,6 +764,7 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
> > > >    * @credits: the number of credits this job contributes to
> > > > the
> > > > schedulers
> > > >    * credit limit
> > > >    * @owner: job owner for debugging
> > > > + * @drm_client_id: drm_file.client_id of the owner
> > > 
> > > For the docu generation to link that properly it must be written
> > > as
> > > 
> > > &struct drm_file.client_id
> > 
> > Noted.
> > 
> > > 
> > > Besides, if this were an optional parameter, one should document
> > > it.
> > > I'm not sure if it is, I haven't used these client_id's before.
> > 
> > Passing an invalid client_id would only cause the trace events to
> > print the invalid client_id.
> > 
> > Thanks,
> > Pierre-Eric
> > 
> > 
> > > 
> > > P.
> > > 
> > > >    *
> > > >    * Refer to drm_sched_entity_push_job() documentation
> > > >    * for locking considerations.
> > > > @@ -784,7 +785,8 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
> > > >    */
> > > >   int drm_sched_job_init(struct drm_sched_job *job,
> > > >                  struct drm_sched_entity *entity,
> > > > -               u32 credits, void *owner)
> > > > +               u32 credits, void *owner,
> > > > +               uint64_t drm_client_id)
> > > >   {
> > > >       if (!entity->rq) {
> > > >           /* This will most likely be followed by missing
> > > > frames
> > > > @@ -810,7 +812,7 @@ int drm_sched_job_init(struct drm_sched_job
> > > > *job,
> > > >       job->entity = entity;
> > > >       job->credits = credits;
> > > > -    job->s_fence = drm_sched_fence_alloc(entity, owner);
> > > > +    job->s_fence = drm_sched_fence_alloc(entity, owner,
> > > > drm_client_id);
> > > >       if (!job->s_fence)
> > > >           return -ENOMEM;
> > > > diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > > > b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > > > index f999c8859cf7..09ffbdb32d76 100644
> > > > --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > > > +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > > > @@ -35,7 +35,7 @@ drm_mock_sched_entity_new(struct kunit *test,
> > > >       ret = drm_sched_entity_init(&entity->base,
> > > >                       priority,
> > > >                       &drm_sched, 1,
> > > > -                    NULL);
> > > > +                    NULL, 1);
> > > >       KUNIT_ASSERT_EQ(test, ret, 0);
> > > >       entity->test = test;
> > > > diff --git a/drivers/gpu/drm/v3d/v3d_submit.c
> > > > b/drivers/gpu/drm/v3d/v3d_submit.c
> > > > index 4ff5de46fb22..5171ffe9012d 100644
> > > > --- a/drivers/gpu/drm/v3d/v3d_submit.c
> > > > +++ b/drivers/gpu/drm/v3d/v3d_submit.c
> > > > @@ -169,7 +169,7 @@ v3d_job_init(struct v3d_dev *v3d, struct
> > > > drm_file
> > > > *file_priv,
> > > >       job->file = file_priv;
> > > >       ret = drm_sched_job_init(&job->base, &v3d_priv-
> > > > > sched_entity[queue],
> > > > -                 1, v3d_priv);
> > > > +                 1, v3d_priv, file_priv->client_id);
> > > >       if (ret)
> > > >           return ret;
> > > > diff --git a/drivers/gpu/drm/xe/xe_sched_job.c
> > > > b/drivers/gpu/drm/xe/xe_sched_job.c
> > > > index 1905ca590965..f4679cb9a56b 100644
> > > > --- a/drivers/gpu/drm/xe/xe_sched_job.c
> > > > +++ b/drivers/gpu/drm/xe/xe_sched_job.c
> > > > @@ -113,7 +113,8 @@ struct xe_sched_job
> > > > *xe_sched_job_create(struct
> > > > xe_exec_queue *q,
> > > >       kref_init(&job->refcount);
> > > >       xe_exec_queue_get(job->q);
> > > > -    err = drm_sched_job_init(&job->drm, q->entity, 1, NULL);
> > > > +    err = drm_sched_job_init(&job->drm, q->entity, 1, NULL,
> > > > +                 q->xef->drm->client_id);
> > > >       if (err)
> > > >           goto err_free;
> > > > diff --git a/include/drm/gpu_scheduler.h
> > > > b/include/drm/gpu_scheduler.h
> > > > index 1a7e377d4cbb..6fe3b4c0cffb 100644
> > > > --- a/include/drm/gpu_scheduler.h
> > > > +++ b/include/drm/gpu_scheduler.h
> > > > @@ -305,6 +305,13 @@ struct drm_sched_fence {
> > > >            * @owner: job owner for debugging
> > > >            */
> > > >       void                *owner;
> > > > +
> > > > +    /**
> > > > +     * @drm_client_id:
> > > > +     *
> > > > +     * The client_id of the drm_file which owns the job.
> > > > +     */
> > > > +    uint64_t            drm_client_id;
> > > >   };
> > > >   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence
> > > > *f);
> > > > @@ -629,7 +636,8 @@ drm_sched_pick_best(struct
> > > > drm_gpu_scheduler
> > > > **sched_list,
> > > >   int drm_sched_job_init(struct drm_sched_job *job,
> > > >                  struct drm_sched_entity *entity,
> > > > -               u32 credits, void *owner);
> > > > +               u32 credits, void *owner,
> > > > +               u64 drm_client_id);
> > > >   void drm_sched_job_arm(struct drm_sched_job *job);
> > > >   void drm_sched_entity_push_job(struct drm_sched_job
> > > > *sched_job);
> > > >   int drm_sched_job_add_dependency(struct drm_sched_job *job,


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 03/10] drm/sched: add device name to the drm_sched_process_job event
  2025-04-24  8:38 ` [PATCH v9 03/10] drm/sched: add device name to the drm_sched_process_job event Pierre-Eric Pelloux-Prayer
@ 2025-05-19 15:34   ` Danilo Krummrich
  2025-05-20 16:56     ` Pierre-Eric Pelloux-Prayer
  0 siblings, 1 reply; 24+ messages in thread
From: Danilo Krummrich @ 2025-05-19 15:34 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer
  Cc: Matthew Brost, Philipp Stanner, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Christian König, dri-devel, linux-kernel

On Thu, Apr 24, 2025 at 10:38:15AM +0200, Pierre-Eric Pelloux-Prayer wrote:
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> index f56e77e7f6d0..713df3516a17 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> @@ -42,6 +42,7 @@ DECLARE_EVENT_CLASS(drm_sched_job,
>  			     __field(uint64_t, id)
>  			     __field(u32, job_count)
>  			     __field(int, hw_job_count)
> +			     __string(dev, dev_name(sched_job->sched->dev))

Using the sched_job->sched pointer here and in other trace events implies that
the trace event must not be called before the sched_job->sched has been set,
i.e. in drm_sched_job_arm().

Please document this for the corresponding trace events.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v9 03/10] drm/sched: add device name to the drm_sched_process_job event
  2025-05-19 15:34   ` Danilo Krummrich
@ 2025-05-20 16:56     ` Pierre-Eric Pelloux-Prayer
  0 siblings, 0 replies; 24+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-05-20 16:56 UTC (permalink / raw)
  To: Danilo Krummrich, Pierre-Eric Pelloux-Prayer
  Cc: Matthew Brost, Philipp Stanner, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Christian König, dri-devel, linux-kernel



Le 19/05/2025 à 17:34, Danilo Krummrich a écrit :
> On Thu, Apr 24, 2025 at 10:38:15AM +0200, Pierre-Eric Pelloux-Prayer wrote:
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> index f56e77e7f6d0..713df3516a17 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> @@ -42,6 +42,7 @@ DECLARE_EVENT_CLASS(drm_sched_job,
>>   			     __field(uint64_t, id)
>>   			     __field(u32, job_count)
>>   			     __field(int, hw_job_count)
>> +			     __string(dev, dev_name(sched_job->sched->dev))
> 
> Using the sched_job->sched pointer here and in other trace events implies that
> the trace event must not be called before the sched_job->sched has been set,
> i.e. in drm_sched_job_arm().
> 
> Please document this for the corresponding trace events.

This is not a new requirement as sched and s_fence were already used by the trace events.

Still it's a good idea to document this, so I'll update the comment added in the documentation patch.

Thanks,
Pierre-Eric

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-05-20 17:00 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-24  8:38 [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer
2025-04-24  8:38 ` [PATCH v9 01/10] drm/debugfs: output client_id in in drm_clients_info Pierre-Eric Pelloux-Prayer
2025-04-24  8:38 ` [PATCH v9 02/10] drm/sched: store the drm client_id in drm_sched_fence Pierre-Eric Pelloux-Prayer
2025-05-14 12:44   ` Philipp Stanner
2025-05-15  6:53     ` Pierre-Eric Pelloux-Prayer
2025-05-19 11:02       ` Pierre-Eric Pelloux-Prayer
2025-05-19 11:59         ` Philipp Stanner
2025-04-24  8:38 ` [PATCH v9 03/10] drm/sched: add device name to the drm_sched_process_job event Pierre-Eric Pelloux-Prayer
2025-05-19 15:34   ` Danilo Krummrich
2025-05-20 16:56     ` Pierre-Eric Pelloux-Prayer
2025-04-24  8:38 ` [PATCH v9 04/10] drm/sched: cleanup gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
2025-04-24  8:38 ` [PATCH v9 05/10] drm/sched: trace dependencies for gpu jobs Pierre-Eric Pelloux-Prayer
2025-05-14 12:46   ` Philipp Stanner
2025-04-24  8:38 ` [PATCH v9 06/10] drm/sched: add the drm_client_id to the drm_sched_run/exec_job events Pierre-Eric Pelloux-Prayer
2025-04-24  8:38 ` [PATCH v9 07/10] drm/sched: cleanup event names Pierre-Eric Pelloux-Prayer
2025-04-24  8:38 ` [PATCH v9 08/10] drm: get rid of drm_sched_job::id Pierre-Eric Pelloux-Prayer
2025-04-25  5:26   ` Yadav, Arvind
2025-05-14 12:50   ` Philipp Stanner
2025-04-24  8:38 ` [PATCH v9 09/10] drm/doc: document some tracepoints as uAPI Pierre-Eric Pelloux-Prayer
2025-05-14 12:53   ` Philipp Stanner
2025-05-16  7:56     ` Pierre-Eric Pelloux-Prayer
2025-04-24  8:38 ` [PATCH v9 10/10] drm/amdgpu: update trace format to match gpu_scheduler_trace Pierre-Eric Pelloux-Prayer
2025-04-25  5:31   ` Yadav, Arvind
2025-05-14 12:25 ` [PATCH v9 00/10] Improve gpu_scheduler trace events + UAPI Pierre-Eric Pelloux-Prayer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).