Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/8] [ANDROID]: Add GPU work period support for Xe driver
@ 2025-09-19 18:38 Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 1/8] Add a new xe_user structure Aakash Deep Sarkar
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Aakash Deep Sarkar @ 2025-09-19 18:38 UTC (permalink / raw)
  To: intel-xe
  Cc: jeevaka.badrappan, rodrigo.vivi, matthew.brost, carlos.santa,
	matthew.auld, jani.nikula, Aakash Deep Sarkar

This patch series implements the Android VSR requirement GPU work
period event for the Intel Xe driver.

|GpuWorkPeriodEvent| defines a non-overlapping, non-zero period
of time from |start_time_ns| (inclusive) until |end_time_ns|
(exclusive) for a given |uid|, and includes details of how much
work the GPU was performing for |uid| during the period. When
GPU work for a given |uid| runs on the GPU, the driver must track
one or more periods that cover the time where the work was running,
and emit events soon after.

Full requirement is defined in the following file:
https://cs.android.com/android/platform/superproject/main/+\
main:frameworks/native/services/gpuservice/gpuwork/bpfprogs/gpuWork.c;l=35

The requirement is implemented using a delayed worker thread per
user id instance to accumulate its runtime on the gpu and emit
the event. Each user id instance is tracked using an xe_user
structure and the runtime is updated every time the kworker is
executed for this uid. The delay period is hardcoded to 500 msecs.

The runtime on the gpu is collected for each xe file individually
inside the function xe_exec_queue_update_run_ticks and accumulated
into the corresponding xe_user active_duration_ns field. The HW
Context timestamp field in the GTT is used to derive the runtime
in clock ticks and then converted into nanosecs before updating the
active duration.

Signed-off-by: Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>


Aakash Deep Sarkar (8):
  Add a new xe_user structure
  Add xe_gt_clock_interval_to_ns function
  drm/xe: Add a trace point for GPU work period
  drm/xe: Modify xe_exec_queue_update_run_ticks
  Handle xe_user creation and removal
  drm/xe: Implement xe_work_period_worker
  drm/xe: Add a Kconfig option for GPU work period
  Handle xe_work_period destruction

 drivers/gpu/drm/xe/Makefile          |   2 +
 drivers/gpu/drm/xe/xe_device.c       |  32 ++++
 drivers/gpu/drm/xe/xe_device_types.h |  19 +++
 drivers/gpu/drm/xe/xe_exec_queue.c   |   8 +
 drivers/gpu/drm/xe/xe_gt_clock.c     |  14 ++
 drivers/gpu/drm/xe/xe_gt_clock.h     |   1 +
 drivers/gpu/drm/xe/xe_pm.c           |   5 +
 drivers/gpu/drm/xe/xe_user.c         | 246 +++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_user.h         | 129 ++++++++++++++
 drivers/gpu/trace/Kconfig            |  12 ++
 include/trace/gpu_work_period.h      |  59 +++++++
 11 files changed, 527 insertions(+)
 create mode 100644 drivers/gpu/drm/xe/xe_user.c
 create mode 100644 drivers/gpu/drm/xe/xe_user.h
 create mode 100644 include/trace/gpu_work_period.h

-- 
2.49.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/8] Add a new xe_user structure
  2025-09-19 18:38 [PATCH v3 0/8] [ANDROID]: Add GPU work period support for Xe driver Aakash Deep Sarkar
@ 2025-09-19 18:38 ` Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 2/8] Add xe_gt_clock_interval_to_ns function Aakash Deep Sarkar
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Aakash Deep Sarkar @ 2025-09-19 18:38 UTC (permalink / raw)
  To: intel-xe
  Cc: jeevaka.badrappan, rodrigo.vivi, matthew.brost, carlos.santa,
	matthew.auld, jani.nikula, Aakash Deep Sarkar

For Android GPU work period event we need to track the runtime
on the GPU for each user id. This means we can have multiple
xe files opened by different processes/threads belonging to
the same user id. All these xe files need to be grouped together
so that one can easily identify these while calculating the
run time for the given user id.

Currently, the xe driver doesn't record the user id of the
calling process. Also, all the xe files created using open
call are clubbed together inside the xe device structure
with no way to distinguish between them based on the user id
of the calling process.

To remedy these limitations we are adding another layer of
indirection between xe device and xe file. xe device will
now have a list of xe users each with a given user id; and each
xe user will have a list of xe files each of which is created
by a process that is associated with this user id.

The lifetime of the xe user structure should be between when
a process with a new user id has opened the xe device; and when
the last xe file belonging to this user id is closed.

Signed-off-by: Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>
---
 drivers/gpu/drm/xe/Makefile  |  2 +
 drivers/gpu/drm/xe/xe_user.c | 59 ++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_user.h | 81 ++++++++++++++++++++++++++++++++++++
 3 files changed, 142 insertions(+)
 create mode 100644 drivers/gpu/drm/xe/xe_user.c
 create mode 100644 drivers/gpu/drm/xe/xe_user.h

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index d9c6cf0f189e..ff6b584f3293 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -333,6 +333,8 @@ ifeq ($(CONFIG_DEBUG_FS),y)
 
 	xe-$(CONFIG_PCI_IOV) += xe_gt_sriov_pf_debugfs.o
 
+	xe-y += xe_user.o
+
 	xe-$(CONFIG_DRM_XE_DISPLAY) += \
 		i915-display/intel_display_debugfs.o \
 		i915-display/intel_display_debugfs_params.o \
diff --git a/drivers/gpu/drm/xe/xe_user.c b/drivers/gpu/drm/xe/xe_user.c
new file mode 100644
index 000000000000..8c285a68115a
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_user.c
@@ -0,0 +1,59 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include <linux/slab.h>
+
+#include "xe_user.h"
+
+/**
+ * worker thread to emit gpu work period event for this xe user
+ * @work: work instance for this xe user
+ *
+ * Return: void
+ */
+static inline void work_period_worker(struct work_struct *work)
+{
+	//TODO: Implement this worker
+}
+
+/**
+ * xe_user_alloc() - Allocate xe user
+ * @void: No arg
+ *
+ * Allocate xe user struct to track activity on the gpu
+ * by the application. Call this API whenever a new app
+ * has opened xe device.
+ *
+ * Return: pointer to user struct or NULL if can't allocate
+ */
+struct xe_user *xe_user_alloc(void)
+{
+	struct xe_user *user;
+
+	user = kzalloc(sizeof(*user), GFP_KERNEL);
+	if (!user)
+		return NULL;
+
+	kref_init(&user->refcount);
+	mutex_init(&user->filelist_lock);
+	INIT_LIST_HEAD(&user->filelist);
+	//TODO: Add a hook into xe device
+	INIT_WORK(&user->work, work_period_worker);
+	return user;
+}
+
+/**
+ * __xe_user_free() - Free user struct
+ * @kref: The reference
+ *
+ * Return: void
+ */
+void __xe_user_free(struct kref *kref)
+{
+	struct xe_user *user =
+		container_of(kref, struct xe_user, refcount);
+
+	kfree(user);
+}
diff --git a/drivers/gpu/drm/xe/xe_user.h b/drivers/gpu/drm/xe/xe_user.h
new file mode 100644
index 000000000000..e52f66d3f3b0
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_user.h
@@ -0,0 +1,81 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef _XE_USER_H_
+#define _XE_USER_H_
+
+#include <linux/kref.h>
+#include <linux/list.h>
+#include <linux/workqueue.h>
+
+/**
+ * This is a per process/user id structure for a xe device
+ * client. It is allocated when a new process/app opens the
+ * xe device and destroyed when the last xe file belonging
+ * to this user id is destroyed.
+ */
+struct xe_user {
+	/**
+	 * @refcount: reference count
+	 */
+	struct kref refcount;
+
+	/**
+	 * @xe: pointer to the xe_device
+	 */
+	struct xe_device *xe;
+
+	/**
+	 * @filelist_lock: lock protecting the filelist
+	 */
+	struct mutex filelist_lock;
+
+	/**
+	 * @filelist: list of xe files belonging to this xe user
+	 */
+	struct list_head filelist;
+
+	/**
+	 * @work: work to emit the gpu work period event for this
+	 * xe user
+	 */
+	struct work_struct work;
+
+	/**
+	 * @uid: user id for this xe_user
+	 */
+	u32 uid;
+
+	/**
+	 * @active_duration_ns: sum total of xe_file.active_duration_ns
+	 * for all xe files belonging to this xe user
+	 */
+	u64 active_duration_ns;
+
+	/**
+	 * @last_timestamp_ns: timestamp in ns when we last emitted event
+	 * for this xe user
+	 */
+	u64 last_timestamp_ns;
+};
+
+struct xe_user *xe_user_alloc(void);
+
+static inline struct xe_user *
+xe_user_get(struct xe_user *user)
+{
+	kref_get(&user->refcount);
+	return user;
+}
+
+void __xe_user_free(struct kref *kref);
+
+static inline void xe_user_put(struct xe_user *user)
+{
+	kref_put(&user->refcount, __xe_user_free);
+}
+
+#endif // _XE_USER_H_
+
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/8] Add xe_gt_clock_interval_to_ns function
  2025-09-19 18:38 [PATCH v3 0/8] [ANDROID]: Add GPU work period support for Xe driver Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 1/8] Add a new xe_user structure Aakash Deep Sarkar
@ 2025-09-19 18:38 ` Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 3/8] drm/xe: Add a trace point for GPU work period Aakash Deep Sarkar
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Aakash Deep Sarkar @ 2025-09-19 18:38 UTC (permalink / raw)
  To: intel-xe
  Cc: jeevaka.badrappan, rodrigo.vivi, matthew.brost, carlos.santa,
	matthew.auld, jani.nikula, Aakash Deep Sarkar

The runtime of a user id in the GPU work period event are required
to be given in nanosec unit. Since we want to use the HW Context
timestamp register to derive the runtime for a context, we need
a way to convert from GT clock ticks to nano seconds.

Signed-off-by: Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_clock.c | 14 ++++++++++++++
 drivers/gpu/drm/xe/xe_gt_clock.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt_clock.c b/drivers/gpu/drm/xe/xe_gt_clock.c
index 4f011d1573c6..17c1cc6bff5a 100644
--- a/drivers/gpu/drm/xe/xe_gt_clock.c
+++ b/drivers/gpu/drm/xe/xe_gt_clock.c
@@ -110,3 +110,17 @@ u64 xe_gt_clock_interval_to_ms(struct xe_gt *gt, u64 count)
 {
 	return div_u64_roundup(count * MSEC_PER_SEC, gt->info.reference_clock);
 }
+
+/**
+ * xe_gt_clock_interval_to_ns - Convert sampled GT clock ticks to nanosec
+ *
+ * @gt: the &xe_gt
+ * @count: count of GT clock ticks
+ *
+ * Returns: time in nanosec
+ */
+u64 xe_gt_clock_interval_to_ns(struct xe_gt *gt, u64 count)
+{
+	return div_u64_roundup(count * NSEC_PER_SEC, gt->info.reference_clock);
+}
+
diff --git a/drivers/gpu/drm/xe/xe_gt_clock.h b/drivers/gpu/drm/xe/xe_gt_clock.h
index 3adeb7baaca4..bd87971bce97 100644
--- a/drivers/gpu/drm/xe/xe_gt_clock.h
+++ b/drivers/gpu/drm/xe/xe_gt_clock.h
@@ -12,5 +12,6 @@ struct xe_gt;
 
 int xe_gt_clock_init(struct xe_gt *gt);
 u64 xe_gt_clock_interval_to_ms(struct xe_gt *gt, u64 count);
+u64 xe_gt_clock_interval_to_ns(struct xe_gt *gt, u64 count);
 
 #endif
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 3/8] drm/xe: Add a trace point for GPU work period
  2025-09-19 18:38 [PATCH v3 0/8] [ANDROID]: Add GPU work period support for Xe driver Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 1/8] Add a new xe_user structure Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 2/8] Add xe_gt_clock_interval_to_ns function Aakash Deep Sarkar
@ 2025-09-19 18:38 ` Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 4/8] drm/xe: Modify xe_exec_queue_update_run_ticks Aakash Deep Sarkar
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Aakash Deep Sarkar @ 2025-09-19 18:38 UTC (permalink / raw)
  To: intel-xe
  Cc: jeevaka.badrappan, rodrigo.vivi, matthew.brost, carlos.santa,
	matthew.auld, jani.nikula, Aakash Deep Sarkar

The GPU work period event is required to have the following format:

Defines the structure of the kernel tracepoint:
/sys/kernel/tracing/events/power/gpu_work_period

A value that uniquely identifies the GPU within the system.
  uint32_t gpu_id;

The UID of the application (i.e. persistent, unique ID of the Android
app) that submitted work to the GPU.
  uint32_t uid;

The start time of the period in nanoseconds. The clock must be
CLOCK_MONOTONIC_RAW, as returned by the ktime_get_raw_ns(void) function.
  uint64_t start_time_ns;

The end time of the period in nanoseconds. The clock must be
CLOCK_MONOTONIC_RAW, as returned by the ktime_get_raw_ns(void) function.
  uint64_t end_time_ns;

The amount of time the GPU was running GPU work for |uid| during the
period, in nanoseconds, without double-counting parallel GPU work for the
same |uid|. For example, this might include the amount of time the GPU
spent performing shader work (vertex work, fragment work, etc.) for
|uid|.
  uint64_t total_active_duration_ns;

Signed-off-by: Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>
---
 include/trace/gpu_work_period.h | 59 +++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)
 create mode 100644 include/trace/gpu_work_period.h

diff --git a/include/trace/gpu_work_period.h b/include/trace/gpu_work_period.h
new file mode 100644
index 000000000000..e06467625705
--- /dev/null
+++ b/include/trace/gpu_work_period.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM power
+
+#if !defined(_TRACE_GPU_WORK_PERIOD_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_GPU_WORK_PERIOD_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(gpu_work_period,
+
+	TP_PROTO(
+		u32 gpu_id,
+		u32 uid,
+		u64 start_time_ns,
+		u64 end_time_ns,
+		u64 total_active_duration_ns
+	),
+
+	TP_ARGS(gpu_id, uid, start_time_ns, end_time_ns, total_active_duration_ns),
+
+	TP_STRUCT__entry(
+		__field(u32, gpu_id)
+		__field(u32, uid)
+		__field(u64, start_time_ns)
+		__field(u64, end_time_ns)
+		__field(u64, total_active_duration_ns)
+	),
+
+	TP_fast_assign(
+		__entry->gpu_id = gpu_id;
+		__entry->uid = uid;
+		__entry->start_time_ns = start_time_ns;
+		__entry->end_time_ns = end_time_ns;
+		__entry->total_active_duration_ns = total_active_duration_ns;
+	),
+
+	TP_printk("gpu_id=%u uid=%u start_time_ns=%llu end_time_ns=%llu total_active_duration_ns=%llu",
+		__entry->gpu_id,
+		__entry->uid,
+		__entry->start_time_ns,
+		__entry->end_time_ns,
+		__entry->total_active_duration_ns)
+);
+
+#endif /* _TRACE_GPU_WORK_PERIOD_H */
+
+/* This part must be outside protection */
+
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE gpu_work_period
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+
+#include <trace/define_trace.h>
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 4/8] drm/xe: Modify xe_exec_queue_update_run_ticks
  2025-09-19 18:38 [PATCH v3 0/8] [ANDROID]: Add GPU work period support for Xe driver Aakash Deep Sarkar
                   ` (2 preceding siblings ...)
  2025-09-19 18:38 ` [PATCH v3 3/8] drm/xe: Add a trace point for GPU work period Aakash Deep Sarkar
@ 2025-09-19 18:38 ` Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 5/8] Handle xe_user creation and removal Aakash Deep Sarkar
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Aakash Deep Sarkar @ 2025-09-19 18:38 UTC (permalink / raw)
  To: intel-xe
  Cc: jeevaka.badrappan, rodrigo.vivi, matthew.brost, carlos.santa,
	matthew.auld, jani.nikula, Aakash Deep Sarkar

For GPU work period event we need to record the run time of a
context on the GPU in nanosecs. In the present xe driver code,
we only record the run time in clock ticks and separately for
each engine class.

So, we are adding a uint64 variable |active_duration_ns| in
the xe file structure where we can record the cumulative
run time in ns of all the engines for this context. The
intent here is to add up the |active_duration_ns| in
all the xe files belonging to a given user id to derive
the run time for that user id.

Signed-off-by: Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>
---
 drivers/gpu/drm/xe/xe_device_types.h | 3 +++
 drivers/gpu/drm/xe/xe_exec_queue.c   | 7 +++++++
 2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index a6c361db11d9..e6ecfb3f7f38 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -678,6 +678,9 @@ struct xe_file {
 	/** @run_ticks: hw engine class run time in ticks for this drm client */
 	u64 run_ticks[XE_ENGINE_CLASS_MAX];
 
+	/** @active_duration_ns: total run time in ns for this xe file */
+	u64 active_duration_ns;
+
 	/** @client: drm client */
 	struct xe_drm_client *client;
 
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index 37b2b93b73d6..6eb34c62c779 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -15,6 +15,7 @@
 #include "xe_dep_scheduler.h"
 #include "xe_device.h"
 #include "xe_gt.h"
+#include "xe_gt_clock.h"
 #include "xe_hw_engine_class_sysfs.h"
 #include "xe_hw_engine_group.h"
 #include "xe_hw_fence.h"
@@ -887,6 +888,8 @@ void xe_exec_queue_update_run_ticks(struct xe_exec_queue *q)
 {
 	struct xe_device *xe = gt_to_xe(q->gt);
 	struct xe_lrc *lrc;
+	struct xe_gt *gt = q->gt;
+
 	u64 old_ts, new_ts;
 	int idx;
 
@@ -912,6 +915,10 @@ void xe_exec_queue_update_run_ticks(struct xe_exec_queue *q)
 	new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
 	q->xef->run_ticks[q->class] += (new_ts - old_ts) * q->width;
 
+	// Accumulate the runtime in nanosec for this queue into the xe file.
+	q->xef->active_duration_ns +=
+		xe_gt_clock_interval_to_ns(gt, (new_ts - old_ts));
+
 	drm_dev_exit(idx);
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 5/8] Handle xe_user creation and removal
  2025-09-19 18:38 [PATCH v3 0/8] [ANDROID]: Add GPU work period support for Xe driver Aakash Deep Sarkar
                   ` (3 preceding siblings ...)
  2025-09-19 18:38 ` [PATCH v3 4/8] drm/xe: Modify xe_exec_queue_update_run_ticks Aakash Deep Sarkar
@ 2025-09-19 18:38 ` Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 6/8] drm/xe: Implement xe_work_period_worker Aakash Deep Sarkar
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Aakash Deep Sarkar @ 2025-09-19 18:38 UTC (permalink / raw)
  To: intel-xe
  Cc: jeevaka.badrappan, rodrigo.vivi, matthew.brost, carlos.santa,
	matthew.auld, jani.nikula, Aakash Deep Sarkar

We want our xe user structure to be created when a new
user id opens the xe device node and to be destroyed
when the final xe file with this uid is closed. In other
words the xe_user structure for a uid should remain in
scope as long as any process with this uid has an open
xe file descriptor.

To implement this we maintain an xarray of xe user
structures inside our xe device instance. Whenever a new
xe file is created via an open call, we check if the
calling process' uid is already present in our xarray.
If so, we increment the refcount for the associated
xe user and add this xe file to the list of xe files
belonging to this xe user. Otherwise, we allocate a
new xe user structure for this uid and initialize its
file list with this xe file.

Whenever an xe file is destroyed, we decrement the
refcount of the associated xe user. When the last
xe file in the xe user's file list is destroyed,
the xe user refcount should drop to zero and the
xe user should be cleaned up. During the cleanup path
we remove the xarray entry for this xe user in our
xe device and free up its memory.

Signed-off-by: Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c       | 23 +++++++++
 drivers/gpu/drm/xe/xe_device_types.h | 16 ++++++
 drivers/gpu/drm/xe/xe_user.c         | 76 +++++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_user.h         | 12 ++++-
 4 files changed, 124 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index fdb7b7498920..258b87403596 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -64,6 +64,7 @@
 #include "xe_tile.h"
 #include "xe_ttm_stolen_mgr.h"
 #include "xe_ttm_sys_mgr.h"
+#include "xe_user.h"
 #include "xe_vm.h"
 #include "xe_vm_madvise.h"
 #include "xe_vram.h"
@@ -79,9 +80,13 @@ static int xe_file_open(struct drm_device *dev, struct drm_file *file)
 {
 	struct xe_device *xe = to_xe_device(dev);
 	struct xe_drm_client *client;
+	struct xe_user *user;
 	struct xe_file *xef;
 	int ret = -ENOMEM;
+	int uid = -EINVAL;
+	u32 idx;
 	struct task_struct *task = NULL;
+	const struct cred *cred = NULL;
 
 	xef = kzalloc(sizeof(*xef), GFP_KERNEL);
 	if (!xef)
@@ -106,8 +111,16 @@ static int xe_file_open(struct drm_device *dev, struct drm_file *file)
 	file->driver_priv = xef;
 	kref_init(&xef->refcount);
 
+	INIT_LIST_HEAD(&xef->user_link);
+
 	task = get_pid_task(rcu_access_pointer(file->pid), PIDTYPE_PID);
 	if (task) {
+		cred = get_task_cred(task);
+		if (cred) {
+			uid = (unsigned int) cred->euid.val;
+			xe_user_init(xe, xef, uid);
+			put_cred(cred);
+		}
 		xef->process_name = kstrdup(task->comm, GFP_KERNEL);
 		xef->pid = task->pid;
 		put_task_struct(task);
@@ -127,6 +140,12 @@ static void xe_file_destroy(struct kref *ref)
 
 	xe_drm_client_put(xef->client);
 	kfree(xef->process_name);
+
+	mutex_lock(&xef->user->filelist_lock);
+	list_del(&xef->user_link);
+	mutex_unlock(&xef->user->filelist_lock);
+
+	xe_user_put(xef->user);
 	kfree(xef);
 }
 
@@ -466,6 +485,10 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
 
 	xa_init_flags(&xe->usm.asid_to_vm, XA_FLAGS_ALLOC);
 
+	xa_init_flags(&xe->work_period.users, XA_FLAGS_ALLOC1);
+
+	mutex_init(&xe->work_period.lock);
+
 	if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) {
 		/* Trigger a large asid and an early asid wrap. */
 		u32 asid;
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index e6ecfb3f7f38..e42b15aa4449 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -608,6 +608,16 @@ struct xe_device {
 	atomic_t g2g_test_count;
 #endif
 
+	/**
+	 * @xe_work_period: Support for GPU work period tracepoint
+	 */
+	struct xe_work_period {
+		/** @users: list of users that have opened this xe device */
+		struct xarray users;
+		/** @lock: lock protecting this structure */
+		struct mutex lock;
+	} work_period;
+
 	/* private: */
 
 #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
@@ -681,6 +691,12 @@ struct xe_file {
 	/** @active_duration_ns: total run time in ns for this xe file */
 	u64 active_duration_ns;
 
+	/** @user: pointer to struct xe_user associated with this xe file */
+	struct xe_user *user;
+
+	/** @user_link: link into xe_user::filelist */
+	struct list_head user_link;
+
 	/** @client: drm client */
 	struct xe_drm_client *client;
 
diff --git a/drivers/gpu/drm/xe/xe_user.c b/drivers/gpu/drm/xe/xe_user.c
index 8c285a68115a..fcdddefc7b4f 100644
--- a/drivers/gpu/drm/xe/xe_user.c
+++ b/drivers/gpu/drm/xe/xe_user.c
@@ -4,6 +4,7 @@
  */
 
 #include <linux/slab.h>
+#include <drm/drm_drv.h>
 
 #include "xe_user.h"
 
@@ -28,7 +29,7 @@ static inline void work_period_worker(struct work_struct *work)
  *
  * Return: pointer to user struct or NULL if can't allocate
  */
-struct xe_user *xe_user_alloc(void)
+static struct xe_user *xe_user_alloc(void)
 {
 	struct xe_user *user;
 
@@ -39,7 +40,6 @@ struct xe_user *xe_user_alloc(void)
 	kref_init(&user->refcount);
 	mutex_init(&user->filelist_lock);
 	INIT_LIST_HEAD(&user->filelist);
-	//TODO: Add a hook into xe device
 	INIT_WORK(&user->work, work_period_worker);
 	return user;
 }
@@ -54,6 +54,78 @@ void __xe_user_free(struct kref *kref)
 {
 	struct xe_user *user =
 		container_of(kref, struct xe_user, refcount);
+	struct xe_device *xe = user->xe;
+	void *lookup;
 
+	mutex_lock(&xe->work_period.lock);
+	lookup = xa_erase(&xe->work_period.users, user->id);
+	xe_assert(xe, lookup == user);
+	mutex_unlock(&xe->work_period.lock);
+
+	drm_dev_put(&user->xe->drm);
 	kfree(user);
 }
+
+static struct xe_user *xe_user_lookup(struct xe_device *xe, u32 uid)
+{
+	struct xe_user *user = NULL;
+	unsigned long i;
+
+	mutex_lock(&xe->work_period.lock);
+	xa_for_each(&xe->work_period.users, i, user) {
+		if (user->uid == uid) {
+			xe_user_get(user);
+			mutex_unlock(&xe->work_period.lock);
+			return user;
+		}
+	}
+	mutex_unlock(&xe->work_period.lock);
+
+	return NULL;
+}
+
+int xe_user_init(struct xe_device *xe, struct xe_file *xef, unsigned int uid)
+{
+	struct xe_user *user = NULL;
+	int ret;
+	u32 idx;
+	/*
+	 * Check if the calling process/uid has already been registered
+	 * with the xe device during a previous open call. If so then
+	 * take a reference to this xe user and add this xe file to the
+	 * filelist belonging to this xe user
+	 */
+	user = xe_user_lookup(xe, uid);
+	if (!user) {
+		/*
+		 * We couldn't find an existing xe user for the calling process.
+		 * Allocate a new struct xe_user and register it with this xe
+		 * device
+		 */
+		user = xe_user_alloc();
+		if (!user)
+			return -ENOMEM;
+
+
+		user->uid = uid;
+		user->last_timestamp_ns = ktime_get_raw_ns();
+		user->xe = xe;
+
+		mutex_lock(&xe->work_period.lock);
+		ret = xa_alloc(&xe->work_period.users, &idx, user, xa_limit_32b, GFP_KERNEL);
+		mutex_unlock(&xe->work_period.lock);
+
+		if (ret < 0)
+			return ret;
+
+		user->id = idx;
+		drm_dev_get(&xe->drm);
+	}
+
+	mutex_lock(&user->filelist_lock);
+	list_add(&xef->user_link, &user->filelist);
+	mutex_unlock(&user->filelist_lock);
+	xef->user = user;
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/xe/xe_user.h b/drivers/gpu/drm/xe/xe_user.h
index e52f66d3f3b0..ec4c0f2b862c 100644
--- a/drivers/gpu/drm/xe/xe_user.h
+++ b/drivers/gpu/drm/xe/xe_user.h
@@ -8,8 +8,12 @@
 
 #include <linux/kref.h>
 #include <linux/list.h>
+#include <linux/mutex.h>
 #include <linux/workqueue.h>
 
+#include "xe_device.h"
+
+
 /**
  * This is a per process/user id structure for a xe device
  * client. It is allocated when a new process/app opens the
@@ -43,6 +47,11 @@ struct xe_user {
 	 */
 	struct work_struct work;
 
+	/**
+	 * @id: index of this user into the xe device users array
+	 */
+	u32 id;
+
 	/**
 	 * @uid: user id for this xe_user
 	 */
@@ -61,7 +70,8 @@ struct xe_user {
 	u64 last_timestamp_ns;
 };
 
-struct xe_user *xe_user_alloc(void);
+int xe_user_init(struct xe_device *xe, struct xe_file *xef, unsigned int uid);
+
 
 static inline struct xe_user *
 xe_user_get(struct xe_user *user)
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 6/8] drm/xe: Implement xe_work_period_worker
  2025-09-19 18:38 [PATCH v3 0/8] [ANDROID]: Add GPU work period support for Xe driver Aakash Deep Sarkar
                   ` (4 preceding siblings ...)
  2025-09-19 18:38 ` [PATCH v3 5/8] Handle xe_user creation and removal Aakash Deep Sarkar
@ 2025-09-19 18:38 ` Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 7/8] drm/xe: Add a Kconfig option for GPU work period Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 8/8] Handle xe_work_period destruction Aakash Deep Sarkar
  7 siblings, 0 replies; 9+ messages in thread
From: Aakash Deep Sarkar @ 2025-09-19 18:38 UTC (permalink / raw)
  To: intel-xe
  Cc: jeevaka.badrappan, rodrigo.vivi, matthew.brost, carlos.santa,
	matthew.auld, jani.nikula, Aakash Deep Sarkar

The work of collecting the GPU run time for a given
xe_user and emitting its event, is done by the
xe_work_period_worker kworker. At the time of creation
of a new xe_user, we simultaneously start a delayed
kworker thread. The delay of execution is set to be
500 ms. After the completion of the work, the kworker
schedules itself for the next execution. This is done
as long as the reference to the xe_user pointer is
valid.

During each execution cycle the xe_work_period_worker
iterates over all the xe files in the xe_user::filelist
and accumulate their corresponding GPU runtime into the
xe_user::active_duration_ns; while also updating each of
the xe_file::active_duration_ns. The total runtime for
this uid in the current sampling period is the delta
between the previous xe_user::active_duration_ns and
the current xe_user::active_duration_ns.

We also record the current timestamp at the end of each
invocation to xe_work_period_worker function in the
xe_user::last_timestamp_ns. The sampling period for this
uid is the delta between the previous timestamp and the
current timestamp.

Signed-off-by: Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c |  13 ++--
 drivers/gpu/drm/xe/xe_pm.c     |   5 ++
 drivers/gpu/drm/xe/xe_user.c   | 127 +++++++++++++++++++++++++++++++--
 drivers/gpu/drm/xe/xe_user.h   |  21 ++++--
 4 files changed, 149 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 258b87403596..8e368346b6d4 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -80,11 +80,9 @@ static int xe_file_open(struct drm_device *dev, struct drm_file *file)
 {
 	struct xe_device *xe = to_xe_device(dev);
 	struct xe_drm_client *client;
-	struct xe_user *user;
 	struct xe_file *xef;
 	int ret = -ENOMEM;
 	int uid = -EINVAL;
-	u32 idx;
 	struct task_struct *task = NULL;
 	const struct cred *cred = NULL;
 
@@ -141,11 +139,12 @@ static void xe_file_destroy(struct kref *ref)
 	xe_drm_client_put(xef->client);
 	kfree(xef->process_name);
 
-	mutex_lock(&xef->user->filelist_lock);
-	list_del(&xef->user_link);
-	mutex_unlock(&xef->user->filelist_lock);
-
-	xe_user_put(xef->user);
+	if (xef->user) {
+		mutex_lock(&xef->user->lock);
+		list_del(&xef->user_link);
+		xe_user_put(xef->user);
+		mutex_unlock(&xef->user->lock);
+	}
 	kfree(xef);
 }
 
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 2b61a3b8257c..35d5433a9e0e 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -26,6 +26,7 @@
 #include "xe_pxp.h"
 #include "xe_sriov_vf_ccs.h"
 #include "xe_trace.h"
+#include "xe_user.h"
 #include "xe_vm.h"
 #include "xe_wa.h"
 
@@ -532,6 +533,8 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
 
 	xe_i2c_pm_suspend(xe);
 
+	xe_user_cancel_workers(xe);
+
 	xe_rpm_lockmap_release(xe);
 	xe_pm_write_callback_task(xe, NULL);
 	return 0;
@@ -584,6 +587,8 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 
 	xe_i2c_pm_resume(xe, xe->d3cold.allowed);
 
+	xe_user_resume_workers(xe);
+
 	xe_irq_resume(xe);
 
 	for_each_gt(gt, xe, id)
diff --git a/drivers/gpu/drm/xe/xe_user.c b/drivers/gpu/drm/xe/xe_user.c
index fcdddefc7b4f..8e23a6c74717 100644
--- a/drivers/gpu/drm/xe/xe_user.c
+++ b/drivers/gpu/drm/xe/xe_user.c
@@ -6,17 +6,95 @@
 #include <linux/slab.h>
 #include <drm/drm_drv.h>
 
+#include "xe_assert.h"
+#include "xe_device_types.h"
+#include "xe_exec_queue.h"
+#include "xe_pm.h"
 #include "xe_user.h"
 
+#define CREATE_TRACE_POINTS
+#include <trace/gpu_work_period.h>
+
+static inline void schedule_next_work(struct xe_device *xe, unsigned int id)
+{
+	struct xe_user *user;
+
+	mutex_lock(&xe->work_period.lock);
+	user = xa_load(&xe->work_period.users, id);
+	if (user && xe_user_get_unless_zero(user))
+		schedule_delayed_work(&user->delay_work,
+				msecs_to_jiffies(XE_WORK_PERIOD_INTERVAL));
+	mutex_unlock(&xe->work_period.lock);
+}
 /**
  * worker thread to emit gpu work period event for this xe user
  * @work: work instance for this xe user
  *
  * Return: void
  */
-static inline void work_period_worker(struct work_struct *work)
+static void xe_work_period_worker(struct work_struct *work)
 {
-	//TODO: Implement this worker
+	struct xe_user *user = container_of(work, struct xe_user, delay_work.work);
+	struct xe_device *xe = user->xe;
+	struct xe_file *xef;
+	struct xe_exec_queue *q;
+
+	/*
+	 * The GPU work period event requires the following parameters
+	 *
+	 * gpuid:           GPU index in case the platform has more than one GPU
+	 * uid:             user id of the app
+	 * start_time:      start time for the sampling period in nanosecs
+	 * end_time:        end time for the sampling period in nanosecs
+	 * active_duration: Total runtime in nanosecs for this uid in
+	 *                  the current sampling period.
+	 */
+	u32 gpuid = 0, uid = user->uid, id = user->id;
+	u64 start_time, end_time, active_duration;
+	u64 last_active_duration, last_timestamp;
+	unsigned long i;
+
+	mutex_lock(&user->lock);
+
+	// Save the last recorded active duration and timestamp
+	last_active_duration = user->active_duration_ns;
+	last_timestamp = user->last_timestamp_ns;
+
+	if (xe_pm_runtime_get_if_active(xe)) {
+
+		list_for_each_entry(xef, &user->filelist, user_link) {
+
+			wait_var_event(&xef->exec_queue.pending_removal,
+			!atomic_read(&xef->exec_queue.pending_removal));
+
+			/* Accumulate all the exec queues from this file */
+			mutex_lock(&xef->exec_queue.lock);
+			xa_for_each(&xef->exec_queue.xa, i, q) {
+				xe_exec_queue_get(q);
+				mutex_unlock(&xef->exec_queue.lock);
+
+				xe_exec_queue_update_run_ticks(q);
+
+				mutex_lock(&xef->exec_queue.lock);
+				xe_exec_queue_put(q);
+			}
+			mutex_unlock(&xef->exec_queue.lock);
+			user->active_duration_ns += xef->active_duration_ns;
+		}
+
+		xe_pm_runtime_put(xe);
+
+		start_time = last_timestamp + 1;
+		end_time = ktime_get_raw_ns();
+		active_duration = user->active_duration_ns - last_active_duration;
+		trace_gpu_work_period(gpuid, uid, start_time, end_time, active_duration);
+		user->last_timestamp_ns = end_time;
+		xe_user_put(user);
+	}
+
+	mutex_unlock(&user->lock);
+
+	schedule_next_work(xe, id);
 }
 
 /**
@@ -38,9 +116,9 @@ static struct xe_user *xe_user_alloc(void)
 		return NULL;
 
 	kref_init(&user->refcount);
-	mutex_init(&user->filelist_lock);
+	mutex_init(&user->lock);
 	INIT_LIST_HEAD(&user->filelist);
-	INIT_WORK(&user->work, work_period_worker);
+	INIT_DELAYED_WORK(&user->delay_work, xe_work_period_worker);
 	return user;
 }
 
@@ -120,12 +198,49 @@ int xe_user_init(struct xe_device* xe, struct xe_file* xef, unsigned int uid)
 
 		user->id = idx;
 		drm_dev_get(&xe->drm);
+
+		xe_user_get(user);
+		if (!schedule_delayed_work(&user->delay_work,
+					msecs_to_jiffies(XE_WORK_PERIOD_INTERVAL)))
+			xe_user_put(user);
 	}
 
-	mutex_lock(&user->filelist_lock);
+	mutex_lock(&user->lock);
 	list_add(&xef->user_link, &user->filelist);
-	mutex_unlock(&user->filelist_lock);
+	mutex_unlock(&user->lock);
 	xef->user = user;
 
 	return 0;
 }
+
+void xe_user_cancel_workers(struct xe_device *xe)
+{
+	struct xe_user *user = NULL;
+	unsigned long i = 0;
+
+	mutex_lock(&xe->work_period.lock);
+	xa_for_each(&xe->work_period.users, i, user) {
+		if (user && xe_user_get_unless_zero(user)) {
+			cancel_delayed_work_sync(&user->delay_work);
+			xe_user_put(user);
+		}
+	}
+	mutex_unlock(&xe->work_period.lock);
+}
+
+void xe_user_resume_workers(struct xe_device *xe)
+{
+	struct xe_user *user = NULL;
+	unsigned long i = 0;
+
+	mutex_lock(&xe->work_period.lock);
+	xa_for_each(&xe->work_period.users, i, user) {
+		if (user && xe_user_get_unless_zero(user)) {
+			if (!schedule_delayed_work(&user->delay_work,
+					msecs_to_jiffies(XE_WORK_PERIOD_INTERVAL)))
+				xe_user_put(user);
+		}
+	}
+	mutex_unlock(&xe->work_period.lock);
+}
+
diff --git a/drivers/gpu/drm/xe/xe_user.h b/drivers/gpu/drm/xe/xe_user.h
index ec4c0f2b862c..fc976beed2ad 100644
--- a/drivers/gpu/drm/xe/xe_user.h
+++ b/drivers/gpu/drm/xe/xe_user.h
@@ -11,9 +11,11 @@
 #include <linux/mutex.h>
 #include <linux/workqueue.h>
 
-#include "xe_device.h"
+#include "xe_device_types.h"
 
 
+#define XE_WORK_PERIOD_INTERVAL 500
+
 /**
  * This is a per process/user id structure for a xe device
  * client. It is allocated when a new process/app opens the
@@ -32,9 +34,9 @@ struct xe_user {
 	struct xe_device *xe;
 
 	/**
-	 * @filelist_lock: lock protecting the filelist
+	 * @filelist_lock: lock protecting this structure
 	 */
-	struct mutex filelist_lock;
+	struct mutex lock;
 
 	/**
 	 * @filelist: list of xe files belonging to this xe user
@@ -45,7 +47,7 @@ struct xe_user {
 	 * @work: work to emit the gpu work period event for this
 	 * xe user
 	 */
-	struct work_struct work;
+	struct delayed_work delay_work;
 
 	/**
 	 * @id: index of this user into the xe device users array
@@ -72,6 +74,17 @@ struct xe_user {
 
 int xe_user_init(struct xe_device* xe, struct xe_file* xef, unsigned int uid);
 
+void xe_user_cancel_workers(struct xe_device *xe);
+
+void xe_user_resume_workers(struct xe_device *xe);
+
+static inline struct xe_user *
+xe_user_get_unless_zero(struct xe_user *user)
+{
+	if (kref_get_unless_zero(&user->refcount))
+		return user;
+	return NULL;
+}
 
 static inline struct xe_user *
 xe_user_get(struct xe_user *user)
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 7/8] drm/xe: Add a Kconfig option for GPU work period
  2025-09-19 18:38 [PATCH v3 0/8] [ANDROID]: Add GPU work period support for Xe driver Aakash Deep Sarkar
                   ` (5 preceding siblings ...)
  2025-09-19 18:38 ` [PATCH v3 6/8] drm/xe: Implement xe_work_period_worker Aakash Deep Sarkar
@ 2025-09-19 18:38 ` Aakash Deep Sarkar
  2025-09-19 18:38 ` [PATCH v3 8/8] Handle xe_work_period destruction Aakash Deep Sarkar
  7 siblings, 0 replies; 9+ messages in thread
From: Aakash Deep Sarkar @ 2025-09-19 18:38 UTC (permalink / raw)
  To: intel-xe
  Cc: jeevaka.badrappan, rodrigo.vivi, matthew.brost, carlos.santa,
	matthew.auld, jani.nikula, Aakash Deep Sarkar

Since this requirement is intended only for Android, there's
no reason to have it enabled by default in other distributions.
So, better to have it guarded by a Kconfig option.

Signed-off-by: Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>
---
 drivers/gpu/drm/xe/Makefile        |  2 +-
 drivers/gpu/drm/xe/xe_device.c     |  1 -
 drivers/gpu/drm/xe/xe_exec_queue.c |  5 +++--
 drivers/gpu/drm/xe/xe_user.h       | 27 ++++++++++++++++++++++++++-
 drivers/gpu/trace/Kconfig          | 12 ++++++++++++
 5 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index ff6b584f3293..6fc23367bdfe 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -333,7 +333,7 @@ ifeq ($(CONFIG_DEBUG_FS),y)
 
 	xe-$(CONFIG_PCI_IOV) += xe_gt_sriov_pf_debugfs.o
 
-	xe-y += xe_user.o
+	xe-$(CONFIG_TRACE_GPU_WORK_PERIOD) += xe_user.o
 
 	xe-$(CONFIG_DRM_XE_DISPLAY) += \
 		i915-display/intel_display_debugfs.o \
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 8e368346b6d4..30d9f2747eab 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -485,7 +485,6 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
 	xa_init_flags(&xe->usm.asid_to_vm, XA_FLAGS_ALLOC);
 
 	xa_init_flags(&xe->work_period.users, XA_FLAGS_ALLOC1);
-
 	mutex_init(&xe->work_period.lock);
 
 	if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) {
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index 6eb34c62c779..d5013d546348 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -915,9 +915,10 @@ void xe_exec_queue_update_run_ticks(struct xe_exec_queue *q)
 	new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
 	q->xef->run_ticks[q->class] += (new_ts - old_ts) * q->width;
 
-	// Accumulate the runtime in nanosec for this queue into the xe file.
+
+	// Accumulate the runtime in ns for this queue
 	q->xef->active_duration_ns +=
-		xe_gt_clock_interval_to_ns(gt, (new_ts - old_ts));
+			xe_gt_clock_interval_to_ns(gt, (new_ts - old_ts));
 
 	drm_dev_exit(idx);
 }
diff --git a/drivers/gpu/drm/xe/xe_user.h b/drivers/gpu/drm/xe/xe_user.h
index fc976beed2ad..c88d4be2c730 100644
--- a/drivers/gpu/drm/xe/xe_user.h
+++ b/drivers/gpu/drm/xe/xe_user.h
@@ -72,12 +72,38 @@ struct xe_user {
 	u64 last_timestamp_ns;
 };
 
+#if IS_ENABLED(CONFIG_TRACE_GPU_WORK_PERIOD)
+
 int xe_user_init(struct xe_device* xe, struct xe_file* xef, unsigned int uid);
 
 void xe_user_cancel_workers(struct xe_device* xe);
 
 void xe_user_resume_workers(struct xe_device* xe);
 
+void __xe_user_free(struct kref *kref);
+
+#else
+
+static inline
+int xe_user_init(struct xe_device *xe, struct xe_file *xef, unsigned int uid);
+{
+	return 0;
+}
+
+static inline void __xe_user_free(struct kref *kref)
+{
+}
+
+void xe_user_cancel_workers(struct xe_device *xe)
+{
+}
+
+void xe_user_resume_workers(struct xe_device *xe)
+{
+}
+
+#endif // CONFIG_TRACE_GPU_WORK_PERIOD
+
 static inline struct xe_user *
 xe_user_get_unless_zero(struct xe_user *user)
 {
@@ -93,7 +119,6 @@ xe_user_get(struct xe_user *user)
 	return user;
 }
 
-void __xe_user_free(struct kref *kref);
 
 static inline void xe_user_put(struct xe_user *user)
 {
diff --git a/drivers/gpu/trace/Kconfig b/drivers/gpu/trace/Kconfig
index cd3d19c4a201..34f2e08cf1be 100644
--- a/drivers/gpu/trace/Kconfig
+++ b/drivers/gpu/trace/Kconfig
@@ -11,3 +11,15 @@ config TRACE_GPU_MEM
 	  Tracepoint availability varies by GPU driver.
 
 	  If in doubt, say "N".
+
+config TRACE_GPU_WORK_PERIOD
+	bool "Enable GPU work period tracepoint"
+	default n
+	help
+	  Choose this option to enable tracepoint for tracking
+	  GPU usage based on the UID. Intended for performance
+	  profiling and required for Android.
+
+	  Tracepoint availability varies by GPU driver.
+
+	  If in doubt, say "N".
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 8/8] Handle xe_work_period destruction
  2025-09-19 18:38 [PATCH v3 0/8] [ANDROID]: Add GPU work period support for Xe driver Aakash Deep Sarkar
                   ` (6 preceding siblings ...)
  2025-09-19 18:38 ` [PATCH v3 7/8] drm/xe: Add a Kconfig option for GPU work period Aakash Deep Sarkar
@ 2025-09-19 18:38 ` Aakash Deep Sarkar
  7 siblings, 0 replies; 9+ messages in thread
From: Aakash Deep Sarkar @ 2025-09-19 18:38 UTC (permalink / raw)
  To: intel-xe
  Cc: jeevaka.badrappan, rodrigo.vivi, matthew.brost, carlos.santa,
	matthew.auld, jani.nikula, Aakash Deep Sarkar

This adds the xe_work_period destruction procedure.
We iterate over all entries in the xe::work_period::users
xarray and cancel any pending delayed work. Then destroy
the xarray itself.

Signed-off-by: Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 30d9f2747eab..3b546d92fa03 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -418,6 +418,8 @@ static struct drm_driver driver = {
 static void xe_device_destroy(struct drm_device *dev, void *dummy)
 {
 	struct xe_device *xe = to_xe_device(dev);
+	struct xe_user *user = NULL;
+	unsigned long i;
 
 	xe_bo_dev_fini(&xe->bo_device);
 
@@ -433,6 +435,15 @@ static void xe_device_destroy(struct drm_device *dev, void *dummy)
 	if (xe->destroy_wq)
 		destroy_workqueue(xe->destroy_wq);
 
+
+	mutex_lock(&xe->work_period.lock);
+	xa_for_each(&xe->work_period.users, i, user) {
+		if (cancel_delayed_work_sync(&user->delay_work))
+			xe_user_put(user);
+	}
+	xa_destroy(&xe->work_period.users);
+	mutex_unlock(&xe->work_period.lock);
+
 	ttm_device_fini(&xe->ttm);
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-09-19 19:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-19 18:38 [PATCH v3 0/8] [ANDROID]: Add GPU work period support for Xe driver Aakash Deep Sarkar
2025-09-19 18:38 ` [PATCH v3 1/8] Add a new xe_user structure Aakash Deep Sarkar
2025-09-19 18:38 ` [PATCH v3 2/8] Add xe_gt_clock_interval_to_ns function Aakash Deep Sarkar
2025-09-19 18:38 ` [PATCH v3 3/8] drm/xe: Add a trace point for GPU work period Aakash Deep Sarkar
2025-09-19 18:38 ` [PATCH v3 4/8] drm/xe: Modify xe_exec_queue_update_run_ticks Aakash Deep Sarkar
2025-09-19 18:38 ` [PATCH v3 5/8] Handle xe_user creation and removal Aakash Deep Sarkar
2025-09-19 18:38 ` [PATCH v3 6/8] drm/xe: Implement xe_work_period_worker Aakash Deep Sarkar
2025-09-19 18:38 ` [PATCH v3 7/8] drm/xe: Add a Kconfig option for GPU work period Aakash Deep Sarkar
2025-09-19 18:38 ` [PATCH v3 8/8] Handle xe_work_period destruction Aakash Deep Sarkar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox