From: Jani Nikula <jani.nikula@linux.intel.com>
To: Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>,
intel-xe@lists.freedesktop.org
Cc: jeevaka.badrappan@intel.com, rodrigo.vivi@intel.com,
matthew.brost@intel.com, carlos.santa@intel.com,
matthew.auld@intel.com,
Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>
Subject: Re: [PATCH v4 6/9] drm/xe: Implement xe_work_period_worker
Date: Fri, 26 Sep 2025 14:31:08 +0300 [thread overview]
Message-ID: <3cf9a71439b5d79751f6b7879bdc6b917d023d2d@intel.com> (raw)
In-Reply-To: <20250926104521.1815428-7-aakash.deep.sarkar@intel.com>
On Fri, 26 Sep 2025, Aakash Deep Sarkar <aakash.deep.sarkar@intel.com> wrote:
> The work of collecting the GPU run time for a given
> xe_user and emitting its event, is done by the
> xe_work_period_worker kworker. At the time of creation
> of a new xe_user, we simultaneously start a delayed
> kworker thread. The delay of execution is set to be
> 500 ms. After the completion of the work, the kworker
> schedules itself for the next execution. This is done
> as long as the reference to the xe_user pointer is
> valid.
>
> During each execution cycle the xe_work_period_worker
> iterates over all the xe files in the xe_user::filelist
> and accumulate their corresponding GPU runtime into the
> xe_user::active_duration_ns; while also updating each of
> the xe_file::active_duration_ns. The total runtime for
> this uid in the current sampling period is the delta
> between the previous xe_user::active_duration_ns and
> the current xe_user::active_duration_ns.
>
> We also record the current timestamp at the end of each
> invocation to xe_work_period_worker function in the
> xe_user::last_timestamp_ns. The sampling period for this
> uid is the delta between the previous timestamp and the
> current timestamp.
>
> Signed-off-by: Aakash Deep Sarkar <aakash.deep.sarkar@intel.com>
> ---
> drivers/gpu/drm/xe/xe_device.c | 13 ++--
> drivers/gpu/drm/xe/xe_pm.c | 5 ++
> drivers/gpu/drm/xe/xe_user.c | 127 +++++++++++++++++++++++++++++++--
> drivers/gpu/drm/xe/xe_user.h | 21 ++++--
> 4 files changed, 149 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 837c23784388..5569a27abb09 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -81,11 +81,9 @@ static int xe_file_open(struct drm_device *dev, struct drm_file *file)
> {
> struct xe_device *xe = to_xe_device(dev);
> struct xe_drm_client *client;
> - struct xe_user *user;
> struct xe_file *xef;
> int ret = -ENOMEM;
> int uid = -EINVAL;
> - u32 idx;
> struct task_struct *task = NULL;
> const struct cred *cred = NULL;
>
> @@ -142,11 +140,12 @@ static void xe_file_destroy(struct kref *ref)
> xe_drm_client_put(xef->client);
> kfree(xef->process_name);
>
> - mutex_lock(&xef->user->filelist_lock);
> - list_del(&xef->user_link);
> - mutex_unlock(&xef->user->filelist_lock);
> -
> - xe_user_put(xef->user);
> + if (xef->user) {
> + mutex_lock(&xef->user->lock);
> + list_del(&xef->user_link);
> + xe_user_put(xef->user);
> + mutex_unlock(&xef->user->lock);
> + }
> kfree(xef);
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index b7e3094f8acf..c7add2616189 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -26,6 +26,7 @@
> #include "xe_pxp.h"
> #include "xe_sriov_vf_ccs.h"
> #include "xe_trace.h"
> +#include "xe_user.h"
> #include "xe_vm.h"
> #include "xe_wa.h"
>
> @@ -598,6 +599,8 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>
> xe_i2c_pm_suspend(xe);
>
> + xe_user_cancel_workers(xe);
> +
> xe_rpm_lockmap_release(xe);
> xe_pm_write_callback_task(xe, NULL);
> return 0;
> @@ -650,6 +653,8 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>
> xe_i2c_pm_resume(xe, xe->d3cold.allowed);
>
> + xe_user_resume_workers(xe);
> +
> xe_irq_resume(xe);
>
> for_each_gt(gt, xe, id)
> diff --git a/drivers/gpu/drm/xe/xe_user.c b/drivers/gpu/drm/xe/xe_user.c
> index 846c6451140b..19b28bcada0f 100644
> --- a/drivers/gpu/drm/xe/xe_user.c
> +++ b/drivers/gpu/drm/xe/xe_user.c
> @@ -6,17 +6,95 @@
> #include <linux/slab.h>
> #include <drm/drm_drv.h>
>
> +#include "xe_assert.h"
> +#include "xe_device_types.h"
> +#include "xe_exec_queue.h"
> +#include "xe_pm.h"
> #include "xe_user.h"
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/gpu_work_period.h>
> +
> +static inline void schedule_next_work(struct xe_device *xe, unsigned int id)
> +{
> + struct xe_user *user;
> +
> + mutex_lock(&xe->work_period.lock);
> + user = xa_load(&xe->work_period.users, id);
> + if (user && xe_user_get_unless_zero(user))
> + schedule_delayed_work(&user->delay_work,
> + msecs_to_jiffies(XE_WORK_PERIOD_INTERVAL));
> + mutex_unlock(&xe->work_period.lock);
> +}
> /**
> * worker thread to emit gpu work period event for this xe user
> * @work: work instance for this xe user
> *
> * Return: void
> */
> -static inline void work_period_worker(struct work_struct *work)
> +static void xe_work_period_worker(struct work_struct *work)
> {
> - //TODO: Implement this worker
> + struct xe_user *user = container_of(work, struct xe_user, delay_work.work);
> + struct xe_device *xe = user->xe;
> + struct xe_file *xef;
> + struct xe_exec_queue *q;
> +
> + /*
> + * The GPU work period event requires the following parameters
> + *
> + * gpuid: GPU index in case the platform has more than one GPU
> + * uid: user id of the app
> + * start_time: start time for the sampling period in nanosecs
> + * end_time: end time for the sampling period in nanosecs
> + * active_duration: Total runtime in nanosecs for this uid in
> + * the current sampling period.
> + */
> + u32 gpuid = 0, uid = user->uid, id = user->id;
> + u64 start_time, end_time, active_duration;
> + u64 last_active_duration, last_timestamp;
> + unsigned long i;
> +
> + mutex_lock(&user->lock);
> +
> + // Save the last recorded active duration and timestamp
> + last_active_duration = user->active_duration_ns;
> + last_timestamp = user->last_timestamp_ns;
> +
> + if (xe_pm_runtime_get_if_active(xe)) {
> +
> + list_for_each_entry(xef, &user->filelist, user_link) {
> +
> + wait_var_event(&xef->exec_queue.pending_removal,
> + !atomic_read(&xef->exec_queue.pending_removal));
> +
> + /* Accumulate all the exec queues from this file */
> + mutex_lock(&xef->exec_queue.lock);
> + xa_for_each(&xef->exec_queue.xa, i, q) {
> + xe_exec_queue_get(q);
> + mutex_unlock(&xef->exec_queue.lock);
> +
> + xe_exec_queue_update_run_ticks(q);
> +
> + mutex_lock(&xef->exec_queue.lock);
> + xe_exec_queue_put(q);
> + }
> + mutex_unlock(&xef->exec_queue.lock);
> + user->active_duration_ns += xef->active_duration_ns;
> + }
> +
> + xe_pm_runtime_put(xe);
> +
> + start_time = last_timestamp + 1;
> + end_time = ktime_get_raw_ns();
> + active_duration = user->active_duration_ns - last_active_duration;
> + trace_gpu_work_period(gpuid, uid, start_time, end_time, active_duration);
> + user->last_timestamp_ns = end_time;
> + xe_user_put(user);
> + }
> +
> + mutex_unlock(&user->lock);
> +
> + schedule_next_work(xe, id);
> }
>
> /**
> @@ -38,9 +116,9 @@ static struct xe_user *xe_user_alloc(void)
> return NULL;
>
> kref_init(&user->refcount);
> - mutex_init(&user->filelist_lock);
> + mutex_init(&user->lock);
> INIT_LIST_HEAD(&user->filelist);
> - INIT_WORK(&user->work, work_period_worker);
> + INIT_DELAYED_WORK(&user->delay_work, xe_work_period_worker);
> return user;
> }
>
> @@ -120,12 +198,49 @@ int xe_user_init(struct xe_device *xe, struct xe_file *xef, unsigned int uid)
>
> user->id = idx;
> drm_dev_get(&xe->drm);
> +
> + xe_user_get(user);
> + if (!schedule_delayed_work(&user->delay_work,
> + msecs_to_jiffies(XE_WORK_PERIOD_INTERVAL)))
> + xe_user_put(user);
> }
>
> - mutex_lock(&user->filelist_lock);
> + mutex_lock(&user->lock);
> list_add(&xef->user_link, &user->filelist);
> - mutex_unlock(&user->filelist_lock);
> + mutex_unlock(&user->lock);
> xef->user = user;
>
> return 0;
> }
> +
> +void xe_user_cancel_workers(struct xe_device *xe)
> +{
> + struct xe_user *user = NULL;
> + unsigned long i = 0;
> +
> + mutex_lock(&xe->work_period.lock);
> + xa_for_each(&xe->work_period.users, i, user) {
> + if (user && xe_user_get_unless_zero(user)) {
> + cancel_delayed_work_sync(&user->delay_work);
> + xe_user_put(user);
> + }
> + }
> + mutex_unlock(&xe->work_period.lock);
> +}
> +
> +void xe_user_resume_workers(struct xe_device *xe)
> +{
> + struct xe_user *user = NULL;
> + unsigned long i = 0;
> +
> + mutex_lock(&xe->work_period.lock);
> + xa_for_each(&xe->work_period.users, i, user) {
> + if (user && xe_user_get_unless_zero(user)) {
> + if (!schedule_delayed_work(&user->delay_work,
> + msecs_to_jiffies(XE_WORK_PERIOD_INTERVAL)))
> + xe_user_put(user);
> + }
> + }
> + mutex_unlock(&xe->work_period.lock);
> +}
> +
> diff --git a/drivers/gpu/drm/xe/xe_user.h b/drivers/gpu/drm/xe/xe_user.h
> index b13130cc9492..ded816be7334 100644
> --- a/drivers/gpu/drm/xe/xe_user.h
> +++ b/drivers/gpu/drm/xe/xe_user.h
> @@ -11,9 +11,11 @@
> #include <linux/mutex.h>
> #include <linux/workqueue.h>
>
> -#include "xe_device.h"
> +#include "xe_device_types.h"
This shouldn't be needed, neither should.
>
>
> +#define XE_WORK_PERIOD_INTERVAL 500
> +
> /**
> * This is a per process/user id structure for a xe device
> * client. It is allocated when a new process/app opens the
> @@ -32,9 +34,9 @@ struct xe_user {
> struct xe_device *xe;
>
> /**
> - * @filelist_lock: lock protecting the filelist
> + * @filelist_lock: lock protecting this structure
> */
> - struct mutex filelist_lock;
> + struct mutex lock;
>
> /**
> * @filelist: list of xe files belonging to this xe user
> @@ -45,7 +47,7 @@ struct xe_user {
> * @work: work to emit the gpu work period event for this
> * xe user
> */
> - struct work_struct work;
> + struct delayed_work delay_work;
>
> /**
> * @id: index of this user into the xe device users array
> @@ -72,6 +74,17 @@ struct xe_user {
>
> int xe_user_init(struct xe_device *xe, struct xe_file *xef, unsigned int uid);
>
> +void xe_user_cancel_workers(struct xe_device *xe);
> +
> +void xe_user_resume_workers(struct xe_device *xe);
> +
> +static inline struct xe_user *
> +xe_user_get_unless_zero(struct xe_user *user)
> +{
> + if (kref_get_unless_zero(&user->refcount))
> + return user;
> + return NULL;
> +}
This is only ever used in xe_user.c. There's no need to add it to a
header.
>
> static inline struct xe_user *
> xe_user_get(struct xe_user *user)
--
Jani Nikula, Intel
next prev parent reply other threads:[~2025-09-26 11:31 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-26 10:45 [PATCH v4 0/9] [ANDROID]: Add GPU work period support for Xe driver Aakash Deep Sarkar
2025-09-26 10:45 ` [PATCH v4 1/9] drm/xe: Add a new xe_user structure Aakash Deep Sarkar
2025-10-02 14:40 ` Rodrigo Vivi
2025-09-26 10:45 ` [PATCH v4 2/9] drm/xe: Add xe_gt_clock_interval_to_ns function Aakash Deep Sarkar
2025-09-26 10:45 ` [PATCH v4 3/9] drm/xe: Add a trace point for GPU work period Aakash Deep Sarkar
2025-10-02 14:42 ` Rodrigo Vivi
2025-10-03 21:41 ` Dixit, Ashutosh
2025-09-26 10:45 ` [PATCH v4 4/9] drm/xe: Modify xe_exec_queue_update_run_ticks Aakash Deep Sarkar
2025-09-26 10:45 ` [PATCH v4 5/9] drm/xe: Handle xe_user creation and removal Aakash Deep Sarkar
2025-09-26 11:29 ` Jani Nikula
2025-09-26 10:45 ` [PATCH v4 6/9] drm/xe: Implement xe_work_period_worker Aakash Deep Sarkar
2025-09-26 11:31 ` Jani Nikula [this message]
2025-09-26 10:45 ` [PATCH v4 7/9] drm/xe: Add a Kconfig option for GPU work period Aakash Deep Sarkar
2025-09-26 10:45 ` [PATCH v4 8/9] drm/xe: Handle xe_work_period destruction Aakash Deep Sarkar
2025-09-26 11:32 ` Jani Nikula
2025-09-26 10:45 ` [PATCH v4 9/9] Hack patch: Do not merge Aakash Deep Sarkar
2025-09-26 11:59 ` ✗ CI.checkpatch: warning for : Add GPU work period support for Xe driver (rev4) Patchwork
2025-09-26 12:01 ` ✓ CI.KUnit: success " Patchwork
2025-09-26 12:51 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-09-26 18:04 ` ✗ Xe.CI.Full: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3cf9a71439b5d79751f6b7879bdc6b917d023d2d@intel.com \
--to=jani.nikula@linux.intel.com \
--cc=aakash.deep.sarkar@intel.com \
--cc=carlos.santa@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jeevaka.badrappan@intel.com \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox