From: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
Tvrtko Ursulin <tursulin@ursulin.net>,
<dri-devel@lists.freedesktop.org>
Subject: Re: [PATCH v4 8/8] drm/xe/client: Print runtime to fdinfo
Date: Thu, 16 May 2024 12:21:24 -0700 [thread overview]
Message-ID: <ZkZctPJtDpuNLZXr@orsosgc001> (raw)
In-Reply-To: <20240515214258.59209-9-lucas.demarchi@intel.com>
On Wed, May 15, 2024 at 02:42:58PM -0700, Lucas De Marchi wrote:
>Print the accumulated runtime for client when printing fdinfo.
>Each time a query is done it first does 2 things:
>
>1) loop through all the exec queues for the current client and
> accumulate the runtime, per engine class. CTX_TIMESTAMP is used for
> that, being read from the context image.
>
>2) Read a "GPU timestamp" that can be used for considering "how much GPU
> time has passed" and that has the same unit/refclock as the one
> recording the runtime. RING_TIMESTAMP is used for that via MMIO.
>
>Since for all current platforms RING_TIMESTAMP follows the same
>refclock, just read it once, using any first engine available.
>
>This is exported to userspace as 2 numbers in fdinfo:
>
> drm-cycles-<class>: <RUNTIME>
> drm-total-cycles-<class>: <TIMESTAMP>
>
>Userspace is expected to collect at least 2 samples, which allows to
>know the client engine busyness as per:
>
> RUNTIME1 - RUNTIME0
> busyness = ---------------------
> T1 - T0
>
>Since drm-cycles-<class> always starts at 0, it's also possible to know
>if and engine was ever used by a client.
>
>It's expected that userspace will read any 2 samples every few seconds.
>Given the update frequency of the counters involved and that
>CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap
>around (assuming 100% utilization) after ~200s. The wraparound is not
>perceived by userspace since it's just accumulated for all the
>exec_queues in a 64-bit counter) but the measurement will not be
>accurate if the samples are too far apart.
>
>This could be mitigated by adding a workqueue to accumulate the counters
>every so often, but it's additional complexity for something that is
>done already by userspace every few seconds in tools like gputop (from
>igt), htop, nvtop, etc, with none of them really defaulting to 1 sample
>per minute or more.
>
>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
LGTM,
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Thanks,
Umesh
>---
> Documentation/gpu/drm-usage-stats.rst | 21 +++-
> Documentation/gpu/xe/index.rst | 1 +
> Documentation/gpu/xe/xe-drm-usage-stats.rst | 10 ++
> drivers/gpu/drm/xe/xe_drm_client.c | 121 +++++++++++++++++++-
> 4 files changed, 150 insertions(+), 3 deletions(-)
> create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst
>
>diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
>index 6dc299343b48..a80f95ca1b2f 100644
>--- a/Documentation/gpu/drm-usage-stats.rst
>+++ b/Documentation/gpu/drm-usage-stats.rst
>@@ -112,6 +112,19 @@ larger value within a reasonable period. Upon observing a value lower than what
> was previously read, userspace is expected to stay with that larger previous
> value until a monotonic update is seen.
>
>+- drm-total-cycles-<keystr>: <uint>
>+
>+Engine identifier string must be the same as the one specified in the
>+drm-cycles-<keystr> tag and shall contain the total number cycles for the given
>+engine.
>+
>+This is a timestamp in GPU unspecified unit that matches the update rate
>+of drm-cycles-<keystr>. For drivers that implement this interface, the engine
>+utilization can be calculated entirely on the GPU clock domain, without
>+considering the CPU sleep time between 2 samples.
>+
>+A driver may implement either this key or drm-maxfreq-<keystr>, but not both.
>+
> - drm-maxfreq-<keystr>: <uint> [Hz|MHz|KHz]
>
> Engine identifier string must be the same as the one specified in the
>@@ -121,6 +134,9 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
> time active without considering what frequency the engine is operating as a
> percentage of its maximum frequency.
>
>+A driver may implement either this key or drm-total-cycles-<keystr>, but not
>+both.
>+
> Memory
> ^^^^^^
>
>@@ -168,5 +184,6 @@ be documented above and where possible, aligned with other drivers.
> Driver specific implementations
> -------------------------------
>
>-:ref:`i915-usage-stats`
>-:ref:`panfrost-usage-stats`
>+* :ref:`i915-usage-stats`
>+* :ref:`panfrost-usage-stats`
>+* :ref:`xe-usage-stats`
>diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst
>index c224ecaee81e..3f07aa3b5432 100644
>--- a/Documentation/gpu/xe/index.rst
>+++ b/Documentation/gpu/xe/index.rst
>@@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver.
> xe_firmware
> xe_tile
> xe_debugging
>+ xe-drm-usage-stats.rst
>diff --git a/Documentation/gpu/xe/xe-drm-usage-stats.rst b/Documentation/gpu/xe/xe-drm-usage-stats.rst
>new file mode 100644
>index 000000000000..482d503ae68a
>--- /dev/null
>+++ b/Documentation/gpu/xe/xe-drm-usage-stats.rst
>@@ -0,0 +1,10 @@
>+.. SPDX-License-Identifier: GPL-2.0+
>+
>+.. _xe-usage-stats:
>+
>+========================================
>+Xe DRM client usage stats implementation
>+========================================
>+
>+.. kernel-doc:: drivers/gpu/drm/xe/xe_drm_client.c
>+ :doc: DRM Client usage stats
>diff --git a/drivers/gpu/drm/xe/xe_drm_client.c b/drivers/gpu/drm/xe/xe_drm_client.c
>index 08f0b7c95901..952b0cc87708 100644
>--- a/drivers/gpu/drm/xe/xe_drm_client.c
>+++ b/drivers/gpu/drm/xe/xe_drm_client.c
>@@ -2,6 +2,7 @@
> /*
> * Copyright © 2023 Intel Corporation
> */
>+#include "xe_drm_client.h"
>
> #include <drm/drm_print.h>
> #include <drm/xe_drm.h>
>@@ -12,9 +13,66 @@
> #include "xe_bo.h"
> #include "xe_bo_types.h"
> #include "xe_device_types.h"
>-#include "xe_drm_client.h"
>+#include "xe_exec_queue.h"
>+#include "xe_force_wake.h"
>+#include "xe_gt.h"
>+#include "xe_hw_engine.h"
>+#include "xe_pm.h"
> #include "xe_trace.h"
>
>+/**
>+ * DOC: DRM Client usage stats
>+ *
>+ * The drm/xe driver implements the DRM client usage stats specification as
>+ * documented in :ref:`drm-client-usage-stats`.
>+ *
>+ * Example of the output showing the implemented key value pairs and entirety of
>+ * the currently possible format options:
>+ *
>+ * ::
>+ *
>+ * pos: 0
>+ * flags: 0100002
>+ * mnt_id: 26
>+ * ino: 685
>+ * drm-driver: xe
>+ * drm-client-id: 3
>+ * drm-pdev: 0000:03:00.0
>+ * drm-total-system: 0
>+ * drm-shared-system: 0
>+ * drm-active-system: 0
>+ * drm-resident-system: 0
>+ * drm-purgeable-system: 0
>+ * drm-total-gtt: 192 KiB
>+ * drm-shared-gtt: 0
>+ * drm-active-gtt: 0
>+ * drm-resident-gtt: 192 KiB
>+ * drm-total-vram0: 23992 KiB
>+ * drm-shared-vram0: 16 MiB
>+ * drm-active-vram0: 0
>+ * drm-resident-vram0: 23992 KiB
>+ * drm-total-stolen: 0
>+ * drm-shared-stolen: 0
>+ * drm-active-stolen: 0
>+ * drm-resident-stolen: 0
>+ * drm-cycles-rcs: 28257900
>+ * drm-total-cycles-rcs: 7655183225
>+ * drm-cycles-bcs: 0
>+ * drm-total-cycles-bcs: 7655183225
>+ * drm-cycles-vcs: 0
>+ * drm-total-cycles-vcs: 7655183225
>+ * drm-engine-capacity-vcs: 2
>+ * drm-cycles-vecs: 0
>+ * drm-total-cycles-vecs: 7655183225
>+ * drm-engine-capacity-vecs: 2
>+ * drm-cycles-ccs: 0
>+ * drm-total-cycles-ccs: 7655183225
>+ * drm-engine-capacity-ccs: 4
>+ *
>+ * Possible `drm-cycles-` key names are: `rcs`, `ccs`, `bcs`, `vcs`, `vecs` and
>+ * "other".
>+ */
>+
> /**
> * xe_drm_client_alloc() - Allocate drm client
> * @void: No arg
>@@ -179,6 +237,66 @@ static void show_meminfo(struct drm_printer *p, struct drm_file *file)
> }
> }
>
>+static void show_runtime(struct drm_printer *p, struct drm_file *file)
>+{
>+ unsigned long class, i, gt_id, capacity[XE_ENGINE_CLASS_MAX] = { };
>+ struct xe_file *xef = file->driver_priv;
>+ struct xe_device *xe = xef->xe;
>+ struct xe_gt *gt;
>+ struct xe_hw_engine *hwe;
>+ struct xe_exec_queue *q;
>+ u64 gpu_timestamp;
>+
>+ xe_pm_runtime_get(xe);
>+
>+ /* Accumulate all the exec queues from this client */
>+ mutex_lock(&xef->exec_queue.lock);
>+ xa_for_each(&xef->exec_queue.xa, i, q)
>+ xe_exec_queue_update_runtime(q);
>+ mutex_unlock(&xef->exec_queue.lock);
>+
>+ /* Get the total GPU cycles */
>+ for_each_gt(gt, xe, gt_id) {
>+ hwe = xe_gt_any_hw_engine(gt);
>+ if (!hwe)
>+ continue;
>+
>+ xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
>+ gpu_timestamp = xe_hw_engine_read_timestamp(hwe);
>+ xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
>+ break;
>+ }
>+
>+ if (unlikely(!hwe))
>+ return;
>+
>+ for (class = 0; class < XE_ENGINE_CLASS_MAX; class++) {
>+ const char *class_name;
>+
>+ for_each_gt(gt, xe, gt_id)
>+ capacity[class] += gt->user_engines.instances_per_class[class];
>+
>+ /*
>+ * Engines may be fused off or not exposed to userspace. Don't
>+ * return anything if this entire class is not available
>+ */
>+ if (!capacity[class])
>+ continue;
>+
>+ class_name = xe_hw_engine_class_to_str(class);
>+ drm_printf(p, "drm-cycles-%s:\t%llu\n",
>+ class_name, xef->runtime[class]);
>+ drm_printf(p, "drm-total-cycles-%s:\t%llu\n",
>+ class_name, gpu_timestamp);
>+
>+ if (capacity[class] > 1)
>+ drm_printf(p, "drm-engine-capacity-%s:\t%lu\n",
>+ class_name, capacity[class]);
>+ }
>+
>+ xe_pm_runtime_put(xe);
>+}
>+
> /**
> * xe_drm_client_fdinfo() - Callback for fdinfo interface
> * @p: The drm_printer ptr
>@@ -192,5 +310,6 @@ static void show_meminfo(struct drm_printer *p, struct drm_file *file)
> void xe_drm_client_fdinfo(struct drm_printer *p, struct drm_file *file)
> {
> show_meminfo(p, file);
>+ show_runtime(p, file);
> }
> #endif
>--
>2.43.0
>
next prev parent reply other threads:[~2024-05-16 19:21 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-15 21:42 [PATCH v4 0/8] drm/xe: Per client usage Lucas De Marchi
2024-05-15 21:42 ` [PATCH v4 1/8] drm/xe: Promote xe_hw_engine_class_to_str() Lucas De Marchi
2024-05-15 21:42 ` [PATCH v4 2/8] drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion Lucas De Marchi
2024-05-15 21:42 ` [PATCH v4 3/8] drm/xe/lrc: Add helper to capture context timestamp Lucas De Marchi
2024-05-17 16:39 ` Francois Dugast
2024-05-15 21:42 ` [PATCH v4 4/8] drm/xe: Add helper to capture engine timestamp Lucas De Marchi
2024-05-15 21:42 ` [PATCH v4 5/8] drm/xe: Add helper to accumulate exec queue runtime Lucas De Marchi
2024-05-15 21:42 ` [PATCH v4 6/8] drm/xe: Cache data about user-visible engines Lucas De Marchi
2024-05-16 14:50 ` Cavitt, Jonathan
2024-05-16 18:33 ` Umesh Nerlige Ramappa
2024-05-16 19:52 ` Lucas De Marchi
2024-05-16 22:56 ` Umesh Nerlige Ramappa
2024-05-15 21:42 ` [PATCH v4 7/8] drm/xe: Add helper to return any available hw engine Lucas De Marchi
2024-05-16 18:55 ` Umesh Nerlige Ramappa
2024-05-15 21:42 ` [PATCH v4 8/8] drm/xe/client: Print runtime to fdinfo Lucas De Marchi
2024-05-16 7:57 ` Tvrtko Ursulin
2024-05-16 13:39 ` Lucas De Marchi
2024-05-16 19:21 ` Umesh Nerlige Ramappa [this message]
2024-05-15 21:51 ` ✓ CI.Patch_applied: success for drm/xe: Per client usage (rev4) Patchwork
2024-05-15 21:51 ` ✗ CI.checkpatch: warning " Patchwork
2024-05-15 21:53 ` ✓ CI.KUnit: success " Patchwork
2024-05-15 22:07 ` ✓ CI.Build: " Patchwork
2024-05-15 22:10 ` ✗ CI.Hooks: failure " Patchwork
2024-05-16 20:09 ` Lucas De Marchi
2024-05-15 22:11 ` ✓ CI.checksparse: success " Patchwork
2024-05-15 22:46 ` ✓ CI.BAT: " Patchwork
2024-05-16 0:05 ` ✗ CI.FULL: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZkZctPJtDpuNLZXr@orsosgc001 \
--to=umesh.nerlige.ramappa@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=tursulin@ursulin.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox