public inbox for igt-dev@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH i-g-t v7 0/5] Close any open drm device after engine initialization in GPUTOP
@ 2026-01-30  9:53 Soham Purkait
  2026-01-30  9:53 ` [PATCH i-g-t v7 1/5] tools: Rename tools/gputop to tools/gputop.src Soham Purkait
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Soham Purkait @ 2026-01-30  9:53 UTC (permalink / raw)
  To: igt-dev, riana.tauro, badal.nilawar, kamil.konieczny,
	ashutosh.dixit
  Cc: anshuman.gupta, soham.purkait, umesh.nerlige.ramappa

       During engine initialization for a specific device, the helper
functions require an open card_fd to query engines and configurations
needed to compute engine busyness via PMU. However, KMS tests require
that no other DRM client be active, and keeping card_fd open causes
gputop to appear as a DRM client. Therefore, card_fd is closed immediately
after discovery.
     Additionally, this patch removes the lib/xe/* dependency and perform
engine discovery using direct ioctl calls, simplifying the implementation
and reducing external dependencies, while also providing a fallback to
report only client busyness when engine busyness counters are not
accessible to non-root users.
  
  GPUTOP engine busyness is also capable of running without superuser access
through setting CAP_PERFMON or an appropriate perf_event_paranoid setting,
Otherwise user could continue with only GPU client busyness if required.

Steps to enable engine busyness to run without root (using CAP_PERFMON):
$ cd /path/to/igt-gpu-tools/
$ sudo setcap cap_perfmon=+ep $(pwd)/build/tools/gputop
$ sudo sh -c "echo $(pwd)/build/lib > /etc/ld.so.conf.d/lib-igt.conf"
$ sudo ldconfig
Steps to revert once done:
$ sudo setcap cap_perfmon=-ep $(pwd)/build/tools/gputop
$ sudo rm /etc/ld.so.conf.d/lib-igt.conf
$ sudo ldconfig

Steps to enable engine busyness to run without root (using perf_event_paranoid):
# Save current value
$ orig_val=$(sysctl -n kernel.perf_event_paranoid)
# Set the value to allow running GPUTOP without root privileges
$ sudo sysctl -w kernel.perf_event_paranoid=-1
Steps to revert once done:
$ sudo sysctl -w kernel.perf_event_paranoid=$orig_val

Soham Purkait (5):
  tools: Rename tools/gputop to tools/gputop.src
  tools/gputop.src/utils: Add clamp macro to remove dependency on
    lib/xe/*
  tools/gputop.src/xe_gputop: Refactor xe_populate_engines to close
    card_fd and use direct ioctl calls
  tools/gputop.src/gputop: Enable support for multiple GPUs and
    instances
  tools/gputop.src/gputop: Add command line option for device filter

 tools/{ => gputop.src}/gputop.c          | 286 +++++++++++++++++++----
 tools/gputop.src/meson.build             |   1 +
 tools/{gputop => gputop.src}/utils.c     |   0
 tools/{gputop => gputop.src}/utils.h     |  11 +-
 tools/{gputop => gputop.src}/xe_gputop.c | 100 ++++++--
 tools/{gputop => gputop.src}/xe_gputop.h |   3 +-
 tools/meson.build                        |  12 +-
 7 files changed, 344 insertions(+), 69 deletions(-)
 rename tools/{ => gputop.src}/gputop.c (58%)
 create mode 100644 tools/gputop.src/meson.build
 rename tools/{gputop => gputop.src}/utils.c (100%)
 rename tools/{gputop => gputop.src}/utils.h (89%)
 rename tools/{gputop => gputop.src}/xe_gputop.c (83%)
 rename tools/{gputop => gputop.src}/xe_gputop.h (95%)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH i-g-t v7 1/5] tools: Rename tools/gputop to tools/gputop.src
  2026-01-30  9:53 [PATCH i-g-t v7 0/5] Close any open drm device after engine initialization in GPUTOP Soham Purkait
@ 2026-01-30  9:53 ` Soham Purkait
  2026-01-30  9:53 ` [PATCH i-g-t v7 2/5] tools/gputop.src/utils: Add clamp macro to remove dependency on lib/xe/* Soham Purkait
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Soham Purkait @ 2026-01-30  9:53 UTC (permalink / raw)
  To: igt-dev, riana.tauro, badal.nilawar, kamil.konieczny,
	ashutosh.dixit
  Cc: anshuman.gupta, soham.purkait, umesh.nerlige.ramappa

The GPUTOP source directory is renamed to gputop.src
under tools/ path.

v1:
 - Add meson build files accordingly to generate
   the binary under tools. (Kamil)

Signed-off-by: Soham Purkait <soham.purkait@intel.com>
Reviewed-by: Kamil Konieczny <kamil.konieczny@linux.intel.com>
---
 tools/{ => gputop.src}/gputop.c          |  0
 tools/gputop.src/meson.build             |  1 +
 tools/{gputop => gputop.src}/utils.c     |  0
 tools/{gputop => gputop.src}/utils.h     |  0
 tools/{gputop => gputop.src}/xe_gputop.c |  0
 tools/{gputop => gputop.src}/xe_gputop.h |  0
 tools/meson.build                        | 11 ++++++-----
 7 files changed, 7 insertions(+), 5 deletions(-)
 rename tools/{ => gputop.src}/gputop.c (100%)
 create mode 100644 tools/gputop.src/meson.build
 rename tools/{gputop => gputop.src}/utils.c (100%)
 rename tools/{gputop => gputop.src}/utils.h (100%)
 rename tools/{gputop => gputop.src}/xe_gputop.c (100%)
 rename tools/{gputop => gputop.src}/xe_gputop.h (100%)

diff --git a/tools/gputop.c b/tools/gputop.src/gputop.c
similarity index 100%
rename from tools/gputop.c
rename to tools/gputop.src/gputop.c
diff --git a/tools/gputop.src/meson.build b/tools/gputop.src/meson.build
new file mode 100644
index 000000000..ec39f4c7a
--- /dev/null
+++ b/tools/gputop.src/meson.build
@@ -0,0 +1 @@
+gputop_src = files('gputop.c')
diff --git a/tools/gputop/utils.c b/tools/gputop.src/utils.c
similarity index 100%
rename from tools/gputop/utils.c
rename to tools/gputop.src/utils.c
diff --git a/tools/gputop/utils.h b/tools/gputop.src/utils.h
similarity index 100%
rename from tools/gputop/utils.h
rename to tools/gputop.src/utils.h
diff --git a/tools/gputop/xe_gputop.c b/tools/gputop.src/xe_gputop.c
similarity index 100%
rename from tools/gputop/xe_gputop.c
rename to tools/gputop.src/xe_gputop.c
diff --git a/tools/gputop/xe_gputop.h b/tools/gputop.src/xe_gputop.h
similarity index 100%
rename from tools/gputop/xe_gputop.h
rename to tools/gputop.src/xe_gputop.h
diff --git a/tools/meson.build b/tools/meson.build
index 8185ba160..521607a4c 100644
--- a/tools/meson.build
+++ b/tools/meson.build
@@ -70,11 +70,6 @@ if libudev.found()
 		   install : true)
 endif
 
-executable('gputop', 'gputop.c',
-           install : true,
-           install_rpath : bindir_rpathdir,
-           dependencies : [lib_igt_drm_clients,lib_igt_drm_fdinfo,lib_igt_profiling,math])
-
 intel_l3_parity_src = [ 'intel_l3_parity.c', 'intel_l3_udev_listener.c' ]
 executable('intel_l3_parity', sources : intel_l3_parity_src,
 	   dependencies : tool_deps,
@@ -123,3 +118,9 @@ endif
 subdir('i915-perf')
 subdir('xe-perf')
 subdir('null_state_gen')
+
+subdir('gputop.src')
+executable('gputop', sources : gputop_src,
+	   install : true,
+	   install_rpath : bindir_rpathdir,
+	   dependencies : [lib_igt_drm_clients,lib_igt_drm_fdinfo,lib_igt_profiling,math],)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH i-g-t v7 2/5] tools/gputop.src/utils: Add clamp macro to remove dependency on lib/xe/*
  2026-01-30  9:53 [PATCH i-g-t v7 0/5] Close any open drm device after engine initialization in GPUTOP Soham Purkait
  2026-01-30  9:53 ` [PATCH i-g-t v7 1/5] tools: Rename tools/gputop to tools/gputop.src Soham Purkait
@ 2026-01-30  9:53 ` Soham Purkait
  2026-01-30  9:53 ` [PATCH i-g-t v7 3/5] tools/gputop.src/xe_gputop: Refactor xe_populate_engines to close card_fd and use direct ioctl calls Soham Purkait
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Soham Purkait @ 2026-01-30  9:53 UTC (permalink / raw)
  To: igt-dev, riana.tauro, badal.nilawar, kamil.konieczny,
	ashutosh.dixit
  Cc: anshuman.gupta, soham.purkait, umesh.nerlige.ramappa

Add clamp helper macro to utils.h to remove dependency on lib/xe/*

Signed-off-by: Soham Purkait <soham.purkait@intel.com>
Reviewed-by: Kamil Konieczny <kamil.konieczny@linux.intel.com>
---
 tools/gputop.src/utils.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/gputop.src/utils.h b/tools/gputop.src/utils.h
index 00befed56..7f3e2ac76 100644
--- a/tools/gputop.src/utils.h
+++ b/tools/gputop.src/utils.h
@@ -6,7 +6,6 @@
 #ifndef COMMON_GPUTOP_H
 #define COMMON_GPUTOP_H
 
-#include <glib.h>
 #include <math.h>
 #include <stdbool.h>
 #include <stdio.h>
@@ -19,6 +18,16 @@
 
 #define PERCLIENT_ENGINE_WIDTH 8
 
+#ifndef clamp
+#define clamp(val, lo, hi)			\
+({						\
+	__typeof__(val) _v = (val);		\
+	__typeof__(lo)  _lo = (lo);		\
+	__typeof__(hi)  _hi = (hi);		\
+	_v < _lo ? _lo : (_v > _hi ? _hi : _v);	\
+})
+#endif
+
 /**
  * struct gputop_driver
  *
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH i-g-t v7 3/5] tools/gputop.src/xe_gputop: Refactor xe_populate_engines to close card_fd and use direct ioctl calls
  2026-01-30  9:53 [PATCH i-g-t v7 0/5] Close any open drm device after engine initialization in GPUTOP Soham Purkait
  2026-01-30  9:53 ` [PATCH i-g-t v7 1/5] tools: Rename tools/gputop to tools/gputop.src Soham Purkait
  2026-01-30  9:53 ` [PATCH i-g-t v7 2/5] tools/gputop.src/utils: Add clamp macro to remove dependency on lib/xe/* Soham Purkait
@ 2026-01-30  9:53 ` Soham Purkait
  2026-02-12  1:41   ` Belgaumkar, Vinay
  2026-01-30  9:53 ` [PATCH i-g-t v7 4/5] tools/gputop.src/gputop: Enable support for multiple GPUs and instances Soham Purkait
  2026-01-30  9:53 ` [PATCH i-g-t v7 5/5] tools/gputop.src/gputop: Add command line option for device filter Soham Purkait
  4 siblings, 1 reply; 11+ messages in thread
From: Soham Purkait @ 2026-01-30  9:53 UTC (permalink / raw)
  To: igt-dev, riana.tauro, badal.nilawar, kamil.konieczny,
	ashutosh.dixit
  Cc: anshuman.gupta, soham.purkait, umesh.nerlige.ramappa

During GPUTOP engine initialization, the opened card fd is required to
query engines and obtain configurations for PMU busyness calculation, but
leaving it open causes the process to appear as a DRM client, which
conflicts with KMS test requirements. To avoid this, card_fd is closed
immediately after engine discovery, and engine queries are refactored to
remove the lib/xe/* dependency in favor of direct ioctl calls, simplifying
the implementation and reducing igt test library dependencies.

v1:
 - Initialize pmu_device_obj to null.
v2:
 - Remove dependency on lib/xe/* libraries. (Kamil)

Fixes: c8106465683f ("tools/gputop/xe_gputop: Add gputop support for xe specific devices")
Signed-off-by: Soham Purkait <soham.purkait@intel.com>
---
 tools/gputop.src/xe_gputop.c | 100 ++++++++++++++++++++++++++++-------
 tools/gputop.src/xe_gputop.h |   3 +-
 2 files changed, 81 insertions(+), 22 deletions(-)

diff --git a/tools/gputop.src/xe_gputop.c b/tools/gputop.src/xe_gputop.c
index bb2caa6ea..7793cd62f 100644
--- a/tools/gputop.src/xe_gputop.c
+++ b/tools/gputop.src/xe_gputop.c
@@ -1,8 +1,10 @@
 // SPDX-License-Identifier: MIT
 /*
- * Copyright © 2025 Intel Corporation
+ * Copyright © 2025-2026 Intel Corporation
  */
 
+#include <sys/ioctl.h>
+
 #include "xe_gputop.h"
 
 #define engine_ptr(pmu_device, n) (&(pmu_device)->engine + (n))
@@ -92,6 +94,7 @@ void xe_gputop_init(void *ptr, int index,
 
 	obj = ((struct xe_gputop *)ptr) + index;
 	obj->card = card;
+	obj->pmu_device_obj = NULL;
 }
 
 static int pmu_format_shift(int xe, const char *name)
@@ -121,20 +124,54 @@ static int engine_cmp(const void *__a, const void *__b)
 		return a->drm_xe_engine.engine_instance - b->drm_xe_engine.engine_instance;
 }
 
+static int engine_query(int fd, struct drm_xe_query_engines **engine_q)
+{
+	struct drm_xe_device_query q;
+	int ret = 0, num_eng;
+
+	memset(&q, 0, sizeof(q));
+	q.query = DRM_XE_DEVICE_QUERY_ENGINES;
+
+	ret = ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &q);
+	if (ret != 0) {
+		if ((errno != ENOSPC && errno != E2BIG) || q.size == 0)
+			return -1;
+	}
+	if (q.size == 0)
+		return 0;
+
+	num_eng = (int)((q.size - sizeof(struct drm_xe_query_engines)) /
+			sizeof(struct drm_xe_engine_class_instance));
+
+	*engine_q = (struct drm_xe_query_engines *)calloc(1, q.size);
+	if (!*engine_q)
+		return -1;
+
+	q.data = (uintptr_t)(*engine_q);
+
+	ret = ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &q);
+	if (ret != 0) {
+		free(*engine_q);
+		return -1;
+	}
+
+	return num_eng;
+}
+
 void *xe_populate_engines(const void *obj, int index)
 {
 	struct xe_gputop *ptr = ((struct xe_gputop *)obj) + index;
 	struct igt_device_card *card = ptr->card;
 	uint64_t engine_active_config, engine_total_config;
 	uint64_t engine_class, engine_instance, gt_shift;
-	struct drm_xe_engine_class_instance *hwe;
+	struct drm_xe_query_engines *engine_q;
 	struct xe_pmu_device *engines;
 	char device[30];
-	int ret = 0;
-	int card_fd;
+	int ret = 0, num_eng;
+	int card_fd = -1;
 
 	if (!card || !strlen(card->card) || !strlen(card->render))
-		return NULL;
+		goto err;
 
 	if (strlen(card->card)) {
 		card_fd = igt_open_card(card);
@@ -142,39 +179,51 @@ void *xe_populate_engines(const void *obj, int index)
 		card_fd = igt_open_render(card);
 	} else {
 		fprintf(stderr, "Failed to detect device!\n");
-		return NULL;
+		goto err;
 	}
-	xe_device_get(card_fd);
+
+	num_eng = engine_query(card_fd, &engine_q);
+	if (num_eng <= 0) {
+		fprintf(stderr, "Engine query failed!\n");
+		goto err;
+	}
+
 	engines = malloc(sizeof(struct xe_pmu_device) +
-			 xe_number_engines(card_fd) * sizeof(struct xe_engine));
+			 num_eng * sizeof(struct xe_engine));
 	if (!engines)
-		return NULL;
+		goto err;
 
 	memset(engines, 0, sizeof(struct xe_pmu_device) +
-	       xe_number_engines(card_fd) * sizeof(struct xe_engine));
+	       num_eng * sizeof(struct xe_engine));
 
 	engines->num_engines = 0;
 	gt_shift = pmu_format_shift(card_fd, "gt");
 	engine_class = pmu_format_shift(card_fd, "engine_class");
 	engine_instance = pmu_format_shift(card_fd, "engine_instance");
 	xe_perf_device(card_fd, device, sizeof(device));
+	close(card_fd);
+	card_fd = -1;
+
 	engines->device = strdup(device);
 	ret = perf_event_config(device, "engine-active-ticks", &engine_active_config);
 	if (ret < 0)
-		return NULL;
+		goto err;
 
 	ret = perf_event_config(device, "engine-total-ticks", &engine_total_config);
 	if (ret < 0)
-		return NULL;
+		goto err;
 
-	xe_for_each_engine(card_fd, hwe) {
+	while (engines->num_engines < engine_q->num_engines) {
 		uint64_t  param_config;
 		struct xe_engine *engine;
+		struct drm_xe_engine_class_instance hwe;
+
+		hwe = engine_q->engines[engines->num_engines].instance;
 
 		engine = engine_ptr(engines, engines->num_engines);
-		param_config = (uint64_t)hwe->gt_id << gt_shift | hwe->engine_class << engine_class
-			| hwe->engine_instance << engine_instance;
-		engine->drm_xe_engine = *hwe;
+		param_config = (uint64_t)hwe.gt_id << gt_shift | hwe.engine_class << engine_class
+			| hwe.engine_instance << engine_instance;
+		engine->drm_xe_engine = hwe;
 		engine->engine_active_ticks.config = engine_active_config | param_config;
 		engine->engine_total_ticks.config = engine_total_config | param_config;
 
@@ -185,7 +234,7 @@ void *xe_populate_engines(const void *obj, int index)
 		}
 
 		ret = asprintf(&engine->display_name, "GT:%u %s/%u",
-			       hwe->gt_id,
+			       hwe.gt_id,
 			       class_display_name(engine->drm_xe_engine.engine_class),
 			       engine->drm_xe_engine.engine_instance);
 
@@ -197,9 +246,11 @@ void *xe_populate_engines(const void *obj, int index)
 		engines->num_engines++;
 	}
 
+	free(engine_q);
+
 	if (!ret) {
 		errno = ret;
-		return NULL;
+		goto err;
 	}
 
 	qsort(engine_ptr(engines, 0), engines->num_engines,
@@ -208,6 +259,15 @@ void *xe_populate_engines(const void *obj, int index)
 	ptr->pmu_device_obj = engines;
 
 	return engines;
+
+err:
+	if (card_fd >= 0)
+		close(card_fd);
+	if (engines)
+		free(engines);
+	if (engine_q)
+		free(engine_q);
+	return NULL;
 }
 
 static uint64_t pmu_read_multi(int fd, unsigned int num, uint64_t *val)
@@ -263,11 +323,11 @@ int xe_pmu_init(const void *obj, int index)
 		fd = _open_pmu(type, &engines->num_counters, &engine->engine_active_ticks,
 			       &engines->fd);
 		if (fd < 0)
-			return -1;
+			return fd;
 		fd = _open_pmu(type, &engines->num_counters, &engine->engine_total_ticks,
 			       &engines->fd);
 		if (fd < 0)
-			return -1;
+			return fd;
 	}
 	return 0;
 }
diff --git a/tools/gputop.src/xe_gputop.h b/tools/gputop.src/xe_gputop.h
index 1e3856071..dcd82de7d 100644
--- a/tools/gputop.src/xe_gputop.h
+++ b/tools/gputop.src/xe_gputop.h
@@ -7,11 +7,10 @@
 #define __XE_GPUTOP_H__
 
 #include <dirent.h>
+#include <xe_drm.h>
 
-#include "igt_device_scan.h"
 #include "igt_perf.h"
 #include "utils.h"
-#include "xe/xe_query.h"
 
 struct xe_pmu_pair {
 	uint64_t cur;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH i-g-t v7 4/5] tools/gputop.src/gputop: Enable support for multiple GPUs and instances
  2026-01-30  9:53 [PATCH i-g-t v7 0/5] Close any open drm device after engine initialization in GPUTOP Soham Purkait
                   ` (2 preceding siblings ...)
  2026-01-30  9:53 ` [PATCH i-g-t v7 3/5] tools/gputop.src/xe_gputop: Refactor xe_populate_engines to close card_fd and use direct ioctl calls Soham Purkait
@ 2026-01-30  9:53 ` Soham Purkait
  2026-02-12 23:28   ` Belgaumkar, Vinay
  2026-01-30  9:53 ` [PATCH i-g-t v7 5/5] tools/gputop.src/gputop: Add command line option for device filter Soham Purkait
  4 siblings, 1 reply; 11+ messages in thread
From: Soham Purkait @ 2026-01-30  9:53 UTC (permalink / raw)
  To: igt-dev, riana.tauro, badal.nilawar, kamil.konieczny,
	ashutosh.dixit
  Cc: anshuman.gupta, soham.purkait, umesh.nerlige.ramappa

Introduce vendor-agnostic support for handling multiple GPUs and
instances in gputop. Improve the tool's adaptability to various GPU
configurations.

v1:
 - Refactor GPUTOP into a vendor-agnostic tool. (Lucas)
v2:
 - Cosmetic changes. (Riana)
 - Avoid three level indentation. (Riana)
v3:
 - Add device filter to populate the array of cards for
   all supported drivers. (Zbigniew)
v4:
 - Add user message for running without root privileges. (Kamil)
v5:
 - Add support for GPU client-only busyness on unsupported
   drivers as a fallback mechanism. (Kamil)
v6:
 - Remove unused dependencies and headers. (Kamil)

Signed-off-by: Soham Purkait <soham.purkait@intel.com>
---
 tools/gputop.src/gputop.c    | 278 +++++++++++++++++++++++++++++------
 tools/gputop.src/meson.build |   2 +-
 tools/meson.build            |   3 +-
 3 files changed, 240 insertions(+), 43 deletions(-)

diff --git a/tools/gputop.src/gputop.c b/tools/gputop.src/gputop.c
index f577a1750..7d4515f8f 100644
--- a/tools/gputop.src/gputop.c
+++ b/tools/gputop.src/gputop.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: MIT
 /*
- * Copyright © 2023 Intel Corporation
+ * Copyright © 2023-2025 Intel Corporation
  */
 
 #include <assert.h>
@@ -14,66 +14,145 @@
 #include <math.h>
 #include <poll.h>
 #include <signal.h>
+#include <stdbool.h>
 #include <stdint.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <sys/ioctl.h>
 #include <sys/stat.h>
+#include <sys/sysmacros.h>
 #include <sys/types.h>
-#include <unistd.h>
 #include <termios.h>
-#include <sys/sysmacros.h>
-#include <stdbool.h>
+#include <unistd.h>
 
-#include "igt_core.h"
 #include "igt_drm_clients.h"
 #include "igt_drm_fdinfo.h"
 #include "igt_profiling.h"
-#include "drmtest.h"
+#include "xe_gputop.h"
+
+/**
+ * Supported Drivers
+ *
+ * Adhere to the following requirements when implementing support for the
+ * new driver:
+ * @drivers: Update drivers[] with driver string.
+ * @sizeof_gputop_obj: Update this function as per new driver support included.
+ * @operations: Update the respective operations of the new driver:
+ * gputop_init,
+ * discover_engines,
+ * pmu_init,
+ * pmu_sample,
+ * print_engines,
+ * clean_up
+ * @per_driver_contexts: Update per_driver_contexts[] array of type "struct gputop_driver" with the
+ * initial values.
+ */
+static const char * const drivers[] = {
+	"xe",
+    /* Keep the last one as NULL */
+	NULL
+};
+
+static size_t sizeof_gputop_obj(int driver_num)
+{
+	switch (driver_num) {
+	case 0:
+		return sizeof(struct xe_gputop);
+	default:
+		fprintf(stderr,
+			"Driver number does not exist.\n");
+		exit(EXIT_FAILURE);
+	}
+}
+
+/**
+ * Supported operations on driver instances. Update the ops[] array for
+ * each individual driver specific function. Maintain the sequence as per
+ * drivers[] array.
+ */
+struct device_operations ops[] = {
+	{
+		xe_gputop_init,
+		xe_populate_engines,
+		xe_pmu_init,
+		xe_pmu_sample,
+		xe_print_engines,
+		xe_clean_up
+	}
+};
+
+/*
+ * per_driver_contexts[] array of type struct gputop_driver which keeps track of the devices
+ * and related info discovered per driver.
+ */
+struct gputop_driver per_driver_contexts[] = {
+	{false, 0, NULL}
+};
 
 enum utilization_type {
 	UTILIZATION_TYPE_ENGINE_TIME,
 	UTILIZATION_TYPE_TOTAL_CYCLES,
 };
 
-static const char *bars[] = { " ", "▏", "▎", "▍", "▌", "▋", "▊", "▉", "█" };
-
-#define ANSI_HEADER "\033[7m"
-#define ANSI_RESET "\033[0m"
-
-static void n_spaces(const unsigned int n)
+static void gputop_clean_up(void)
 {
-	unsigned int i;
-
-	for (i = 0; i < n; i++)
-		putchar(' ');
+	for (int i = 0; drivers[i]; i++) {
+		ops[i].clean_up(per_driver_contexts[i].instances, per_driver_contexts[i].len);
+		free(per_driver_contexts[i].instances);
+		per_driver_contexts[i].device_present = false;
+		per_driver_contexts[i].len = 0;
+	}
 }
 
-static void print_percentage_bar(double percent, int max_len)
+static int find_driver(struct igt_device_card *card)
 {
-	int bar_len, i, len = max_len - 1;
-	const int w = 8;
-
-	len -= printf("|%5.1f%% ", percent);
-
-	/* no space left for bars, do what we can */
-	if (len < 0)
-		len = 0;
-
-	bar_len = ceil(w * percent * len / 100.0);
-	if (bar_len > w * len)
-		bar_len = w * len;
+	for (int i = 0; drivers[i]; i++) {
+		if (strcmp(drivers[i], card->driver) == 0)
+			return i;
+	}
+	return -1;
+}
 
-	for (i = bar_len; i >= w; i -= w)
-		printf("%s", bars[w]);
-	if (i)
-		printf("%s", bars[i]);
+static int populate_device_instances(const char *filter)
+{
+	struct igt_device_card *cards = NULL;
+	struct igt_device_card *card_inplace = NULL;
+	struct gputop_driver *driver_entry =  NULL;
+	int driver_no;
+	int count, final_count = 0;
+
+	count = igt_device_card_match_all(filter, &cards);
+	for (int j = 0; j < count; j++) {
+		if (strcmp(cards[j].subsystem, "pci") != 0)
+			continue;
 
-	len -= (bar_len + (w - 1)) / w;
-	n_spaces(len);
+		driver_no = find_driver(&cards[j]);
+		if (driver_no < 0)
+			continue;
 
-	putchar('|');
+		driver_entry = &per_driver_contexts[driver_no];
+		if (!driver_entry->device_present)
+			driver_entry->device_present = true;
+		driver_entry->len++;
+		driver_entry->instances = realloc(driver_entry->instances,
+						  driver_entry->len * sizeof_gputop_obj(driver_no));
+		if (!driver_entry->instances) {
+			fprintf(stderr,
+				"Device instance realloc failed (%s)\n",
+				strerror(errno));
+			exit(EXIT_FAILURE);
+		}
+		card_inplace = (struct igt_device_card *)
+				calloc(1, sizeof(struct igt_device_card));
+		memcpy(card_inplace, &cards[j], sizeof(struct igt_device_card));
+		ops[driver_no].gputop_init(driver_entry->instances, (driver_entry->len - 1),
+			card_inplace);
+		final_count++;
+	}
+	if (count)
+		free(cards);
+	return final_count;
 }
 
 static int
@@ -335,6 +414,28 @@ struct gputop_args {
 	unsigned long delay_usec;
 };
 
+static void countdown(const char *msg, const int start_sec)
+{
+	struct pollfd pfd;
+	int i, ret;
+	char ch;
+
+	for (i = start_sec; i > 0; i--) {
+		printf("\r%s%d... second(s)", msg, i);
+		fflush(stdout);
+
+		pfd.fd = STDIN_FILENO;
+		pfd.events = POLLIN;
+
+		ret = poll(&pfd, 1, 1000);
+		if (ret > 0 && (pfd.revents & POLLIN)) {
+			while ((ch = getchar()) != '\n' && ch != EOF)
+				continue;
+			return;
+		}
+	}
+}
+
 static void help(char *full_path)
 {
 	const char *short_program_name = strrchr(full_path, '/');
@@ -349,7 +450,32 @@ static void help(char *full_path)
 	       "Options:\n"
 	       "\t-h, --help                show this help\n"
 	       "\t-d, --delay =SEC[.TENTHS] iterative delay as SECS [.TENTHS]\n"
-	       "\t-n, --iterations =NUMBER  number of executions\n"
+	       "\t-n, --iterations =NUMBER  number of executions\n\n"
+	       "Running without root:\n"
+	       "\tAs a non-root user, CAP_PERFMON or perf_event_paranoid is required to\n"
+	       "\taccess engine busyness\n"
+	       "\t" ANSI_HEADER "Steps to run without root (using CAP_PERFMON):"
+		ANSI_RESET "\n"
+	       "\tcd /path/to/igt-gpu-tools/\n"
+	       "\tsudo setcap cap_perfmon=+ep $(pwd)/build/tools/gputop\n"
+	       "\tsudo sh -c \"echo $(pwd)/build/lib > /etc/ld.so.conf.d/lib-igt.conf\"\n"
+	       "\tsudo ldconfig\n"
+	       "\t" ANSI_HEADER "Steps to revert once done:" ANSI_RESET "\n"
+	       "\tsudo setcap cap_perfmon=-ep $(pwd)/build/tools/gputop\n"
+	       "\tsudo rm /etc/ld.so.conf.d/lib-igt.conf\n"
+	       "\tsudo ldconfig\n"
+	       "\n"
+	       "\t" ANSI_HEADER "Steps to run without root (using perf_event_paranoid):"
+	       ANSI_RESET "\n"
+	       "\t\033[32m# Save current perf_event_paranoid value\033[0m\n"
+	       "\torig_val=$(sysctl -n kernel.perf_event_paranoid)\n"
+	       "\tsudo sysctl -w kernel.perf_event_paranoid=-1\n"
+	       "\t" ANSI_HEADER "Steps to revert once done:" ANSI_RESET "\n"
+	       "\t\033[32m# Restore original value\033[0m\n"
+	       "\tsudo sysctl -w kernel.perf_event_paranoid=$orig_val\n\n"
+	       "\tFor details, see 'Perf events and tool security':\n"
+	       "\thttps://www.kernel.org/doc/html/"
+	       "latest/admin-guide/perf-security.html\n\n"
 	       , short_program_name);
 }
 
@@ -417,9 +543,12 @@ int main(int argc, char **argv)
 	struct igt_profiled_device *profiled_devices = NULL;
 	struct igt_drm_clients *clients = NULL;
 	int con_w = -1, con_h = -1;
+	bool is_root;
 	int ret;
 	long n;
 
+	is_root = (geteuid() == 0);
+
 	ret = parse_args(argc, argv, &args);
 	if (ret < 0)
 		return EXIT_FAILURE;
@@ -428,6 +557,53 @@ int main(int argc, char **argv)
 
 	n = args.n_iter;
 	period_us = args.delay_usec;
+	populate_device_instances("device:subsystem=pci,card=all");
+
+	for (int i = 0; drivers[i]; i++) {
+		if (!per_driver_contexts[i].device_present)
+			continue;
+
+		for (int j = 0; j < per_driver_contexts[i].len; j++) {
+			if (!ops[i].init_engines(per_driver_contexts[i].instances, j)) {
+				fprintf(stderr,
+					"Failed to initialize engines! (%s)\n",
+					strerror(errno));
+					gputop_clean_up();
+				return EXIT_FAILURE;
+			}
+			ret = ops[i].pmu_init(per_driver_contexts[i].instances, j);
+
+			if (ret) {
+				if (errno == EACCES && !is_root) {
+					fprintf(stderr,
+						"\n"
+						"Running without root privileges.\n"
+						"Engine busyness may not be available "
+						"without root privileges.\n"
+						"See \"--help\" to enable engine "
+						"busyness without root.\n\n");
+					igt_devices_free();
+					gputop_clean_up();
+					countdown("Resuming with only gpu client "
+						  "busyness in ", 5);
+				} else {
+					fprintf(stderr,
+						"Failed to initialize PMU! (%s)\n",
+						strerror(errno));
+					igt_devices_free();
+					gputop_clean_up();
+					return EXIT_FAILURE;
+				}
+			}
+		}
+	}
+
+	for (int i = 0; drivers[i]; i++) {
+		for (int j = 0;
+		     per_driver_contexts[i].device_present && j < per_driver_contexts[i].len;
+		     j++)
+			ops[i].pmu_sample(per_driver_contexts[i].instances, j);
+	}
 
 	clients = igt_drm_clients_init(NULL);
 	if (!clients)
@@ -449,22 +625,42 @@ int main(int argc, char **argv)
 
 	while ((n != 0) && !stop_top) {
 		struct igt_drm_client *c, *prevc = NULL;
-		int i, engine_w = 0, lines = 0;
+		int k, engine_w = 0, lines = 0;
 
 		igt_drm_clients_scan(clients, NULL, NULL, 0, NULL, 0);
+
+		for (int i = 0; drivers[i]; i++) {
+			for (int j = 0;
+			     per_driver_contexts[i].device_present &&
+			     j < per_driver_contexts[i].len;
+			     j++)
+				ops[i].pmu_sample(per_driver_contexts[i].instances, j);
+		}
+
 		igt_drm_clients_sort(clients, client_cmp);
 
 		update_console_size(&con_w, &con_h);
 		clrscr();
 
+		for (int i = 0; drivers[i]; i++) {
+			for (int j = 0;
+			     per_driver_contexts[i].device_present &&
+			     j < per_driver_contexts[i].len;
+			     j++) {
+				lines = ops[i].print_engines(per_driver_contexts[i].instances, j,
+							 lines, con_w, con_h);
+			}
+		}
+
 		if (!clients->num_clients) {
-			const char *msg = " (No GPU clients yet. Start workload to see stats)";
+			const char *msg;
 
+			msg = " (No GPU clients yet. Start workload to see stats)";
 			printf(ANSI_HEADER "%-*s" ANSI_RESET "\n",
 			       (int)(con_w - strlen(msg) - 1), msg);
 		}
 
-		igt_for_each_drm_client(clients, c, i) {
+		igt_for_each_drm_client(clients, c, k) {
 			assert(c->status != IGT_DRM_CLIENT_PROBE);
 			if (c->status != IGT_DRM_CLIENT_ALIVE)
 				break; /* Active clients are first in the array. */
@@ -488,11 +684,11 @@ int main(int argc, char **argv)
 	}
 
 	igt_drm_clients_free(clients);
+	gputop_clean_up();
 
 	if (profiled_devices != NULL) {
 		igt_devices_configure_profiling(profiled_devices, false);
 		igt_devices_free_profiling(profiled_devices);
 	}
-
 	return 0;
 }
diff --git a/tools/gputop.src/meson.build b/tools/gputop.src/meson.build
index ec39f4c7a..e95657fca 100644
--- a/tools/gputop.src/meson.build
+++ b/tools/gputop.src/meson.build
@@ -1 +1 @@
-gputop_src = files('gputop.c')
+gputop_src = files('gputop.c', 'utils.c', 'xe_gputop.c')
diff --git a/tools/meson.build b/tools/meson.build
index 521607a4c..caca57d0e 100644
--- a/tools/meson.build
+++ b/tools/meson.build
@@ -123,4 +123,5 @@ subdir('gputop.src')
 executable('gputop', sources : gputop_src,
 	   install : true,
 	   install_rpath : bindir_rpathdir,
-	   dependencies : [lib_igt_drm_clients,lib_igt_drm_fdinfo,lib_igt_profiling,math],)
+	   dependencies : [lib_igt_perf,lib_igt_device_scan,lib_igt_drm_clients,
+	   		   lib_igt_drm_fdinfo,lib_igt_profiling,math],)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH i-g-t v7 5/5] tools/gputop.src/gputop: Add command line option for device filter
  2026-01-30  9:53 [PATCH i-g-t v7 0/5] Close any open drm device after engine initialization in GPUTOP Soham Purkait
                   ` (3 preceding siblings ...)
  2026-01-30  9:53 ` [PATCH i-g-t v7 4/5] tools/gputop.src/gputop: Enable support for multiple GPUs and instances Soham Purkait
@ 2026-01-30  9:53 ` Soham Purkait
  4 siblings, 0 replies; 11+ messages in thread
From: Soham Purkait @ 2026-01-30  9:53 UTC (permalink / raw)
  To: igt-dev, riana.tauro, badal.nilawar, kamil.konieczny,
	ashutosh.dixit
  Cc: anshuman.gupta, soham.purkait, umesh.nerlige.ramappa

With a new command-line option for device filtering, the engine
busyness for the specific device could be selected.

Signed-off-by: Soham Purkait <soham.purkait@intel.com>
---
 tools/gputop.src/gputop.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/tools/gputop.src/gputop.c b/tools/gputop.src/gputop.c
index 7d4515f8f..e803db99e 100644
--- a/tools/gputop.src/gputop.c
+++ b/tools/gputop.src/gputop.c
@@ -412,6 +412,7 @@ static void clrscr(void)
 struct gputop_args {
 	long n_iter;
 	unsigned long delay_usec;
+	char *device;
 };
 
 static void countdown(const char *msg, const int start_sec)
@@ -450,7 +451,8 @@ static void help(char *full_path)
 	       "Options:\n"
 	       "\t-h, --help                show this help\n"
 	       "\t-d, --delay =SEC[.TENTHS] iterative delay as SECS [.TENTHS]\n"
-	       "\t-n, --iterations =NUMBER  number of executions\n\n"
+	       "\t-n, --iterations =NUMBER  number of executions\n"
+	       "\t-D, --device              Device filter\n\n"
 	       "Running without root:\n"
 	       "\tAs a non-root user, CAP_PERFMON or perf_event_paranoid is required to\n"
 	       "\taccess engine busyness\n"
@@ -481,11 +483,12 @@ static void help(char *full_path)
 
 static int parse_args(int argc, char * const argv[], struct gputop_args *args)
 {
-	static const char cmdopts_s[] = "hn:d:";
+	static const char cmdopts_s[] = "hn:d:D:";
 	static const struct option cmdopts[] = {
 	       {"help", no_argument, 0, 'h'},
 	       {"delay", required_argument, 0, 'd'},
 	       {"iterations", required_argument, 0, 'n'},
+	       {"device", required_argument, 0, 'D'},
 	       { }
 	};
 
@@ -493,6 +496,7 @@ static int parse_args(int argc, char * const argv[], struct gputop_args *args)
 	memset(args, 0, sizeof(*args));
 	args->n_iter = -1;
 	args->delay_usec = 2 * USEC_PER_SEC;
+	args->device = NULL;
 
 	for (;;) {
 		int c, idx = 0;
@@ -516,6 +520,9 @@ static int parse_args(int argc, char * const argv[], struct gputop_args *args)
 				return -1;
 			}
 			break;
+		case 'D':
+			args->device = optarg;
+			break;
 		case 'h':
 			help(argv[0]);
 			return 0;
@@ -557,7 +564,8 @@ int main(int argc, char **argv)
 
 	n = args.n_iter;
 	period_us = args.delay_usec;
-	populate_device_instances("device:subsystem=pci,card=all");
+	populate_device_instances(args.device ? args.device
+				  : "device:subsystem=pci,card=all");
 
 	for (int i = 0; drivers[i]; i++) {
 		if (!per_driver_contexts[i].device_present)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH i-g-t v7 3/5] tools/gputop.src/xe_gputop: Refactor xe_populate_engines to close card_fd and use direct ioctl calls
  2026-01-30  9:53 ` [PATCH i-g-t v7 3/5] tools/gputop.src/xe_gputop: Refactor xe_populate_engines to close card_fd and use direct ioctl calls Soham Purkait
@ 2026-02-12  1:41   ` Belgaumkar, Vinay
  2026-02-18 15:27     ` Purkait, Soham
  0 siblings, 1 reply; 11+ messages in thread
From: Belgaumkar, Vinay @ 2026-02-12  1:41 UTC (permalink / raw)
  To: Soham Purkait, igt-dev, riana.tauro, badal.nilawar,
	kamil.konieczny, ashutosh.dixit
  Cc: anshuman.gupta, umesh.nerlige.ramappa

[-- Attachment #1: Type: text/plain, Size: 7914 bytes --]


On 1/30/2026 1:53 AM, Soham Purkait wrote:
> During GPUTOP engine initialization, the opened card fd is required to
> query engines and obtain configurations for PMU busyness calculation, but
> leaving it open causes the process to appear as a DRM client, which
> conflicts with KMS test requirements. To avoid this, card_fd is closed
> immediately after engine discovery, and engine queries are refactored to
> remove the lib/xe/* dependency in favor of direct ioctl calls, simplifying
> the implementation and reducing igt test library dependencies.
>
> v1:
>   - Initialize pmu_device_obj to null.
> v2:
>   - Remove dependency on lib/xe/* libraries. (Kamil)
>
> Fixes: c8106465683f ("tools/gputop/xe_gputop: Add gputop support for xe specific devices")
> Signed-off-by: Soham Purkait<soham.purkait@intel.com>
> ---
>   tools/gputop.src/xe_gputop.c | 100 ++++++++++++++++++++++++++++-------
>   tools/gputop.src/xe_gputop.h |   3 +-
>   2 files changed, 81 insertions(+), 22 deletions(-)
>
> diff --git a/tools/gputop.src/xe_gputop.c b/tools/gputop.src/xe_gputop.c
> index bb2caa6ea..7793cd62f 100644
> --- a/tools/gputop.src/xe_gputop.c
> +++ b/tools/gputop.src/xe_gputop.c
> @@ -1,8 +1,10 @@
>   // SPDX-License-Identifier: MIT
>   /*
> - * Copyright © 2025 Intel Corporation
> + * Copyright © 2025-2026 Intel Corporation
>    */
>   
> +#include <sys/ioctl.h>
> +
>   #include "xe_gputop.h"
>   
>   #define engine_ptr(pmu_device, n) (&(pmu_device)->engine + (n))
> @@ -92,6 +94,7 @@ void xe_gputop_init(void *ptr, int index,
>   
>   	obj = ((struct xe_gputop *)ptr) + index;
>   	obj->card = card;
> +	obj->pmu_device_obj = NULL;
>   }
>   
>   static int pmu_format_shift(int xe, const char *name)
> @@ -121,20 +124,54 @@ static int engine_cmp(const void *__a, const void *__b)
>   		return a->drm_xe_engine.engine_instance - b->drm_xe_engine.engine_instance;
>   }
>   
> +static int engine_query(int fd, struct drm_xe_query_engines **engine_q)
> +{
> +	struct drm_xe_device_query q;
> +	int ret = 0, num_eng;
> +
> +	memset(&q, 0, sizeof(q));
> +	q.query = DRM_XE_DEVICE_QUERY_ENGINES;
> +
> +	ret = ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &q);
> +	if (ret != 0) {
> +		if ((errno != ENOSPC && errno != E2BIG) || q.size == 0)
why are we ignoring ENOSPC and E2BIG?
> +			return -1;
> +	}
> +	if (q.size == 0)
we can check q.size just once instead of twice?
> +		return 0;
> +
> +	num_eng = (int)((q.size - sizeof(struct drm_xe_query_engines)) /
> +			sizeof(struct drm_xe_engine_class_instance));
> +
> +	*engine_q = (struct drm_xe_query_engines *)calloc(1, q.size);
> +	if (!*engine_q)
> +		return -1;
> +
> +	q.data = (uintptr_t)(*engine_q);
better to use to_user_pointer from the lib?
> +
> +	ret = ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &q);
> +	if (ret != 0) {
> +		free(*engine_q);
> +		return -1;
> +	}
> +
> +	return num_eng;
> +}
> +
>   void *xe_populate_engines(const void *obj, int index)
>   {
>   	struct xe_gputop *ptr = ((struct xe_gputop *)obj) + index;
>   	struct igt_device_card *card = ptr->card;
>   	uint64_t engine_active_config, engine_total_config;
>   	uint64_t engine_class, engine_instance, gt_shift;
> -	struct drm_xe_engine_class_instance *hwe;
> +	struct drm_xe_query_engines *engine_q;
>   	struct xe_pmu_device *engines;
>   	char device[30];
> -	int ret = 0;
> -	int card_fd;
> +	int ret = 0, num_eng;
> +	int card_fd = -1;
>   
>   	if (!card || !strlen(card->card) || !strlen(card->render))
> -		return NULL;
> +		goto err;
>   
>   	if (strlen(card->card)) {
>   		card_fd = igt_open_card(card);
> @@ -142,39 +179,51 @@ void *xe_populate_engines(const void *obj, int index)
>   		card_fd = igt_open_render(card);
>   	} else {
>   		fprintf(stderr, "Failed to detect device!\n");
> -		return NULL;
> +		goto err;
>   	}
> -	xe_device_get(card_fd);
> +
> +	num_eng = engine_query(card_fd, &engine_q);
> +	if (num_eng <= 0) {
> +		fprintf(stderr, "Engine query failed!\n");
> +		goto err;
> +	}
> +
>   	engines = malloc(sizeof(struct xe_pmu_device) +
> -			 xe_number_engines(card_fd) * sizeof(struct xe_engine));
> +			 num_eng * sizeof(struct xe_engine));
>   	if (!engines)
> -		return NULL;
> +		goto err;
>   
>   	memset(engines, 0, sizeof(struct xe_pmu_device) +
> -	       xe_number_engines(card_fd) * sizeof(struct xe_engine));
> +	       num_eng * sizeof(struct xe_engine));
>   
>   	engines->num_engines = 0;
>   	gt_shift = pmu_format_shift(card_fd, "gt");
>   	engine_class = pmu_format_shift(card_fd, "engine_class");
>   	engine_instance = pmu_format_shift(card_fd, "engine_instance");
>   	xe_perf_device(card_fd, device, sizeof(device));
> +	close(card_fd);
> +	card_fd = -1;
> +
>   	engines->device = strdup(device);
>   	ret = perf_event_config(device, "engine-active-ticks", &engine_active_config);
>   	if (ret < 0)
> -		return NULL;
> +		goto err;
>   
>   	ret = perf_event_config(device, "engine-total-ticks", &engine_total_config);
>   	if (ret < 0)
> -		return NULL;
> +		goto err;
>   
> -	xe_for_each_engine(card_fd, hwe) {
> +	while (engines->num_engines < engine_q->num_engines) {
>   		uint64_t  param_config;
>   		struct xe_engine *engine;
> +		struct drm_xe_engine_class_instance hwe;
> +
> +		hwe = engine_q->engines[engines->num_engines].instance;
>   
>   		engine = engine_ptr(engines, engines->num_engines);
> -		param_config = (uint64_t)hwe->gt_id << gt_shift | hwe->engine_class << engine_class
> -			| hwe->engine_instance << engine_instance;
> -		engine->drm_xe_engine = *hwe;
> +		param_config = (uint64_t)hwe.gt_id << gt_shift | hwe.engine_class << engine_class
> +			| hwe.engine_instance << engine_instance;
> +		engine->drm_xe_engine = hwe;
>   		engine->engine_active_ticks.config = engine_active_config | param_config;
>   		engine->engine_total_ticks.config = engine_total_config | param_config;
>   
> @@ -185,7 +234,7 @@ void *xe_populate_engines(const void *obj, int index)
>   		}
>   
>   		ret = asprintf(&engine->display_name, "GT:%u %s/%u",
> -			       hwe->gt_id,
> +			       hwe.gt_id,
>   			       class_display_name(engine->drm_xe_engine.engine_class),
>   			       engine->drm_xe_engine.engine_instance);
>   
> @@ -197,9 +246,11 @@ void *xe_populate_engines(const void *obj, int index)
>   		engines->num_engines++;
>   	}
>   
> +	free(engine_q);
> +
>   	if (!ret) {

Not part of this patch, but looks like ret is overloaded here for 
multiple things. Usually ret = 0 is a good thing? Should use num_bytes 
or something if we expect asprintf above to return a positive value. We 
should fix this as a separate patch.

Thanks,

Vinay.

>   		errno = ret;
> -		return NULL;
> +		goto err;
>   	}
>   
>   	qsort(engine_ptr(engines, 0), engines->num_engines,
> @@ -208,6 +259,15 @@ void *xe_populate_engines(const void *obj, int index)
>   	ptr->pmu_device_obj = engines;
>   
>   	return engines;
> +
> +err:
> +	if (card_fd >= 0)
> +		close(card_fd);
> +	if (engines)
> +		free(engines);
> +	if (engine_q)
> +		free(engine_q);
> +	return NULL;
>   }
>   
>   static uint64_t pmu_read_multi(int fd, unsigned int num, uint64_t *val)
> @@ -263,11 +323,11 @@ int xe_pmu_init(const void *obj, int index)
>   		fd = _open_pmu(type, &engines->num_counters, &engine->engine_active_ticks,
>   			       &engines->fd);
>   		if (fd < 0)
> -			return -1;
> +			return fd;
>   		fd = _open_pmu(type, &engines->num_counters, &engine->engine_total_ticks,
>   			       &engines->fd);
>   		if (fd < 0)
> -			return -1;
> +			return fd;
>   	}
>   	return 0;
>   }
> diff --git a/tools/gputop.src/xe_gputop.h b/tools/gputop.src/xe_gputop.h
> index 1e3856071..dcd82de7d 100644
> --- a/tools/gputop.src/xe_gputop.h
> +++ b/tools/gputop.src/xe_gputop.h
> @@ -7,11 +7,10 @@
>   #define __XE_GPUTOP_H__
>   
>   #include <dirent.h>
> +#include <xe_drm.h>
>   
> -#include "igt_device_scan.h"
>   #include "igt_perf.h"
>   #include "utils.h"
> -#include "xe/xe_query.h"
>   
>   struct xe_pmu_pair {
>   	uint64_t cur;

[-- Attachment #2: Type: text/html, Size: 8915 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH i-g-t v7 4/5] tools/gputop.src/gputop: Enable support for multiple GPUs and instances
  2026-01-30  9:53 ` [PATCH i-g-t v7 4/5] tools/gputop.src/gputop: Enable support for multiple GPUs and instances Soham Purkait
@ 2026-02-12 23:28   ` Belgaumkar, Vinay
  2026-02-18 16:23     ` Purkait, Soham
  2026-02-18 17:10     ` Purkait, Soham
  0 siblings, 2 replies; 11+ messages in thread
From: Belgaumkar, Vinay @ 2026-02-12 23:28 UTC (permalink / raw)
  To: Soham Purkait, igt-dev, riana.tauro, badal.nilawar,
	kamil.konieczny, ashutosh.dixit
  Cc: anshuman.gupta, umesh.nerlige.ramappa


On 1/30/2026 1:53 AM, Soham Purkait wrote:
> Introduce vendor-agnostic support for handling multiple GPUs and
> instances in gputop. Improve the tool's adaptability to various GPU
> configurations.
>
> v1:
>   - Refactor GPUTOP into a vendor-agnostic tool. (Lucas)
> v2:
>   - Cosmetic changes. (Riana)
>   - Avoid three level indentation. (Riana)
> v3:
>   - Add device filter to populate the array of cards for
>     all supported drivers. (Zbigniew)
> v4:
>   - Add user message for running without root privileges. (Kamil)
> v5:
>   - Add support for GPU client-only busyness on unsupported
>     drivers as a fallback mechanism. (Kamil)
> v6:
>   - Remove unused dependencies and headers. (Kamil)
>
> Signed-off-by: Soham Purkait <soham.purkait@intel.com>
> ---
>   tools/gputop.src/gputop.c    | 278 +++++++++++++++++++++++++++++------
>   tools/gputop.src/meson.build |   2 +-
>   tools/meson.build            |   3 +-
>   3 files changed, 240 insertions(+), 43 deletions(-)
>
> diff --git a/tools/gputop.src/gputop.c b/tools/gputop.src/gputop.c
> index f577a1750..7d4515f8f 100644
> --- a/tools/gputop.src/gputop.c
> +++ b/tools/gputop.src/gputop.c
> @@ -1,6 +1,6 @@
>   // SPDX-License-Identifier: MIT
>   /*
> - * Copyright © 2023 Intel Corporation
> + * Copyright © 2023-2025 Intel Corporation
>    */
>   
>   #include <assert.h>
> @@ -14,66 +14,145 @@
>   #include <math.h>
>   #include <poll.h>
>   #include <signal.h>
> +#include <stdbool.h>
>   #include <stdint.h>
>   #include <stdio.h>
>   #include <stdlib.h>
>   #include <string.h>
>   #include <sys/ioctl.h>
>   #include <sys/stat.h>
> +#include <sys/sysmacros.h>
>   #include <sys/types.h>
> -#include <unistd.h>
>   #include <termios.h>
> -#include <sys/sysmacros.h>
> -#include <stdbool.h>
> +#include <unistd.h>
>   
> -#include "igt_core.h"
>   #include "igt_drm_clients.h"
>   #include "igt_drm_fdinfo.h"
>   #include "igt_profiling.h"
> -#include "drmtest.h"
> +#include "xe_gputop.h"
> +
> +/**
> + * Supported Drivers
> + *
> + * Adhere to the following requirements when implementing support for the
> + * new driver:
> + * @drivers: Update drivers[] with driver string.
> + * @sizeof_gputop_obj: Update this function as per new driver support included.
> + * @operations: Update the respective operations of the new driver:
> + * gputop_init,
> + * discover_engines,
> + * pmu_init,
> + * pmu_sample,
> + * print_engines,
> + * clean_up
> + * @per_driver_contexts: Update per_driver_contexts[] array of type "struct gputop_driver" with the
> + * initial values.
> + */
> +static const char * const drivers[] = {
> +	"xe",
> +    /* Keep the last one as NULL */
> +	NULL
> +};
> +
> +static size_t sizeof_gputop_obj(int driver_num)
> +{
> +	switch (driver_num) {
> +	case 0:
Might be worthwhile using an enum like INTEL_XE_DRIVER instead of 
hardcoded values.
> +		return sizeof(struct xe_gputop);
> +	default:
> +		fprintf(stderr,
> +			"Driver number does not exist.\n");
> +		exit(EXIT_FAILURE);
> +	}
> +}
> +
> +/**
> + * Supported operations on driver instances. Update the ops[] array for
> + * each individual driver specific function. Maintain the sequence as per
> + * drivers[] array.
> + */
> +struct device_operations ops[] = {
> +	{
> +		xe_gputop_init,
> +		xe_populate_engines,
> +		xe_pmu_init,
> +		xe_pmu_sample,
> +		xe_print_engines,
> +		xe_clean_up
> +	}
> +};
> +
> +/*
> + * per_driver_contexts[] array of type struct gputop_driver which keeps track of the devices
> + * and related info discovered per driver.
> + */
> +struct gputop_driver per_driver_contexts[] = {
This is already per driver, no need to add that in the name? just 
driver_context or pci_driver_context? Also, where is gputop_driver 
struct defined? (Didn't find it in this patch set)
> +	{false, 0, NULL}
> +};
>   
>   enum utilization_type {
>   	UTILIZATION_TYPE_ENGINE_TIME,
>   	UTILIZATION_TYPE_TOTAL_CYCLES,
>   };
>   
> -static const char *bars[] = { " ", "▏", "▎", "▍", "▌", "▋", "▊", "▉", "█" };
> -
> -#define ANSI_HEADER "\033[7m"
> -#define ANSI_RESET "\033[0m"
> -
> -static void n_spaces(const unsigned int n)
> +static void gputop_clean_up(void)
>   {
> -	unsigned int i;
> -
> -	for (i = 0; i < n; i++)
> -		putchar(' ');
> +	for (int i = 0; drivers[i]; i++) {
> +		ops[i].clean_up(per_driver_contexts[i].instances, per_driver_contexts[i].len);
> +		free(per_driver_contexts[i].instances);
> +		per_driver_contexts[i].device_present = false;
> +		per_driver_contexts[i].len = 0;
> +	}
>   }
>   
> -static void print_percentage_bar(double percent, int max_len)
> +static int find_driver(struct igt_device_card *card)
should be find_pci_driver?
>   {
> -	int bar_len, i, len = max_len - 1;
> -	const int w = 8;
> -
> -	len -= printf("|%5.1f%% ", percent);
> -
> -	/* no space left for bars, do what we can */
> -	if (len < 0)
> -		len = 0;
> -
> -	bar_len = ceil(w * percent * len / 100.0);
> -	if (bar_len > w * len)
> -		bar_len = w * len;
> +	for (int i = 0; drivers[i]; i++) {
> +		if (strcmp(drivers[i], card->driver) == 0)
> +			return i;
> +	}
> +	return -1;
> +}
>   
> -	for (i = bar_len; i >= w; i -= w)
> -		printf("%s", bars[w]);
> -	if (i)
> -		printf("%s", bars[i]);
> +static int populate_device_instances(const char *filter)
> +{
> +	struct igt_device_card *cards = NULL;
> +	struct igt_device_card *card_inplace = NULL;
> +	struct gputop_driver *driver_entry =  NULL;
> +	int driver_no;
should this be driver_num or driver_id?
> +	int count, final_count = 0;
> +
> +	count = igt_device_card_match_all(filter, &cards);
> +	for (int j = 0; j < count; j++) {
> +		if (strcmp(cards[j].subsystem, "pci") != 0)
> +			continue;
>   
> -	len -= (bar_len + (w - 1)) / w;
> -	n_spaces(len);
> +		driver_no = find_driver(&cards[j]);
> +		if (driver_no < 0)
> +			continue;
>   
> -	putchar('|');
> +		driver_entry = &per_driver_contexts[driver_no];
> +		if (!driver_entry->device_present)
> +			driver_entry->device_present = true;
is this check only for reassignment of device_present or should it 
include the below statements as well?
> +		driver_entry->len++;
> +		driver_entry->instances = realloc(driver_entry->instances,
> +						  driver_entry->len * sizeof_gputop_obj(driver_no));
should this be renamed to driver_entry->engine_instances so we are clear 
about which instances this refers to?
> +		if (!driver_entry->instances) {
> +			fprintf(stderr,
> +				"Device instance realloc failed (%s)\n",
> +				strerror(errno));
> +			exit(EXIT_FAILURE);
> +		}

should these be under the conditional if check above as well? Or will we 
end up incrementing len twice and duplicate realloc? Unless this refers 
to driver instances?

Thanks,

Vinay.

> +		card_inplace = (struct igt_device_card *)
> +				calloc(1, sizeof(struct igt_device_card));
> +		memcpy(card_inplace, &cards[j], sizeof(struct igt_device_card));
> +		ops[driver_no].gputop_init(driver_entry->instances, (driver_entry->len - 1),
> +			card_inplace);
> +		final_count++;
> +	}
> +	if (count)
> +		free(cards);
> +	return final_count;
>   }
>   
>   static int
> @@ -335,6 +414,28 @@ struct gputop_args {
>   	unsigned long delay_usec;
>   };
>   
> +static void countdown(const char *msg, const int start_sec)
> +{
> +	struct pollfd pfd;
> +	int i, ret;
> +	char ch;
> +
> +	for (i = start_sec; i > 0; i--) {
> +		printf("\r%s%d... second(s)", msg, i);
> +		fflush(stdout);
> +
> +		pfd.fd = STDIN_FILENO;
> +		pfd.events = POLLIN;
> +
> +		ret = poll(&pfd, 1, 1000);
> +		if (ret > 0 && (pfd.revents & POLLIN)) {
> +			while ((ch = getchar()) != '\n' && ch != EOF)
> +				continue;
> +			return;
> +		}
> +	}
> +}
> +
>   static void help(char *full_path)
>   {
>   	const char *short_program_name = strrchr(full_path, '/');
> @@ -349,7 +450,32 @@ static void help(char *full_path)
>   	       "Options:\n"
>   	       "\t-h, --help                show this help\n"
>   	       "\t-d, --delay =SEC[.TENTHS] iterative delay as SECS [.TENTHS]\n"
> -	       "\t-n, --iterations =NUMBER  number of executions\n"
> +	       "\t-n, --iterations =NUMBER  number of executions\n\n"
> +	       "Running without root:\n"
> +	       "\tAs a non-root user, CAP_PERFMON or perf_event_paranoid is required to\n"
> +	       "\taccess engine busyness\n"
> +	       "\t" ANSI_HEADER "Steps to run without root (using CAP_PERFMON):"
> +		ANSI_RESET "\n"
> +	       "\tcd /path/to/igt-gpu-tools/\n"
> +	       "\tsudo setcap cap_perfmon=+ep $(pwd)/build/tools/gputop\n"
> +	       "\tsudo sh -c \"echo $(pwd)/build/lib > /etc/ld.so.conf.d/lib-igt.conf\"\n"
> +	       "\tsudo ldconfig\n"
> +	       "\t" ANSI_HEADER "Steps to revert once done:" ANSI_RESET "\n"
> +	       "\tsudo setcap cap_perfmon=-ep $(pwd)/build/tools/gputop\n"
> +	       "\tsudo rm /etc/ld.so.conf.d/lib-igt.conf\n"
> +	       "\tsudo ldconfig\n"
> +	       "\n"
> +	       "\t" ANSI_HEADER "Steps to run without root (using perf_event_paranoid):"
> +	       ANSI_RESET "\n"
> +	       "\t\033[32m# Save current perf_event_paranoid value\033[0m\n"
> +	       "\torig_val=$(sysctl -n kernel.perf_event_paranoid)\n"
> +	       "\tsudo sysctl -w kernel.perf_event_paranoid=-1\n"
> +	       "\t" ANSI_HEADER "Steps to revert once done:" ANSI_RESET "\n"
> +	       "\t\033[32m# Restore original value\033[0m\n"
> +	       "\tsudo sysctl -w kernel.perf_event_paranoid=$orig_val\n\n"
> +	       "\tFor details, see 'Perf events and tool security':\n"
> +	       "\thttps://www.kernel.org/doc/html/"
> +	       "latest/admin-guide/perf-security.html\n\n"
>   	       , short_program_name);
>   }
>   
> @@ -417,9 +543,12 @@ int main(int argc, char **argv)
>   	struct igt_profiled_device *profiled_devices = NULL;
>   	struct igt_drm_clients *clients = NULL;
>   	int con_w = -1, con_h = -1;
> +	bool is_root;
>   	int ret;
>   	long n;
>   
> +	is_root = (geteuid() == 0);
> +
>   	ret = parse_args(argc, argv, &args);
>   	if (ret < 0)
>   		return EXIT_FAILURE;
> @@ -428,6 +557,53 @@ int main(int argc, char **argv)
>   
>   	n = args.n_iter;
>   	period_us = args.delay_usec;
> +	populate_device_instances("device:subsystem=pci,card=all");
> +
> +	for (int i = 0; drivers[i]; i++) {
> +		if (!per_driver_contexts[i].device_present)
> +			continue;
> +
> +		for (int j = 0; j < per_driver_contexts[i].len; j++) {
> +			if (!ops[i].init_engines(per_driver_contexts[i].instances, j)) {
> +				fprintf(stderr,
> +					"Failed to initialize engines! (%s)\n",
> +					strerror(errno));
> +					gputop_clean_up();
> +				return EXIT_FAILURE;
> +			}
> +			ret = ops[i].pmu_init(per_driver_contexts[i].instances, j);
> +
> +			if (ret) {
> +				if (errno == EACCES && !is_root) {
> +					fprintf(stderr,
> +						"\n"
> +						"Running without root privileges.\n"
> +						"Engine busyness may not be available "
> +						"without root privileges.\n"
> +						"See \"--help\" to enable engine "
> +						"busyness without root.\n\n");
> +					igt_devices_free();
> +					gputop_clean_up();
> +					countdown("Resuming with only gpu client "
> +						  "busyness in ", 5);
> +				} else {
> +					fprintf(stderr,
> +						"Failed to initialize PMU! (%s)\n",
> +						strerror(errno));
> +					igt_devices_free();
> +					gputop_clean_up();
> +					return EXIT_FAILURE;
> +				}
> +			}
> +		}
> +	}
> +
> +	for (int i = 0; drivers[i]; i++) {
> +		for (int j = 0;
> +		     per_driver_contexts[i].device_present && j < per_driver_contexts[i].len;
> +		     j++)
> +			ops[i].pmu_sample(per_driver_contexts[i].instances, j);
> +	}
>   
>   	clients = igt_drm_clients_init(NULL);
>   	if (!clients)
> @@ -449,22 +625,42 @@ int main(int argc, char **argv)
>   
>   	while ((n != 0) && !stop_top) {
>   		struct igt_drm_client *c, *prevc = NULL;
> -		int i, engine_w = 0, lines = 0;
> +		int k, engine_w = 0, lines = 0;
>   
>   		igt_drm_clients_scan(clients, NULL, NULL, 0, NULL, 0);
> +
> +		for (int i = 0; drivers[i]; i++) {
> +			for (int j = 0;
> +			     per_driver_contexts[i].device_present &&
> +			     j < per_driver_contexts[i].len;
> +			     j++)
> +				ops[i].pmu_sample(per_driver_contexts[i].instances, j);
> +		}
> +
>   		igt_drm_clients_sort(clients, client_cmp);
>   
>   		update_console_size(&con_w, &con_h);
>   		clrscr();
>   
> +		for (int i = 0; drivers[i]; i++) {
> +			for (int j = 0;
> +			     per_driver_contexts[i].device_present &&
> +			     j < per_driver_contexts[i].len;
> +			     j++) {
> +				lines = ops[i].print_engines(per_driver_contexts[i].instances, j,
> +							 lines, con_w, con_h);
> +			}
> +		}
> +
>   		if (!clients->num_clients) {
> -			const char *msg = " (No GPU clients yet. Start workload to see stats)";
> +			const char *msg;
>   
> +			msg = " (No GPU clients yet. Start workload to see stats)";
>   			printf(ANSI_HEADER "%-*s" ANSI_RESET "\n",
>   			       (int)(con_w - strlen(msg) - 1), msg);
>   		}
>   
> -		igt_for_each_drm_client(clients, c, i) {
> +		igt_for_each_drm_client(clients, c, k) {
>   			assert(c->status != IGT_DRM_CLIENT_PROBE);
>   			if (c->status != IGT_DRM_CLIENT_ALIVE)
>   				break; /* Active clients are first in the array. */
> @@ -488,11 +684,11 @@ int main(int argc, char **argv)
>   	}
>   
>   	igt_drm_clients_free(clients);
> +	gputop_clean_up();
>   
>   	if (profiled_devices != NULL) {
>   		igt_devices_configure_profiling(profiled_devices, false);
>   		igt_devices_free_profiling(profiled_devices);
>   	}
> -
>   	return 0;
>   }
> diff --git a/tools/gputop.src/meson.build b/tools/gputop.src/meson.build
> index ec39f4c7a..e95657fca 100644
> --- a/tools/gputop.src/meson.build
> +++ b/tools/gputop.src/meson.build
> @@ -1 +1 @@
> -gputop_src = files('gputop.c')
> +gputop_src = files('gputop.c', 'utils.c', 'xe_gputop.c')
> diff --git a/tools/meson.build b/tools/meson.build
> index 521607a4c..caca57d0e 100644
> --- a/tools/meson.build
> +++ b/tools/meson.build
> @@ -123,4 +123,5 @@ subdir('gputop.src')
>   executable('gputop', sources : gputop_src,
>   	   install : true,
>   	   install_rpath : bindir_rpathdir,
> -	   dependencies : [lib_igt_drm_clients,lib_igt_drm_fdinfo,lib_igt_profiling,math],)
> +	   dependencies : [lib_igt_perf,lib_igt_device_scan,lib_igt_drm_clients,
> +	   		   lib_igt_drm_fdinfo,lib_igt_profiling,math],)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH i-g-t v7 3/5] tools/gputop.src/xe_gputop: Refactor xe_populate_engines to close card_fd and use direct ioctl calls
  2026-02-12  1:41   ` Belgaumkar, Vinay
@ 2026-02-18 15:27     ` Purkait, Soham
  0 siblings, 0 replies; 11+ messages in thread
From: Purkait, Soham @ 2026-02-18 15:27 UTC (permalink / raw)
  To: Belgaumkar, Vinay, igt-dev, riana.tauro, badal.nilawar,
	kamil.konieczny, ashutosh.dixit
  Cc: anshuman.gupta, umesh.nerlige.ramappa

[-- Attachment #1: Type: text/plain, Size: 8491 bytes --]

Hi Vinay

On 12-02-2026 07:11, Belgaumkar, Vinay wrote:
>
>
> On 1/30/2026 1:53 AM, Soham Purkait wrote:
>> During GPUTOP engine initialization, the opened card fd is required to
>> query engines and obtain configurations for PMU busyness calculation, but
>> leaving it open causes the process to appear as a DRM client, which
>> conflicts with KMS test requirements. To avoid this, card_fd is closed
>> immediately after engine discovery, and engine queries are refactored to
>> remove the lib/xe/* dependency in favor of direct ioctl calls, simplifying
>> the implementation and reducing igt test library dependencies.
>>
>> v1:
>>   - Initialize pmu_device_obj to null.
>> v2:
>>   - Remove dependency on lib/xe/* libraries. (Kamil)
>>
>> Fixes: c8106465683f ("tools/gputop/xe_gputop: Add gputop support for xe specific devices")
>> Signed-off-by: Soham Purkait<soham.purkait@intel.com>
>> ---
>>   tools/gputop.src/xe_gputop.c | 100 ++++++++++++++++++++++++++++-------
>>   tools/gputop.src/xe_gputop.h |   3 +-
>>   2 files changed, 81 insertions(+), 22 deletions(-)
>>
>> diff --git a/tools/gputop.src/xe_gputop.c b/tools/gputop.src/xe_gputop.c
>> index bb2caa6ea..7793cd62f 100644
>> --- a/tools/gputop.src/xe_gputop.c
>> +++ b/tools/gputop.src/xe_gputop.c
>> @@ -1,8 +1,10 @@
>>   // SPDX-License-Identifier: MIT
>>   /*
>> - * Copyright © 2025 Intel Corporation
>> + * Copyright © 2025-2026 Intel Corporation
>>    */
>>   
>> +#include <sys/ioctl.h>
>> +
>>   #include "xe_gputop.h"
>>   
>>   #define engine_ptr(pmu_device, n) (&(pmu_device)->engine + (n))
>> @@ -92,6 +94,7 @@ void xe_gputop_init(void *ptr, int index,
>>   
>>   	obj = ((struct xe_gputop *)ptr) + index;
>>   	obj->card = card;
>> +	obj->pmu_device_obj = NULL;
>>   }
>>   
>>   static int pmu_format_shift(int xe, const char *name)
>> @@ -121,20 +124,54 @@ static int engine_cmp(const void *__a, const void *__b)
>>   		return a->drm_xe_engine.engine_instance - b->drm_xe_engine.engine_instance;
>>   }
>>   
>> +static int engine_query(int fd, struct drm_xe_query_engines **engine_q)
>> +{
>> +	struct drm_xe_device_query q;
>> +	int ret = 0, num_eng;
>> +
>> +	memset(&q, 0, sizeof(q));
>> +	q.query = DRM_XE_DEVICE_QUERY_ENGINES;
>> +
>> +	ret = ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &q);
>> +	if (ret != 0) {
>> +		if ((errno != ENOSPC && errno != E2BIG) || q.size == 0)
> why are we ignoring ENOSPC and E2BIG? 
I have rectified this check. Seems like the driver is not returning any 
of these during size check.
>> +			return -1;
>> +	}
>> +	if (q.size == 0)
> we can check q.size just once instead of twice? 
Sure.
>> +		return 0;
>> +
>> +	num_eng = (int)((q.size - sizeof(struct drm_xe_query_engines)) /
>> +			sizeof(struct drm_xe_engine_class_instance));
>> +
>> +	*engine_q = (struct drm_xe_query_engines *)calloc(1, q.size);
>> +	if (!*engine_q)
>> +		return -1;
>> +
>> +	q.data = (uintptr_t)(*engine_q);
> better to use to_user_pointer from the lib? 
For igt "tools" it has been asked to not use any library function from 
lib as these are meant to be used only with IGT tests.
>> +
>> +	ret = ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &q);
>> +	if (ret != 0) {
>> +		free(*engine_q);
>> +		return -1;
>> +	}
>> +
>> +	return num_eng;
>> +}
>> +
>>   void *xe_populate_engines(const void *obj, int index)
>>   {
>>   	struct xe_gputop *ptr = ((struct xe_gputop *)obj) + index;
>>   	struct igt_device_card *card = ptr->card;
>>   	uint64_t engine_active_config, engine_total_config;
>>   	uint64_t engine_class, engine_instance, gt_shift;
>> -	struct drm_xe_engine_class_instance *hwe;
>> +	struct drm_xe_query_engines *engine_q;
>>   	struct xe_pmu_device *engines;
>>   	char device[30];
>> -	int ret = 0;
>> -	int card_fd;
>> +	int ret = 0, num_eng;
>> +	int card_fd = -1;
>>   
>>   	if (!card || !strlen(card->card) || !strlen(card->render))
>> -		return NULL;
>> +		goto err;
>>   
>>   	if (strlen(card->card)) {
>>   		card_fd = igt_open_card(card);
>> @@ -142,39 +179,51 @@ void *xe_populate_engines(const void *obj, int index)
>>   		card_fd = igt_open_render(card);
>>   	} else {
>>   		fprintf(stderr, "Failed to detect device!\n");
>> -		return NULL;
>> +		goto err;
>>   	}
>> -	xe_device_get(card_fd);
>> +
>> +	num_eng = engine_query(card_fd, &engine_q);
>> +	if (num_eng <= 0) {
>> +		fprintf(stderr, "Engine query failed!\n");
>> +		goto err;
>> +	}
>> +
>>   	engines = malloc(sizeof(struct xe_pmu_device) +
>> -			 xe_number_engines(card_fd) * sizeof(struct xe_engine));
>> +			 num_eng * sizeof(struct xe_engine));
>>   	if (!engines)
>> -		return NULL;
>> +		goto err;
>>   
>>   	memset(engines, 0, sizeof(struct xe_pmu_device) +
>> -	       xe_number_engines(card_fd) * sizeof(struct xe_engine));
>> +	       num_eng * sizeof(struct xe_engine));
>>   
>>   	engines->num_engines = 0;
>>   	gt_shift = pmu_format_shift(card_fd, "gt");
>>   	engine_class = pmu_format_shift(card_fd, "engine_class");
>>   	engine_instance = pmu_format_shift(card_fd, "engine_instance");
>>   	xe_perf_device(card_fd, device, sizeof(device));
>> +	close(card_fd);
>> +	card_fd = -1;
>> +
>>   	engines->device = strdup(device);
>>   	ret = perf_event_config(device, "engine-active-ticks", &engine_active_config);
>>   	if (ret < 0)
>> -		return NULL;
>> +		goto err;
>>   
>>   	ret = perf_event_config(device, "engine-total-ticks", &engine_total_config);
>>   	if (ret < 0)
>> -		return NULL;
>> +		goto err;
>>   
>> -	xe_for_each_engine(card_fd, hwe) {
>> +	while (engines->num_engines < engine_q->num_engines) {
>>   		uint64_t  param_config;
>>   		struct xe_engine *engine;
>> +		struct drm_xe_engine_class_instance hwe;
>> +
>> +		hwe = engine_q->engines[engines->num_engines].instance;
>>   
>>   		engine = engine_ptr(engines, engines->num_engines);
>> -		param_config = (uint64_t)hwe->gt_id << gt_shift | hwe->engine_class << engine_class
>> -			| hwe->engine_instance << engine_instance;
>> -		engine->drm_xe_engine = *hwe;
>> +		param_config = (uint64_t)hwe.gt_id << gt_shift | hwe.engine_class << engine_class
>> +			| hwe.engine_instance << engine_instance;
>> +		engine->drm_xe_engine = hwe;
>>   		engine->engine_active_ticks.config = engine_active_config | param_config;
>>   		engine->engine_total_ticks.config = engine_total_config | param_config;
>>   
>> @@ -185,7 +234,7 @@ void *xe_populate_engines(const void *obj, int index)
>>   		}
>>   
>>   		ret = asprintf(&engine->display_name, "GT:%u %s/%u",
>> -			       hwe->gt_id,
>> +			       hwe.gt_id,
>>   			       class_display_name(engine->drm_xe_engine.engine_class),
>>   			       engine->drm_xe_engine.engine_instance);
>>   
>> @@ -197,9 +246,11 @@ void *xe_populate_engines(const void *obj, int index)
>>   		engines->num_engines++;
>>   	}
>>   
>> +	free(engine_q);
>> +
>>   	if (!ret) {
>
> Not part of this patch, but looks like ret is overloaded here for 
> multiple things. Usually ret = 0 is a good thing? Should use num_bytes 
> or something if we expect asprintf above to return a positive value. 
> We should fix this as a separate patch.
>
Sure.

Thanks,
Soham

> Thanks,
>
> Vinay.
>
>>   		errno = ret;
>> -		return NULL;
>> +		goto err;
>>   	}
>>   
>>   	qsort(engine_ptr(engines, 0), engines->num_engines,
>> @@ -208,6 +259,15 @@ void *xe_populate_engines(const void *obj, int index)
>>   	ptr->pmu_device_obj = engines;
>>   
>>   	return engines;
>> +
>> +err:
>> +	if (card_fd >= 0)
>> +		close(card_fd);
>> +	if (engines)
>> +		free(engines);
>> +	if (engine_q)
>> +		free(engine_q);
>> +	return NULL;
>>   }
>>   
>>   static uint64_t pmu_read_multi(int fd, unsigned int num, uint64_t *val)
>> @@ -263,11 +323,11 @@ int xe_pmu_init(const void *obj, int index)
>>   		fd = _open_pmu(type, &engines->num_counters, &engine->engine_active_ticks,
>>   			       &engines->fd);
>>   		if (fd < 0)
>> -			return -1;
>> +			return fd;
>>   		fd = _open_pmu(type, &engines->num_counters, &engine->engine_total_ticks,
>>   			       &engines->fd);
>>   		if (fd < 0)
>> -			return -1;
>> +			return fd;
>>   	}
>>   	return 0;
>>   }
>> diff --git a/tools/gputop.src/xe_gputop.h b/tools/gputop.src/xe_gputop.h
>> index 1e3856071..dcd82de7d 100644
>> --- a/tools/gputop.src/xe_gputop.h
>> +++ b/tools/gputop.src/xe_gputop.h
>> @@ -7,11 +7,10 @@
>>   #define __XE_GPUTOP_H__
>>   
>>   #include <dirent.h>
>> +#include <xe_drm.h>
>>   
>> -#include "igt_device_scan.h"
>>   #include "igt_perf.h"
>>   #include "utils.h"
>> -#include "xe/xe_query.h"
>>   
>>   struct xe_pmu_pair {
>>   	uint64_t cur;

[-- Attachment #2: Type: text/html, Size: 9943 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH i-g-t v7 4/5] tools/gputop.src/gputop: Enable support for multiple GPUs and instances
  2026-02-12 23:28   ` Belgaumkar, Vinay
@ 2026-02-18 16:23     ` Purkait, Soham
  2026-02-18 17:10     ` Purkait, Soham
  1 sibling, 0 replies; 11+ messages in thread
From: Purkait, Soham @ 2026-02-18 16:23 UTC (permalink / raw)
  To: Belgaumkar, Vinay, igt-dev, riana.tauro, badal.nilawar,
	kamil.konieczny, ashutosh.dixit
  Cc: anshuman.gupta, umesh.nerlige.ramappa

Hi Vinay,

On 13-02-2026 04:58, Belgaumkar, Vinay wrote:
>
> On 1/30/2026 1:53 AM, Soham Purkait wrote:
>> Introduce vendor-agnostic support for handling multiple GPUs and
>> instances in gputop. Improve the tool's adaptability to various GPU
>> configurations.
>>
>> v1:
>>   - Refactor GPUTOP into a vendor-agnostic tool. (Lucas)
>> v2:
>>   - Cosmetic changes. (Riana)
>>   - Avoid three level indentation. (Riana)
>> v3:
>>   - Add device filter to populate the array of cards for
>>     all supported drivers. (Zbigniew)
>> v4:
>>   - Add user message for running without root privileges. (Kamil)
>> v5:
>>   - Add support for GPU client-only busyness on unsupported
>>     drivers as a fallback mechanism. (Kamil)
>> v6:
>>   - Remove unused dependencies and headers. (Kamil)
>>
>> Signed-off-by: Soham Purkait <soham.purkait@intel.com>
>> ---
>>   tools/gputop.src/gputop.c    | 278 +++++++++++++++++++++++++++++------
>>   tools/gputop.src/meson.build |   2 +-
>>   tools/meson.build            |   3 +-
>>   3 files changed, 240 insertions(+), 43 deletions(-)
>>
>> diff --git a/tools/gputop.src/gputop.c b/tools/gputop.src/gputop.c
>> index f577a1750..7d4515f8f 100644
>> --- a/tools/gputop.src/gputop.c
>> +++ b/tools/gputop.src/gputop.c
>> @@ -1,6 +1,6 @@
>>   // SPDX-License-Identifier: MIT
>>   /*
>> - * Copyright © 2023 Intel Corporation
>> + * Copyright © 2023-2025 Intel Corporation
>>    */
>>     #include <assert.h>
>> @@ -14,66 +14,145 @@
>>   #include <math.h>
>>   #include <poll.h>
>>   #include <signal.h>
>> +#include <stdbool.h>
>>   #include <stdint.h>
>>   #include <stdio.h>
>>   #include <stdlib.h>
>>   #include <string.h>
>>   #include <sys/ioctl.h>
>>   #include <sys/stat.h>
>> +#include <sys/sysmacros.h>
>>   #include <sys/types.h>
>> -#include <unistd.h>
>>   #include <termios.h>
>> -#include <sys/sysmacros.h>
>> -#include <stdbool.h>
>> +#include <unistd.h>
>>   -#include "igt_core.h"
>>   #include "igt_drm_clients.h"
>>   #include "igt_drm_fdinfo.h"
>>   #include "igt_profiling.h"
>> -#include "drmtest.h"
>> +#include "xe_gputop.h"
>> +
>> +/**
>> + * Supported Drivers
>> + *
>> + * Adhere to the following requirements when implementing support 
>> for the
>> + * new driver:
>> + * @drivers: Update drivers[] with driver string.
>> + * @sizeof_gputop_obj: Update this function as per new driver 
>> support included.
>> + * @operations: Update the respective operations of the new driver:
>> + * gputop_init,
>> + * discover_engines,
>> + * pmu_init,
>> + * pmu_sample,
>> + * print_engines,
>> + * clean_up
>> + * @per_driver_contexts: Update per_driver_contexts[] array of type 
>> "struct gputop_driver" with the
>> + * initial values.
>> + */
>> +static const char * const drivers[] = {
>> +    "xe",
>> +    /* Keep the last one as NULL */
>> +    NULL
>> +};
>> +
>> +static size_t sizeof_gputop_obj(int driver_num)
>> +{
>> +    switch (driver_num) {
>> +    case 0:
> Might be worthwhile using an enum like INTEL_XE_DRIVER instead of 
> hardcoded values.
Sure.
>> +        return sizeof(struct xe_gputop);
>> +    default:
>> +        fprintf(stderr,
>> +            "Driver number does not exist.\n");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +}
>> +
>> +/**
>> + * Supported operations on driver instances. Update the ops[] array for
>> + * each individual driver specific function. Maintain the sequence 
>> as per
>> + * drivers[] array.
>> + */
>> +struct device_operations ops[] = {
>> +    {
>> +        xe_gputop_init,
>> +        xe_populate_engines,
>> +        xe_pmu_init,
>> +        xe_pmu_sample,
>> +        xe_print_engines,
>> +        xe_clean_up
>> +    }
>> +};
>> +
>> +/*
>> + * per_driver_contexts[] array of type struct gputop_driver which 
>> keeps track of the devices
>> + * and related info discovered per driver.
>> + */
>> +struct gputop_driver per_driver_contexts[] = {
> This is already per driver, no need to add that in the name? just 
> driver_context or pci_driver_context? Also, where is gputop_driver 
> struct defined? (Didn't find it in this patch set)
In my view this array is indexed by driver, keeping "per_driver_" avoids 
the confusion. gputop_driver is defined in tools/gputop/utils.h (not 
included in this patch).
>> +    {false, 0, NULL}
>> +};
>>     enum utilization_type {
>>       UTILIZATION_TYPE_ENGINE_TIME,
>>       UTILIZATION_TYPE_TOTAL_CYCLES,
>>   };
>>   -static const char *bars[] = { " ", "▏", "▎", "▍", "▌", "▋", "▊", 
>> "▉", "█" };
>> -
>> -#define ANSI_HEADER "\033[7m"
>> -#define ANSI_RESET "\033[0m"
>> -
>> -static void n_spaces(const unsigned int n)
>> +static void gputop_clean_up(void)
>>   {
>> -    unsigned int i;
>> -
>> -    for (i = 0; i < n; i++)
>> -        putchar(' ');
>> +    for (int i = 0; drivers[i]; i++) {
>> +        ops[i].clean_up(per_driver_contexts[i].instances, 
>> per_driver_contexts[i].len);
>> +        free(per_driver_contexts[i].instances);
>> +        per_driver_contexts[i].device_present = false;
>> +        per_driver_contexts[i].len = 0;
>> +    }
>>   }
>>   -static void print_percentage_bar(double percent, int max_len)
>> +static int find_driver(struct igt_device_card *card)
> should be find_pci_driver?
Seems like "find_driver" is more appropriate. This is finding the driver 
index but not using any "pci only" condition.
>>   {
>> -    int bar_len, i, len = max_len - 1;
>> -    const int w = 8;
>> -
>> -    len -= printf("|%5.1f%% ", percent);
>> -
>> -    /* no space left for bars, do what we can */
>> -    if (len < 0)
>> -        len = 0;
>> -
>> -    bar_len = ceil(w * percent * len / 100.0);
>> -    if (bar_len > w * len)
>> -        bar_len = w * len;
>> +    for (int i = 0; drivers[i]; i++) {
>> +        if (strcmp(drivers[i], card->driver) == 0)
>> +            return i;
>> +    }
>> +    return -1;
>> +}
>>   -    for (i = bar_len; i >= w; i -= w)
>> -        printf("%s", bars[w]);
>> -    if (i)
>> -        printf("%s", bars[i]);
>> +static int populate_device_instances(const char *filter)
>> +{
>> +    struct igt_device_card *cards = NULL;
>> +    struct igt_device_card *card_inplace = NULL;
>> +    struct gputop_driver *driver_entry =  NULL;
>> +    int driver_no;
> should this be driver_num or driver_id?
Sure.
>> +    int count, final_count = 0;
>> +
>> +    count = igt_device_card_match_all(filter, &cards);
>> +    for (int j = 0; j < count; j++) {
>> +        if (strcmp(cards[j].subsystem, "pci") != 0)
>> +            continue;
>>   -    len -= (bar_len + (w - 1)) / w;
>> -    n_spaces(len);
>> +        driver_no = find_driver(&cards[j]);
>> +        if (driver_no < 0)
>> +            continue;
>>   -    putchar('|');
>> +        driver_entry = &per_driver_contexts[driver_no];
>> +        if (!driver_entry->device_present)
>> +            driver_entry->device_present = true;
> is this check only for reassignment of device_present or should it 
> include the below statements as well?
This check  sets device_present to true only once if at least one device 
for that specific driver is found.
>> +        driver_entry->len++;
>> +        driver_entry->instances = realloc(driver_entry->instances,
>> +                          driver_entry->len * 
>> sizeof_gputop_obj(driver_no));
> should this be renamed to driver_entry->engine_instances so we are 
> clear about which instances this refers to?

This instance is not exactly the engine instance. It represents a device 
of a specific driver. If there are multiple devices of a specific driver 
then there should be equal number of such instances and so the function 
is named as "populate_device_instances". It is documented where 
gputop_driver is defined.

Thanks,
Soham

>> +        if (!driver_entry->instances) {
>> +            fprintf(stderr,
>> +                "Device instance realloc failed (%s)\n",
>> +                strerror(errno));
>> +            exit(EXIT_FAILURE);
>> +        }
>
> should these be under the conditional if check above as well? Or will 
> we end up incrementing len twice and duplicate realloc? Unless this 
> refers to driver instances?
>
> Thanks,
>
> Vinay.
>
>> +        card_inplace = (struct igt_device_card *)
>> +                calloc(1, sizeof(struct igt_device_card));
>> +        memcpy(card_inplace, &cards[j], sizeof(struct 
>> igt_device_card));
>> +        ops[driver_no].gputop_init(driver_entry->instances, 
>> (driver_entry->len - 1),
>> +            card_inplace);
>> +        final_count++;
>> +    }
>> +    if (count)
>> +        free(cards);
>> +    return final_count;
>>   }
>>     static int
>> @@ -335,6 +414,28 @@ struct gputop_args {
>>       unsigned long delay_usec;
>>   };
>>   +static void countdown(const char *msg, const int start_sec)
>> +{
>> +    struct pollfd pfd;
>> +    int i, ret;
>> +    char ch;
>> +
>> +    for (i = start_sec; i > 0; i--) {
>> +        printf("\r%s%d... second(s)", msg, i);
>> +        fflush(stdout);
>> +
>> +        pfd.fd = STDIN_FILENO;
>> +        pfd.events = POLLIN;
>> +
>> +        ret = poll(&pfd, 1, 1000);
>> +        if (ret > 0 && (pfd.revents & POLLIN)) {
>> +            while ((ch = getchar()) != '\n' && ch != EOF)
>> +                continue;
>> +            return;
>> +        }
>> +    }
>> +}
>> +
>>   static void help(char *full_path)
>>   {
>>       const char *short_program_name = strrchr(full_path, '/');
>> @@ -349,7 +450,32 @@ static void help(char *full_path)
>>              "Options:\n"
>>              "\t-h, --help                show this help\n"
>>              "\t-d, --delay =SEC[.TENTHS] iterative delay as SECS 
>> [.TENTHS]\n"
>> -           "\t-n, --iterations =NUMBER  number of executions\n"
>> +           "\t-n, --iterations =NUMBER  number of executions\n\n"
>> +           "Running without root:\n"
>> +           "\tAs a non-root user, CAP_PERFMON or perf_event_paranoid 
>> is required to\n"
>> +           "\taccess engine busyness\n"
>> +           "\t" ANSI_HEADER "Steps to run without root (using 
>> CAP_PERFMON):"
>> +        ANSI_RESET "\n"
>> +           "\tcd /path/to/igt-gpu-tools/\n"
>> +           "\tsudo setcap cap_perfmon=+ep $(pwd)/build/tools/gputop\n"
>> +           "\tsudo sh -c \"echo $(pwd)/build/lib > 
>> /etc/ld.so.conf.d/lib-igt.conf\"\n"
>> +           "\tsudo ldconfig\n"
>> +           "\t" ANSI_HEADER "Steps to revert once done:" ANSI_RESET 
>> "\n"
>> +           "\tsudo setcap cap_perfmon=-ep $(pwd)/build/tools/gputop\n"
>> +           "\tsudo rm /etc/ld.so.conf.d/lib-igt.conf\n"
>> +           "\tsudo ldconfig\n"
>> +           "\n"
>> +           "\t" ANSI_HEADER "Steps to run without root (using 
>> perf_event_paranoid):"
>> +           ANSI_RESET "\n"
>> +           "\t\033[32m# Save current perf_event_paranoid 
>> value\033[0m\n"
>> +           "\torig_val=$(sysctl -n kernel.perf_event_paranoid)\n"
>> +           "\tsudo sysctl -w kernel.perf_event_paranoid=-1\n"
>> +           "\t" ANSI_HEADER "Steps to revert once done:" ANSI_RESET 
>> "\n"
>> +           "\t\033[32m# Restore original value\033[0m\n"
>> +           "\tsudo sysctl -w kernel.perf_event_paranoid=$orig_val\n\n"
>> +           "\tFor details, see 'Perf events and tool security':\n"
>> +           "\thttps://www.kernel.org/doc/html/"
>> +           "latest/admin-guide/perf-security.html\n\n"
>>              , short_program_name);
>>   }
>>   @@ -417,9 +543,12 @@ int main(int argc, char **argv)
>>       struct igt_profiled_device *profiled_devices = NULL;
>>       struct igt_drm_clients *clients = NULL;
>>       int con_w = -1, con_h = -1;
>> +    bool is_root;
>>       int ret;
>>       long n;
>>   +    is_root = (geteuid() == 0);
>> +
>>       ret = parse_args(argc, argv, &args);
>>       if (ret < 0)
>>           return EXIT_FAILURE;
>> @@ -428,6 +557,53 @@ int main(int argc, char **argv)
>>         n = args.n_iter;
>>       period_us = args.delay_usec;
>> +    populate_device_instances("device:subsystem=pci,card=all");
>> +
>> +    for (int i = 0; drivers[i]; i++) {
>> +        if (!per_driver_contexts[i].device_present)
>> +            continue;
>> +
>> +        for (int j = 0; j < per_driver_contexts[i].len; j++) {
>> +            if 
>> (!ops[i].init_engines(per_driver_contexts[i].instances, j)) {
>> +                fprintf(stderr,
>> +                    "Failed to initialize engines! (%s)\n",
>> +                    strerror(errno));
>> +                    gputop_clean_up();
>> +                return EXIT_FAILURE;
>> +            }
>> +            ret = ops[i].pmu_init(per_driver_contexts[i].instances, j);
>> +
>> +            if (ret) {
>> +                if (errno == EACCES && !is_root) {
>> +                    fprintf(stderr,
>> +                        "\n"
>> +                        "Running without root privileges.\n"
>> +                        "Engine busyness may not be available "
>> +                        "without root privileges.\n"
>> +                        "See \"--help\" to enable engine "
>> +                        "busyness without root.\n\n");
>> +                    igt_devices_free();
>> +                    gputop_clean_up();
>> +                    countdown("Resuming with only gpu client "
>> +                          "busyness in ", 5);
>> +                } else {
>> +                    fprintf(stderr,
>> +                        "Failed to initialize PMU! (%s)\n",
>> +                        strerror(errno));
>> +                    igt_devices_free();
>> +                    gputop_clean_up();
>> +                    return EXIT_FAILURE;
>> +                }
>> +            }
>> +        }
>> +    }
>> +
>> +    for (int i = 0; drivers[i]; i++) {
>> +        for (int j = 0;
>> +             per_driver_contexts[i].device_present && j < 
>> per_driver_contexts[i].len;
>> +             j++)
>> +            ops[i].pmu_sample(per_driver_contexts[i].instances, j);
>> +    }
>>         clients = igt_drm_clients_init(NULL);
>>       if (!clients)
>> @@ -449,22 +625,42 @@ int main(int argc, char **argv)
>>         while ((n != 0) && !stop_top) {
>>           struct igt_drm_client *c, *prevc = NULL;
>> -        int i, engine_w = 0, lines = 0;
>> +        int k, engine_w = 0, lines = 0;
>>             igt_drm_clients_scan(clients, NULL, NULL, 0, NULL, 0);
>> +
>> +        for (int i = 0; drivers[i]; i++) {
>> +            for (int j = 0;
>> +                 per_driver_contexts[i].device_present &&
>> +                 j < per_driver_contexts[i].len;
>> +                 j++)
>> + ops[i].pmu_sample(per_driver_contexts[i].instances, j);
>> +        }
>> +
>>           igt_drm_clients_sort(clients, client_cmp);
>>             update_console_size(&con_w, &con_h);
>>           clrscr();
>>   +        for (int i = 0; drivers[i]; i++) {
>> +            for (int j = 0;
>> +                 per_driver_contexts[i].device_present &&
>> +                 j < per_driver_contexts[i].len;
>> +                 j++) {
>> +                lines = 
>> ops[i].print_engines(per_driver_contexts[i].instances, j,
>> +                             lines, con_w, con_h);
>> +            }
>> +        }
>> +
>>           if (!clients->num_clients) {
>> -            const char *msg = " (No GPU clients yet. Start workload 
>> to see stats)";
>> +            const char *msg;
>>   +            msg = " (No GPU clients yet. Start workload to see 
>> stats)";
>>               printf(ANSI_HEADER "%-*s" ANSI_RESET "\n",
>>                      (int)(con_w - strlen(msg) - 1), msg);
>>           }
>>   -        igt_for_each_drm_client(clients, c, i) {
>> +        igt_for_each_drm_client(clients, c, k) {
>>               assert(c->status != IGT_DRM_CLIENT_PROBE);
>>               if (c->status != IGT_DRM_CLIENT_ALIVE)
>>                   break; /* Active clients are first in the array. */
>> @@ -488,11 +684,11 @@ int main(int argc, char **argv)
>>       }
>>         igt_drm_clients_free(clients);
>> +    gputop_clean_up();
>>         if (profiled_devices != NULL) {
>>           igt_devices_configure_profiling(profiled_devices, false);
>>           igt_devices_free_profiling(profiled_devices);
>>       }
>> -
>>       return 0;
>>   }
>> diff --git a/tools/gputop.src/meson.build b/tools/gputop.src/meson.build
>> index ec39f4c7a..e95657fca 100644
>> --- a/tools/gputop.src/meson.build
>> +++ b/tools/gputop.src/meson.build
>> @@ -1 +1 @@
>> -gputop_src = files('gputop.c')
>> +gputop_src = files('gputop.c', 'utils.c', 'xe_gputop.c')
>> diff --git a/tools/meson.build b/tools/meson.build
>> index 521607a4c..caca57d0e 100644
>> --- a/tools/meson.build
>> +++ b/tools/meson.build
>> @@ -123,4 +123,5 @@ subdir('gputop.src')
>>   executable('gputop', sources : gputop_src,
>>          install : true,
>>          install_rpath : bindir_rpathdir,
>> -       dependencies : 
>> [lib_igt_drm_clients,lib_igt_drm_fdinfo,lib_igt_profiling,math],)
>> +       dependencies : 
>> [lib_igt_perf,lib_igt_device_scan,lib_igt_drm_clients,
>> +                  lib_igt_drm_fdinfo,lib_igt_profiling,math],)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH i-g-t v7 4/5] tools/gputop.src/gputop: Enable support for multiple GPUs and instances
  2026-02-12 23:28   ` Belgaumkar, Vinay
  2026-02-18 16:23     ` Purkait, Soham
@ 2026-02-18 17:10     ` Purkait, Soham
  1 sibling, 0 replies; 11+ messages in thread
From: Purkait, Soham @ 2026-02-18 17:10 UTC (permalink / raw)
  To: Belgaumkar, Vinay, igt-dev, riana.tauro, badal.nilawar,
	kamil.konieczny, ashutosh.dixit
  Cc: anshuman.gupta, umesh.nerlige.ramappa

Hi Vinay,

On 13-02-2026 04:58, Belgaumkar, Vinay wrote:
>
> On 1/30/2026 1:53 AM, Soham Purkait wrote:
>> Introduce vendor-agnostic support for handling multiple GPUs and
>> instances in gputop. Improve the tool's adaptability to various GPU
>> configurations.
>>
>> v1:
>>   - Refactor GPUTOP into a vendor-agnostic tool. (Lucas)
>> v2:
>>   - Cosmetic changes. (Riana)
>>   - Avoid three level indentation. (Riana)
>> v3:
>>   - Add device filter to populate the array of cards for
>>     all supported drivers. (Zbigniew)
>> v4:
>>   - Add user message for running without root privileges. (Kamil)
>> v5:
>>   - Add support for GPU client-only busyness on unsupported
>>     drivers as a fallback mechanism. (Kamil)
>> v6:
>>   - Remove unused dependencies and headers. (Kamil)
>>
>> Signed-off-by: Soham Purkait <soham.purkait@intel.com>
>> ---
>>   tools/gputop.src/gputop.c    | 278 +++++++++++++++++++++++++++++------
>>   tools/gputop.src/meson.build |   2 +-
>>   tools/meson.build            |   3 +-
>>   3 files changed, 240 insertions(+), 43 deletions(-)
>>
>> diff --git a/tools/gputop.src/gputop.c b/tools/gputop.src/gputop.c
>> index f577a1750..7d4515f8f 100644
>> --- a/tools/gputop.src/gputop.c
>> +++ b/tools/gputop.src/gputop.c
>> @@ -1,6 +1,6 @@
>>   // SPDX-License-Identifier: MIT
>>   /*
>> - * Copyright © 2023 Intel Corporation
>> + * Copyright © 2023-2025 Intel Corporation
>>    */
>>     #include <assert.h>
>> @@ -14,66 +14,145 @@
>>   #include <math.h>
>>   #include <poll.h>
>>   #include <signal.h>
>> +#include <stdbool.h>
>>   #include <stdint.h>
>>   #include <stdio.h>
>>   #include <stdlib.h>
>>   #include <string.h>
>>   #include <sys/ioctl.h>
>>   #include <sys/stat.h>
>> +#include <sys/sysmacros.h>
>>   #include <sys/types.h>
>> -#include <unistd.h>
>>   #include <termios.h>
>> -#include <sys/sysmacros.h>
>> -#include <stdbool.h>
>> +#include <unistd.h>
>>   -#include "igt_core.h"
>>   #include "igt_drm_clients.h"
>>   #include "igt_drm_fdinfo.h"
>>   #include "igt_profiling.h"
>> -#include "drmtest.h"
>> +#include "xe_gputop.h"
>> +
>> +/**
>> + * Supported Drivers
>> + *
>> + * Adhere to the following requirements when implementing support 
>> for the
>> + * new driver:
>> + * @drivers: Update drivers[] with driver string.
>> + * @sizeof_gputop_obj: Update this function as per new driver 
>> support included.
>> + * @operations: Update the respective operations of the new driver:
>> + * gputop_init,
>> + * discover_engines,
>> + * pmu_init,
>> + * pmu_sample,
>> + * print_engines,
>> + * clean_up
>> + * @per_driver_contexts: Update per_driver_contexts[] array of type 
>> "struct gputop_driver" with the
>> + * initial values.
>> + */
>> +static const char * const drivers[] = {
>> +    "xe",
>> +    /* Keep the last one as NULL */
>> +    NULL
>> +};
>> +
>> +static size_t sizeof_gputop_obj(int driver_num)
>> +{
>> +    switch (driver_num) {
>> +    case 0:
> Might be worthwhile using an enum like INTEL_XE_DRIVER instead of 
> hardcoded values.
>> +        return sizeof(struct xe_gputop);
>> +    default:
>> +        fprintf(stderr,
>> +            "Driver number does not exist.\n");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +}
>> +
>> +/**
>> + * Supported operations on driver instances. Update the ops[] array for
>> + * each individual driver specific function. Maintain the sequence 
>> as per
>> + * drivers[] array.
>> + */
>> +struct device_operations ops[] = {
>> +    {
>> +        xe_gputop_init,
>> +        xe_populate_engines,
>> +        xe_pmu_init,
>> +        xe_pmu_sample,
>> +        xe_print_engines,
>> +        xe_clean_up
>> +    }
>> +};
>> +
>> +/*
>> + * per_driver_contexts[] array of type struct gputop_driver which 
>> keeps track of the devices
>> + * and related info discovered per driver.
>> + */
>> +struct gputop_driver per_driver_contexts[] = {
> This is already per driver, no need to add that in the name? just 
> driver_context or pci_driver_context? Also, where is gputop_driver 
> struct defined? (Didn't find it in this patch set)
>> +    {false, 0, NULL}
>> +};
>>     enum utilization_type {
>>       UTILIZATION_TYPE_ENGINE_TIME,
>>       UTILIZATION_TYPE_TOTAL_CYCLES,
>>   };
>>   -static const char *bars[] = { " ", "▏", "▎", "▍", "▌", "▋", "▊", 
>> "▉", "█" };
>> -
>> -#define ANSI_HEADER "\033[7m"
>> -#define ANSI_RESET "\033[0m"
>> -
>> -static void n_spaces(const unsigned int n)
>> +static void gputop_clean_up(void)
>>   {
>> -    unsigned int i;
>> -
>> -    for (i = 0; i < n; i++)
>> -        putchar(' ');
>> +    for (int i = 0; drivers[i]; i++) {
>> +        ops[i].clean_up(per_driver_contexts[i].instances, 
>> per_driver_contexts[i].len);
>> +        free(per_driver_contexts[i].instances);
>> +        per_driver_contexts[i].device_present = false;
>> +        per_driver_contexts[i].len = 0;
>> +    }
>>   }
>>   -static void print_percentage_bar(double percent, int max_len)
>> +static int find_driver(struct igt_device_card *card)
> should be find_pci_driver?
>>   {
>> -    int bar_len, i, len = max_len - 1;
>> -    const int w = 8;
>> -
>> -    len -= printf("|%5.1f%% ", percent);
>> -
>> -    /* no space left for bars, do what we can */
>> -    if (len < 0)
>> -        len = 0;
>> -
>> -    bar_len = ceil(w * percent * len / 100.0);
>> -    if (bar_len > w * len)
>> -        bar_len = w * len;
>> +    for (int i = 0; drivers[i]; i++) {
>> +        if (strcmp(drivers[i], card->driver) == 0)
>> +            return i;
>> +    }
>> +    return -1;
>> +}
>>   -    for (i = bar_len; i >= w; i -= w)
>> -        printf("%s", bars[w]);
>> -    if (i)
>> -        printf("%s", bars[i]);
>> +static int populate_device_instances(const char *filter)
>> +{
>> +    struct igt_device_card *cards = NULL;
>> +    struct igt_device_card *card_inplace = NULL;
>> +    struct gputop_driver *driver_entry =  NULL;
>> +    int driver_no;
> should this be driver_num or driver_id?
>> +    int count, final_count = 0;
>> +
>> +    count = igt_device_card_match_all(filter, &cards);
>> +    for (int j = 0; j < count; j++) {
>> +        if (strcmp(cards[j].subsystem, "pci") != 0)
>> +            continue;
>>   -    len -= (bar_len + (w - 1)) / w;
>> -    n_spaces(len);
>> +        driver_no = find_driver(&cards[j]);
>> +        if (driver_no < 0)
>> +            continue;
>>   -    putchar('|');
>> +        driver_entry = &per_driver_contexts[driver_no];
>> +        if (!driver_entry->device_present)
>> +            driver_entry->device_present = true;
> is this check only for reassignment of device_present or should it 
> include the below statements as well?
>> +        driver_entry->len++;
>> +        driver_entry->instances = realloc(driver_entry->instances,
>> +                          driver_entry->len * 
>> sizeof_gputop_obj(driver_no));
> should this be renamed to driver_entry->engine_instances so we are 
> clear about which instances this refers to?
>> +        if (!driver_entry->instances) {
>> +            fprintf(stderr,
>> +                "Device instance realloc failed (%s)\n",
>> +                strerror(errno));
>> +            exit(EXIT_FAILURE);
>> +        }
>
> should these be under the conditional if check above as well? Or will 
> we end up incrementing len twice and duplicate realloc? Unless this 
> refers to driver instances?

I guess no. The  device_present (for a specific driver) is set to true 
only once during the 1st device discovery of that specific driver. After 
that if any more device of that driver is present it is not touched but 
len does increases accordingly.

Thanks,
Soham

>
> Thanks,
>
> Vinay.
>
>> +        card_inplace = (struct igt_device_card *)
>> +                calloc(1, sizeof(struct igt_device_card));
>> +        memcpy(card_inplace, &cards[j], sizeof(struct 
>> igt_device_card));
>> +        ops[driver_no].gputop_init(driver_entry->instances, 
>> (driver_entry->len - 1),
>> +            card_inplace);
>> +        final_count++;
>> +    }
>> +    if (count)
>> +        free(cards);
>> +    return final_count;
>>   }
>>     static int
>> @@ -335,6 +414,28 @@ struct gputop_args {
>>       unsigned long delay_usec;
>>   };
>>   +static void countdown(const char *msg, const int start_sec)
>> +{
>> +    struct pollfd pfd;
>> +    int i, ret;
>> +    char ch;
>> +
>> +    for (i = start_sec; i > 0; i--) {
>> +        printf("\r%s%d... second(s)", msg, i);
>> +        fflush(stdout);
>> +
>> +        pfd.fd = STDIN_FILENO;
>> +        pfd.events = POLLIN;
>> +
>> +        ret = poll(&pfd, 1, 1000);
>> +        if (ret > 0 && (pfd.revents & POLLIN)) {
>> +            while ((ch = getchar()) != '\n' && ch != EOF)
>> +                continue;
>> +            return;
>> +        }
>> +    }
>> +}
>> +
>>   static void help(char *full_path)
>>   {
>>       const char *short_program_name = strrchr(full_path, '/');
>> @@ -349,7 +450,32 @@ static void help(char *full_path)
>>              "Options:\n"
>>              "\t-h, --help                show this help\n"
>>              "\t-d, --delay =SEC[.TENTHS] iterative delay as SECS 
>> [.TENTHS]\n"
>> -           "\t-n, --iterations =NUMBER  number of executions\n"
>> +           "\t-n, --iterations =NUMBER  number of executions\n\n"
>> +           "Running without root:\n"
>> +           "\tAs a non-root user, CAP_PERFMON or perf_event_paranoid 
>> is required to\n"
>> +           "\taccess engine busyness\n"
>> +           "\t" ANSI_HEADER "Steps to run without root (using 
>> CAP_PERFMON):"
>> +        ANSI_RESET "\n"
>> +           "\tcd /path/to/igt-gpu-tools/\n"
>> +           "\tsudo setcap cap_perfmon=+ep $(pwd)/build/tools/gputop\n"
>> +           "\tsudo sh -c \"echo $(pwd)/build/lib > 
>> /etc/ld.so.conf.d/lib-igt.conf\"\n"
>> +           "\tsudo ldconfig\n"
>> +           "\t" ANSI_HEADER "Steps to revert once done:" ANSI_RESET 
>> "\n"
>> +           "\tsudo setcap cap_perfmon=-ep $(pwd)/build/tools/gputop\n"
>> +           "\tsudo rm /etc/ld.so.conf.d/lib-igt.conf\n"
>> +           "\tsudo ldconfig\n"
>> +           "\n"
>> +           "\t" ANSI_HEADER "Steps to run without root (using 
>> perf_event_paranoid):"
>> +           ANSI_RESET "\n"
>> +           "\t\033[32m# Save current perf_event_paranoid 
>> value\033[0m\n"
>> +           "\torig_val=$(sysctl -n kernel.perf_event_paranoid)\n"
>> +           "\tsudo sysctl -w kernel.perf_event_paranoid=-1\n"
>> +           "\t" ANSI_HEADER "Steps to revert once done:" ANSI_RESET 
>> "\n"
>> +           "\t\033[32m# Restore original value\033[0m\n"
>> +           "\tsudo sysctl -w kernel.perf_event_paranoid=$orig_val\n\n"
>> +           "\tFor details, see 'Perf events and tool security':\n"
>> +           "\thttps://www.kernel.org/doc/html/"
>> +           "latest/admin-guide/perf-security.html\n\n"
>>              , short_program_name);
>>   }
>>   @@ -417,9 +543,12 @@ int main(int argc, char **argv)
>>       struct igt_profiled_device *profiled_devices = NULL;
>>       struct igt_drm_clients *clients = NULL;
>>       int con_w = -1, con_h = -1;
>> +    bool is_root;
>>       int ret;
>>       long n;
>>   +    is_root = (geteuid() == 0);
>> +
>>       ret = parse_args(argc, argv, &args);
>>       if (ret < 0)
>>           return EXIT_FAILURE;
>> @@ -428,6 +557,53 @@ int main(int argc, char **argv)
>>         n = args.n_iter;
>>       period_us = args.delay_usec;
>> +    populate_device_instances("device:subsystem=pci,card=all");
>> +
>> +    for (int i = 0; drivers[i]; i++) {
>> +        if (!per_driver_contexts[i].device_present)
>> +            continue;
>> +
>> +        for (int j = 0; j < per_driver_contexts[i].len; j++) {
>> +            if 
>> (!ops[i].init_engines(per_driver_contexts[i].instances, j)) {
>> +                fprintf(stderr,
>> +                    "Failed to initialize engines! (%s)\n",
>> +                    strerror(errno));
>> +                    gputop_clean_up();
>> +                return EXIT_FAILURE;
>> +            }
>> +            ret = ops[i].pmu_init(per_driver_contexts[i].instances, j);
>> +
>> +            if (ret) {
>> +                if (errno == EACCES && !is_root) {
>> +                    fprintf(stderr,
>> +                        "\n"
>> +                        "Running without root privileges.\n"
>> +                        "Engine busyness may not be available "
>> +                        "without root privileges.\n"
>> +                        "See \"--help\" to enable engine "
>> +                        "busyness without root.\n\n");
>> +                    igt_devices_free();
>> +                    gputop_clean_up();
>> +                    countdown("Resuming with only gpu client "
>> +                          "busyness in ", 5);
>> +                } else {
>> +                    fprintf(stderr,
>> +                        "Failed to initialize PMU! (%s)\n",
>> +                        strerror(errno));
>> +                    igt_devices_free();
>> +                    gputop_clean_up();
>> +                    return EXIT_FAILURE;
>> +                }
>> +            }
>> +        }
>> +    }
>> +
>> +    for (int i = 0; drivers[i]; i++) {
>> +        for (int j = 0;
>> +             per_driver_contexts[i].device_present && j < 
>> per_driver_contexts[i].len;
>> +             j++)
>> +            ops[i].pmu_sample(per_driver_contexts[i].instances, j);
>> +    }
>>         clients = igt_drm_clients_init(NULL);
>>       if (!clients)
>> @@ -449,22 +625,42 @@ int main(int argc, char **argv)
>>         while ((n != 0) && !stop_top) {
>>           struct igt_drm_client *c, *prevc = NULL;
>> -        int i, engine_w = 0, lines = 0;
>> +        int k, engine_w = 0, lines = 0;
>>             igt_drm_clients_scan(clients, NULL, NULL, 0, NULL, 0);
>> +
>> +        for (int i = 0; drivers[i]; i++) {
>> +            for (int j = 0;
>> +                 per_driver_contexts[i].device_present &&
>> +                 j < per_driver_contexts[i].len;
>> +                 j++)
>> + ops[i].pmu_sample(per_driver_contexts[i].instances, j);
>> +        }
>> +
>>           igt_drm_clients_sort(clients, client_cmp);
>>             update_console_size(&con_w, &con_h);
>>           clrscr();
>>   +        for (int i = 0; drivers[i]; i++) {
>> +            for (int j = 0;
>> +                 per_driver_contexts[i].device_present &&
>> +                 j < per_driver_contexts[i].len;
>> +                 j++) {
>> +                lines = 
>> ops[i].print_engines(per_driver_contexts[i].instances, j,
>> +                             lines, con_w, con_h);
>> +            }
>> +        }
>> +
>>           if (!clients->num_clients) {
>> -            const char *msg = " (No GPU clients yet. Start workload 
>> to see stats)";
>> +            const char *msg;
>>   +            msg = " (No GPU clients yet. Start workload to see 
>> stats)";
>>               printf(ANSI_HEADER "%-*s" ANSI_RESET "\n",
>>                      (int)(con_w - strlen(msg) - 1), msg);
>>           }
>>   -        igt_for_each_drm_client(clients, c, i) {
>> +        igt_for_each_drm_client(clients, c, k) {
>>               assert(c->status != IGT_DRM_CLIENT_PROBE);
>>               if (c->status != IGT_DRM_CLIENT_ALIVE)
>>                   break; /* Active clients are first in the array. */
>> @@ -488,11 +684,11 @@ int main(int argc, char **argv)
>>       }
>>         igt_drm_clients_free(clients);
>> +    gputop_clean_up();
>>         if (profiled_devices != NULL) {
>>           igt_devices_configure_profiling(profiled_devices, false);
>>           igt_devices_free_profiling(profiled_devices);
>>       }
>> -
>>       return 0;
>>   }
>> diff --git a/tools/gputop.src/meson.build b/tools/gputop.src/meson.build
>> index ec39f4c7a..e95657fca 100644
>> --- a/tools/gputop.src/meson.build
>> +++ b/tools/gputop.src/meson.build
>> @@ -1 +1 @@
>> -gputop_src = files('gputop.c')
>> +gputop_src = files('gputop.c', 'utils.c', 'xe_gputop.c')
>> diff --git a/tools/meson.build b/tools/meson.build
>> index 521607a4c..caca57d0e 100644
>> --- a/tools/meson.build
>> +++ b/tools/meson.build
>> @@ -123,4 +123,5 @@ subdir('gputop.src')
>>   executable('gputop', sources : gputop_src,
>>          install : true,
>>          install_rpath : bindir_rpathdir,
>> -       dependencies : 
>> [lib_igt_drm_clients,lib_igt_drm_fdinfo,lib_igt_profiling,math],)
>> +       dependencies : 
>> [lib_igt_perf,lib_igt_device_scan,lib_igt_drm_clients,
>> +                  lib_igt_drm_fdinfo,lib_igt_profiling,math],)

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-02-18 17:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-30  9:53 [PATCH i-g-t v7 0/5] Close any open drm device after engine initialization in GPUTOP Soham Purkait
2026-01-30  9:53 ` [PATCH i-g-t v7 1/5] tools: Rename tools/gputop to tools/gputop.src Soham Purkait
2026-01-30  9:53 ` [PATCH i-g-t v7 2/5] tools/gputop.src/utils: Add clamp macro to remove dependency on lib/xe/* Soham Purkait
2026-01-30  9:53 ` [PATCH i-g-t v7 3/5] tools/gputop.src/xe_gputop: Refactor xe_populate_engines to close card_fd and use direct ioctl calls Soham Purkait
2026-02-12  1:41   ` Belgaumkar, Vinay
2026-02-18 15:27     ` Purkait, Soham
2026-01-30  9:53 ` [PATCH i-g-t v7 4/5] tools/gputop.src/gputop: Enable support for multiple GPUs and instances Soham Purkait
2026-02-12 23:28   ` Belgaumkar, Vinay
2026-02-18 16:23     ` Purkait, Soham
2026-02-18 17:10     ` Purkait, Soham
2026-01-30  9:53 ` [PATCH i-g-t v7 5/5] tools/gputop.src/gputop: Add command line option for device filter Soham Purkait

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox