* [PATCH v14 1/7] perf/core: Add PMU_EVENT_ATTR_ID_STRING
2025-01-22 6:23 [PATCH v14 0/7] drm/xe/pmu: PMU interface for Xe Lucas De Marchi
@ 2025-01-22 6:23 ` Lucas De Marchi
2025-02-22 12:42 ` Ingo Molnar
2025-01-22 6:23 ` [PATCH v14 2/7] drm/xe/pmu: Enable PMU interface Lucas De Marchi
` (5 subsequent siblings)
6 siblings, 1 reply; 15+ messages in thread
From: Lucas De Marchi @ 2025-01-22 6:23 UTC (permalink / raw)
To: intel-xe
Cc: Rodrigo Vivi, Vinay Belgaumkar, Riana Tauro, Peter Zijlstra,
linux-perf-users, Lucas De Marchi
struct perf_pmu_events_attr has both id and event_str however zeroes
the id and only set event_str. Add another macro that allows to set both
so drivers can make use of them. The id is useful for determining the
visibility of the attributes without resorting to creating separate
groups passed via update_attr, while the event_str is still useful for
attributes like *.unit or *.scale.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
include/linux/perf_event.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index cb99ec8c9e96f..423f21b51cb0f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1923,13 +1923,16 @@ static struct perf_pmu_events_attr _var = { \
.id = _id, \
};
-#define PMU_EVENT_ATTR_STRING(_name, _var, _str) \
+#define PMU_EVENT_ATTR_ID_STRING(_name, _var, _id, _str) \
static struct perf_pmu_events_attr _var = { \
.attr = __ATTR(_name, 0444, perf_event_sysfs_show, NULL), \
- .id = 0, \
+ .id = _id, \
.event_str = _str, \
};
+#define PMU_EVENT_ATTR_STRING(_name, _var, _str) \
+ PMU_EVENT_ATTR_ID_STRING(_name, _var, 0, _str)
+
#define PMU_EVENT_ATTR_ID(_name, _show, _id) \
(&((struct perf_pmu_events_attr[]) { \
{ .attr = __ATTR(_name, 0444, _show, NULL), \
--
2.48.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH v14 1/7] perf/core: Add PMU_EVENT_ATTR_ID_STRING
2025-01-22 6:23 ` [PATCH v14 1/7] perf/core: Add PMU_EVENT_ATTR_ID_STRING Lucas De Marchi
@ 2025-02-22 12:42 ` Ingo Molnar
0 siblings, 0 replies; 15+ messages in thread
From: Ingo Molnar @ 2025-02-22 12:42 UTC (permalink / raw)
To: Lucas De Marchi
Cc: intel-xe, Rodrigo Vivi, Vinay Belgaumkar, Riana Tauro,
Peter Zijlstra, linux-perf-users
* Lucas De Marchi <lucas.demarchi@intel.com> wrote:
> struct perf_pmu_events_attr has both id and event_str however zeroes
> the id and only set event_str. Add another macro that allows to set both
> so drivers can make use of them. The id is useful for determining the
> visibility of the attributes without resorting to creating separate
> groups passed via update_attr, while the event_str is still useful for
> attributes like *.unit or *.scale.
>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
> include/linux/perf_event.h | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
Acked-by: Ingo Molnar <mingo@kernel.org>
Thanks,
Ingo
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v14 2/7] drm/xe/pmu: Enable PMU interface
2025-01-22 6:23 [PATCH v14 0/7] drm/xe/pmu: PMU interface for Xe Lucas De Marchi
2025-01-22 6:23 ` [PATCH v14 1/7] perf/core: Add PMU_EVENT_ATTR_ID_STRING Lucas De Marchi
@ 2025-01-22 6:23 ` Lucas De Marchi
2025-01-22 10:51 ` Riana Tauro
2025-01-22 6:23 ` [PATCH v14 3/7] drm/xe/pmu: Assert max gt Lucas De Marchi
` (4 subsequent siblings)
6 siblings, 1 reply; 15+ messages in thread
From: Lucas De Marchi @ 2025-01-22 6:23 UTC (permalink / raw)
To: intel-xe
Cc: Rodrigo Vivi, Vinay Belgaumkar, Riana Tauro, Peter Zijlstra,
linux-perf-users, Lucas De Marchi
From: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Basic PMU enabling patch. Setup the basic framework
for adding events/timers. This patch was previously
reviewed here -
https://patchwork.freedesktop.org/series/119504/
Based on previous versions by Bommu Krishnaiah, Aravind Iddamsetty and
Riana Tauro, using i915 and rapl as reference implementation.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
drivers/gpu/drm/xe/Makefile | 2 +
drivers/gpu/drm/xe/xe_device.c | 3 +
drivers/gpu/drm/xe/xe_device_types.h | 4 +
drivers/gpu/drm/xe/xe_pmu.c | 299 +++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_pmu.h | 20 ++
drivers/gpu/drm/xe/xe_pmu_types.h | 43 ++++
6 files changed, 371 insertions(+)
create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 68861db5f27ce..aa0d981663e4c 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -305,6 +305,8 @@ endif
xe-$(CONFIG_DRM_XE_DP_TUNNEL) += \
i915-display/intel_dp_tunnel.o
+xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
+
obj-$(CONFIG_DRM_XE) += xe.o
obj-$(CONFIG_DRM_XE_KUNIT_TEST) += tests/
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index bd6191e1ed3e7..f3f754beb812b 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -49,6 +49,7 @@
#include "xe_pat.h"
#include "xe_pcode.h"
#include "xe_pm.h"
+#include "xe_pmu.h"
#include "xe_query.h"
#include "xe_sriov.h"
#include "xe_tile.h"
@@ -871,6 +872,8 @@ int xe_device_probe(struct xe_device *xe)
xe_oa_register(xe);
+ xe_pmu_register(&xe->pmu);
+
xe_debugfs_register(xe);
xe_hwmon_register(xe);
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 16ebb2859877f..58e79e19deaad 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -18,6 +18,7 @@
#include "xe_memirq_types.h"
#include "xe_oa_types.h"
#include "xe_platform_types.h"
+#include "xe_pmu_types.h"
#include "xe_pt_types.h"
#include "xe_sriov_types.h"
#include "xe_step_types.h"
@@ -514,6 +515,9 @@ struct xe_device {
int mode;
} wedged;
+ /** @pmu: performance monitoring unit */
+ struct xe_pmu pmu;
+
#ifdef TEST_VM_OPS_ERROR
/**
* @vm_inject_error_position: inject errors at different places in VM
diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
new file mode 100644
index 0000000000000..c5641af6e9a91
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_pmu.c
@@ -0,0 +1,299 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include <drm/drm_drv.h>
+#include <drm/drm_managed.h>
+#include <drm/xe_drm.h>
+
+#include "regs/xe_gt_regs.h"
+#include "xe_device.h"
+#include "xe_force_wake.h"
+#include "xe_gt_clock.h"
+#include "xe_gt_printk.h"
+#include "xe_mmio.h"
+#include "xe_macros.h"
+#include "xe_pm.h"
+#include "xe_pmu.h"
+
+/**
+ * DOC: Xe PMU (Performance Monitoring Unit)
+ *
+ * Expose events/counters like GT-C6 residency and GT frequency to user land.
+ * Events are per device. The GT can be selected with an extra config sub-field
+ * (bits 60-63).
+ *
+ * All events are listed in sysfs:
+ *
+ * $ ls -ld /sys/bus/event_source/devices/xe_*
+ * $ ls /sys/bus/event_source/devices/xe_0000_00_02.0/events/
+ * $ ls /sys/bus/event_source/devices/xe_0000_00_02.0/format/
+ *
+ * The format directory has info regarding the configs that can be used.
+ * The standard perf tool can be used to grep for a certain event as well.
+ * Example:
+ *
+ * $ perf list | grep gt-c6
+ *
+ * To sample a specific event for a GT at regular intervals:
+ *
+ * $ perf stat -e <event_name,gt=> -I <interval>
+ */
+
+#define XE_PMU_EVENT_GT_MASK GENMASK_ULL(63, 60)
+#define XE_PMU_EVENT_ID_MASK GENMASK_ULL(11, 0)
+
+static unsigned int config_to_event_id(u64 config)
+{
+ return FIELD_GET(XE_PMU_EVENT_ID_MASK, config);
+}
+
+static unsigned int config_to_gt_id(u64 config)
+{
+ return FIELD_GET(XE_PMU_EVENT_GT_MASK, config);
+}
+
+static struct xe_gt *event_to_gt(struct perf_event *event)
+{
+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
+ u64 gt = config_to_gt_id(event->attr.config);
+
+ return xe_device_get_gt(xe, gt);
+}
+
+static bool event_supported(struct xe_pmu *pmu, unsigned int gt,
+ unsigned int id)
+{
+ if (gt >= XE_MAX_GT_PER_TILE)
+ return false;
+
+ return false;
+}
+
+static void xe_pmu_event_destroy(struct perf_event *event)
+{
+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
+
+ drm_WARN_ON(&xe->drm, event->parent);
+ drm_dev_put(&xe->drm);
+}
+
+static int xe_pmu_event_init(struct perf_event *event)
+{
+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
+ struct xe_pmu *pmu = &xe->pmu;
+ unsigned int id, gt;
+
+ if (!pmu->registered)
+ return -ENODEV;
+
+ if (event->attr.type != event->pmu->type)
+ return -ENOENT;
+
+ /* unsupported modes and filters */
+ if (event->attr.sample_period) /* no sampling */
+ return -EINVAL;
+
+ if (event->cpu < 0)
+ return -EINVAL;
+
+ gt = config_to_gt_id(event->attr.config);
+ id = config_to_event_id(event->attr.config);
+ if (!event_supported(pmu, gt, id))
+ return -ENOENT;
+
+ if (has_branch_stack(event))
+ return -EOPNOTSUPP;
+
+ if (!event->parent) {
+ drm_dev_get(&xe->drm);
+ event->destroy = xe_pmu_event_destroy;
+ }
+
+ return 0;
+}
+
+static u64 __xe_pmu_event_read(struct perf_event *event)
+{
+ struct xe_gt *gt = event_to_gt(event);
+ u64 val = 0;
+
+ if (!gt)
+ return 0;
+
+ return val;
+}
+
+static void xe_pmu_event_read(struct perf_event *event)
+{
+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
+ struct hw_perf_event *hwc = &event->hw;
+ struct xe_pmu *pmu = &xe->pmu;
+ u64 prev, new;
+
+ if (!pmu->registered) {
+ event->hw.state = PERF_HES_STOPPED;
+ return;
+ }
+
+ prev = local64_read(&hwc->prev_count);
+ do {
+ new = __xe_pmu_event_read(event);
+ } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new));
+
+ local64_add(new - prev, &event->count);
+}
+
+static void xe_pmu_enable(struct perf_event *event)
+{
+ /*
+ * Store the current counter value so we can report the correct delta
+ * for all listeners. Even when the event was already enabled and has
+ * an existing non-zero value.
+ */
+ local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
+}
+
+static void xe_pmu_event_start(struct perf_event *event, int flags)
+{
+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
+ struct xe_pmu *pmu = &xe->pmu;
+
+ if (!pmu->registered)
+ return;
+
+ xe_pmu_enable(event);
+ event->hw.state = 0;
+}
+
+static void xe_pmu_event_stop(struct perf_event *event, int flags)
+{
+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
+ struct xe_pmu *pmu = &xe->pmu;
+
+ if (pmu->registered)
+ if (flags & PERF_EF_UPDATE)
+ xe_pmu_event_read(event);
+
+ event->hw.state = PERF_HES_STOPPED;
+}
+
+static int xe_pmu_event_add(struct perf_event *event, int flags)
+{
+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
+ struct xe_pmu *pmu = &xe->pmu;
+
+ if (!pmu->registered)
+ return -ENODEV;
+
+ if (flags & PERF_EF_START)
+ xe_pmu_event_start(event, flags);
+
+ return 0;
+}
+
+static void xe_pmu_event_del(struct perf_event *event, int flags)
+{
+ xe_pmu_event_stop(event, PERF_EF_UPDATE);
+}
+
+PMU_FORMAT_ATTR(gt, "config:60-63");
+PMU_FORMAT_ATTR(event, "config:0-11");
+
+static struct attribute *pmu_format_attrs[] = {
+ &format_attr_event.attr,
+ &format_attr_gt.attr,
+ NULL,
+};
+
+static const struct attribute_group pmu_format_attr_group = {
+ .name = "format",
+ .attrs = pmu_format_attrs,
+};
+
+static struct attribute *pmu_event_attrs[] = {
+ /* No events yet */
+ NULL,
+};
+
+static const struct attribute_group pmu_events_attr_group = {
+ .name = "events",
+ .attrs = pmu_event_attrs,
+};
+
+/**
+ * xe_pmu_unregister() - Remove/cleanup PMU registration
+ * @arg: Ptr to pmu
+ */
+static void xe_pmu_unregister(void *arg)
+{
+ struct xe_pmu *pmu = arg;
+ struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
+
+ if (!pmu->registered)
+ return;
+
+ pmu->registered = false;
+
+ perf_pmu_unregister(&pmu->base);
+ kfree(pmu->name);
+}
+
+/**
+ * xe_pmu_register() - Define basic PMU properties for Xe and add event callbacks.
+ * @pmu: the PMU object
+ *
+ * Returns 0 on success and an appropriate error code otherwise
+ */
+int xe_pmu_register(struct xe_pmu *pmu)
+{
+ struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
+ static const struct attribute_group *attr_groups[] = {
+ &pmu_format_attr_group,
+ &pmu_events_attr_group,
+ NULL
+ };
+ int ret = -ENOMEM;
+ char *name;
+
+ if (IS_SRIOV_VF(xe))
+ return 0;
+
+ raw_spin_lock_init(&pmu->lock);
+
+ name = kasprintf(GFP_KERNEL, "xe_%s",
+ dev_name(xe->drm.dev));
+ if (!name)
+ goto err;
+
+ /* tools/perf reserves colons as special. */
+ strreplace(name, ':', '_');
+
+ pmu->name = name;
+ pmu->base.attr_groups = attr_groups;
+ pmu->base.scope = PERF_PMU_SCOPE_SYS_WIDE;
+ pmu->base.module = THIS_MODULE;
+ pmu->base.task_ctx_nr = perf_invalid_context;
+ pmu->base.event_init = xe_pmu_event_init;
+ pmu->base.add = xe_pmu_event_add;
+ pmu->base.del = xe_pmu_event_del;
+ pmu->base.start = xe_pmu_event_start;
+ pmu->base.stop = xe_pmu_event_stop;
+ pmu->base.read = xe_pmu_event_read;
+
+ ret = perf_pmu_register(&pmu->base, pmu->name, -1);
+ if (ret)
+ goto err_name;
+
+ pmu->registered = true;
+
+ return devm_add_action_or_reset(xe->drm.dev, xe_pmu_unregister, pmu);
+
+err_name:
+ kfree(name);
+err:
+ drm_err(&xe->drm, "Failed to register PMU (ret=%d)!\n", ret);
+
+ return ret;
+}
diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
new file mode 100644
index 0000000000000..f9dfe77d00cb6
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_pmu.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_PMU_H_
+#define _XE_PMU_H_
+
+#include "xe_pmu_types.h"
+
+struct xe_gt;
+
+#if IS_ENABLED(CONFIG_PERF_EVENTS)
+int xe_pmu_register(struct xe_pmu *pmu);
+#else
+static inline void xe_pmu_register(struct xe_pmu *pmu) {}
+#endif
+
+#endif
+
diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
new file mode 100644
index 0000000000000..e0cf7169f4fda
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_pmu_types.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_PMU_TYPES_H_
+#define _XE_PMU_TYPES_H_
+
+#include <linux/perf_event.h>
+#include <linux/spinlock_types.h>
+
+enum {
+ __XE_NUM_PMU_SAMPLERS
+};
+
+#define XE_PMU_MAX_GT 2
+
+/**
+ * struct xe_pmu - PMU related data per Xe device
+ *
+ * Stores per device PMU info that includes event/perf attributes and
+ * sampling counters across all GTs for this device.
+ */
+struct xe_pmu {
+ /**
+ * @base: PMU base.
+ */
+ struct pmu base;
+ /**
+ * @registered: PMU is registered and not in the unregistering process.
+ */
+ bool registered;
+ /**
+ * @name: Name as registered with perf core.
+ */
+ const char *name;
+ /**
+ * @lock: Lock protecting enable mask and ref count handling.
+ */
+ raw_spinlock_t lock;
+};
+
+#endif
--
2.48.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH v14 2/7] drm/xe/pmu: Enable PMU interface
2025-01-22 6:23 ` [PATCH v14 2/7] drm/xe/pmu: Enable PMU interface Lucas De Marchi
@ 2025-01-22 10:51 ` Riana Tauro
2025-01-22 15:31 ` Lucas De Marchi
0 siblings, 1 reply; 15+ messages in thread
From: Riana Tauro @ 2025-01-22 10:51 UTC (permalink / raw)
To: Lucas De Marchi, intel-xe
Cc: Rodrigo Vivi, Vinay Belgaumkar, Peter Zijlstra, linux-perf-users
Hi Lucas
On 1/22/2025 11:53 AM, Lucas De Marchi wrote:
> From: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
>
> Basic PMU enabling patch. Setup the basic framework
> for adding events/timers. This patch was previously
> reviewed here -
> https://patchwork.freedesktop.org/series/119504/
>
> Based on previous versions by Bommu Krishnaiah, Aravind Iddamsetty and
> Riana Tauro, using i915 and rapl as reference implementation.
>
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
> drivers/gpu/drm/xe/Makefile | 2 +
> drivers/gpu/drm/xe/xe_device.c | 3 +
> drivers/gpu/drm/xe/xe_device_types.h | 4 +
> drivers/gpu/drm/xe/xe_pmu.c | 299 +++++++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_pmu.h | 20 ++
> drivers/gpu/drm/xe/xe_pmu_types.h | 43 ++++
> 6 files changed, 371 insertions(+)
> create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
> create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
> create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 68861db5f27ce..aa0d981663e4c 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -305,6 +305,8 @@ endif
> xe-$(CONFIG_DRM_XE_DP_TUNNEL) += \
> i915-display/intel_dp_tunnel.o
>
> +xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
> +
> obj-$(CONFIG_DRM_XE) += xe.o
> obj-$(CONFIG_DRM_XE_KUNIT_TEST) += tests/
>
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index bd6191e1ed3e7..f3f754beb812b 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -49,6 +49,7 @@
> #include "xe_pat.h"
> #include "xe_pcode.h"
> #include "xe_pm.h"
> +#include "xe_pmu.h"
> #include "xe_query.h"
> #include "xe_sriov.h"
> #include "xe_tile.h"
> @@ -871,6 +872,8 @@ int xe_device_probe(struct xe_device *xe)
>
> xe_oa_register(xe);
>
> + xe_pmu_register(&xe->pmu);
> +
> xe_debugfs_register(xe);
>
> xe_hwmon_register(xe);
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 16ebb2859877f..58e79e19deaad 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -18,6 +18,7 @@
> #include "xe_memirq_types.h"
> #include "xe_oa_types.h"
> #include "xe_platform_types.h"
> +#include "xe_pmu_types.h"
> #include "xe_pt_types.h"
> #include "xe_sriov_types.h"
> #include "xe_step_types.h"
> @@ -514,6 +515,9 @@ struct xe_device {
> int mode;
> } wedged;
>
> + /** @pmu: performance monitoring unit */
> + struct xe_pmu pmu;
> +
> #ifdef TEST_VM_OPS_ERROR
> /**
> * @vm_inject_error_position: inject errors at different places in VM
> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
> new file mode 100644
> index 0000000000000..c5641af6e9a91
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.c
> @@ -0,0 +1,299 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
Some of these headers are unused
> +#include <drm/drm_drv.h>
> +#include <drm/drm_managed.h>
> +#include <drm/xe_drm.h>
above 2 unused
> +
> +#include "regs/xe_gt_regs.h"
unused
> +#include "xe_device.h"
> +#include "xe_force_wake.h"
> +#include "xe_gt_clock.h"
> +#include "xe_gt_printk.h"
> +#include "xe_mmio.h"
> +#include "xe_macros.h"
unused
> +#include "xe_pm.h"
> +#include "xe_pmu.h"
> +
> +/**
> + * DOC: Xe PMU (Performance Monitoring Unit)
> + *
> + * Expose events/counters like GT-C6 residency and GT frequency to user land.
> + * Events are per device. The GT can be selected with an extra config sub-field
> + * (bits 60-63).
> + *
> + * All events are listed in sysfs:
> + *
> + * $ ls -ld /sys/bus/event_source/devices/xe_*
> + * $ ls /sys/bus/event_source/devices/xe_0000_00_02.0/events/
> + * $ ls /sys/bus/event_source/devices/xe_0000_00_02.0/format/
> + *
> + * The format directory has info regarding the configs that can be used.
> + * The standard perf tool can be used to grep for a certain event as well.
> + * Example:
> + *
> + * $ perf list | grep gt-c6
> + *
> + * To sample a specific event for a GT at regular intervals:
> + *
> + * $ perf stat -e <event_name,gt=> -I <interval>
> + */
> +
> +#define XE_PMU_EVENT_GT_MASK GENMASK_ULL(63, 60)
> +#define XE_PMU_EVENT_ID_MASK GENMASK_ULL(11, 0)
> +
> +static unsigned int config_to_event_id(u64 config)
> +{
> + return FIELD_GET(XE_PMU_EVENT_ID_MASK, config);
> +}
> +
> +static unsigned int config_to_gt_id(u64 config)
> +{
> + return FIELD_GET(XE_PMU_EVENT_GT_MASK, config);
> +}
> +
> +static struct xe_gt *event_to_gt(struct perf_event *event)
> +{
> + struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
> + u64 gt = config_to_gt_id(event->attr.config);
> +
> + return xe_device_get_gt(xe, gt);
> +}
> +
> +static bool event_supported(struct xe_pmu *pmu, unsigned int gt,
> + unsigned int id)
> +{
> + if (gt >= XE_MAX_GT_PER_TILE)
> + return false;
> +
> + return false;
> +}
> +
> +static void xe_pmu_event_destroy(struct perf_event *event)
> +{
> + struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
> +
> + drm_WARN_ON(&xe->drm, event->parent);
> + drm_dev_put(&xe->drm);
> +}
> +
> +static int xe_pmu_event_init(struct perf_event *event)
> +{
> + struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
> + struct xe_pmu *pmu = &xe->pmu;
> + unsigned int id, gt;
> +
> + if (!pmu->registered)
> + return -ENODEV;
> +
> + if (event->attr.type != event->pmu->type)
> + return -ENOENT;
> +
> + /* unsupported modes and filters */
> + if (event->attr.sample_period) /* no sampling */
> + return -EINVAL;
> +
> + if (event->cpu < 0)
> + return -EINVAL;
> +
> + gt = config_to_gt_id(event->attr.config);
> + id = config_to_event_id(event->attr.config);
> + if (!event_supported(pmu, gt, id))
> + return -ENOENT;
> +
> + if (has_branch_stack(event))
> + return -EOPNOTSUPP;
> +
> + if (!event->parent) {
> + drm_dev_get(&xe->drm);
> + event->destroy = xe_pmu_event_destroy;
> + }
> +
> + return 0;
> +}
> +
> +static u64 __xe_pmu_event_read(struct perf_event *event)
> +{
> + struct xe_gt *gt = event_to_gt(event);
> + u64 val = 0;
> +
> + if (!gt)
> + return 0;
> +
> + return val;
> +}
> +
> +static void xe_pmu_event_read(struct perf_event *event)
> +{
> + struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
> + struct hw_perf_event *hwc = &event->hw;
> + struct xe_pmu *pmu = &xe->pmu;
> + u64 prev, new;
> +
> + if (!pmu->registered) {
> + event->hw.state = PERF_HES_STOPPED;
> + return;
> + }
> +
> + prev = local64_read(&hwc->prev_count);
> + do {
> + new = __xe_pmu_event_read(event);
> + } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new));
> +
> + local64_add(new - prev, &event->count);
> +}
> +
> +static void xe_pmu_enable(struct perf_event *event)
> +{
> + /*
> + * Store the current counter value so we can report the correct delta
> + * for all listeners. Even when the event was already enabled and has
> + * an existing non-zero value.
> + */
> + local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
> +}
> +
> +static void xe_pmu_event_start(struct perf_event *event, int flags)
> +{
> + struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
> + struct xe_pmu *pmu = &xe->pmu;
> +
> + if (!pmu->registered)
> + return;
> +
> + xe_pmu_enable(event);
> + event->hw.state = 0;
> +}
> +
> +static void xe_pmu_event_stop(struct perf_event *event, int flags)
> +{
> + struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
> + struct xe_pmu *pmu = &xe->pmu;
> +
> + if (pmu->registered)
> + if (flags & PERF_EF_UPDATE)
> + xe_pmu_event_read(event);
> +
> + event->hw.state = PERF_HES_STOPPED;
> +}
> +
> +static int xe_pmu_event_add(struct perf_event *event, int flags)
> +{
> + struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
> + struct xe_pmu *pmu = &xe->pmu;
> +
> + if (!pmu->registered)
> + return -ENODEV;
> +
> + if (flags & PERF_EF_START)
> + xe_pmu_event_start(event, flags);
> +
> + return 0;
> +}
> +
> +static void xe_pmu_event_del(struct perf_event *event, int flags)
> +{
> + xe_pmu_event_stop(event, PERF_EF_UPDATE);
> +}
> +
> +PMU_FORMAT_ATTR(gt, "config:60-63");
> +PMU_FORMAT_ATTR(event, "config:0-11");
> +
> +static struct attribute *pmu_format_attrs[] = {
> + &format_attr_event.attr,
> + &format_attr_gt.attr,
> + NULL,
> +};
> +
> +static const struct attribute_group pmu_format_attr_group = {
> + .name = "format",
> + .attrs = pmu_format_attrs,
> +};
> +
> +static struct attribute *pmu_event_attrs[] = {
> + /* No events yet */
> + NULL,
> +};
> +
> +static const struct attribute_group pmu_events_attr_group = {
> + .name = "events",
> + .attrs = pmu_event_attrs,
> +};
> +
> +/**
> + * xe_pmu_unregister() - Remove/cleanup PMU registration
> + * @arg: Ptr to pmu
> + */
> +static void xe_pmu_unregister(void *arg)
> +{
> + struct xe_pmu *pmu = arg;
> + struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +
> + if (!pmu->registered)
> + return;
> +
> + pmu->registered = false;
> +
> + perf_pmu_unregister(&pmu->base);
> + kfree(pmu->name);
> +}
> +
> +/**
> + * xe_pmu_register() - Define basic PMU properties for Xe and add event callbacks.
> + * @pmu: the PMU object
> + *
> + * Returns 0 on success and an appropriate error code otherwise
> + */
> +int xe_pmu_register(struct xe_pmu *pmu)
> +{
> + struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> + static const struct attribute_group *attr_groups[] = {
> + &pmu_format_attr_group,
> + &pmu_events_attr_group,
> + NULL
> + };
> + int ret = -ENOMEM;
> + char *name;
> +
> + if (IS_SRIOV_VF(xe))
> + return 0;
> +
> + raw_spin_lock_init(&pmu->lock);
> +
> + name = kasprintf(GFP_KERNEL, "xe_%s",
> + dev_name(xe->drm.dev));
> + if (!name)
> + goto err;
> +
> + /* tools/perf reserves colons as special. */
> + strreplace(name, ':', '_');
> +
> + pmu->name = name;
> + pmu->base.attr_groups = attr_groups;
> + pmu->base.scope = PERF_PMU_SCOPE_SYS_WIDE;
> + pmu->base.module = THIS_MODULE;
> + pmu->base.task_ctx_nr = perf_invalid_context;
> + pmu->base.event_init = xe_pmu_event_init;
> + pmu->base.add = xe_pmu_event_add;
> + pmu->base.del = xe_pmu_event_del;
> + pmu->base.start = xe_pmu_event_start;
> + pmu->base.stop = xe_pmu_event_stop;
> + pmu->base.read = xe_pmu_event_read;
> +
> + ret = perf_pmu_register(&pmu->base, pmu->name, -1);
> + if (ret)
> + goto err_name;
> +
> + pmu->registered = true;
> +
> + return devm_add_action_or_reset(xe->drm.dev, xe_pmu_unregister, pmu);
> +
> +err_name:
> + kfree(name);
> +err:
> + drm_err(&xe->drm, "Failed to register PMU (ret=%d)!\n", ret);
> +
> + return ret;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
> new file mode 100644
> index 0000000000000..f9dfe77d00cb6
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu.h
> @@ -0,0 +1,20 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_H_
> +#define _XE_PMU_H_
> +
> +#include "xe_pmu_types.h"
> +
> +struct xe_gt;
unused
> +
> +#if IS_ENABLED(CONFIG_PERF_EVENTS)
> +int xe_pmu_register(struct xe_pmu *pmu);
> +#else
> +static inline void xe_pmu_register(struct xe_pmu *pmu) {}
> +#endif
> +
> +#endif
> +
> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
> new file mode 100644
> index 0000000000000..e0cf7169f4fda
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
> @@ -0,0 +1,43 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_PMU_TYPES_H_
> +#define _XE_PMU_TYPES_H_
> +
> +#include <linux/perf_event.h>
> +#include <linux/spinlock_types.h>
> +
> +enum {
> + __XE_NUM_PMU_SAMPLERS
> +};
This enum is not used. can be removed
Thanks,
Riana
> +
> +#define XE_PMU_MAX_GT 2
> +
> +/**
> + * struct xe_pmu - PMU related data per Xe device
> + *
> + * Stores per device PMU info that includes event/perf attributes and
> + * sampling counters across all GTs for this device.
> + */
> +struct xe_pmu {
> + /**
> + * @base: PMU base.
> + */
> + struct pmu base;
> + /**
> + * @registered: PMU is registered and not in the unregistering process.
> + */
> + bool registered;
> + /**
> + * @name: Name as registered with perf core.
> + */
> + const char *name;
> + /**
> + * @lock: Lock protecting enable mask and ref count handling.
> + */
> + raw_spinlock_t lock;
> +};
> +
> +#endif
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v14 2/7] drm/xe/pmu: Enable PMU interface
2025-01-22 10:51 ` Riana Tauro
@ 2025-01-22 15:31 ` Lucas De Marchi
0 siblings, 0 replies; 15+ messages in thread
From: Lucas De Marchi @ 2025-01-22 15:31 UTC (permalink / raw)
To: Riana Tauro
Cc: intel-xe, Rodrigo Vivi, Vinay Belgaumkar, Peter Zijlstra,
linux-perf-users
On Wed, Jan 22, 2025 at 04:21:55PM +0530, Riana Tauro wrote:
>Hi Lucas
>
>On 1/22/2025 11:53 AM, Lucas De Marchi wrote:
>>From: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
>>
>>Basic PMU enabling patch. Setup the basic framework
>>for adding events/timers. This patch was previously
>>reviewed here -
>>https://patchwork.freedesktop.org/series/119504/
>>
>>Based on previous versions by Bommu Krishnaiah, Aravind Iddamsetty and
>>Riana Tauro, using i915 and rapl as reference implementation.
>>
>>Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
>>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>>---
>> drivers/gpu/drm/xe/Makefile | 2 +
>> drivers/gpu/drm/xe/xe_device.c | 3 +
>> drivers/gpu/drm/xe/xe_device_types.h | 4 +
>> drivers/gpu/drm/xe/xe_pmu.c | 299 +++++++++++++++++++++++++++
>> drivers/gpu/drm/xe/xe_pmu.h | 20 ++
>> drivers/gpu/drm/xe/xe_pmu_types.h | 43 ++++
>> 6 files changed, 371 insertions(+)
>> create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
>> create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
>> create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h
>>
>>diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
>>index 68861db5f27ce..aa0d981663e4c 100644
>>--- a/drivers/gpu/drm/xe/Makefile
>>+++ b/drivers/gpu/drm/xe/Makefile
>>@@ -305,6 +305,8 @@ endif
>> xe-$(CONFIG_DRM_XE_DP_TUNNEL) += \
>> i915-display/intel_dp_tunnel.o
>>+xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
>>+
>> obj-$(CONFIG_DRM_XE) += xe.o
>> obj-$(CONFIG_DRM_XE_KUNIT_TEST) += tests/
>>diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>>index bd6191e1ed3e7..f3f754beb812b 100644
>>--- a/drivers/gpu/drm/xe/xe_device.c
>>+++ b/drivers/gpu/drm/xe/xe_device.c
>>@@ -49,6 +49,7 @@
>> #include "xe_pat.h"
>> #include "xe_pcode.h"
>> #include "xe_pm.h"
>>+#include "xe_pmu.h"
>> #include "xe_query.h"
>> #include "xe_sriov.h"
>> #include "xe_tile.h"
>>@@ -871,6 +872,8 @@ int xe_device_probe(struct xe_device *xe)
>> xe_oa_register(xe);
>>+ xe_pmu_register(&xe->pmu);
>>+
>> xe_debugfs_register(xe);
>> xe_hwmon_register(xe);
>>diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>>index 16ebb2859877f..58e79e19deaad 100644
>>--- a/drivers/gpu/drm/xe/xe_device_types.h
>>+++ b/drivers/gpu/drm/xe/xe_device_types.h
>>@@ -18,6 +18,7 @@
>> #include "xe_memirq_types.h"
>> #include "xe_oa_types.h"
>> #include "xe_platform_types.h"
>>+#include "xe_pmu_types.h"
>> #include "xe_pt_types.h"
>> #include "xe_sriov_types.h"
>> #include "xe_step_types.h"
>>@@ -514,6 +515,9 @@ struct xe_device {
>> int mode;
>> } wedged;
>>+ /** @pmu: performance monitoring unit */
>>+ struct xe_pmu pmu;
>>+
>> #ifdef TEST_VM_OPS_ERROR
>> /**
>> * @vm_inject_error_position: inject errors at different places in VM
>>diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
>>new file mode 100644
>>index 0000000000000..c5641af6e9a91
>>--- /dev/null
>>+++ b/drivers/gpu/drm/xe/xe_pmu.c
>>@@ -0,0 +1,299 @@
>>+// SPDX-License-Identifier: MIT
>>+/*
>>+ * Copyright © 2025 Intel Corporation
>>+ */
>>+
>
>Some of these headers are unused
>>+#include <drm/drm_drv.h>
>>+#include <drm/drm_managed.h> +#include <drm/xe_drm.h>
>above 2 unused
>>+
>>+#include "regs/xe_gt_regs.h"
>unused
>>+#include "xe_device.h"
>
>>+#include "xe_force_wake.h"
>>+#include "xe_gt_clock.h"
>>+#include "xe_gt_printk.h"
>>+#include "xe_mmio.h"
>>+#include "xe_macros.h"
>unused
>>+#include "xe_pm.h"
>>+#include "xe_pmu.h"
>>+
>>+/**
>>+ * DOC: Xe PMU (Performance Monitoring Unit)
>>+ *
>>+ * Expose events/counters like GT-C6 residency and GT frequency to user land.
>>+ * Events are per device. The GT can be selected with an extra config sub-field
>>+ * (bits 60-63).
>>+ *
>>+ * All events are listed in sysfs:
>>+ *
>>+ * $ ls -ld /sys/bus/event_source/devices/xe_*
>>+ * $ ls /sys/bus/event_source/devices/xe_0000_00_02.0/events/
>>+ * $ ls /sys/bus/event_source/devices/xe_0000_00_02.0/format/
>>+ *
>>+ * The format directory has info regarding the configs that can be used.
>>+ * The standard perf tool can be used to grep for a certain event as well.
>>+ * Example:
>>+ *
>>+ * $ perf list | grep gt-c6
>>+ *
>>+ * To sample a specific event for a GT at regular intervals:
>>+ *
>>+ * $ perf stat -e <event_name,gt=> -I <interval>
>>+ */
>>+
>>+#define XE_PMU_EVENT_GT_MASK GENMASK_ULL(63, 60)
>>+#define XE_PMU_EVENT_ID_MASK GENMASK_ULL(11, 0)
>>+
>>+static unsigned int config_to_event_id(u64 config)
>>+{
>>+ return FIELD_GET(XE_PMU_EVENT_ID_MASK, config);
>>+}
>>+
>>+static unsigned int config_to_gt_id(u64 config)
>>+{
>>+ return FIELD_GET(XE_PMU_EVENT_GT_MASK, config);
>>+}
>>+
>>+static struct xe_gt *event_to_gt(struct perf_event *event)
>>+{
>>+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
>>+ u64 gt = config_to_gt_id(event->attr.config);
>>+
>>+ return xe_device_get_gt(xe, gt);
>>+}
>>+
>>+static bool event_supported(struct xe_pmu *pmu, unsigned int gt,
>>+ unsigned int id)
>>+{
>>+ if (gt >= XE_MAX_GT_PER_TILE)
>>+ return false;
>>+
>>+ return false;
>>+}
>>+
>>+static void xe_pmu_event_destroy(struct perf_event *event)
>>+{
>>+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
>>+
>>+ drm_WARN_ON(&xe->drm, event->parent);
>>+ drm_dev_put(&xe->drm);
>>+}
>>+
>>+static int xe_pmu_event_init(struct perf_event *event)
>>+{
>>+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
>>+ struct xe_pmu *pmu = &xe->pmu;
>>+ unsigned int id, gt;
>>+
>>+ if (!pmu->registered)
>>+ return -ENODEV;
>>+
>>+ if (event->attr.type != event->pmu->type)
>>+ return -ENOENT;
>>+
>>+ /* unsupported modes and filters */
>>+ if (event->attr.sample_period) /* no sampling */
>>+ return -EINVAL;
>>+
>>+ if (event->cpu < 0)
>>+ return -EINVAL;
>>+
>>+ gt = config_to_gt_id(event->attr.config);
>>+ id = config_to_event_id(event->attr.config);
>>+ if (!event_supported(pmu, gt, id))
>>+ return -ENOENT;
>>+
>>+ if (has_branch_stack(event))
>>+ return -EOPNOTSUPP;
>>+
>>+ if (!event->parent) {
>>+ drm_dev_get(&xe->drm);
>>+ event->destroy = xe_pmu_event_destroy;
>>+ }
>>+
>>+ return 0;
>>+}
>>+
>>+static u64 __xe_pmu_event_read(struct perf_event *event)
>>+{
>>+ struct xe_gt *gt = event_to_gt(event);
>>+ u64 val = 0;
>>+
>>+ if (!gt)
>>+ return 0;
>>+
>>+ return val;
>>+}
>>+
>>+static void xe_pmu_event_read(struct perf_event *event)
>>+{
>>+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
>>+ struct hw_perf_event *hwc = &event->hw;
>>+ struct xe_pmu *pmu = &xe->pmu;
>>+ u64 prev, new;
>>+
>>+ if (!pmu->registered) {
>>+ event->hw.state = PERF_HES_STOPPED;
>>+ return;
>>+ }
>>+
>>+ prev = local64_read(&hwc->prev_count);
>>+ do {
>>+ new = __xe_pmu_event_read(event);
>>+ } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new));
>>+
>>+ local64_add(new - prev, &event->count);
>>+}
>>+
>>+static void xe_pmu_enable(struct perf_event *event)
>>+{
>>+ /*
>>+ * Store the current counter value so we can report the correct delta
>>+ * for all listeners. Even when the event was already enabled and has
>>+ * an existing non-zero value.
>>+ */
>>+ local64_set(&event->hw.prev_count, __xe_pmu_event_read(event));
>>+}
>>+
>>+static void xe_pmu_event_start(struct perf_event *event, int flags)
>>+{
>>+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
>>+ struct xe_pmu *pmu = &xe->pmu;
>>+
>>+ if (!pmu->registered)
>>+ return;
>>+
>>+ xe_pmu_enable(event);
>>+ event->hw.state = 0;
>>+}
>>+
>>+static void xe_pmu_event_stop(struct perf_event *event, int flags)
>>+{
>>+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
>>+ struct xe_pmu *pmu = &xe->pmu;
>>+
>>+ if (pmu->registered)
>>+ if (flags & PERF_EF_UPDATE)
>>+ xe_pmu_event_read(event);
>>+
>>+ event->hw.state = PERF_HES_STOPPED;
>>+}
>>+
>>+static int xe_pmu_event_add(struct perf_event *event, int flags)
>>+{
>>+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
>>+ struct xe_pmu *pmu = &xe->pmu;
>>+
>>+ if (!pmu->registered)
>>+ return -ENODEV;
>>+
>>+ if (flags & PERF_EF_START)
>>+ xe_pmu_event_start(event, flags);
>>+
>>+ return 0;
>>+}
>>+
>>+static void xe_pmu_event_del(struct perf_event *event, int flags)
>>+{
>>+ xe_pmu_event_stop(event, PERF_EF_UPDATE);
>>+}
>>+
>>+PMU_FORMAT_ATTR(gt, "config:60-63");
>>+PMU_FORMAT_ATTR(event, "config:0-11");
>>+
>>+static struct attribute *pmu_format_attrs[] = {
>>+ &format_attr_event.attr,
>>+ &format_attr_gt.attr,
>>+ NULL,
>>+};
>>+
>>+static const struct attribute_group pmu_format_attr_group = {
>>+ .name = "format",
>>+ .attrs = pmu_format_attrs,
>>+};
>>+
>>+static struct attribute *pmu_event_attrs[] = {
>>+ /* No events yet */
>>+ NULL,
>>+};
>>+
>>+static const struct attribute_group pmu_events_attr_group = {
>>+ .name = "events",
>>+ .attrs = pmu_event_attrs,
>>+};
>>+
>>+/**
>>+ * xe_pmu_unregister() - Remove/cleanup PMU registration
>>+ * @arg: Ptr to pmu
>>+ */
>>+static void xe_pmu_unregister(void *arg)
>>+{
>>+ struct xe_pmu *pmu = arg;
>>+ struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>>+
>>+ if (!pmu->registered)
>>+ return;
>>+
>>+ pmu->registered = false;
>>+
>>+ perf_pmu_unregister(&pmu->base);
>>+ kfree(pmu->name);
>>+}
>>+
>>+/**
>>+ * xe_pmu_register() - Define basic PMU properties for Xe and add event callbacks.
>>+ * @pmu: the PMU object
>>+ *
>>+ * Returns 0 on success and an appropriate error code otherwise
>>+ */
>>+int xe_pmu_register(struct xe_pmu *pmu)
>>+{
>>+ struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>>+ static const struct attribute_group *attr_groups[] = {
>>+ &pmu_format_attr_group,
>>+ &pmu_events_attr_group,
>>+ NULL
>>+ };
>>+ int ret = -ENOMEM;
>>+ char *name;
>>+
>>+ if (IS_SRIOV_VF(xe))
>>+ return 0;
>>+
>>+ raw_spin_lock_init(&pmu->lock);
>>+
>>+ name = kasprintf(GFP_KERNEL, "xe_%s",
>>+ dev_name(xe->drm.dev));
>>+ if (!name)
>>+ goto err;
>>+
>>+ /* tools/perf reserves colons as special. */
>>+ strreplace(name, ':', '_');
>>+
>>+ pmu->name = name;
>>+ pmu->base.attr_groups = attr_groups;
>>+ pmu->base.scope = PERF_PMU_SCOPE_SYS_WIDE;
>>+ pmu->base.module = THIS_MODULE;
>>+ pmu->base.task_ctx_nr = perf_invalid_context;
>>+ pmu->base.event_init = xe_pmu_event_init;
>>+ pmu->base.add = xe_pmu_event_add;
>>+ pmu->base.del = xe_pmu_event_del;
>>+ pmu->base.start = xe_pmu_event_start;
>>+ pmu->base.stop = xe_pmu_event_stop;
>>+ pmu->base.read = xe_pmu_event_read;
>>+
>>+ ret = perf_pmu_register(&pmu->base, pmu->name, -1);
>>+ if (ret)
>>+ goto err_name;
>>+
>>+ pmu->registered = true;
>>+
>>+ return devm_add_action_or_reset(xe->drm.dev, xe_pmu_unregister, pmu);
>>+
>>+err_name:
>>+ kfree(name);
>>+err:
>>+ drm_err(&xe->drm, "Failed to register PMU (ret=%d)!\n", ret);
>>+
>>+ return ret;
>>+}
>>diff --git a/drivers/gpu/drm/xe/xe_pmu.h b/drivers/gpu/drm/xe/xe_pmu.h
>>new file mode 100644
>>index 0000000000000..f9dfe77d00cb6
>>--- /dev/null
>>+++ b/drivers/gpu/drm/xe/xe_pmu.h
>>@@ -0,0 +1,20 @@
>>+/* SPDX-License-Identifier: MIT */
>>+/*
>>+ * Copyright © 2025 Intel Corporation
>>+ */
>>+
>>+#ifndef _XE_PMU_H_
>>+#define _XE_PMU_H_
>>+
>>+#include "xe_pmu_types.h"
>>+
>>+struct xe_gt;
>unused
>>+
>>+#if IS_ENABLED(CONFIG_PERF_EVENTS)
>>+int xe_pmu_register(struct xe_pmu *pmu);
>>+#else
>>+static inline void xe_pmu_register(struct xe_pmu *pmu) {}
this also missed the update to return int.
>>+#endif
>>+
>>+#endif
>>+
>>diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
>>new file mode 100644
>>index 0000000000000..e0cf7169f4fda
>>--- /dev/null
>>+++ b/drivers/gpu/drm/xe/xe_pmu_types.h
>>@@ -0,0 +1,43 @@
>>+/* SPDX-License-Identifier: MIT */
>>+/*
>>+ * Copyright © 2025 Intel Corporation
>>+ */
>>+
>>+#ifndef _XE_PMU_TYPES_H_
>>+#define _XE_PMU_TYPES_H_
>>+
>>+#include <linux/perf_event.h>
>>+#include <linux/spinlock_types.h>
>>+
>>+enum {
>>+ __XE_NUM_PMU_SAMPLERS
>>+};
>This enum is not used. can be removed
Thanks. After all the cleanups and patches re-split some of the
headers/struct don't make sense anymore in this initial patch. I
cleaned them and have things lined up for next version.
thanks
Lucas De Marchi
>
>Thanks,
>Riana
>>+
>>+#define XE_PMU_MAX_GT 2
>>+
>>+/**
>>+ * struct xe_pmu - PMU related data per Xe device
>>+ *
>>+ * Stores per device PMU info that includes event/perf attributes and
>>+ * sampling counters across all GTs for this device.
>>+ */
>>+struct xe_pmu {
>>+ /**
>>+ * @base: PMU base.
>>+ */
>>+ struct pmu base;
>>+ /**
>>+ * @registered: PMU is registered and not in the unregistering process.
>>+ */
>>+ bool registered;
>>+ /**
>>+ * @name: Name as registered with perf core.
>>+ */
>>+ const char *name;
>>+ /**
>>+ * @lock: Lock protecting enable mask and ref count handling.
>>+ */
>>+ raw_spinlock_t lock;
>>+};
>>+
>>+#endif
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v14 3/7] drm/xe/pmu: Assert max gt
2025-01-22 6:23 [PATCH v14 0/7] drm/xe/pmu: PMU interface for Xe Lucas De Marchi
2025-01-22 6:23 ` [PATCH v14 1/7] perf/core: Add PMU_EVENT_ATTR_ID_STRING Lucas De Marchi
2025-01-22 6:23 ` [PATCH v14 2/7] drm/xe/pmu: Enable PMU interface Lucas De Marchi
@ 2025-01-22 6:23 ` Lucas De Marchi
2025-01-22 6:23 ` [PATCH v14 4/7] drm/xe/pmu: Extract xe_pmu_event_update() Lucas De Marchi
` (3 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Lucas De Marchi @ 2025-01-22 6:23 UTC (permalink / raw)
To: intel-xe
Cc: Rodrigo Vivi, Vinay Belgaumkar, Riana Tauro, Peter Zijlstra,
linux-perf-users, Lucas De Marchi
XE_PMU_MAX_GT needs to be used due to a circular dependency, but we
should make sure it doesn't go out of sync with XE_PMU_MAX_GT. Add a
compile check for that.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
drivers/gpu/drm/xe/xe_pmu.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
index c5641af6e9a91..bd2d709453257 100644
--- a/drivers/gpu/drm/xe/xe_pmu.c
+++ b/drivers/gpu/drm/xe/xe_pmu.c
@@ -257,6 +257,8 @@ int xe_pmu_register(struct xe_pmu *pmu)
int ret = -ENOMEM;
char *name;
+ BUILD_BUG_ON(XE_MAX_GT_PER_TILE != XE_PMU_MAX_GT);
+
if (IS_SRIOV_VF(xe))
return 0;
--
2.48.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v14 4/7] drm/xe/pmu: Extract xe_pmu_event_update()
2025-01-22 6:23 [PATCH v14 0/7] drm/xe/pmu: PMU interface for Xe Lucas De Marchi
` (2 preceding siblings ...)
2025-01-22 6:23 ` [PATCH v14 3/7] drm/xe/pmu: Assert max gt Lucas De Marchi
@ 2025-01-22 6:23 ` Lucas De Marchi
2025-01-22 6:23 ` [PATCH v14 5/7] drm/xe/pmu: Add attribute skeleton Lucas De Marchi
` (2 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Lucas De Marchi @ 2025-01-22 6:23 UTC (permalink / raw)
To: intel-xe
Cc: Rodrigo Vivi, Vinay Belgaumkar, Riana Tauro, Peter Zijlstra,
linux-perf-users, Lucas De Marchi
Like other pmu drivers, keep the update separate from the read so it can
be called from other methods (like stop()) without side effects.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
drivers/gpu/drm/xe/xe_pmu.c | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
index bd2d709453257..5a93634f17a2b 100644
--- a/drivers/gpu/drm/xe/xe_pmu.c
+++ b/drivers/gpu/drm/xe/xe_pmu.c
@@ -125,18 +125,11 @@ static u64 __xe_pmu_event_read(struct perf_event *event)
return val;
}
-static void xe_pmu_event_read(struct perf_event *event)
+static void xe_pmu_event_update(struct perf_event *event)
{
- struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
struct hw_perf_event *hwc = &event->hw;
- struct xe_pmu *pmu = &xe->pmu;
u64 prev, new;
- if (!pmu->registered) {
- event->hw.state = PERF_HES_STOPPED;
- return;
- }
-
prev = local64_read(&hwc->prev_count);
do {
new = __xe_pmu_event_read(event);
@@ -145,6 +138,19 @@ static void xe_pmu_event_read(struct perf_event *event)
local64_add(new - prev, &event->count);
}
+static void xe_pmu_event_read(struct perf_event *event)
+{
+ struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
+ struct xe_pmu *pmu = &xe->pmu;
+
+ if (!pmu->registered) {
+ event->hw.state = PERF_HES_STOPPED;
+ return;
+ }
+
+ xe_pmu_event_update(event);
+}
+
static void xe_pmu_enable(struct perf_event *event)
{
/*
@@ -174,7 +180,7 @@ static void xe_pmu_event_stop(struct perf_event *event, int flags)
if (pmu->registered)
if (flags & PERF_EF_UPDATE)
- xe_pmu_event_read(event);
+ xe_pmu_event_update(event);
event->hw.state = PERF_HES_STOPPED;
}
--
2.48.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v14 5/7] drm/xe/pmu: Add attribute skeleton
2025-01-22 6:23 [PATCH v14 0/7] drm/xe/pmu: PMU interface for Xe Lucas De Marchi
` (3 preceding siblings ...)
2025-01-22 6:23 ` [PATCH v14 4/7] drm/xe/pmu: Extract xe_pmu_event_update() Lucas De Marchi
@ 2025-01-22 6:23 ` Lucas De Marchi
2025-01-22 6:23 ` [PATCH v14 6/7] drm/xe/pmu: Get/put runtime pm on event init Lucas De Marchi
2025-01-22 6:23 ` [PATCH v14 7/7] drm/xe/pmu: Add GT C6 events Lucas De Marchi
6 siblings, 0 replies; 15+ messages in thread
From: Lucas De Marchi @ 2025-01-22 6:23 UTC (permalink / raw)
To: intel-xe
Cc: Rodrigo Vivi, Vinay Belgaumkar, Riana Tauro, Peter Zijlstra,
linux-perf-users, Lucas De Marchi
Add the generic support for defining new attributes. This uses
gt-c6-residency as first attribute to bootstrap it, but its
implementation will be added by a follow up commit: until proper support
is added, it will always be invisible in sysfs since the corresponding
bit is not set in the supported_events bitmap.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
drivers/gpu/drm/xe/xe_pmu.c | 46 +++++++++++++++++++++++++++++--
drivers/gpu/drm/xe/xe_pmu_types.h | 4 +++
2 files changed, 48 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
index 5a93634f17a2b..68ebec1746a53 100644
--- a/drivers/gpu/drm/xe/xe_pmu.c
+++ b/drivers/gpu/drm/xe/xe_pmu.c
@@ -54,6 +54,8 @@ static unsigned int config_to_gt_id(u64 config)
return FIELD_GET(XE_PMU_EVENT_GT_MASK, config);
}
+#define XE_PMU_EVENT_GT_C6_RESIDENCY 0x01
+
static struct xe_gt *event_to_gt(struct perf_event *event)
{
struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
@@ -68,7 +70,8 @@ static bool event_supported(struct xe_pmu *pmu, unsigned int gt,
if (gt >= XE_MAX_GT_PER_TILE)
return false;
- return false;
+ return id < sizeof(pmu->supported_events) * BITS_PER_BYTE &&
+ pmu->supported_events & BIT_ULL(id);
}
static void xe_pmu_event_destroy(struct perf_event *event)
@@ -218,16 +221,53 @@ static const struct attribute_group pmu_format_attr_group = {
.attrs = pmu_format_attrs,
};
+static ssize_t event_attr_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct perf_pmu_events_attr *pmu_attr =
+ container_of(attr, struct perf_pmu_events_attr, attr);
+
+ return sprintf(buf, "event=%#04llx\n", pmu_attr->id);
+}
+
+static umode_t event_attr_is_visible(struct kobject *kobj,
+ struct attribute *attr, int idx)
+{
+ struct device *dev = kobj_to_dev(kobj);
+ struct perf_pmu_events_attr *pmu_attr =
+ container_of(attr, typeof(*pmu_attr), attr.attr);
+ struct xe_pmu *pmu =
+ container_of(dev_get_drvdata(dev), typeof(*pmu), base);
+
+ if (event_supported(pmu, 0, pmu_attr->id))
+ return attr->mode;
+
+ return 0;
+}
+
+#define XE_EVENT_ATTR(name_, v_, id_, unit_) \
+ PMU_EVENT_ATTR(name_, pmu_event_ ## v_, id_, event_attr_show) \
+ PMU_EVENT_ATTR_ID_STRING(name_.unit, pmu_event_unit_ ## v_, id_, unit_)
+
+XE_EVENT_ATTR(gt-c6-residency, gt_c6_residency, XE_PMU_EVENT_GT_C6_RESIDENCY, "ms")
+
static struct attribute *pmu_event_attrs[] = {
- /* No events yet */
+ &pmu_event_gt_c6_residency.attr.attr,
+ &pmu_event_unit_gt_c6_residency.attr.attr,
+
NULL,
};
static const struct attribute_group pmu_events_attr_group = {
.name = "events",
.attrs = pmu_event_attrs,
+ .is_visible = event_attr_is_visible,
};
+static void set_supported_events(struct xe_pmu *pmu)
+{
+}
+
/**
* xe_pmu_unregister() - Remove/cleanup PMU registration
* @arg: Ptr to pmu
@@ -290,6 +330,8 @@ int xe_pmu_register(struct xe_pmu *pmu)
pmu->base.stop = xe_pmu_event_stop;
pmu->base.read = xe_pmu_event_read;
+ set_supported_events(pmu);
+
ret = perf_pmu_register(&pmu->base, pmu->name, -1);
if (ret)
goto err_name;
diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
index e0cf7169f4fda..64a1ca881c233 100644
--- a/drivers/gpu/drm/xe/xe_pmu_types.h
+++ b/drivers/gpu/drm/xe/xe_pmu_types.h
@@ -38,6 +38,10 @@ struct xe_pmu {
* @lock: Lock protecting enable mask and ref count handling.
*/
raw_spinlock_t lock;
+ /**
+ * @supported_events: Bitmap of supported events, indexed by event id
+ */
+ u64 supported_events;
};
#endif
--
2.48.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v14 6/7] drm/xe/pmu: Get/put runtime pm on event init
2025-01-22 6:23 [PATCH v14 0/7] drm/xe/pmu: PMU interface for Xe Lucas De Marchi
` (4 preceding siblings ...)
2025-01-22 6:23 ` [PATCH v14 5/7] drm/xe/pmu: Add attribute skeleton Lucas De Marchi
@ 2025-01-22 6:23 ` Lucas De Marchi
2025-01-22 10:26 ` Rodrigo Vivi
2025-01-22 6:23 ` [PATCH v14 7/7] drm/xe/pmu: Add GT C6 events Lucas De Marchi
6 siblings, 1 reply; 15+ messages in thread
From: Lucas De Marchi @ 2025-01-22 6:23 UTC (permalink / raw)
To: intel-xe
Cc: Rodrigo Vivi, Vinay Belgaumkar, Riana Tauro, Peter Zijlstra,
linux-perf-users, Lucas De Marchi
When the event is created, make sure runtime pm is taken and later put:
in order to read an event counter the GPU needs to remain accessible and
doing a get/put during perf's read is not possible it's holding a
raw_spinlock.
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
drivers/gpu/drm/xe/xe_pmu.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
index 68ebec1746a53..8d938d67c1f2c 100644
--- a/drivers/gpu/drm/xe/xe_pmu.c
+++ b/drivers/gpu/drm/xe/xe_pmu.c
@@ -79,6 +79,7 @@ static void xe_pmu_event_destroy(struct perf_event *event)
struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
drm_WARN_ON(&xe->drm, event->parent);
+ xe_pm_runtime_put(xe);
drm_dev_put(&xe->drm);
}
@@ -111,6 +112,7 @@ static int xe_pmu_event_init(struct perf_event *event)
if (!event->parent) {
drm_dev_get(&xe->drm);
+ xe_pm_runtime_get(xe);
event->destroy = xe_pmu_event_destroy;
}
--
2.48.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH v14 6/7] drm/xe/pmu: Get/put runtime pm on event init
2025-01-22 6:23 ` [PATCH v14 6/7] drm/xe/pmu: Get/put runtime pm on event init Lucas De Marchi
@ 2025-01-22 10:26 ` Rodrigo Vivi
0 siblings, 0 replies; 15+ messages in thread
From: Rodrigo Vivi @ 2025-01-22 10:26 UTC (permalink / raw)
To: Lucas De Marchi
Cc: intel-xe, Vinay Belgaumkar, Riana Tauro, Peter Zijlstra,
linux-perf-users
On Tue, Jan 21, 2025 at 10:23:40PM -0800, Lucas De Marchi wrote:
> When the event is created, make sure runtime pm is taken and later put:
> in order to read an event counter the GPU needs to remain accessible and
> doing a get/put during perf's read is not possible it's holding a
> raw_spinlock.
>
> Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
> drivers/gpu/drm/xe/xe_pmu.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
> index 68ebec1746a53..8d938d67c1f2c 100644
> --- a/drivers/gpu/drm/xe/xe_pmu.c
> +++ b/drivers/gpu/drm/xe/xe_pmu.c
> @@ -79,6 +79,7 @@ static void xe_pmu_event_destroy(struct perf_event *event)
> struct xe_device *xe = container_of(event->pmu, typeof(*xe), pmu.base);
>
> drm_WARN_ON(&xe->drm, event->parent);
> + xe_pm_runtime_put(xe);
> drm_dev_put(&xe->drm);
> }
>
> @@ -111,6 +112,7 @@ static int xe_pmu_event_init(struct perf_event *event)
>
> if (!event->parent) {
> drm_dev_get(&xe->drm);
> + xe_pm_runtime_get(xe);
> event->destroy = xe_pmu_event_destroy;
> }
>
> --
> 2.48.0
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v14 7/7] drm/xe/pmu: Add GT C6 events
2025-01-22 6:23 [PATCH v14 0/7] drm/xe/pmu: PMU interface for Xe Lucas De Marchi
` (5 preceding siblings ...)
2025-01-22 6:23 ` [PATCH v14 6/7] drm/xe/pmu: Get/put runtime pm on event init Lucas De Marchi
@ 2025-01-22 6:23 ` Lucas De Marchi
2025-01-22 10:23 ` Riana Tauro
2025-01-22 10:32 ` Rodrigo Vivi
6 siblings, 2 replies; 15+ messages in thread
From: Lucas De Marchi @ 2025-01-22 6:23 UTC (permalink / raw)
To: intel-xe
Cc: Rodrigo Vivi, Vinay Belgaumkar, Riana Tauro, Peter Zijlstra,
linux-perf-users, Lucas De Marchi
From: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Provide a PMU interface for GT C6 residency counters. The implementation
is ported over from the i915 PMU code. Residency is provided in units of
ms(like sysfs entry in - /sys/class/drm/card0/device/tile0/gt0/gtidle).
Sample usage and output:
$ perf list | grep gt-c6
xe_0000_00_02.0/gt-c6-residency/ [Kernel PMU event]
$ tail /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency*
==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency <==
event=0x01
==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency.unit <==
ms
$ perf stat -e xe_0000_00_02.0/gt-c6-residency,gt=0/ -I1000
# time counts unit events
1.001196056 1,001 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
2.005216219 1,003 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
drivers/gpu/drm/xe/xe_pmu.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
index 8d938d67c1f2c..a2e4addd3dd7e 100644
--- a/drivers/gpu/drm/xe/xe_pmu.c
+++ b/drivers/gpu/drm/xe/xe_pmu.c
@@ -11,6 +11,7 @@
#include "xe_device.h"
#include "xe_force_wake.h"
#include "xe_gt_clock.h"
+#include "xe_gt_idle.h"
#include "xe_gt_printk.h"
#include "xe_mmio.h"
#include "xe_macros.h"
@@ -122,12 +123,16 @@ static int xe_pmu_event_init(struct perf_event *event)
static u64 __xe_pmu_event_read(struct perf_event *event)
{
struct xe_gt *gt = event_to_gt(event);
- u64 val = 0;
if (!gt)
return 0;
- return val;
+ switch (config_to_event_id(event->attr.config)) {
+ case XE_PMU_EVENT_GT_C6_RESIDENCY:
+ return xe_gt_idle_residency_msec(>->gtidle);
+ }
+
+ return 0;
}
static void xe_pmu_event_update(struct perf_event *event)
@@ -268,6 +273,10 @@ static const struct attribute_group pmu_events_attr_group = {
static void set_supported_events(struct xe_pmu *pmu)
{
+ struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
+
+ if (!xe->info.skip_guc_pc)
+ pmu->supported_events |= BIT_ULL(XE_PMU_EVENT_GT_C6_RESIDENCY);
}
/**
--
2.48.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH v14 7/7] drm/xe/pmu: Add GT C6 events
2025-01-22 6:23 ` [PATCH v14 7/7] drm/xe/pmu: Add GT C6 events Lucas De Marchi
@ 2025-01-22 10:23 ` Riana Tauro
2025-01-22 10:32 ` Rodrigo Vivi
1 sibling, 0 replies; 15+ messages in thread
From: Riana Tauro @ 2025-01-22 10:23 UTC (permalink / raw)
To: Lucas De Marchi, intel-xe
Cc: Rodrigo Vivi, Vinay Belgaumkar, Peter Zijlstra, linux-perf-users
On 1/22/2025 11:53 AM, Lucas De Marchi wrote:
> From: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
>
> Provide a PMU interface for GT C6 residency counters. The implementation
> is ported over from the i915 PMU code. Residency is provided in units of
> ms(like sysfs entry in - /sys/class/drm/card0/device/tile0/gt0/gtidle).
>
> Sample usage and output:
>
> $ perf list | grep gt-c6
> xe_0000_00_02.0/gt-c6-residency/ [Kernel PMU event]
>
> $ tail /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency*
> ==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency <==
> event=0x01
>
> ==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency.unit <==
> ms
>
> $ perf stat -e xe_0000_00_02.0/gt-c6-residency,gt=0/ -I1000
> # time counts unit events
> 1.001196056 1,001 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
> 2.005216219 1,003 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
>
> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Looks good to me
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
> ---
> drivers/gpu/drm/xe/xe_pmu.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
> index 8d938d67c1f2c..a2e4addd3dd7e 100644
> --- a/drivers/gpu/drm/xe/xe_pmu.c
> +++ b/drivers/gpu/drm/xe/xe_pmu.c
> @@ -11,6 +11,7 @@
> #include "xe_device.h"
> #include "xe_force_wake.h"
> #include "xe_gt_clock.h"
> +#include "xe_gt_idle.h"
> #include "xe_gt_printk.h"
> #include "xe_mmio.h"
> #include "xe_macros.h"
> @@ -122,12 +123,16 @@ static int xe_pmu_event_init(struct perf_event *event)
> static u64 __xe_pmu_event_read(struct perf_event *event)
> {
> struct xe_gt *gt = event_to_gt(event);
> - u64 val = 0;
>
> if (!gt)
> return 0;
>
> - return val;
> + switch (config_to_event_id(event->attr.config)) {
> + case XE_PMU_EVENT_GT_C6_RESIDENCY:
> + return xe_gt_idle_residency_msec(>->gtidle);
> + }
> +
> + return 0;
> }
>
> static void xe_pmu_event_update(struct perf_event *event)
> @@ -268,6 +273,10 @@ static const struct attribute_group pmu_events_attr_group = {
>
> static void set_supported_events(struct xe_pmu *pmu)
> {
> + struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +
> + if (!xe->info.skip_guc_pc)
> + pmu->supported_events |= BIT_ULL(XE_PMU_EVENT_GT_C6_RESIDENCY);
> }
>
> /**
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v14 7/7] drm/xe/pmu: Add GT C6 events
2025-01-22 6:23 ` [PATCH v14 7/7] drm/xe/pmu: Add GT C6 events Lucas De Marchi
2025-01-22 10:23 ` Riana Tauro
@ 2025-01-22 10:32 ` Rodrigo Vivi
2025-01-23 0:11 ` Lucas De Marchi
1 sibling, 1 reply; 15+ messages in thread
From: Rodrigo Vivi @ 2025-01-22 10:32 UTC (permalink / raw)
To: Lucas De Marchi
Cc: intel-xe, Vinay Belgaumkar, Riana Tauro, Peter Zijlstra,
linux-perf-users
On Tue, Jan 21, 2025 at 10:23:41PM -0800, Lucas De Marchi wrote:
> From: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
>
> Provide a PMU interface for GT C6 residency counters. The implementation
> is ported over from the i915 PMU code.
This is not valid anymore right?! Perhaps rephrase to show that the API
design itself was taken from there?
> Residency is provided in units of
> ms(like sysfs entry in - /sys/class/drm/card0/device/tile0/gt0/gtidle).
>
> Sample usage and output:
>
> $ perf list | grep gt-c6
> xe_0000_00_02.0/gt-c6-residency/ [Kernel PMU event]
>
> $ tail /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency*
> ==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency <==
> event=0x01
>
> ==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency.unit <==
> ms
>
> $ perf stat -e xe_0000_00_02.0/gt-c6-residency,gt=0/ -I1000
> # time counts unit events
> 1.001196056 1,001 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
> 2.005216219 1,003 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
>
> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
> drivers/gpu/drm/xe/xe_pmu.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
> index 8d938d67c1f2c..a2e4addd3dd7e 100644
> --- a/drivers/gpu/drm/xe/xe_pmu.c
> +++ b/drivers/gpu/drm/xe/xe_pmu.c
> @@ -11,6 +11,7 @@
> #include "xe_device.h"
> #include "xe_force_wake.h"
> #include "xe_gt_clock.h"
> +#include "xe_gt_idle.h"
> #include "xe_gt_printk.h"
> #include "xe_mmio.h"
> #include "xe_macros.h"
> @@ -122,12 +123,16 @@ static int xe_pmu_event_init(struct perf_event *event)
> static u64 __xe_pmu_event_read(struct perf_event *event)
> {
> struct xe_gt *gt = event_to_gt(event);
> - u64 val = 0;
>
> if (!gt)
> return 0;
>
> - return val;
> + switch (config_to_event_id(event->attr.config)) {
> + case XE_PMU_EVENT_GT_C6_RESIDENCY:
> + return xe_gt_idle_residency_msec(>->gtidle);
> + }
> +
> + return 0;
> }
>
> static void xe_pmu_event_update(struct perf_event *event)
> @@ -268,6 +273,10 @@ static const struct attribute_group pmu_events_attr_group = {
>
> static void set_supported_events(struct xe_pmu *pmu)
> {
> + struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
> +
> + if (!xe->info.skip_guc_pc)
> + pmu->supported_events |= BIT_ULL(XE_PMU_EVENT_GT_C6_RESIDENCY);
> }
A feeling that it would be better to squash this to the other attribute patch,
but I understand the reasons...
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>
> /**
> --
> 2.48.0
>
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v14 7/7] drm/xe/pmu: Add GT C6 events
2025-01-22 10:32 ` Rodrigo Vivi
@ 2025-01-23 0:11 ` Lucas De Marchi
0 siblings, 0 replies; 15+ messages in thread
From: Lucas De Marchi @ 2025-01-23 0:11 UTC (permalink / raw)
To: Rodrigo Vivi
Cc: intel-xe, Vinay Belgaumkar, Riana Tauro, Peter Zijlstra,
linux-perf-users
On Wed, Jan 22, 2025 at 05:32:59AM -0500, Rodrigo Vivi wrote:
>On Tue, Jan 21, 2025 at 10:23:41PM -0800, Lucas De Marchi wrote:
>> From: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
>>
>> Provide a PMU interface for GT C6 residency counters. The implementation
>> is ported over from the i915 PMU code.
>
>This is not valid anymore right?! Perhaps rephrase to show that the API
>design itself was taken from there?
yep, re-phrased:
Provide a PMU interface for GT C6 residency counters. The interface is
similar to the one available for i915, but gt is passed in the config
when creating the event.
>
>> Residency is provided in units of
>> ms(like sysfs entry in - /sys/class/drm/card0/device/tile0/gt0/gtidle).
>>
>> Sample usage and output:
>>
>> $ perf list | grep gt-c6
>> xe_0000_00_02.0/gt-c6-residency/ [Kernel PMU event]
>>
>> $ tail /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency*
>> ==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency <==
>> event=0x01
>>
>> ==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency.unit <==
>> ms
>>
>> $ perf stat -e xe_0000_00_02.0/gt-c6-residency,gt=0/ -I1000
>> # time counts unit events
>> 1.001196056 1,001 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
>> 2.005216219 1,003 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
>>
>> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
>> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_pmu.c | 13 +++++++++++--
>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
>> index 8d938d67c1f2c..a2e4addd3dd7e 100644
>> --- a/drivers/gpu/drm/xe/xe_pmu.c
>> +++ b/drivers/gpu/drm/xe/xe_pmu.c
>> @@ -11,6 +11,7 @@
>> #include "xe_device.h"
>> #include "xe_force_wake.h"
>> #include "xe_gt_clock.h"
>> +#include "xe_gt_idle.h"
>> #include "xe_gt_printk.h"
>> #include "xe_mmio.h"
>> #include "xe_macros.h"
>> @@ -122,12 +123,16 @@ static int xe_pmu_event_init(struct perf_event *event)
>> static u64 __xe_pmu_event_read(struct perf_event *event)
>> {
>> struct xe_gt *gt = event_to_gt(event);
>> - u64 val = 0;
>>
>> if (!gt)
>> return 0;
>>
>> - return val;
>> + switch (config_to_event_id(event->attr.config)) {
>> + case XE_PMU_EVENT_GT_C6_RESIDENCY:
>> + return xe_gt_idle_residency_msec(>->gtidle);
>> + }
>> +
>> + return 0;
>> }
>>
>> static void xe_pmu_event_update(struct perf_event *event)
>> @@ -268,6 +273,10 @@ static const struct attribute_group pmu_events_attr_group = {
>>
>> static void set_supported_events(struct xe_pmu *pmu)
>> {
>> + struct xe_device *xe = container_of(pmu, typeof(*xe), pmu);
>> +
>> + if (!xe->info.skip_guc_pc)
>> + pmu->supported_events |= BIT_ULL(XE_PMU_EVENT_GT_C6_RESIDENCY);
>> }
>
>A feeling that it would be better to squash this to the other attribute patch,
>but I understand the reasons...
well... yeah. That part alone is now bigger than this with all the
simplifications and the previous code with dynamic attributes is simply
gone. I'm not comfortable adding others' signoff on a completely new
code. It's based more on other drivers already in the tree than the
previous implementation for xe (or i915).
>
>Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
thanks
Lucas De Marchi
>
>>
>> /**
>> --
>> 2.48.0
>>
^ permalink raw reply [flat|nested] 15+ messages in thread