* [PATCH 0/2] Add a few tracepoints to panthor
@ 2025-12-03 13:56 Nicolas Frattaroli
2025-12-03 13:56 ` [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes Nicolas Frattaroli
2025-12-03 13:56 ` [PATCH 2/2] drm/panthor: Add gpu_job_irq tracepoint Nicolas Frattaroli
0 siblings, 2 replies; 15+ messages in thread
From: Nicolas Frattaroli @ 2025-12-03 13:56 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter
Cc: kernel, linux-kernel, dri-devel, Nicolas Frattaroli
This series adds two tracepoints to panthor.
The first tracepoint allows for inspecting utilisation of the hardware
subdivisions, e.g. how many shader cores are active. This is done by
reading three hardware registers when a certain IRQ fires.
The second tracepoint instruments panthor's job IRQ handler. This is
more useful than the generic interrupt tracing functionality, as the
tracepoint has the events bit mask included, which indicates which
command stream group interfaces triggered the interrupt.
To test the tracepoints, the following can be used:
:~# echo 1 > /sys/kernel/tracing/events/panthor/gpu_power_active/enable
:~# echo 1 > /sys/kernel/tracing/events/panthor/gpu_job_irq/enable
:~# echo 1 > /sys/kernel/tracing/tracing_on
:~# cat /sys/kernel/tracing/trace_pipe
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
Nicolas Frattaroli (2):
drm/panthor: Add tracepoint for hardware utilisation changes
drm/panthor: Add gpu_job_irq tracepoint
drivers/gpu/drm/panthor/panthor_device.c | 1 +
drivers/gpu/drm/panthor/panthor_fw.c | 13 +++++++
drivers/gpu/drm/panthor/panthor_gpu.c | 9 +++++
drivers/gpu/drm/panthor/panthor_trace.h | 62 ++++++++++++++++++++++++++++++++
4 files changed, 85 insertions(+)
---
base-commit: 6ef847703ac6da2deaaf735ce95369ba25c2c432
change-id: 20251203-panthor-tracepoints-488af09d46e7
Best regards,
--
Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-03 13:56 [PATCH 0/2] Add a few tracepoints to panthor Nicolas Frattaroli
@ 2025-12-03 13:56 ` Nicolas Frattaroli
2025-12-04 20:21 ` Chia-I Wu
` (2 more replies)
2025-12-03 13:56 ` [PATCH 2/2] drm/panthor: Add gpu_job_irq tracepoint Nicolas Frattaroli
1 sibling, 3 replies; 15+ messages in thread
From: Nicolas Frattaroli @ 2025-12-03 13:56 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter
Cc: kernel, linux-kernel, dri-devel, Nicolas Frattaroli
Mali GPUs have three registers that indicate which parts of the hardware
are powered and active at any moment. These take the form of bitmaps. In
the case of SHADER_PWRACTIVE for example, a high bit indicates that the
shader core corresponding to that bit index is active. These bitmaps
aren't solely contiguous bits, as it's common to have holes in the
sequence of shader core indices, and the actual set of which cores are
present is defined by the "shader present" register.
When the GPU finishes a power state transition, it fires a
GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
received, the PWRACTIVE registers will likely contain interesting new
information.
This is not to be confused with the PWR_IRQ_POWER_CHANGED_ALL interrupt,
which is something related to Mali v14+'s power control logic. The
PWRACTIVE registers and corresponding interrupts are already available
in v9 and onwards.
Expose this as a tracepoint to userspace. This allows users to debug
various scenarios and gather interesting information, such as: knowing
how much hardware is lit up at any given time, correlating graphics
corruption with a specific active shader core, measuring when hardware
is allowed to go to an inactive state again, and so on.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/panthor/panthor_device.c | 1 +
drivers/gpu/drm/panthor/panthor_gpu.c | 9 ++++++++
drivers/gpu/drm/panthor/panthor_trace.h | 38 ++++++++++++++++++++++++++++++++
3 files changed, 48 insertions(+)
diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
index e133b1e0ad6d..a3cb934104b8 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -548,6 +548,7 @@ int panthor_device_resume(struct device *dev)
DRM_PANTHOR_USER_MMIO_OFFSET, 0, 1);
atomic_set(&ptdev->pm.state, PANTHOR_DEVICE_PM_STATE_ACTIVE);
mutex_unlock(&ptdev->pm.mmio_lock);
+
return 0;
err_suspend_devfreq:
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index 9cb5dee93212..8830aa9a5c4b 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -22,6 +22,9 @@
#include "panthor_hw.h"
#include "panthor_regs.h"
+#define CREATE_TRACE_POINTS
+#include "panthor_trace.h"
+
/**
* struct panthor_gpu - GPU block management data.
*/
@@ -46,6 +49,7 @@ struct panthor_gpu {
(GPU_IRQ_FAULT | \
GPU_IRQ_PROTM_FAULT | \
GPU_IRQ_RESET_COMPLETED | \
+ GPU_IRQ_POWER_CHANGED_ALL | \
GPU_IRQ_CLEAN_CACHES_COMPLETED)
static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
@@ -97,6 +101,11 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
wake_up_all(&ptdev->gpu->reqs_acked);
}
spin_unlock(&ptdev->gpu->reqs_lock);
+
+ if (status & GPU_IRQ_POWER_CHANGED_ALL)
+ trace_gpu_power_active(gpu_read64(ptdev, SHADER_PWRACTIVE),
+ gpu_read64(ptdev, TILER_PWRACTIVE),
+ gpu_read64(ptdev, L2_PWRACTIVE));
}
PANTHOR_IRQ_HANDLER(gpu, GPU, panthor_gpu_irq_handler);
diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
new file mode 100644
index 000000000000..01013f81e68a
--- /dev/null
+++ b/drivers/gpu/drm/panthor/panthor_trace.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0 or MIT */
+/* Copyright 2025 Collabora ltd. */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM panthor
+
+#if !defined(__PANTHOR_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
+#define __PANTHOR_TRACE_H__
+
+#include <linux/tracepoint.h>
+#include <linux/types.h>
+
+TRACE_EVENT(gpu_power_active,
+ TP_PROTO(u64 shader_bitmap, u64 tiler_bitmap, u64 l2_bitmap),
+ TP_ARGS(shader_bitmap, tiler_bitmap, l2_bitmap),
+ TP_STRUCT__entry(
+ __field(u64, shader_bitmap)
+ __field(u64, tiler_bitmap)
+ __field(u64, l2_bitmap)
+ ),
+ TP_fast_assign(
+ __entry->shader_bitmap = shader_bitmap;
+ __entry->tiler_bitmap = tiler_bitmap;
+ __entry->l2_bitmap = l2_bitmap;
+ ),
+ TP_printk("shader_bitmap=0x%llx tiler_bitmap=0x%llx l2_bitmap=0x%llx",
+ __entry->shader_bitmap, __entry->tiler_bitmap, __entry->l2_bitmap
+ )
+);
+
+#endif /* __PANTHOR_TRACE_H__ */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE panthor_trace
+
+#include <trace/define_trace.h>
--
2.52.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/2] drm/panthor: Add gpu_job_irq tracepoint
2025-12-03 13:56 [PATCH 0/2] Add a few tracepoints to panthor Nicolas Frattaroli
2025-12-03 13:56 ` [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes Nicolas Frattaroli
@ 2025-12-03 13:56 ` Nicolas Frattaroli
1 sibling, 0 replies; 15+ messages in thread
From: Nicolas Frattaroli @ 2025-12-03 13:56 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter
Cc: kernel, linux-kernel, dri-devel, Nicolas Frattaroli
Mali's CSF firmware triggers the job IRQ whenever there's new firmware
events for processing. While this can be a global event (BIT(31) of the
status register), it's usually an event relating to a command stream
group (the other bit indices).
Panthor throws these events onto a workqueue for processing outside the
IRQ handler. It's therefore useful to have an instrumented tracepoint
that goes beyond the generic IRQ tracepoint for this specific case, as
it can be augmented with additional data, namely the events bit mask.
This can then be used to debug problems relating to GPU jobs events not
being processed quickly enough. The duration_ns field can be used to
work backwards from when the tracepoint fires (at the end of the IRQ
handler) to figure out when the interrupt itself landed, providing not
just information on how long the work queueing took, but also when the
actual interrupt itself arrived.
With this information in hand, the IRQ handler itself being slow can be
excluded as a possible source of problems, and attention can be directed
to the workqueue processing instead.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/panthor/panthor_fw.c | 13 +++++++++++++
drivers/gpu/drm/panthor/panthor_trace.h | 24 ++++++++++++++++++++++++
2 files changed, 37 insertions(+)
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 94a3cd6dfa6d..df07f6435cda 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -26,6 +26,7 @@
#include "panthor_mmu.h"
#include "panthor_regs.h"
#include "panthor_sched.h"
+#include "panthor_trace.h"
#define CSF_FW_NAME "mali_csffw.bin"
@@ -1059,6 +1060,12 @@ static void panthor_fw_init_global_iface(struct panthor_device *ptdev)
static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
{
+ u32 duration;
+ u64 start;
+
+ if (tracepoint_enabled(gpu_job_irq))
+ start = ktime_get_ns();
+
gpu_write(ptdev, JOB_INT_CLEAR, status);
if (!ptdev->fw->booted && (status & JOB_INT_GLOBAL_IF))
@@ -1071,6 +1078,12 @@ static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
return;
panthor_sched_report_fw_events(ptdev, status);
+
+ if (tracepoint_enabled(gpu_job_irq)) {
+ if (check_sub_overflow(ktime_get_ns(), start, &duration))
+ duration = U32_MAX;
+ trace_gpu_job_irq(status, duration);
+ }
}
PANTHOR_IRQ_HANDLER(job, JOB, panthor_job_irq_handler);
diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
index 01013f81e68a..fcddfdb6ffef 100644
--- a/drivers/gpu/drm/panthor/panthor_trace.h
+++ b/drivers/gpu/drm/panthor/panthor_trace.h
@@ -28,6 +28,30 @@ TRACE_EVENT(gpu_power_active,
)
);
+/**
+ * gpu_job_irq - called after a job interrupt from firmware completes
+ * @events: bitmask of BIT(CSG id) | BIT(31) for a global event
+ * @duration_ns: Nanoseconds between job IRQ handler entry and exit
+ *
+ * The panthor_job_irq_handler() function instrumented by this tracepoint exits
+ * once it has queued the firmware interrupts for processing, not when the
+ * firmware interrupts are fully processed. This tracepoint allows for debugging
+ * issues with delays in the workqueue's processing of events.
+ */
+TRACE_EVENT(gpu_job_irq,
+ TP_PROTO(u32 events, u32 duration_ns),
+ TP_ARGS(events, duration_ns),
+ TP_STRUCT__entry(
+ __field(u32, events)
+ __field(u32, duration_ns)
+ ),
+ TP_fast_assign(
+ __entry->events = events;
+ __entry->duration_ns = duration_ns;
+ ),
+ TP_printk("events=0x%x duration_ns=%d", __entry->events, __entry->duration_ns)
+);
+
#endif /* __PANTHOR_TRACE_H__ */
#undef TRACE_INCLUDE_PATH
--
2.52.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-03 13:56 ` [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes Nicolas Frattaroli
@ 2025-12-04 20:21 ` Chia-I Wu
2025-12-05 10:47 ` Nicolas Frattaroli
2025-12-08 17:14 ` Karunika Choo
2025-12-08 17:21 ` Karunika Choo
2 siblings, 1 reply; 15+ messages in thread
From: Chia-I Wu @ 2025-12-04 20:21 UTC (permalink / raw)
To: Nicolas Frattaroli
Cc: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, linux-kernel, dri-devel
On Wed, Dec 3, 2025 at 6:04 AM Nicolas Frattaroli
<nicolas.frattaroli@collabora.com> wrote:
>
> Mali GPUs have three registers that indicate which parts of the hardware
> are powered and active at any moment. These take the form of bitmaps. In
> the case of SHADER_PWRACTIVE for example, a high bit indicates that the
> shader core corresponding to that bit index is active. These bitmaps
> aren't solely contiguous bits, as it's common to have holes in the
> sequence of shader core indices, and the actual set of which cores are
> present is defined by the "shader present" register.
>
> When the GPU finishes a power state transition, it fires a
> GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
> received, the PWRACTIVE registers will likely contain interesting new
> information.
I am seeing
irq/342-panthor-412 [000] ..... 934.526754: gpu_power_active:
shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
irq/342-panthor-412 [000] ..... 936.640356: gpu_power_active:
shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
on a gpu-bound test. It does not look like collecting samples on
GPU_IRQ_POWER_CHANGED_ALL gives too much info.
I think they are more useful to be collected periodically, such that
we know that in the past X seconds, Y out of a total of Z samples
indicates activities. That's best done in userspace, and panthor's
role should be to provide an uapi such as
https://lore.kernel.org/all/cover.1743517880.git.lukas.zapolskas@arm.com/.
>
> This is not to be confused with the PWR_IRQ_POWER_CHANGED_ALL interrupt,
> which is something related to Mali v14+'s power control logic. The
> PWRACTIVE registers and corresponding interrupts are already available
> in v9 and onwards.
>
> Expose this as a tracepoint to userspace. This allows users to debug
> various scenarios and gather interesting information, such as: knowing
> how much hardware is lit up at any given time, correlating graphics
> corruption with a specific active shader core, measuring when hardware
> is allowed to go to an inactive state again, and so on.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.c | 1 +
> drivers/gpu/drm/panthor/panthor_gpu.c | 9 ++++++++
> drivers/gpu/drm/panthor/panthor_trace.h | 38 ++++++++++++++++++++++++++++++++
> 3 files changed, 48 insertions(+)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index e133b1e0ad6d..a3cb934104b8 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -548,6 +548,7 @@ int panthor_device_resume(struct device *dev)
> DRM_PANTHOR_USER_MMIO_OFFSET, 0, 1);
> atomic_set(&ptdev->pm.state, PANTHOR_DEVICE_PM_STATE_ACTIVE);
> mutex_unlock(&ptdev->pm.mmio_lock);
> +
> return 0;
>
> err_suspend_devfreq:
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index 9cb5dee93212..8830aa9a5c4b 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -22,6 +22,9 @@
> #include "panthor_hw.h"
> #include "panthor_regs.h"
>
> +#define CREATE_TRACE_POINTS
> +#include "panthor_trace.h"
> +
> /**
> * struct panthor_gpu - GPU block management data.
> */
> @@ -46,6 +49,7 @@ struct panthor_gpu {
> (GPU_IRQ_FAULT | \
> GPU_IRQ_PROTM_FAULT | \
> GPU_IRQ_RESET_COMPLETED | \
> + GPU_IRQ_POWER_CHANGED_ALL | \
> GPU_IRQ_CLEAN_CACHES_COMPLETED)
>
> static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
> @@ -97,6 +101,11 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
> wake_up_all(&ptdev->gpu->reqs_acked);
> }
> spin_unlock(&ptdev->gpu->reqs_lock);
> +
> + if (status & GPU_IRQ_POWER_CHANGED_ALL)
> + trace_gpu_power_active(gpu_read64(ptdev, SHADER_PWRACTIVE),
> + gpu_read64(ptdev, TILER_PWRACTIVE),
> + gpu_read64(ptdev, L2_PWRACTIVE));
> }
> PANTHOR_IRQ_HANDLER(gpu, GPU, panthor_gpu_irq_handler);
>
> diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
> new file mode 100644
> index 000000000000..01013f81e68a
> --- /dev/null
> +++ b/drivers/gpu/drm/panthor/panthor_trace.h
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0 or MIT */
> +/* Copyright 2025 Collabora ltd. */
> +
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM panthor
> +
> +#if !defined(__PANTHOR_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
> +#define __PANTHOR_TRACE_H__
> +
> +#include <linux/tracepoint.h>
> +#include <linux/types.h>
> +
> +TRACE_EVENT(gpu_power_active,
> + TP_PROTO(u64 shader_bitmap, u64 tiler_bitmap, u64 l2_bitmap),
> + TP_ARGS(shader_bitmap, tiler_bitmap, l2_bitmap),
> + TP_STRUCT__entry(
> + __field(u64, shader_bitmap)
> + __field(u64, tiler_bitmap)
> + __field(u64, l2_bitmap)
> + ),
> + TP_fast_assign(
> + __entry->shader_bitmap = shader_bitmap;
> + __entry->tiler_bitmap = tiler_bitmap;
> + __entry->l2_bitmap = l2_bitmap;
> + ),
> + TP_printk("shader_bitmap=0x%llx tiler_bitmap=0x%llx l2_bitmap=0x%llx",
> + __entry->shader_bitmap, __entry->tiler_bitmap, __entry->l2_bitmap
> + )
> +);
> +
> +#endif /* __PANTHOR_TRACE_H__ */
> +
> +#undef TRACE_INCLUDE_PATH
> +#define TRACE_INCLUDE_PATH .
> +#undef TRACE_INCLUDE_FILE
> +#define TRACE_INCLUDE_FILE panthor_trace
> +
> +#include <trace/define_trace.h>
>
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-04 20:21 ` Chia-I Wu
@ 2025-12-05 10:47 ` Nicolas Frattaroli
2025-12-05 21:16 ` Chia-I Wu
0 siblings, 1 reply; 15+ messages in thread
From: Nicolas Frattaroli @ 2025-12-05 10:47 UTC (permalink / raw)
To: Chia-I Wu
Cc: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, linux-kernel, dri-devel
On Thursday, 4 December 2025 21:21:08 Central European Standard Time Chia-I Wu wrote:
> On Wed, Dec 3, 2025 at 6:04 AM Nicolas Frattaroli
> <nicolas.frattaroli@collabora.com> wrote:
> >
> > Mali GPUs have three registers that indicate which parts of the hardware
> > are powered and active at any moment. These take the form of bitmaps. In
> > the case of SHADER_PWRACTIVE for example, a high bit indicates that the
> > shader core corresponding to that bit index is active. These bitmaps
> > aren't solely contiguous bits, as it's common to have holes in the
> > sequence of shader core indices, and the actual set of which cores are
> > present is defined by the "shader present" register.
> >
> > When the GPU finishes a power state transition, it fires a
> > GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
> > received, the PWRACTIVE registers will likely contain interesting new
> > information.
> I am seeing
>
> irq/342-panthor-412 [000] ..... 934.526754: gpu_power_active:
> shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
> irq/342-panthor-412 [000] ..... 936.640356: gpu_power_active:
> shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
>
> on a gpu-bound test. It does not look like collecting samples on
> GPU_IRQ_POWER_CHANGED_ALL gives too much info.
On what GPU and SoC is that? If it's MT8196 then I wouldn't be
surprised if it just broke that hardware register, considering
what it did to the SHADER_PRESENT register.
On RK3588 (v10), GPU_IRQ_POWER_CHANGED_ALL reliably fires when
there is new information available in those registers. I haven't
tried on MT8196 (v13) yet because that still doesn't boot with
mainline so testing anything is a pain.
I don't have any v12 or v11 hardware to test with. From what I
understand, there's no open enough platform to do v11 testing on,
just the Pixel 8 and Pixel 9. I could look into the Cix SoC for v12
though some day, but I don't own one at the moment.
>
> I think they are more useful to be collected periodically, such that
> we know that in the past X seconds, Y out of a total of Z samples
> indicates activities. That's best done in userspace, and panthor's
> role should be to provide an uapi such as
> https://lore.kernel.org/all/cover.1743517880.git.lukas.zapolskas@arm.com/.
This wouldn't give you information on the time a power transition has
completed, which is one of the motivations. A periodically collected
PWRACTIVE would just be roughly correlated to how busy the GPU is,
which isn't very useful additional information as the performance
counters themselves are likely a better source of that kind of info.
What I need to do is restrict this to <= v13 in the next revision
however, because v14 reworks this stuff.
Kind regards,
Nicolas Frattaroli
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-05 10:47 ` Nicolas Frattaroli
@ 2025-12-05 21:16 ` Chia-I Wu
2025-12-08 7:48 ` Nicolas Frattaroli
0 siblings, 1 reply; 15+ messages in thread
From: Chia-I Wu @ 2025-12-05 21:16 UTC (permalink / raw)
To: Nicolas Frattaroli
Cc: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, linux-kernel, dri-devel
On Fri, Dec 5, 2025 at 2:48 AM Nicolas Frattaroli
<nicolas.frattaroli@collabora.com> wrote:
>
> On Thursday, 4 December 2025 21:21:08 Central European Standard Time Chia-I Wu wrote:
> > On Wed, Dec 3, 2025 at 6:04 AM Nicolas Frattaroli
> > <nicolas.frattaroli@collabora.com> wrote:
> > >
> > > Mali GPUs have three registers that indicate which parts of the hardware
> > > are powered and active at any moment. These take the form of bitmaps. In
> > > the case of SHADER_PWRACTIVE for example, a high bit indicates that the
> > > shader core corresponding to that bit index is active. These bitmaps
> > > aren't solely contiguous bits, as it's common to have holes in the
> > > sequence of shader core indices, and the actual set of which cores are
> > > present is defined by the "shader present" register.
> > >
> > > When the GPU finishes a power state transition, it fires a
> > > GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
> > > received, the PWRACTIVE registers will likely contain interesting new
> > > information.
> > I am seeing
> >
> > irq/342-panthor-412 [000] ..... 934.526754: gpu_power_active:
> > shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
> > irq/342-panthor-412 [000] ..... 936.640356: gpu_power_active:
> > shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
> >
> > on a gpu-bound test. It does not look like collecting samples on
> > GPU_IRQ_POWER_CHANGED_ALL gives too much info.
>
> On what GPU and SoC is that? If it's MT8196 then I wouldn't be
> surprised if it just broke that hardware register, considering
> what it did to the SHADER_PRESENT register.
Indeed I was on mt8196.
>
> On RK3588 (v10), GPU_IRQ_POWER_CHANGED_ALL reliably fires when
> there is new information available in those registers. I haven't
> tried on MT8196 (v13) yet because that still doesn't boot with
> mainline so testing anything is a pain.
>
> I don't have any v12 or v11 hardware to test with. From what I
> understand, there's no open enough platform to do v11 testing on,
> just the Pixel 8 and Pixel 9. I could look into the Cix SoC for v12
> though some day, but I don't own one at the moment.
>
> >
> > I think they are more useful to be collected periodically, such that
> > we know that in the past X seconds, Y out of a total of Z samples
> > indicates activities. That's best done in userspace, and panthor's
> > role should be to provide an uapi such as
> > https://lore.kernel.org/all/cover.1743517880.git.lukas.zapolskas@arm.com/.
>
> This wouldn't give you information on the time a power transition has
> completed, which is one of the motivations. A periodically collected
> PWRACTIVE would just be roughly correlated to how busy the GPU is,
> which isn't very useful additional information as the performance
> counters themselves are likely a better source of that kind of info.
{SHADER,TILER,L2}_READY might be more appropriate if you want to trace
power transitions?
>
> What I need to do is restrict this to <= v13 in the next revision
> however, because v14 reworks this stuff.
>
> Kind regards,
> Nicolas Frattaroli
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-05 21:16 ` Chia-I Wu
@ 2025-12-08 7:48 ` Nicolas Frattaroli
2025-12-08 18:28 ` Chia-I Wu
0 siblings, 1 reply; 15+ messages in thread
From: Nicolas Frattaroli @ 2025-12-08 7:48 UTC (permalink / raw)
To: Chia-I Wu
Cc: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, linux-kernel, dri-devel
On Friday, 5 December 2025 22:16:44 Central European Standard Time Chia-I Wu wrote:
> On Fri, Dec 5, 2025 at 2:48 AM Nicolas Frattaroli
> <nicolas.frattaroli@collabora.com> wrote:
> >
> > On Thursday, 4 December 2025 21:21:08 Central European Standard Time Chia-I Wu wrote:
> > > On Wed, Dec 3, 2025 at 6:04 AM Nicolas Frattaroli
> > > <nicolas.frattaroli@collabora.com> wrote:
> > > >
> > > > Mali GPUs have three registers that indicate which parts of the hardware
> > > > are powered and active at any moment. These take the form of bitmaps. In
> > > > the case of SHADER_PWRACTIVE for example, a high bit indicates that the
> > > > shader core corresponding to that bit index is active. These bitmaps
> > > > aren't solely contiguous bits, as it's common to have holes in the
> > > > sequence of shader core indices, and the actual set of which cores are
> > > > present is defined by the "shader present" register.
> > > >
> > > > When the GPU finishes a power state transition, it fires a
> > > > GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
> > > > received, the PWRACTIVE registers will likely contain interesting new
> > > > information.
> > > I am seeing
> > >
> > > irq/342-panthor-412 [000] ..... 934.526754: gpu_power_active:
> > > shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
> > > irq/342-panthor-412 [000] ..... 936.640356: gpu_power_active:
> > > shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
> > >
> > > on a gpu-bound test. It does not look like collecting samples on
> > > GPU_IRQ_POWER_CHANGED_ALL gives too much info.
> >
> > On what GPU and SoC is that? If it's MT8196 then I wouldn't be
> > surprised if it just broke that hardware register, considering
> > what it did to the SHADER_PRESENT register.
> Indeed I was on mt8196.
I don't have much faith in the Mali integration of that SoC being
representative of how the Mali hardware is supposed to work. The
SHADER_PRESENT thing is just the tip of the iceberg, I've also
noticed while developing mtk-mfg-pmdomain that it seemingly messes
with the Mali GPU's internal MCU from the GPUEB depending on the
commands you send it, and can get it into a broken state with
enough luck.
Check if the registers ever read anything but 0, e.g. by dumping
them from sysfs like this:
---
diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
index d1d4c50da5bf..b0e67dc17c92 100644
--- a/drivers/gpu/drm/panthor/panthor_drv.c
+++ b/drivers/gpu/drm/panthor/panthor_drv.c
@@ -1678,8 +1678,69 @@ static ssize_t profiling_store(struct device *dev,
static DEVICE_ATTR_RW(profiling);
+static ssize_t print_active_bitmask(char *buf, ssize_t len, u64 present, u64 active)
+{
+ unsigned int i = 0;
+ u64 bit;
+
+ while (present) {
+ bit = BIT(i);
+ if (present & bit) {
+ present &= ~bit;
+ len += sysfs_emit_at(buf, len, "%s", (active & bit) ? "1" : "0");
+ } else {
+ len += sysfs_emit_at(buf, len, "_");
+ }
+ i++;
+ }
+
+ return len;
+}
+
+static ssize_t power_active_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct panthor_device *ptdev = dev_get_drvdata(dev);
+ ssize_t len = 0;
+ u64 present;
+ int ret;
+
+ if (pm_runtime_suspended(ptdev->base.dev))
+ return sysfs_emit(buf, "Shader:\t0\nTiler:\t0\nL2:\t0\n");
+
+ ret = pm_runtime_resume_and_get(ptdev->base.dev);
+ if (ret)
+ return ret;
+
+ len += sysfs_emit_at(buf, len, "Shader:\t");
+ len += print_active_bitmask(buf, len, gpu_read64(ptdev, GPU_SHADER_PRESENT),
+ gpu_read64(ptdev, SHADER_PWRACTIVE));
+ len += sysfs_emit_at(buf, len, "\n");
+
+ present = gpu_read64(ptdev, GPU_TILER_PRESENT);
+ if (present == 0x1) /* "Implementation defined", just try to dump all */
+ present = U64_MAX;
+ len += sysfs_emit_at(buf, len, "Tiler:\t");
+ len += print_active_bitmask(buf, len, present, gpu_read64(ptdev, TILER_PWRACTIVE));
+ len += sysfs_emit_at(buf, len, "\n");
+
+ present = gpu_read64(ptdev, GPU_L2_PRESENT);
+ if (present == 0x1) /* "Implementation defined", just try to dump all */
+ present = U64_MAX;
+ len += sysfs_emit_at(buf, len, "L2:\t");
+ len += print_active_bitmask(buf, len, present, gpu_read64(ptdev, L2_PWRACTIVE));
+ len += sysfs_emit_at(buf, len, "\n");
+
+ pm_runtime_put(ptdev->base.dev);
+
+ return len;
+}
+
+static DEVICE_ATTR_RO(power_active);
+
static struct attribute *panthor_attrs[] = {
&dev_attr_profiling.attr,
+ &dev_attr_power_active.attr,
NULL,
};
---
If they always read 0 regardless of whether you're running a GPU
workload or not, then it's just not properly wired up.
> >
> > On RK3588 (v10), GPU_IRQ_POWER_CHANGED_ALL reliably fires when
> > there is new information available in those registers. I haven't
> > tried on MT8196 (v13) yet because that still doesn't boot with
> > mainline so testing anything is a pain.
> >
> > I don't have any v12 or v11 hardware to test with. From what I
> > understand, there's no open enough platform to do v11 testing on,
> > just the Pixel 8 and Pixel 9. I could look into the Cix SoC for v12
> > though some day, but I don't own one at the moment.
> >
> > >
> > > I think they are more useful to be collected periodically, such that
> > > we know that in the past X seconds, Y out of a total of Z samples
> > > indicates activities. That's best done in userspace, and panthor's
> > > role should be to provide an uapi such as
> > > https://lore.kernel.org/all/cover.1743517880.git.lukas.zapolskas@arm.com/.
> >
> > This wouldn't give you information on the time a power transition has
> > completed, which is one of the motivations. A periodically collected
> > PWRACTIVE would just be roughly correlated to how busy the GPU is,
> > which isn't very useful additional information as the performance
> > counters themselves are likely a better source of that kind of info.
> {SHADER,TILER,L2}_READY might be more appropriate if you want to trace
> power transitions?
Depends, the documentation I have access to isn't explicit about
what "READY" means. Is a busy core non-ready? Is there ever a case
where a significant number of cores are READY but not PWRACTIVE?
I can answer the first question with some more poking on RK3588,
but for the latter a simple experiment on one piece of hardware
isn't going to answer it. Plus, the core being active will probably
be more interesting than it either sitting idle but powered or
actually doing work.
>
> >
> > What I need to do is restrict this to <= v13 in the next revision
> > however, because v14 reworks this stuff.
> >
> > Kind regards,
> > Nicolas Frattaroli
> >
> >
>
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-03 13:56 ` [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes Nicolas Frattaroli
2025-12-04 20:21 ` Chia-I Wu
@ 2025-12-08 17:14 ` Karunika Choo
2025-12-09 13:01 ` Nicolas Frattaroli
2025-12-08 17:21 ` Karunika Choo
2 siblings, 1 reply; 15+ messages in thread
From: Karunika Choo @ 2025-12-08 17:14 UTC (permalink / raw)
To: Nicolas Frattaroli, Boris Brezillon, Steven Price, Liviu Dudau,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: kernel, linux-kernel, dri-devel
On 03/12/2025 13:56, Nicolas Frattaroli wrote:
> Mali GPUs have three registers that indicate which parts of the hardware
> are powered and active at any moment. These take the form of bitmaps. In
> the case of SHADER_PWRACTIVE for example, a high bit indicates that the
> shader core corresponding to that bit index is active. These bitmaps
> aren't solely contiguous bits, as it's common to have holes in the
> sequence of shader core indices, and the actual set of which cores are
> present is defined by the "shader present" register.
>
> When the GPU finishes a power state transition, it fires a
> GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
> received, the PWRACTIVE registers will likely contain interesting new
> information.
>
> This is not to be confused with the PWR_IRQ_POWER_CHANGED_ALL interrupt,
> which is something related to Mali v14+'s power control logic. The
> PWRACTIVE registers and corresponding interrupts are already available
> in v9 and onwards.
>
> Expose this as a tracepoint to userspace. This allows users to debug
> various scenarios and gather interesting information, such as: knowing
> how much hardware is lit up at any given time, correlating graphics
> corruption with a specific active shader core, measuring when hardware
> is allowed to go to an inactive state again, and so on.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.c | 1 +
> drivers/gpu/drm/panthor/panthor_gpu.c | 9 ++++++++
> drivers/gpu/drm/panthor/panthor_trace.h | 38 ++++++++++++++++++++++++++++++++
> 3 files changed, 48 insertions(+)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index e133b1e0ad6d..a3cb934104b8 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -548,6 +548,7 @@ int panthor_device_resume(struct device *dev)
> DRM_PANTHOR_USER_MMIO_OFFSET, 0, 1);
> atomic_set(&ptdev->pm.state, PANTHOR_DEVICE_PM_STATE_ACTIVE);
> mutex_unlock(&ptdev->pm.mmio_lock);
> +
> return 0;
>
> err_suspend_devfreq:
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index 9cb5dee93212..8830aa9a5c4b 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -22,6 +22,9 @@
> #include "panthor_hw.h"
> #include "panthor_regs.h"
>
> +#define CREATE_TRACE_POINTS
> +#include "panthor_trace.h"
> +
> /**
> * struct panthor_gpu - GPU block management data.
> */
> @@ -46,6 +49,7 @@ struct panthor_gpu {
> (GPU_IRQ_FAULT | \
> GPU_IRQ_PROTM_FAULT | \
> GPU_IRQ_RESET_COMPLETED | \
> + GPU_IRQ_POWER_CHANGED_ALL | \
> GPU_IRQ_CLEAN_CACHES_COMPLETED)
>
> static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
> @@ -97,6 +101,11 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
> wake_up_all(&ptdev->gpu->reqs_acked);
> }
> spin_unlock(&ptdev->gpu->reqs_lock);
> +
> + if (status & GPU_IRQ_POWER_CHANGED_ALL)
> + trace_gpu_power_active(gpu_read64(ptdev, SHADER_PWRACTIVE),
> + gpu_read64(ptdev, TILER_PWRACTIVE),
> + gpu_read64(ptdev, L2_PWRACTIVE));
> }
> PANTHOR_IRQ_HANDLER(gpu, GPU, panthor_gpu_irq_handler);
>
> diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
> new file mode 100644
> index 000000000000..01013f81e68a
> --- /dev/null
> +++ b/drivers/gpu/drm/panthor/panthor_trace.h
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0 or MIT */
> +/* Copyright 2025 Collabora ltd. */
> +
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM panthor
> +
> +#if !defined(__PANTHOR_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
> +#define __PANTHOR_TRACE_H__
> +
> +#include <linux/tracepoint.h>
> +#include <linux/types.h>
> +
> +TRACE_EVENT(gpu_power_active,
> + TP_PROTO(u64 shader_bitmap, u64 tiler_bitmap, u64 l2_bitmap),
nit: if you want to add tracing can we also add the device name as
well? Something like:
TP_PROTO(struct device *dev, ...),
TP_ARGS(dev, ...),
TP_STRUCT__ENTRY(
__string(dev_name, dev_name(dev))
...
).
...
This will help differentiate the device it is originating from in
a multi GPU situation.
Kind regards,
Karunika
> + TP_ARGS(shader_bitmap, tiler_bitmap, l2_bitmap),
> + TP_STRUCT__entry(
> + __field(u64, shader_bitmap)
> + __field(u64, tiler_bitmap)
> + __field(u64, l2_bitmap)
> + ),
> + TP_fast_assign(
> + __entry->shader_bitmap = shader_bitmap;
> + __entry->tiler_bitmap = tiler_bitmap;
> + __entry->l2_bitmap = l2_bitmap;
> + ),
> + TP_printk("shader_bitmap=0x%llx tiler_bitmap=0x%llx l2_bitmap=0x%llx",
> + __entry->shader_bitmap, __entry->tiler_bitmap, __entry->l2_bitmap
> + )
> +);
> +
> +#endif /* __PANTHOR_TRACE_H__ */
> +
> +#undef TRACE_INCLUDE_PATH
> +#define TRACE_INCLUDE_PATH .
> +#undef TRACE_INCLUDE_FILE
> +#define TRACE_INCLUDE_FILE panthor_trace
> +
> +#include <trace/define_trace.h>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-03 13:56 ` [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes Nicolas Frattaroli
2025-12-04 20:21 ` Chia-I Wu
2025-12-08 17:14 ` Karunika Choo
@ 2025-12-08 17:21 ` Karunika Choo
2025-12-09 12:55 ` Nicolas Frattaroli
2 siblings, 1 reply; 15+ messages in thread
From: Karunika Choo @ 2025-12-08 17:21 UTC (permalink / raw)
To: Nicolas Frattaroli, Boris Brezillon, Steven Price, Liviu Dudau,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: kernel, linux-kernel, dri-devel
On 03/12/2025 13:56, Nicolas Frattaroli wrote:
> Mali GPUs have three registers that indicate which parts of the hardware
> are powered and active at any moment. These take the form of bitmaps. In
> the case of SHADER_PWRACTIVE for example, a high bit indicates that the
> shader core corresponding to that bit index is active. These bitmaps
> aren't solely contiguous bits, as it's common to have holes in the
> sequence of shader core indices, and the actual set of which cores are
> present is defined by the "shader present" register.
>
> When the GPU finishes a power state transition, it fires a
> GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
> received, the PWRACTIVE registers will likely contain interesting new
> information.
>
> This is not to be confused with the PWR_IRQ_POWER_CHANGED_ALL interrupt,
> which is something related to Mali v14+'s power control logic. The
> PWRACTIVE registers and corresponding interrupts are already available
> in v9 and onwards.
>
> Expose this as a tracepoint to userspace. This allows users to debug
> various scenarios and gather interesting information, such as: knowing
> how much hardware is lit up at any given time, correlating graphics
> corruption with a specific active shader core, measuring when hardware
> is allowed to go to an inactive state again, and so on.
>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.c | 1 +
> drivers/gpu/drm/panthor/panthor_gpu.c | 9 ++++++++
> drivers/gpu/drm/panthor/panthor_trace.h | 38 ++++++++++++++++++++++++++++++++
> 3 files changed, 48 insertions(+)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index e133b1e0ad6d..a3cb934104b8 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -548,6 +548,7 @@ int panthor_device_resume(struct device *dev)
> DRM_PANTHOR_USER_MMIO_OFFSET, 0, 1);
> atomic_set(&ptdev->pm.state, PANTHOR_DEVICE_PM_STATE_ACTIVE);
> mutex_unlock(&ptdev->pm.mmio_lock);
> +
> return 0;
>
> err_suspend_devfreq:
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index 9cb5dee93212..8830aa9a5c4b 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -22,6 +22,9 @@
> #include "panthor_hw.h"
> #include "panthor_regs.h"
>
> +#define CREATE_TRACE_POINTS
> +#include "panthor_trace.h"
> +
> /**
> * struct panthor_gpu - GPU block management data.
> */
> @@ -46,6 +49,7 @@ struct panthor_gpu {
> (GPU_IRQ_FAULT | \
> GPU_IRQ_PROTM_FAULT | \
> GPU_IRQ_RESET_COMPLETED | \
> + GPU_IRQ_POWER_CHANGED_ALL | \
Also, we've seen customers complain about too many IRQs originating
from this event, is there any chance we can enable this conditionally
i.e. only when the trace point is enabled?
Kind regards,
Karunika
> GPU_IRQ_CLEAN_CACHES_COMPLETED)
>
> static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
> @@ -97,6 +101,11 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
> wake_up_all(&ptdev->gpu->reqs_acked);
> }
> spin_unlock(&ptdev->gpu->reqs_lock);
> +
> + if (status & GPU_IRQ_POWER_CHANGED_ALL)
> + trace_gpu_power_active(gpu_read64(ptdev, SHADER_PWRACTIVE),
> + gpu_read64(ptdev, TILER_PWRACTIVE),
> + gpu_read64(ptdev, L2_PWRACTIVE));
> }
> PANTHOR_IRQ_HANDLER(gpu, GPU, panthor_gpu_irq_handler);
>
> diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
> new file mode 100644
> index 000000000000..01013f81e68a
> --- /dev/null
> +++ b/drivers/gpu/drm/panthor/panthor_trace.h
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0 or MIT */
> +/* Copyright 2025 Collabora ltd. */
> +
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM panthor
> +
> +#if !defined(__PANTHOR_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
> +#define __PANTHOR_TRACE_H__
> +
> +#include <linux/tracepoint.h>
> +#include <linux/types.h>
> +
> +TRACE_EVENT(gpu_power_active,
> + TP_PROTO(u64 shader_bitmap, u64 tiler_bitmap, u64 l2_bitmap),
> + TP_ARGS(shader_bitmap, tiler_bitmap, l2_bitmap),
> + TP_STRUCT__entry(
> + __field(u64, shader_bitmap)
> + __field(u64, tiler_bitmap)
> + __field(u64, l2_bitmap)
> + ),
> + TP_fast_assign(
> + __entry->shader_bitmap = shader_bitmap;
> + __entry->tiler_bitmap = tiler_bitmap;
> + __entry->l2_bitmap = l2_bitmap;
> + ),
> + TP_printk("shader_bitmap=0x%llx tiler_bitmap=0x%llx l2_bitmap=0x%llx",
> + __entry->shader_bitmap, __entry->tiler_bitmap, __entry->l2_bitmap
> + )
> +);
> +
> +#endif /* __PANTHOR_TRACE_H__ */
> +
> +#undef TRACE_INCLUDE_PATH
> +#define TRACE_INCLUDE_PATH .
> +#undef TRACE_INCLUDE_FILE
> +#define TRACE_INCLUDE_FILE panthor_trace
> +
> +#include <trace/define_trace.h>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-08 7:48 ` Nicolas Frattaroli
@ 2025-12-08 18:28 ` Chia-I Wu
2025-12-09 10:32 ` Karunika Choo
0 siblings, 1 reply; 15+ messages in thread
From: Chia-I Wu @ 2025-12-08 18:28 UTC (permalink / raw)
To: Nicolas Frattaroli
Cc: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, linux-kernel, dri-devel
On Sun, Dec 7, 2025 at 11:49 PM Nicolas Frattaroli
<nicolas.frattaroli@collabora.com> wrote:
>
> On Friday, 5 December 2025 22:16:44 Central European Standard Time Chia-I Wu wrote:
> > On Fri, Dec 5, 2025 at 2:48 AM Nicolas Frattaroli
> > <nicolas.frattaroli@collabora.com> wrote:
> > >
> > > On Thursday, 4 December 2025 21:21:08 Central European Standard Time Chia-I Wu wrote:
> > > > On Wed, Dec 3, 2025 at 6:04 AM Nicolas Frattaroli
> > > > <nicolas.frattaroli@collabora.com> wrote:
> > > > >
> > > > > Mali GPUs have three registers that indicate which parts of the hardware
> > > > > are powered and active at any moment. These take the form of bitmaps. In
> > > > > the case of SHADER_PWRACTIVE for example, a high bit indicates that the
> > > > > shader core corresponding to that bit index is active. These bitmaps
> > > > > aren't solely contiguous bits, as it's common to have holes in the
> > > > > sequence of shader core indices, and the actual set of which cores are
> > > > > present is defined by the "shader present" register.
> > > > >
> > > > > When the GPU finishes a power state transition, it fires a
> > > > > GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
> > > > > received, the PWRACTIVE registers will likely contain interesting new
> > > > > information.
> > > > I am seeing
> > > >
> > > > irq/342-panthor-412 [000] ..... 934.526754: gpu_power_active:
> > > > shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
> > > > irq/342-panthor-412 [000] ..... 936.640356: gpu_power_active:
> > > > shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
> > > >
> > > > on a gpu-bound test. It does not look like collecting samples on
> > > > GPU_IRQ_POWER_CHANGED_ALL gives too much info.
> > >
> > > On what GPU and SoC is that? If it's MT8196 then I wouldn't be
> > > surprised if it just broke that hardware register, considering
> > > what it did to the SHADER_PRESENT register.
> > Indeed I was on mt8196.
>
> I don't have much faith in the Mali integration of that SoC being
> representative of how the Mali hardware is supposed to work. The
> SHADER_PRESENT thing is just the tip of the iceberg, I've also
> noticed while developing mtk-mfg-pmdomain that it seemingly messes
> with the Mali GPU's internal MCU from the GPUEB depending on the
> commands you send it, and can get it into a broken state with
> enough luck.
>
> Check if the registers ever read anything but 0, e.g. by dumping
> them from sysfs like this:
>
> ---
> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
> index d1d4c50da5bf..b0e67dc17c92 100644
> --- a/drivers/gpu/drm/panthor/panthor_drv.c
> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> @@ -1678,8 +1678,69 @@ static ssize_t profiling_store(struct device *dev,
>
> static DEVICE_ATTR_RW(profiling);
>
> +static ssize_t print_active_bitmask(char *buf, ssize_t len, u64 present, u64 active)
> +{
> + unsigned int i = 0;
> + u64 bit;
> +
> + while (present) {
> + bit = BIT(i);
> + if (present & bit) {
> + present &= ~bit;
> + len += sysfs_emit_at(buf, len, "%s", (active & bit) ? "1" : "0");
> + } else {
> + len += sysfs_emit_at(buf, len, "_");
> + }
> + i++;
> + }
> +
> + return len;
> +}
> +
> +static ssize_t power_active_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct panthor_device *ptdev = dev_get_drvdata(dev);
> + ssize_t len = 0;
> + u64 present;
> + int ret;
> +
> + if (pm_runtime_suspended(ptdev->base.dev))
> + return sysfs_emit(buf, "Shader:\t0\nTiler:\t0\nL2:\t0\n");
> +
> + ret = pm_runtime_resume_and_get(ptdev->base.dev);
> + if (ret)
> + return ret;
> +
> + len += sysfs_emit_at(buf, len, "Shader:\t");
> + len += print_active_bitmask(buf, len, gpu_read64(ptdev, GPU_SHADER_PRESENT),
> + gpu_read64(ptdev, SHADER_PWRACTIVE));
> + len += sysfs_emit_at(buf, len, "\n");
> +
> + present = gpu_read64(ptdev, GPU_TILER_PRESENT);
> + if (present == 0x1) /* "Implementation defined", just try to dump all */
> + present = U64_MAX;
> + len += sysfs_emit_at(buf, len, "Tiler:\t");
> + len += print_active_bitmask(buf, len, present, gpu_read64(ptdev, TILER_PWRACTIVE));
> + len += sysfs_emit_at(buf, len, "\n");
> +
> + present = gpu_read64(ptdev, GPU_L2_PRESENT);
> + if (present == 0x1) /* "Implementation defined", just try to dump all */
> + present = U64_MAX;
> + len += sysfs_emit_at(buf, len, "L2:\t");
> + len += print_active_bitmask(buf, len, present, gpu_read64(ptdev, L2_PWRACTIVE));
> + len += sysfs_emit_at(buf, len, "\n");
> +
> + pm_runtime_put(ptdev->base.dev);
> +
> + return len;
> +}
> +
> +static DEVICE_ATTR_RO(power_active);
> +
> static struct attribute *panthor_attrs[] = {
> &dev_attr_profiling.attr,
> + &dev_attr_power_active.attr,
> NULL,
> };
> ---
>
> If they always read 0 regardless of whether you're running a GPU
> workload or not, then it's just not properly wired up.
They can be non-zero.
>
> > >
> > > On RK3588 (v10), GPU_IRQ_POWER_CHANGED_ALL reliably fires when
> > > there is new information available in those registers. I haven't
> > > tried on MT8196 (v13) yet because that still doesn't boot with
> > > mainline so testing anything is a pain.
> > >
> > > I don't have any v12 or v11 hardware to test with. From what I
> > > understand, there's no open enough platform to do v11 testing on,
> > > just the Pixel 8 and Pixel 9. I could look into the Cix SoC for v12
> > > though some day, but I don't own one at the moment.
> > >
> > > >
> > > > I think they are more useful to be collected periodically, such that
> > > > we know that in the past X seconds, Y out of a total of Z samples
> > > > indicates activities. That's best done in userspace, and panthor's
> > > > role should be to provide an uapi such as
> > > > https://lore.kernel.org/all/cover.1743517880.git.lukas.zapolskas@arm.com/.
> > >
> > > This wouldn't give you information on the time a power transition has
> > > completed, which is one of the motivations. A periodically collected
> > > PWRACTIVE would just be roughly correlated to how busy the GPU is,
> > > which isn't very useful additional information as the performance
> > > counters themselves are likely a better source of that kind of info.
> > {SHADER,TILER,L2}_READY might be more appropriate if you want to trace
> > power transitions?
>
> Depends, the documentation I have access to isn't explicit about
> what "READY" means. Is a busy core non-ready? Is there ever a case
> where a significant number of cores are READY but not PWRACTIVE?
>
> I can answer the first question with some more poking on RK3588,
> but for the latter a simple experiment on one piece of hardware
> isn't going to answer it. Plus, the core being active will probably
> be more interesting than it either sitting idle but powered or
> actually doing work.
From what I can see, *_READY are non-zero when powered and *_PWRACTIVE
are non-zero when powered and busy on mt8196.
If you want to generate a trace event upon GPU_IRQ_POWER_CHANGED_ALL,
*_READY seems more appropriate at least on mt8196. If you want to
track busyness with *_PWRACTIVE, you probably need to sample
periodically.
>
> >
> > >
> > > What I need to do is restrict this to <= v13 in the next revision
> > > however, because v14 reworks this stuff.
> > >
> > > Kind regards,
> > > Nicolas Frattaroli
> > >
> > >
> >
>
>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-08 18:28 ` Chia-I Wu
@ 2025-12-09 10:32 ` Karunika Choo
0 siblings, 0 replies; 15+ messages in thread
From: Karunika Choo @ 2025-12-09 10:32 UTC (permalink / raw)
To: Chia-I Wu, Nicolas Frattaroli
Cc: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
kernel, linux-kernel, dri-devel
On 08/12/2025 18:28, Chia-I Wu wrote:
> On Sun, Dec 7, 2025 at 11:49 PM Nicolas Frattaroli
> <nicolas.frattaroli@collabora.com> wrote:
>>
>> On Friday, 5 December 2025 22:16:44 Central European Standard Time Chia-I Wu wrote:
>>> On Fri, Dec 5, 2025 at 2:48 AM Nicolas Frattaroli
>>> <nicolas.frattaroli@collabora.com> wrote:
>>>>
>>>> On Thursday, 4 December 2025 21:21:08 Central European Standard Time Chia-I Wu wrote:
>>>>> On Wed, Dec 3, 2025 at 6:04 AM Nicolas Frattaroli
>>>>> <nicolas.frattaroli@collabora.com> wrote:
>>>>>>
>>>>>> Mali GPUs have three registers that indicate which parts of the hardware
>>>>>> are powered and active at any moment. These take the form of bitmaps. In
>>>>>> the case of SHADER_PWRACTIVE for example, a high bit indicates that the
>>>>>> shader core corresponding to that bit index is active. These bitmaps
>>>>>> aren't solely contiguous bits, as it's common to have holes in the
>>>>>> sequence of shader core indices, and the actual set of which cores are
>>>>>> present is defined by the "shader present" register.
>>>>>>
>>>>>> When the GPU finishes a power state transition, it fires a
>>>>>> GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
>>>>>> received, the PWRACTIVE registers will likely contain interesting new
>>>>>> information.
>>>>> I am seeing
>>>>>
>>>>> irq/342-panthor-412 [000] ..... 934.526754: gpu_power_active:
>>>>> shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
>>>>> irq/342-panthor-412 [000] ..... 936.640356: gpu_power_active:
>>>>> shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
>>>>>
>>>>> on a gpu-bound test. It does not look like collecting samples on
>>>>> GPU_IRQ_POWER_CHANGED_ALL gives too much info.
>>>>
>>>> On what GPU and SoC is that? If it's MT8196 then I wouldn't be
>>>> surprised if it just broke that hardware register, considering
>>>> what it did to the SHADER_PRESENT register.
>>> Indeed I was on mt8196.
>>
>> I don't have much faith in the Mali integration of that SoC being
>> representative of how the Mali hardware is supposed to work. The
>> SHADER_PRESENT thing is just the tip of the iceberg, I've also
>> noticed while developing mtk-mfg-pmdomain that it seemingly messes
>> with the Mali GPU's internal MCU from the GPUEB depending on the
>> commands you send it, and can get it into a broken state with
>> enough luck.
>>
>> Check if the registers ever read anything but 0, e.g. by dumping
>> them from sysfs like this:
>>
>> ---
>> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
>> index d1d4c50da5bf..b0e67dc17c92 100644
>> --- a/drivers/gpu/drm/panthor/panthor_drv.c
>> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
>> @@ -1678,8 +1678,69 @@ static ssize_t profiling_store(struct device *dev,
>>
>> static DEVICE_ATTR_RW(profiling);
>>
>> +static ssize_t print_active_bitmask(char *buf, ssize_t len, u64 present, u64 active)
>> +{
>> + unsigned int i = 0;
>> + u64 bit;
>> +
>> + while (present) {
>> + bit = BIT(i);
>> + if (present & bit) {
>> + present &= ~bit;
>> + len += sysfs_emit_at(buf, len, "%s", (active & bit) ? "1" : "0");
>> + } else {
>> + len += sysfs_emit_at(buf, len, "_");
>> + }
>> + i++;
>> + }
>> +
>> + return len;
>> +}
>> +
>> +static ssize_t power_active_show(struct device *dev, struct device_attribute *attr,
>> + char *buf)
>> +{
>> + struct panthor_device *ptdev = dev_get_drvdata(dev);
>> + ssize_t len = 0;
>> + u64 present;
>> + int ret;
>> +
>> + if (pm_runtime_suspended(ptdev->base.dev))
>> + return sysfs_emit(buf, "Shader:\t0\nTiler:\t0\nL2:\t0\n");
>> +
>> + ret = pm_runtime_resume_and_get(ptdev->base.dev);
>> + if (ret)
>> + return ret;
>> +
>> + len += sysfs_emit_at(buf, len, "Shader:\t");
>> + len += print_active_bitmask(buf, len, gpu_read64(ptdev, GPU_SHADER_PRESENT),
>> + gpu_read64(ptdev, SHADER_PWRACTIVE));
>> + len += sysfs_emit_at(buf, len, "\n");
>> +
>> + present = gpu_read64(ptdev, GPU_TILER_PRESENT);
>> + if (present == 0x1) /* "Implementation defined", just try to dump all */
>> + present = U64_MAX;
>> + len += sysfs_emit_at(buf, len, "Tiler:\t");
>> + len += print_active_bitmask(buf, len, present, gpu_read64(ptdev, TILER_PWRACTIVE));
>> + len += sysfs_emit_at(buf, len, "\n");
>> +
>> + present = gpu_read64(ptdev, GPU_L2_PRESENT);
>> + if (present == 0x1) /* "Implementation defined", just try to dump all */
>> + present = U64_MAX;
>> + len += sysfs_emit_at(buf, len, "L2:\t");
>> + len += print_active_bitmask(buf, len, present, gpu_read64(ptdev, L2_PWRACTIVE));
>> + len += sysfs_emit_at(buf, len, "\n");
>> +
>> + pm_runtime_put(ptdev->base.dev);
>> +
>> + return len;
>> +}
>> +
>> +static DEVICE_ATTR_RO(power_active);
>> +
>> static struct attribute *panthor_attrs[] = {
>> &dev_attr_profiling.attr,
>> + &dev_attr_power_active.attr,
>> NULL,
>> };
>> ---
>>
>> If they always read 0 regardless of whether you're running a GPU
>> workload or not, then it's just not properly wired up.
> They can be non-zero.
>>
>>>>
>>>> On RK3588 (v10), GPU_IRQ_POWER_CHANGED_ALL reliably fires when
>>>> there is new information available in those registers. I haven't
>>>> tried on MT8196 (v13) yet because that still doesn't boot with
>>>> mainline so testing anything is a pain.
>>>>
>>>> I don't have any v12 or v11 hardware to test with. From what I
>>>> understand, there's no open enough platform to do v11 testing on,
>>>> just the Pixel 8 and Pixel 9. I could look into the Cix SoC for v12
>>>> though some day, but I don't own one at the moment.
>>>>
>>>>>
>>>>> I think they are more useful to be collected periodically, such that
>>>>> we know that in the past X seconds, Y out of a total of Z samples
>>>>> indicates activities. That's best done in userspace, and panthor's
>>>>> role should be to provide an uapi such as
>>>>> https://lore.kernel.org/all/cover.1743517880.git.lukas.zapolskas@arm.com/.
>>>>
>>>> This wouldn't give you information on the time a power transition has
>>>> completed, which is one of the motivations. A periodically collected
>>>> PWRACTIVE would just be roughly correlated to how busy the GPU is,
>>>> which isn't very useful additional information as the performance
>>>> counters themselves are likely a better source of that kind of info.
>>> {SHADER,TILER,L2}_READY might be more appropriate if you want to trace
>>> power transitions?
>>
>> Depends, the documentation I have access to isn't explicit about
>> what "READY" means. Is a busy core non-ready? Is there ever a case
>> where a significant number of cores are READY but not PWRACTIVE?
>>
>> I can answer the first question with some more poking on RK3588,
>> but for the latter a simple experiment on one piece of hardware
>> isn't going to answer it. Plus, the core being active will probably
>> be more interesting than it either sitting idle but powered or
>> actually doing work.
> From what I can see, *_READY are non-zero when powered and *_PWRACTIVE
> are non-zero when powered and busy on mt8196.
>
> If you want to generate a trace event upon GPU_IRQ_POWER_CHANGED_ALL,
> *_READY seems more appropriate at least on mt8196. If you want to
> track busyness with *_PWRACTIVE, you probably need to sample
> periodically.
Hello,
Just chiming in from the architecture perspective, *_PWRACTIVE indicates
which cores are currently active and processing data, while *_READY show
which cores are powered up.
So in essence, *_READY might be more suitable in this case as
*_PWRACTIVE can be zero if there is no work running.
Kind regards,
Karunika
>
>>
>>>
>>>>
>>>> What I need to do is restrict this to <= v13 in the next revision
>>>> however, because v14 reworks this stuff.
>>>>
>>>> Kind regards,
>>>> Nicolas Frattaroli
>>>>
>>>>
>>>
>>
>>
>>
>>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-08 17:21 ` Karunika Choo
@ 2025-12-09 12:55 ` Nicolas Frattaroli
0 siblings, 0 replies; 15+ messages in thread
From: Nicolas Frattaroli @ 2025-12-09 12:55 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Karunika Choo
Cc: kernel, linux-kernel, dri-devel
On Monday, 8 December 2025 18:21:06 Central European Standard Time Karunika Choo wrote:
> On 03/12/2025 13:56, Nicolas Frattaroli wrote:
> > Mali GPUs have three registers that indicate which parts of the hardware
> > are powered and active at any moment. These take the form of bitmaps. In
> > the case of SHADER_PWRACTIVE for example, a high bit indicates that the
> > shader core corresponding to that bit index is active. These bitmaps
> > aren't solely contiguous bits, as it's common to have holes in the
> > sequence of shader core indices, and the actual set of which cores are
> > present is defined by the "shader present" register.
> >
> > When the GPU finishes a power state transition, it fires a
> > GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
> > received, the PWRACTIVE registers will likely contain interesting new
> > information.
> >
> > This is not to be confused with the PWR_IRQ_POWER_CHANGED_ALL interrupt,
> > which is something related to Mali v14+'s power control logic. The
> > PWRACTIVE registers and corresponding interrupts are already available
> > in v9 and onwards.
> >
> > Expose this as a tracepoint to userspace. This allows users to debug
> > various scenarios and gather interesting information, such as: knowing
> > how much hardware is lit up at any given time, correlating graphics
> > corruption with a specific active shader core, measuring when hardware
> > is allowed to go to an inactive state again, and so on.
> >
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > ---
> > drivers/gpu/drm/panthor/panthor_device.c | 1 +
> > drivers/gpu/drm/panthor/panthor_gpu.c | 9 ++++++++
> > drivers/gpu/drm/panthor/panthor_trace.h | 38 ++++++++++++++++++++++++++++++++
> > 3 files changed, 48 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> > index e133b1e0ad6d..a3cb934104b8 100644
> > --- a/drivers/gpu/drm/panthor/panthor_device.c
> > +++ b/drivers/gpu/drm/panthor/panthor_device.c
> > @@ -548,6 +548,7 @@ int panthor_device_resume(struct device *dev)
> > DRM_PANTHOR_USER_MMIO_OFFSET, 0, 1);
> > atomic_set(&ptdev->pm.state, PANTHOR_DEVICE_PM_STATE_ACTIVE);
> > mutex_unlock(&ptdev->pm.mmio_lock);
> > +
> > return 0;
> >
> > err_suspend_devfreq:
> > diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> > index 9cb5dee93212..8830aa9a5c4b 100644
> > --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> > +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> > @@ -22,6 +22,9 @@
> > #include "panthor_hw.h"
> > #include "panthor_regs.h"
> >
> > +#define CREATE_TRACE_POINTS
> > +#include "panthor_trace.h"
> > +
> > /**
> > * struct panthor_gpu - GPU block management data.
> > */
> > @@ -46,6 +49,7 @@ struct panthor_gpu {
> > (GPU_IRQ_FAULT | \
> > GPU_IRQ_PROTM_FAULT | \
> > GPU_IRQ_RESET_COMPLETED | \
> > + GPU_IRQ_POWER_CHANGED_ALL | \
>
> Also, we've seen customers complain about too many IRQs originating
> from this event, is there any chance we can enable this conditionally
> i.e. only when the trace point is enabled?
Yeah, that's something I've been trying to look into. I'll need to
do some more digging to see if there's a way to run a callback when
an event tracepoint is enabled. That'd be the ideal way to do this,
because then we can just modify the interrupt mask in the callback.
For what it's worth, it doesn't fire very often for me, magnitudes
less often than the job interrupt fires at least. But I assume this
is highly implementation dependent, e.g. on bigger designs that have
more complex power setups and more reasons to enable only part of the
hardware, it'll fire way more often.
Kind regards,
Nicolas Frattaroli
>
> Kind regards,
> Karunika
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-08 17:14 ` Karunika Choo
@ 2025-12-09 13:01 ` Nicolas Frattaroli
2025-12-09 16:22 ` Karunika Choo
0 siblings, 1 reply; 15+ messages in thread
From: Nicolas Frattaroli @ 2025-12-09 13:01 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Karunika Choo
Cc: kernel, linux-kernel, dri-devel
On Monday, 8 December 2025 18:14:53 Central European Standard Time Karunika Choo wrote:
> On 03/12/2025 13:56, Nicolas Frattaroli wrote:
> > [... snip ...]
> > diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
> > new file mode 100644
> > index 000000000000..01013f81e68a
> > --- /dev/null
> > +++ b/drivers/gpu/drm/panthor/panthor_trace.h
> > @@ -0,0 +1,38 @@
> > +/* SPDX-License-Identifier: GPL-2.0 or MIT */
> > +/* Copyright 2025 Collabora ltd. */
> > +
> > +#undef TRACE_SYSTEM
> > +#define TRACE_SYSTEM panthor
> > +
> > +#if !defined(__PANTHOR_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
> > +#define __PANTHOR_TRACE_H__
> > +
> > +#include <linux/tracepoint.h>
> > +#include <linux/types.h>
> > +
> > +TRACE_EVENT(gpu_power_active,
> > + TP_PROTO(u64 shader_bitmap, u64 tiler_bitmap, u64 l2_bitmap),
>
> nit: if you want to add tracing can we also add the device name as
> well? Something like:
>
> TP_PROTO(struct device *dev, ...),
> TP_ARGS(dev, ...),
> TP_STRUCT__ENTRY(
> __string(dev_name, dev_name(dev))
> ...
> ).
> ...
This is a great idea, will do. Any specific reason to pass the
device in the tracepoint rather than a const char*?
>
> This will help differentiate the device it is originating from in
> a multi GPU situation.
I'll try not to get too excited at the prospect of systems using
multiple Mali GPUs because I know the likeliest case this happens
on is Arm evaluation systems with a hard IP and a soft IP loaded to
the FPGA core. :)
Kind regards,
Nicolas Frattaroli
>
> Kind regards,
> Karunika
>
> > + TP_ARGS(shader_bitmap, tiler_bitmap, l2_bitmap),
> > + TP_STRUCT__entry(
> > + __field(u64, shader_bitmap)
> > + __field(u64, tiler_bitmap)
> > + __field(u64, l2_bitmap)
> > + ),
> > + TP_fast_assign(
> > + __entry->shader_bitmap = shader_bitmap;
> > + __entry->tiler_bitmap = tiler_bitmap;
> > + __entry->l2_bitmap = l2_bitmap;
> > + ),
> > + TP_printk("shader_bitmap=0x%llx tiler_bitmap=0x%llx l2_bitmap=0x%llx",
> > + __entry->shader_bitmap, __entry->tiler_bitmap, __entry->l2_bitmap
> > + )
> > +);
> > +
> > +#endif /* __PANTHOR_TRACE_H__ */
> > +
> > +#undef TRACE_INCLUDE_PATH
> > +#define TRACE_INCLUDE_PATH .
> > +#undef TRACE_INCLUDE_FILE
> > +#define TRACE_INCLUDE_FILE panthor_trace
> > +
> > +#include <trace/define_trace.h>
> >
>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-09 13:01 ` Nicolas Frattaroli
@ 2025-12-09 16:22 ` Karunika Choo
2025-12-09 17:10 ` Marcin Ślusarz
0 siblings, 1 reply; 15+ messages in thread
From: Karunika Choo @ 2025-12-09 16:22 UTC (permalink / raw)
To: Nicolas Frattaroli, Boris Brezillon, Steven Price, Liviu Dudau,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: kernel, linux-kernel, dri-devel
On 09/12/2025 13:01, Nicolas Frattaroli wrote:
> On Monday, 8 December 2025 18:14:53 Central European Standard Time Karunika Choo wrote:
>> On 03/12/2025 13:56, Nicolas Frattaroli wrote:
>>> [... snip ...]
>>> diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
>>> new file mode 100644
>>> index 000000000000..01013f81e68a
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/panthor/panthor_trace.h
>>> @@ -0,0 +1,38 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 or MIT */
>>> +/* Copyright 2025 Collabora ltd. */
>>> +
>>> +#undef TRACE_SYSTEM
>>> +#define TRACE_SYSTEM panthor
>>> +
>>> +#if !defined(__PANTHOR_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
>>> +#define __PANTHOR_TRACE_H__
>>> +
>>> +#include <linux/tracepoint.h>
>>> +#include <linux/types.h>
>>> +
>>> +TRACE_EVENT(gpu_power_active,
>>> + TP_PROTO(u64 shader_bitmap, u64 tiler_bitmap, u64 l2_bitmap),
>>
>> nit: if you want to add tracing can we also add the device name as
>> well? Something like:
>>
>> TP_PROTO(struct device *dev, ...),
>> TP_ARGS(dev, ...),
>> TP_STRUCT__ENTRY(
>> __string(dev_name, dev_name(dev))
>> ...
>> ).
>> ...
>
> This is a great idea, will do. Any specific reason to pass the
> device in the tracepoint rather than a const char*?
>
Nope, totaly fine to do it that way as well.
Kind regards,
Karunika
>>
>> This will help differentiate the device it is originating from in
>> a multi GPU situation.
>
> I'll try not to get too excited at the prospect of systems using
> multiple Mali GPUs because I know the likeliest case this happens
> on is Arm evaluation systems with a hard IP and a soft IP loaded to
> the FPGA core. :)
>
> Kind regards,
> Nicolas Frattaroli
>
>>
>> Kind regards,
>> Karunika
>>
>>> + TP_ARGS(shader_bitmap, tiler_bitmap, l2_bitmap),
>>> + TP_STRUCT__entry(
>>> + __field(u64, shader_bitmap)
>>> + __field(u64, tiler_bitmap)
>>> + __field(u64, l2_bitmap)
>>> + ),
>>> + TP_fast_assign(
>>> + __entry->shader_bitmap = shader_bitmap;
>>> + __entry->tiler_bitmap = tiler_bitmap;
>>> + __entry->l2_bitmap = l2_bitmap;
>>> + ),
>>> + TP_printk("shader_bitmap=0x%llx tiler_bitmap=0x%llx l2_bitmap=0x%llx",
>>> + __entry->shader_bitmap, __entry->tiler_bitmap, __entry->l2_bitmap
>>> + )
>>> +);
>>> +
>>> +#endif /* __PANTHOR_TRACE_H__ */
>>> +
>>> +#undef TRACE_INCLUDE_PATH
>>> +#define TRACE_INCLUDE_PATH .
>>> +#undef TRACE_INCLUDE_FILE
>>> +#define TRACE_INCLUDE_FILE panthor_trace
>>> +
>>> +#include <trace/define_trace.h>
>>>
>>
>>
>>
>
>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes
2025-12-09 16:22 ` Karunika Choo
@ 2025-12-09 17:10 ` Marcin Ślusarz
0 siblings, 0 replies; 15+ messages in thread
From: Marcin Ślusarz @ 2025-12-09 17:10 UTC (permalink / raw)
To: Karunika Choo
Cc: Nicolas Frattaroli, Boris Brezillon, Steven Price, Liviu Dudau,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, kernel, linux-kernel, dri-devel, nd
On Tue, Dec 09, 2025 at 04:22:15PM +0000, Karunika Choo wrote:
> On 09/12/2025 13:01, Nicolas Frattaroli wrote:
> > On Monday, 8 December 2025 18:14:53 Central European Standard Time Karunika Choo wrote:
> >> On 03/12/2025 13:56, Nicolas Frattaroli wrote:
> >>> [... snip ...]
> >>> diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
> >>> new file mode 100644
> >>> index 000000000000..01013f81e68a
> >>> --- /dev/null
> >>> +++ b/drivers/gpu/drm/panthor/panthor_trace.h
> >>> @@ -0,0 +1,38 @@
> >>> +/* SPDX-License-Identifier: GPL-2.0 or MIT */
> >>> +/* Copyright 2025 Collabora ltd. */
> >>> +
> >>> +#undef TRACE_SYSTEM
> >>> +#define TRACE_SYSTEM panthor
> >>> +
> >>> +#if !defined(__PANTHOR_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
> >>> +#define __PANTHOR_TRACE_H__
> >>> +
> >>> +#include <linux/tracepoint.h>
> >>> +#include <linux/types.h>
> >>> +
> >>> +TRACE_EVENT(gpu_power_active,
> >>> + TP_PROTO(u64 shader_bitmap, u64 tiler_bitmap, u64 l2_bitmap),
> >>
> >> nit: if you want to add tracing can we also add the device name as
> >> well? Something like:
> >>
> >> TP_PROTO(struct device *dev, ...),
> >> TP_ARGS(dev, ...),
> >> TP_STRUCT__ENTRY(
> >> __string(dev_name, dev_name(dev))
> >> ...
> >> ).
> >> ...
> >
> > This is a great idea, will do. Any specific reason to pass the
> > device in the tracepoint rather than a const char*?
> >
>
> Nope, totaly fine to do it that way as well.
If you store any pointer into the trace buffer, then by the time
it will be dereferenced (TP_printk below), the object behind it may
be already freed. __string/__assign_str prevents this use-after-free
by embedding a full copy of the string in the trace buffer.
Cheers,
Marcin
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-12-09 17:11 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-03 13:56 [PATCH 0/2] Add a few tracepoints to panthor Nicolas Frattaroli
2025-12-03 13:56 ` [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes Nicolas Frattaroli
2025-12-04 20:21 ` Chia-I Wu
2025-12-05 10:47 ` Nicolas Frattaroli
2025-12-05 21:16 ` Chia-I Wu
2025-12-08 7:48 ` Nicolas Frattaroli
2025-12-08 18:28 ` Chia-I Wu
2025-12-09 10:32 ` Karunika Choo
2025-12-08 17:14 ` Karunika Choo
2025-12-09 13:01 ` Nicolas Frattaroli
2025-12-09 16:22 ` Karunika Choo
2025-12-09 17:10 ` Marcin Ślusarz
2025-12-08 17:21 ` Karunika Choo
2025-12-09 12:55 ` Nicolas Frattaroli
2025-12-03 13:56 ` [PATCH 2/2] drm/panthor: Add gpu_job_irq tracepoint Nicolas Frattaroli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox