[PATCH v7 0/4] Add a few tracepoints to panthor

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v7 0/4] Add a few tracepoints to panthor
@ 2026-01-08 14:19 Nicolas Frattaroli
  2026-01-08 14:19 ` [PATCH v7 1/4] drm/panthor: Extend IRQ helpers for mask modification/restoration Nicolas Frattaroli
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Nicolas Frattaroli @ 2026-01-08 14:19 UTC (permalink / raw)
  To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Chia-I Wu, Karunika Choo
  Cc: kernel, linux-kernel, dri-devel, Nicolas Frattaroli

This series adds two tracepoints to panthor.

The first tracepoint allows for inspecting the power status of the
hardware subdivisions, e.g. how many shader cores are powered on. This
is done by reading three hardware registers when a certain IRQ fires.

The second tracepoint instruments panthor's job IRQ handler. This is
more useful than the generic interrupt tracing functionality, as the
tracepoint has the events bit mask included, which indicates which
command stream group interfaces triggered the interrupt.

To test the tracepoints, the following can be used:

  :~# echo 1 > /sys/kernel/tracing/events/panthor/gpu_power_status/enable
  :~# echo 1 > /sys/kernel/tracing/events/panthor/gpu_job_irq/enable
  :~# echo 1 > /sys/kernel/tracing/tracing_on
  :~# cat /sys/kernel/tracing/trace_pipe

Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
Changes in v7:
- Get rid of old resume IRQ helper by reworking code throughout panthor,
  and make what used to be resume_restore in v6 the new resume.
- Rename mask_enable/mask_disable to enable_events/disable_events.
- Turn panthor_irq::suspended into a multi-state value, and utilise it
  in the IRQ helpers as appropriate.
- Link to v6: https://lore.kernel.org/r/20251223-panthor-tracepoints-v6-0-d3c998ee9efc@collabora.com

Changes in v6:
- Read the mask member into a local while holding the lock in
  irq_threaded_handler.
- Drop the lock before entering the while loop, letting the threaded
  handler function run without holding a spinlock
- Re-acquire the spinlock at the end of irq_threaded_handler, OR'ing the
  mask register's contents with the mask local ANDed by the member. This
  avoids stomping over any other modified bits, or restoring ones that
  have been disabled in the meantime.
- Link to v5: https://lore.kernel.org/r/20251221-panthor-tracepoints-v5-0-889ef78165d8@collabora.com

Changes in v5:
- Change the panthor IRQ helpers to guard the mask member and register
  with a spinlock. The rationale behind using a spinlock, rather than
  some constellation of atomics, is that we have to guarantee mutual
  exclusion for state beyond just a single value, namely both the register
  write, and writes to/reads from the mask member, including
  reads-from-member-writes-to-register. Making the mask atomic does not do
  anything to avoid concurrency issues in such a case.
- Change the IRQ mask member to not get zeroed when suspended. It's
  possible something outside of the IRQ helpers depends on this
  behaviour, but I'd argue the code should not access the mask outside
  of the IRQ helpers, as it'll do so with no lock taken.
- Drop the mask_set function, but add mask_enable/mask_disable helpers
  to enable/disable individual parts of the IRQ mask.
- Add a resume_restore IRQ helper that does the same thing as resume,
  but does not overwrite the mask member. This avoids me having to
  refactor whatever panthor_mmu.c is doing with that poor mask member.
- Link to v4: https://lore.kernel.org/r/20251217-panthor-tracepoints-v4-0-916186cb8d03@collabora.com

Changes in v4:
- Include "panthor_hw.h" in panthor_trace.h instead of duplicating the
  reg/unreg function prototypes.
- Link to v3: https://lore.kernel.org/r/20251211-panthor-tracepoints-v3-0-924c9d356a5c@collabora.com

Changes in v3:
- Drop PWRFEATURES patch, as this register is no longer needed by this
  series.
- Eliminate the rt_on field from the gpu_power_status register, as per
  Steven Price's feedback.
- Make gpu_power_status tracepoint reg/unreg functions generic across
  hardware generations by wrapping a hw op in panthor_hw.c.
- Reimplement the <= v13 IRQ mask modification functions as the new hw
  ops functions. v14 can add its own ops in due time.
- Link to v2: https://lore.kernel.org/r/20251210-panthor-tracepoints-v2-0-ace2e29bad0f@collabora.com

Changes in v2:
- Only enable the GPU_IRQ_POWER_CHANGED_* IRQ mask bits when the
  tracepoint is enabled. Necessitates the new irq helper patch.
- Only enable the GPU_IRQ_POWER_CHANGED_* IRQ mask bits if the hardware
  architecture is <= v13, as v14 changes things.
- Use _READY instead of _PWRACTIVE registers, and rename the tracepoint
  accordingly.
- Also read the status of the ray tracing unit's power. This is a global
  flag for all shader cores, it seems. Necessitates the new register
  definition patch.
- Move the POWER_CHANGED_* check to earlier in the interrupt handler.
- Also listen to POWER_CHANGED, not just POWER_CHANGED_ALL, as this
  provides useful information with the _READY registers.
- Print the device name in both tracepoints, to disambiguate things on
  systems with multiple Mali GPUs.
- Document the gpu_power_status tracepoint, so the meaning of the fields
  is made clear.
- Link to v1: https://lore.kernel.org/r/20251203-panthor-tracepoints-v1-0-871c8917e084@collabora.com

---
Nicolas Frattaroli (4):
      drm/panthor: Extend IRQ helpers for mask modification/restoration
      drm/panthor: Rework panthor_irq::suspended into panthor_irq::state
      drm/panthor: Add tracepoint for hardware utilisation changes
      drm/panthor: Add gpu_job_irq tracepoint

 drivers/gpu/drm/panthor/panthor_device.h |  94 ++++++++++--
 drivers/gpu/drm/panthor/panthor_fw.c     |  16 +-
 drivers/gpu/drm/panthor/panthor_gpu.c    |  30 +++-
 drivers/gpu/drm/panthor/panthor_gpu.h    |   2 +
 drivers/gpu/drm/panthor/panthor_hw.c     |  62 ++++++++
 drivers/gpu/drm/panthor/panthor_hw.h     |   8 +
 drivers/gpu/drm/panthor/panthor_mmu.c    | 247 ++++++++++++++++---------------
 drivers/gpu/drm/panthor/panthor_pwr.c    |   2 +-
 drivers/gpu/drm/panthor/panthor_trace.h  |  86 +++++++++++
 9 files changed, 407 insertions(+), 140 deletions(-)
---
base-commit: d7d19ebd62e1a312e67f4484df9a4e2b407d93d0
change-id: 20251203-panthor-tracepoints-488af09d46e7

Best regards,
-- 
Nicolas Frattaroli <nicolas.frattaroli@collabora.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v7 1/4] drm/panthor: Extend IRQ helpers for mask modification/restoration
  2026-01-08 14:19 [PATCH v7 0/4] Add a few tracepoints to panthor Nicolas Frattaroli
@ 2026-01-08 14:19 ` Nicolas Frattaroli
  2026-01-09 15:59   ` Steven Price
  2026-01-08 14:19 ` [PATCH v7 2/4] drm/panthor: Rework panthor_irq::suspended into panthor_irq::state Nicolas Frattaroli
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Nicolas Frattaroli @ 2026-01-08 14:19 UTC (permalink / raw)
  To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Chia-I Wu, Karunika Choo
  Cc: kernel, linux-kernel, dri-devel, Nicolas Frattaroli

The current IRQ helpers do not guarantee mutual exclusion that covers
the entire transaction from accessing the mask member and modifying the
mask register.

This makes it hard, if not impossible, to implement mask modification
helpers that may change one of these outside the normal
suspend/resume/isr code paths.

Add a spinlock to struct panthor_irq that protects both the mask member
and register. Acquire it in all code paths that access these, but drop
it before processing the threaded handler function. Then, add the
aforementioned new helpers: enable_events, and disable_events. They work
by ORing and NANDing the mask bits.

resume is changed to no longer have a mask passed, as pirq->mask is
supposed to be the user-requested mask now, rather than a mirror of the
INT_MASK register contents. Users of the resume helper are adjusted
accordingly, including a rather painful refactor in panthor_mmu.c.

panthor_irq::suspended remains an atomic, as it's necessarily written to
outside the mask_lock in the suspend path.

Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_device.h |  60 ++++++--
 drivers/gpu/drm/panthor/panthor_fw.c     |   3 +-
 drivers/gpu/drm/panthor/panthor_gpu.c    |   2 +-
 drivers/gpu/drm/panthor/panthor_mmu.c    | 247 ++++++++++++++++---------------
 drivers/gpu/drm/panthor/panthor_pwr.c    |   2 +-
 5 files changed, 179 insertions(+), 135 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index f35e52b9546a..cf76a8abca76 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -73,11 +73,14 @@ struct panthor_irq {
 	/** @irq: IRQ number. */
 	int irq;
 
-	/** @mask: Current mask being applied to xxx_INT_MASK. */
+	/** @mask: Values to write to xxx_INT_MASK if active. */
 	u32 mask;
 
 	/** @suspended: Set to true when the IRQ is suspended. */
 	atomic_t suspended;
+
+	/** @mask_lock: protects modifications to _INT_MASK and @mask */
+	spinlock_t mask_lock;
 };
 
 /**
@@ -410,6 +413,8 @@ static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data)
 	struct panthor_irq *pirq = data;							\
 	struct panthor_device *ptdev = pirq->ptdev;						\
 												\
+	guard(spinlock_irqsave)(&pirq->mask_lock);						\
+												\
 	if (atomic_read(&pirq->suspended))							\
 		return IRQ_NONE;								\
 	if (!gpu_read(ptdev, __reg_prefix ## _INT_STAT))					\
@@ -424,9 +429,14 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
 	struct panthor_irq *pirq = data;							\
 	struct panthor_device *ptdev = pirq->ptdev;						\
 	irqreturn_t ret = IRQ_NONE;								\
+	u32 mask;										\
+												\
+	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
+		mask = pirq->mask;								\
+	}											\
 												\
 	while (true) {										\
-		u32 status = gpu_read(ptdev, __reg_prefix ## _INT_RAWSTAT) & pirq->mask;	\
+		u32 status = (gpu_read(ptdev, __reg_prefix ## _INT_RAWSTAT) & mask);		\
 												\
 		if (!status)									\
 			break;									\
@@ -435,26 +445,34 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
 		ret = IRQ_HANDLED;								\
 	}											\
 												\
-	if (!atomic_read(&pirq->suspended))							\
-		gpu_write(ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
+	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
+		if (!atomic_read(&pirq->suspended)) {						\
+			/* Only restore the bits that were used and are still enabled */	\
+			gpu_write(ptdev, __reg_prefix ## _INT_MASK,				\
+				  gpu_read(ptdev, __reg_prefix ## _INT_MASK) |			\
+				  (mask & pirq->mask));						\
+		}										\
+	}											\
 												\
 	return ret;										\
 }												\
 												\
 static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq)			\
 {												\
-	pirq->mask = 0;										\
-	gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, 0);					\
+	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
+		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, 0);				\
+	}											\
 	synchronize_irq(pirq->irq);								\
 	atomic_set(&pirq->suspended, true);							\
 }												\
 												\
-static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq, u32 mask)	\
+static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq)			\
 {												\
+	guard(spinlock_irqsave)(&pirq->mask_lock);						\
+												\
 	atomic_set(&pirq->suspended, false);							\
-	pirq->mask = mask;									\
-	gpu_write(pirq->ptdev, __reg_prefix ## _INT_CLEAR, mask);				\
-	gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, mask);				\
+	gpu_write(pirq->ptdev, __reg_prefix ## _INT_CLEAR, pirq->mask);				\
+	gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);				\
 }												\
 												\
 static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
@@ -463,13 +481,33 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
 {												\
 	pirq->ptdev = ptdev;									\
 	pirq->irq = irq;									\
-	panthor_ ## __name ## _irq_resume(pirq, mask);						\
+	pirq->mask = mask;									\
+	spin_lock_init(&pirq->mask_lock);							\
+	panthor_ ## __name ## _irq_resume(pirq);						\
 												\
 	return devm_request_threaded_irq(ptdev->base.dev, irq,					\
 					 panthor_ ## __name ## _irq_raw_handler,		\
 					 panthor_ ## __name ## _irq_threaded_handler,		\
 					 IRQF_SHARED, KBUILD_MODNAME "-" # __name,		\
 					 pirq);							\
+}												\
+												\
+static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *pirq, u32 mask)	\
+{												\
+	guard(spinlock_irqsave)(&pirq->mask_lock);						\
+												\
+	pirq->mask |= mask;									\
+	if (!atomic_read(&pirq->suspended))							\
+		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
+}												\
+												\
+static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq *pirq, u32 mask)\
+{												\
+	guard(spinlock_irqsave)(&pirq->mask_lock);						\
+												\
+	pirq->mask &= ~mask;									\
+	if (!atomic_read(&pirq->suspended))							\
+		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
 }
 
 extern struct workqueue_struct *panthor_cleanup_wq;
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index a64ec8756bed..0e46625f7621 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1080,7 +1080,8 @@ static int panthor_fw_start(struct panthor_device *ptdev)
 	bool timedout = false;
 
 	ptdev->fw->booted = false;
-	panthor_job_irq_resume(&ptdev->fw->irq, ~0);
+	panthor_job_irq_enable_events(&ptdev->fw->irq, ~0);
+	panthor_job_irq_resume(&ptdev->fw->irq);
 	gpu_write(ptdev, MCU_CONTROL, MCU_CONTROL_AUTO);
 
 	if (!wait_event_timeout(ptdev->fw->req_waitqueue,
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index 057e167468d0..9304469a711a 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -395,7 +395,7 @@ void panthor_gpu_suspend(struct panthor_device *ptdev)
  */
 void panthor_gpu_resume(struct panthor_device *ptdev)
 {
-	panthor_gpu_irq_resume(&ptdev->gpu->irq, GPU_INTERRUPTS_MASK);
+	panthor_gpu_irq_resume(&ptdev->gpu->irq);
 	panthor_hw_l2_power_on(ptdev);
 }
 
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index b888fff05efe..64aa249c8a93 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -655,125 +655,6 @@ static void panthor_vm_release_as_locked(struct panthor_vm *vm)
 	vm->as.id = -1;
 }
 
-/**
- * panthor_vm_active() - Flag a VM as active
- * @vm: VM to flag as active.
- *
- * Assigns an address space to a VM so it can be used by the GPU/MCU.
- *
- * Return: 0 on success, a negative error code otherwise.
- */
-int panthor_vm_active(struct panthor_vm *vm)
-{
-	struct panthor_device *ptdev = vm->ptdev;
-	u32 va_bits = GPU_MMU_FEATURES_VA_BITS(ptdev->gpu_info.mmu_features);
-	struct io_pgtable_cfg *cfg = &io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg;
-	int ret = 0, as, cookie;
-	u64 transtab, transcfg;
-
-	if (!drm_dev_enter(&ptdev->base, &cookie))
-		return -ENODEV;
-
-	if (refcount_inc_not_zero(&vm->as.active_cnt))
-		goto out_dev_exit;
-
-	/* Make sure we don't race with lock/unlock_region() calls
-	 * happening around VM bind operations.
-	 */
-	mutex_lock(&vm->op_lock);
-	mutex_lock(&ptdev->mmu->as.slots_lock);
-
-	if (refcount_inc_not_zero(&vm->as.active_cnt))
-		goto out_unlock;
-
-	as = vm->as.id;
-	if (as >= 0) {
-		/* Unhandled pagefault on this AS, the MMU was disabled. We need to
-		 * re-enable the MMU after clearing+unmasking the AS interrupts.
-		 */
-		if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as))
-			goto out_enable_as;
-
-		goto out_make_active;
-	}
-
-	/* Check for a free AS */
-	if (vm->for_mcu) {
-		drm_WARN_ON(&ptdev->base, ptdev->mmu->as.alloc_mask & BIT(0));
-		as = 0;
-	} else {
-		as = ffz(ptdev->mmu->as.alloc_mask | BIT(0));
-	}
-
-	if (!(BIT(as) & ptdev->gpu_info.as_present)) {
-		struct panthor_vm *lru_vm;
-
-		lru_vm = list_first_entry_or_null(&ptdev->mmu->as.lru_list,
-						  struct panthor_vm,
-						  as.lru_node);
-		if (drm_WARN_ON(&ptdev->base, !lru_vm)) {
-			ret = -EBUSY;
-			goto out_unlock;
-		}
-
-		drm_WARN_ON(&ptdev->base, refcount_read(&lru_vm->as.active_cnt));
-		as = lru_vm->as.id;
-
-		ret = panthor_mmu_as_disable(ptdev, as, true);
-		if (ret)
-			goto out_unlock;
-
-		panthor_vm_release_as_locked(lru_vm);
-	}
-
-	/* Assign the free or reclaimed AS to the FD */
-	vm->as.id = as;
-	set_bit(as, &ptdev->mmu->as.alloc_mask);
-	ptdev->mmu->as.slots[as].vm = vm;
-
-out_enable_as:
-	transtab = cfg->arm_lpae_s1_cfg.ttbr;
-	transcfg = AS_TRANSCFG_PTW_MEMATTR_WB |
-		   AS_TRANSCFG_PTW_RA |
-		   AS_TRANSCFG_ADRMODE_AARCH64_4K |
-		   AS_TRANSCFG_INA_BITS(55 - va_bits);
-	if (ptdev->coherent)
-		transcfg |= AS_TRANSCFG_PTW_SH_OS;
-
-	/* If the VM is re-activated, we clear the fault. */
-	vm->unhandled_fault = false;
-
-	/* Unhandled pagefault on this AS, clear the fault and re-enable interrupts
-	 * before enabling the AS.
-	 */
-	if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as)) {
-		gpu_write(ptdev, MMU_INT_CLEAR, panthor_mmu_as_fault_mask(ptdev, as));
-		ptdev->mmu->as.faulty_mask &= ~panthor_mmu_as_fault_mask(ptdev, as);
-		ptdev->mmu->irq.mask |= panthor_mmu_as_fault_mask(ptdev, as);
-		gpu_write(ptdev, MMU_INT_MASK, ~ptdev->mmu->as.faulty_mask);
-	}
-
-	/* The VM update is guarded by ::op_lock, which we take at the beginning
-	 * of this function, so we don't expect any locked region here.
-	 */
-	drm_WARN_ON(&vm->ptdev->base, vm->locked_region.size > 0);
-	ret = panthor_mmu_as_enable(vm->ptdev, vm->as.id, transtab, transcfg, vm->memattr);
-
-out_make_active:
-	if (!ret) {
-		refcount_set(&vm->as.active_cnt, 1);
-		list_del_init(&vm->as.lru_node);
-	}
-
-out_unlock:
-	mutex_unlock(&ptdev->mmu->as.slots_lock);
-	mutex_unlock(&vm->op_lock);
-
-out_dev_exit:
-	drm_dev_exit(cookie);
-	return ret;
-}
-
 /**
  * panthor_vm_idle() - Flag a VM idle
  * @vm: VM to flag as idle.
@@ -1772,6 +1653,128 @@ static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
 }
 PANTHOR_IRQ_HANDLER(mmu, MMU, panthor_mmu_irq_handler);
 
+/**
+ * panthor_vm_active() - Flag a VM as active
+ * @vm: VM to flag as active.
+ *
+ * Assigns an address space to a VM so it can be used by the GPU/MCU.
+ *
+ * Return: 0 on success, a negative error code otherwise.
+ */
+int panthor_vm_active(struct panthor_vm *vm)
+{
+	struct panthor_device *ptdev = vm->ptdev;
+	u32 va_bits = GPU_MMU_FEATURES_VA_BITS(ptdev->gpu_info.mmu_features);
+	struct io_pgtable_cfg *cfg = &io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg;
+	int ret = 0, as, cookie;
+	u64 transtab, transcfg;
+	u32 fault_mask;
+
+	if (!drm_dev_enter(&ptdev->base, &cookie))
+		return -ENODEV;
+
+	if (refcount_inc_not_zero(&vm->as.active_cnt))
+		goto out_dev_exit;
+
+	/* Make sure we don't race with lock/unlock_region() calls
+	 * happening around VM bind operations.
+	 */
+	mutex_lock(&vm->op_lock);
+	mutex_lock(&ptdev->mmu->as.slots_lock);
+
+	if (refcount_inc_not_zero(&vm->as.active_cnt))
+		goto out_unlock;
+
+	as = vm->as.id;
+	if (as >= 0) {
+		/* Unhandled pagefault on this AS, the MMU was disabled. We need to
+		 * re-enable the MMU after clearing+unmasking the AS interrupts.
+		 */
+		if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as))
+			goto out_enable_as;
+
+		goto out_make_active;
+	}
+
+	/* Check for a free AS */
+	if (vm->for_mcu) {
+		drm_WARN_ON(&ptdev->base, ptdev->mmu->as.alloc_mask & BIT(0));
+		as = 0;
+	} else {
+		as = ffz(ptdev->mmu->as.alloc_mask | BIT(0));
+	}
+
+	if (!(BIT(as) & ptdev->gpu_info.as_present)) {
+		struct panthor_vm *lru_vm;
+
+		lru_vm = list_first_entry_or_null(&ptdev->mmu->as.lru_list,
+						  struct panthor_vm,
+						  as.lru_node);
+		if (drm_WARN_ON(&ptdev->base, !lru_vm)) {
+			ret = -EBUSY;
+			goto out_unlock;
+		}
+
+		drm_WARN_ON(&ptdev->base, refcount_read(&lru_vm->as.active_cnt));
+		as = lru_vm->as.id;
+
+		ret = panthor_mmu_as_disable(ptdev, as, true);
+		if (ret)
+			goto out_unlock;
+
+		panthor_vm_release_as_locked(lru_vm);
+	}
+
+	/* Assign the free or reclaimed AS to the FD */
+	vm->as.id = as;
+	set_bit(as, &ptdev->mmu->as.alloc_mask);
+	ptdev->mmu->as.slots[as].vm = vm;
+
+out_enable_as:
+	transtab = cfg->arm_lpae_s1_cfg.ttbr;
+	transcfg = AS_TRANSCFG_PTW_MEMATTR_WB |
+		   AS_TRANSCFG_PTW_RA |
+		   AS_TRANSCFG_ADRMODE_AARCH64_4K |
+		   AS_TRANSCFG_INA_BITS(55 - va_bits);
+	if (ptdev->coherent)
+		transcfg |= AS_TRANSCFG_PTW_SH_OS;
+
+	/* If the VM is re-activated, we clear the fault. */
+	vm->unhandled_fault = false;
+
+	/* Unhandled pagefault on this AS, clear the fault and re-enable interrupts
+	 * before enabling the AS.
+	 */
+	fault_mask = panthor_mmu_as_fault_mask(ptdev, as);
+	if (ptdev->mmu->as.faulty_mask & fault_mask) {
+		gpu_write(ptdev, MMU_INT_CLEAR, fault_mask);
+		ptdev->mmu->as.faulty_mask &= ~fault_mask;
+		panthor_mmu_irq_enable_events(&ptdev->mmu->irq, fault_mask);
+		panthor_mmu_irq_disable_events(&ptdev->mmu->irq, ptdev->mmu->as.faulty_mask);
+	}
+
+	/* The VM update is guarded by ::op_lock, which we take at the beginning
+	 * of this function, so we don't expect any locked region here.
+	 */
+	drm_WARN_ON(&vm->ptdev->base, vm->locked_region.size > 0);
+	ret = panthor_mmu_as_enable(vm->ptdev, vm->as.id, transtab, transcfg, vm->memattr);
+
+out_make_active:
+	if (!ret) {
+		refcount_set(&vm->as.active_cnt, 1);
+		list_del_init(&vm->as.lru_node);
+	}
+
+out_unlock:
+	mutex_unlock(&ptdev->mmu->as.slots_lock);
+	mutex_unlock(&vm->op_lock);
+
+out_dev_exit:
+	drm_dev_exit(cookie);
+	return ret;
+}
+
+
 /**
  * panthor_mmu_suspend() - Suspend the MMU logic
  * @ptdev: Device.
@@ -1815,7 +1818,8 @@ void panthor_mmu_resume(struct panthor_device *ptdev)
 	ptdev->mmu->as.faulty_mask = 0;
 	mutex_unlock(&ptdev->mmu->as.slots_lock);
 
-	panthor_mmu_irq_resume(&ptdev->mmu->irq, panthor_mmu_fault_mask(ptdev, ~0));
+	panthor_mmu_irq_enable_events(&ptdev->mmu->irq, panthor_mmu_fault_mask(ptdev, ~0));
+	panthor_mmu_irq_resume(&ptdev->mmu->irq);
 }
 
 /**
@@ -1869,7 +1873,8 @@ void panthor_mmu_post_reset(struct panthor_device *ptdev)
 
 	mutex_unlock(&ptdev->mmu->as.slots_lock);
 
-	panthor_mmu_irq_resume(&ptdev->mmu->irq, panthor_mmu_fault_mask(ptdev, ~0));
+	panthor_mmu_irq_enable_events(&ptdev->mmu->irq, panthor_mmu_fault_mask(ptdev, ~0));
+	panthor_mmu_irq_resume(&ptdev->mmu->irq);
 
 	/* Restart the VM_BIND queues. */
 	mutex_lock(&ptdev->mmu->vm.lock);
diff --git a/drivers/gpu/drm/panthor/panthor_pwr.c b/drivers/gpu/drm/panthor/panthor_pwr.c
index 57cfc7ce715b..ed3b2b4479ca 100644
--- a/drivers/gpu/drm/panthor/panthor_pwr.c
+++ b/drivers/gpu/drm/panthor/panthor_pwr.c
@@ -545,5 +545,5 @@ void panthor_pwr_resume(struct panthor_device *ptdev)
 	if (!ptdev->pwr)
 		return;
 
-	panthor_pwr_irq_resume(&ptdev->pwr->irq, PWR_INTERRUPTS_MASK);
+	panthor_pwr_irq_resume(&ptdev->pwr->irq);
 }

-- 
2.52.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v7 2/4] drm/panthor: Rework panthor_irq::suspended into panthor_irq::state
  2026-01-08 14:19 [PATCH v7 0/4] Add a few tracepoints to panthor Nicolas Frattaroli
  2026-01-08 14:19 ` [PATCH v7 1/4] drm/panthor: Extend IRQ helpers for mask modification/restoration Nicolas Frattaroli
@ 2026-01-08 14:19 ` Nicolas Frattaroli
  2026-01-09 16:05   ` Steven Price
  2026-01-08 14:19 ` [PATCH v7 3/4] drm/panthor: Add tracepoint for hardware utilisation changes Nicolas Frattaroli
  2026-01-08 14:19 ` [PATCH v7 4/4] drm/panthor: Add gpu_job_irq tracepoint Nicolas Frattaroli
  3 siblings, 1 reply; 13+ messages in thread
From: Nicolas Frattaroli @ 2026-01-08 14:19 UTC (permalink / raw)
  To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Chia-I Wu, Karunika Choo
  Cc: kernel, linux-kernel, dri-devel, Nicolas Frattaroli

To deal with the threaded interrupt handler and a suspend action
overlapping, the boolean panthor_irq::suspended is not sufficient.

Rework it into taking several different values depending on the current
state, and check it and set it within the IRQ helper functions.

Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_device.h | 40 +++++++++++++++++++++++++-------
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index cf76a8abca76..a8c21a7eea05 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -61,6 +61,17 @@ enum panthor_device_pm_state {
 	PANTHOR_DEVICE_PM_STATE_SUSPENDING,
 };
 
+enum panthor_irq_state {
+	/** @PANTHOR_IRQ_STATE_ACTIVE: IRQ is active and ready to process events. */
+	PANTHOR_IRQ_STATE_ACTIVE = 0,
+	/** @PANTHOR_IRQ_STATE_PROCESSING: IRQ is currently processing events. */
+	PANTHOR_IRQ_STATE_PROCESSING,
+	/** @PANTHOR_IRQ_STATE_SUSPENDED: IRQ is suspended. */
+	PANTHOR_IRQ_STATE_SUSPENDED,
+	/** @PANTHOR_IRQ_STATE_SUSPENDING: IRQ is being suspended. */
+	PANTHOR_IRQ_STATE_SUSPENDING,
+};
+
 /**
  * struct panthor_irq - IRQ data
  *
@@ -76,8 +87,8 @@ struct panthor_irq {
 	/** @mask: Values to write to xxx_INT_MASK if active. */
 	u32 mask;
 
-	/** @suspended: Set to true when the IRQ is suspended. */
-	atomic_t suspended;
+	/** @state: one of &enum panthor_irq_state reflecting the current state. */
+	atomic_t state;
 
 	/** @mask_lock: protects modifications to _INT_MASK and @mask */
 	spinlock_t mask_lock;
@@ -415,7 +426,7 @@ static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data)
 												\
 	guard(spinlock_irqsave)(&pirq->mask_lock);						\
 												\
-	if (atomic_read(&pirq->suspended))							\
+	if (atomic_read(&pirq->state) == PANTHOR_IRQ_STATE_SUSPENDED)				\
 		return IRQ_NONE;								\
 	if (!gpu_read(ptdev, __reg_prefix ## _INT_STAT))					\
 		return IRQ_NONE;								\
@@ -428,11 +439,14 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
 {												\
 	struct panthor_irq *pirq = data;							\
 	struct panthor_device *ptdev = pirq->ptdev;						\
+	enum panthor_irq_state state;								\
 	irqreturn_t ret = IRQ_NONE;								\
 	u32 mask;										\
 												\
 	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
 		mask = pirq->mask;								\
+		atomic_cmpxchg(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE,				\
+			       PANTHOR_IRQ_STATE_PROCESSING);					\
 	}											\
 												\
 	while (true) {										\
@@ -446,11 +460,14 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
 	}											\
 												\
 	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
-		if (!atomic_read(&pirq->suspended)) {						\
+		state = atomic_read(&pirq->state);						\
+		if (state != PANTHOR_IRQ_STATE_SUSPENDED &&					\
+		    state != PANTHOR_IRQ_STATE_SUSPENDING) {					\
 			/* Only restore the bits that were used and are still enabled */	\
 			gpu_write(ptdev, __reg_prefix ## _INT_MASK,				\
 				  gpu_read(ptdev, __reg_prefix ## _INT_MASK) |			\
 				  (mask & pirq->mask));						\
+			atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE);			\
 		}										\
 	}											\
 												\
@@ -461,16 +478,17 @@ static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq)
 {												\
 	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
 		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, 0);				\
+		atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDING);				\
 	}											\
 	synchronize_irq(pirq->irq);								\
-	atomic_set(&pirq->suspended, true);							\
+	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDED);					\
 }												\
 												\
 static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq)			\
 {												\
 	guard(spinlock_irqsave)(&pirq->mask_lock);						\
 												\
-	atomic_set(&pirq->suspended, false);							\
+	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE);					\
 	gpu_write(pirq->ptdev, __reg_prefix ## _INT_CLEAR, pirq->mask);				\
 	gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);				\
 }												\
@@ -494,19 +512,25 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
 												\
 static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *pirq, u32 mask)	\
 {												\
+	enum panthor_irq_state state;								\
+												\
 	guard(spinlock_irqsave)(&pirq->mask_lock);						\
 												\
 	pirq->mask |= mask;									\
-	if (!atomic_read(&pirq->suspended))							\
+	state = atomic_read(&pirq->state);							\
+	if (state != PANTHOR_IRQ_STATE_SUSPENDED && state != PANTHOR_IRQ_STATE_SUSPENDING)	\
 		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
 }												\
 												\
 static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq *pirq, u32 mask)\
 {												\
+	enum panthor_irq_state state;								\
+												\
 	guard(spinlock_irqsave)(&pirq->mask_lock);						\
 												\
 	pirq->mask &= ~mask;									\
-	if (!atomic_read(&pirq->suspended))							\
+	state = atomic_read(&pirq->state);							\
+	if (state != PANTHOR_IRQ_STATE_SUSPENDED && state != PANTHOR_IRQ_STATE_SUSPENDING)	\
 		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
 }
 

-- 
2.52.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v7 3/4] drm/panthor: Add tracepoint for hardware utilisation changes
  2026-01-08 14:19 [PATCH v7 0/4] Add a few tracepoints to panthor Nicolas Frattaroli
  2026-01-08 14:19 ` [PATCH v7 1/4] drm/panthor: Extend IRQ helpers for mask modification/restoration Nicolas Frattaroli
  2026-01-08 14:19 ` [PATCH v7 2/4] drm/panthor: Rework panthor_irq::suspended into panthor_irq::state Nicolas Frattaroli
@ 2026-01-08 14:19 ` Nicolas Frattaroli
  2026-01-08 14:19 ` [PATCH v7 4/4] drm/panthor: Add gpu_job_irq tracepoint Nicolas Frattaroli
  3 siblings, 0 replies; 13+ messages in thread
From: Nicolas Frattaroli @ 2026-01-08 14:19 UTC (permalink / raw)
  To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Chia-I Wu, Karunika Choo
  Cc: kernel, linux-kernel, dri-devel, Nicolas Frattaroli

Mali GPUs have three registers that indicate which parts of the hardware
are powered at any moment. These take the form of bitmaps. In the case
of SHADER_READY for example, a high bit indicates that the shader core
corresponding to that bit index is powered on. These bitmaps aren't
solely contiguous bits, as it's common to have holes in the sequence of
shader core indices, and the actual set of which cores are present is
defined by the "shader present" register.

When the GPU finishes a power state transition, it fires a
GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
received, the _READY registers will contain new interesting data. During
power transitions, the GPU_IRQ_POWER_CHANGED interrupt will fire, and
the registers will likewise contain potentially changed data.

This is not to be confused with the PWR_IRQ_POWER_CHANGED_ALL interrupt,
which is something related to Mali v14+'s power control logic. The
_READY registers and corresponding interrupts are already available in
v9 and onwards.

Expose the data as a tracepoint to userspace. This allows users to debug
various scenarios and gather interesting information, such as: knowing
how much hardware is lit up at any given time, correlating graphics
corruption with a specific powered shader core, measuring when hardware
is allowed to go to a powered off state again, and so on.

The registration/unregistration functions for the tracepoint go through
a wrapper in panthor_hw.c, so that v14+ can implement the same
tracepoint by adding its hardware specific IRQ on/off callbacks to the
panthor_hw.ops member.

Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_gpu.c   | 28 +++++++++++++++
 drivers/gpu/drm/panthor/panthor_gpu.h   |  2 ++
 drivers/gpu/drm/panthor/panthor_hw.c    | 62 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/panthor/panthor_hw.h    |  8 +++++
 drivers/gpu/drm/panthor/panthor_trace.h | 58 ++++++++++++++++++++++++++++++
 5 files changed, 158 insertions(+)

diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index 9304469a711a..2ab444ee8c71 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -22,6 +22,9 @@
 #include "panthor_hw.h"
 #include "panthor_regs.h"
 
+#define CREATE_TRACE_POINTS
+#include "panthor_trace.h"
+
 /**
  * struct panthor_gpu - GPU block management data.
  */
@@ -48,6 +51,9 @@ struct panthor_gpu {
 	 GPU_IRQ_RESET_COMPLETED | \
 	 GPU_IRQ_CLEAN_CACHES_COMPLETED)
 
+#define GPU_POWER_INTERRUPTS_MASK	\
+	(GPU_IRQ_POWER_CHANGED | GPU_IRQ_POWER_CHANGED_ALL)
+
 static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
 {
 	gpu_write(ptdev, GPU_COHERENCY_PROTOCOL,
@@ -80,6 +86,12 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
 {
 	gpu_write(ptdev, GPU_INT_CLEAR, status);
 
+	if (tracepoint_enabled(gpu_power_status) && (status & GPU_POWER_INTERRUPTS_MASK))
+		trace_gpu_power_status(ptdev->base.dev,
+				       gpu_read64(ptdev, SHADER_READY),
+				       gpu_read64(ptdev, TILER_READY),
+				       gpu_read64(ptdev, L2_READY));
+
 	if (status & GPU_IRQ_FAULT) {
 		u32 fault_status = gpu_read(ptdev, GPU_FAULT_STATUS);
 		u64 address = gpu_read64(ptdev, GPU_FAULT_ADDR);
@@ -157,6 +169,22 @@ int panthor_gpu_init(struct panthor_device *ptdev)
 	return 0;
 }
 
+int panthor_gpu_power_changed_on(struct panthor_device *ptdev)
+{
+	guard(pm_runtime_active)(ptdev->base.dev);
+
+	panthor_gpu_irq_enable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
+
+	return 0;
+}
+
+void panthor_gpu_power_changed_off(struct panthor_device *ptdev)
+{
+	guard(pm_runtime_active)(ptdev->base.dev);
+
+	panthor_gpu_irq_disable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
+}
+
 /**
  * panthor_gpu_block_power_off() - Power-off a specific block of the GPU
  * @ptdev: Device.
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.h b/drivers/gpu/drm/panthor/panthor_gpu.h
index 12e66f48ced1..12c263a39928 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.h
+++ b/drivers/gpu/drm/panthor/panthor_gpu.h
@@ -51,5 +51,7 @@ int panthor_gpu_l2_power_on(struct panthor_device *ptdev);
 int panthor_gpu_flush_caches(struct panthor_device *ptdev,
 			     u32 l2, u32 lsc, u32 other);
 int panthor_gpu_soft_reset(struct panthor_device *ptdev);
+void panthor_gpu_power_changed_off(struct panthor_device *ptdev);
+int panthor_gpu_power_changed_on(struct panthor_device *ptdev);
 
 #endif
diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
index 87ebb7ae42c4..ae3320d0e251 100644
--- a/drivers/gpu/drm/panthor/panthor_hw.c
+++ b/drivers/gpu/drm/panthor/panthor_hw.c
@@ -1,6 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0 or MIT
 /* Copyright 2025 ARM Limited. All rights reserved. */
 
+#include <linux/platform_device.h>
+
 #include <drm/drm_print.h>
 
 #include "panthor_device.h"
@@ -29,6 +31,8 @@ static struct panthor_hw panthor_hw_arch_v10 = {
 		.soft_reset = panthor_gpu_soft_reset,
 		.l2_power_off = panthor_gpu_l2_power_off,
 		.l2_power_on = panthor_gpu_l2_power_on,
+		.power_changed_off = panthor_gpu_power_changed_off,
+		.power_changed_on = panthor_gpu_power_changed_on,
 	},
 };
 
@@ -53,6 +57,64 @@ static struct panthor_hw_entry panthor_hw_match[] = {
 	},
 };
 
+static int panthor_hw_set_power_tracing(struct device *dev, void *data)
+{
+	struct panthor_device *ptdev = dev_get_drvdata(dev);
+
+	if (!ptdev)
+		return -ENODEV;
+
+	if (!ptdev->hw)
+		return 0;
+
+	if (data) {
+		if (ptdev->hw->ops.power_changed_on)
+			return ptdev->hw->ops.power_changed_on(ptdev);
+	} else {
+		if (ptdev->hw->ops.power_changed_off)
+			ptdev->hw->ops.power_changed_off(ptdev);
+	}
+
+	return 0;
+}
+
+int panthor_hw_power_status_register(void)
+{
+	struct device_driver *drv;
+	int ret;
+
+	drv = driver_find("panthor", &platform_bus_type);
+	if (!drv)
+		return -ENODEV;
+
+	ret = driver_for_each_device(drv, NULL, (void *)true,
+				     panthor_hw_set_power_tracing);
+
+	return ret;
+}
+
+void panthor_hw_power_status_unregister(void)
+{
+	struct device_driver *drv;
+	int ret;
+
+	drv = driver_find("panthor", &platform_bus_type);
+	if (!drv)
+		return;
+
+	ret = driver_for_each_device(drv, NULL, NULL, panthor_hw_set_power_tracing);
+
+	/*
+	 * Ideally, it'd be possible to ask driver_for_each_device to hand us
+	 * another "start" to keep going after the failing device, but it
+	 * doesn't do that. Minor inconvenience in what is probably a bad day
+	 * on the computer already though.
+	 */
+	if (ret)
+		pr_warn("Couldn't mask power IRQ for at least one device: %pe\n",
+			ERR_PTR(ret));
+}
+
 static char *get_gpu_model_name(struct panthor_device *ptdev)
 {
 	const u32 gpu_id = ptdev->gpu_info.gpu_id;
diff --git a/drivers/gpu/drm/panthor/panthor_hw.h b/drivers/gpu/drm/panthor/panthor_hw.h
index 56c68c1e9c26..2c28aea82841 100644
--- a/drivers/gpu/drm/panthor/panthor_hw.h
+++ b/drivers/gpu/drm/panthor/panthor_hw.h
@@ -19,6 +19,12 @@ struct panthor_hw_ops {
 
 	/** @l2_power_on: L2 power on function pointer */
 	int (*l2_power_on)(struct panthor_device *ptdev);
+
+	/** @power_changed_on: Start listening to power change IRQs */
+	int (*power_changed_on)(struct panthor_device *ptdev);
+
+	/** @power_changed_off: Stop listening to power change IRQs */
+	void (*power_changed_off)(struct panthor_device *ptdev);
 };
 
 /**
@@ -32,6 +38,8 @@ struct panthor_hw {
 };
 
 int panthor_hw_init(struct panthor_device *ptdev);
+int panthor_hw_power_status_register(void);
+void panthor_hw_power_status_unregister(void);
 
 static inline int panthor_hw_soft_reset(struct panthor_device *ptdev)
 {
diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
new file mode 100644
index 000000000000..5bd420894745
--- /dev/null
+++ b/drivers/gpu/drm/panthor/panthor_trace.h
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: GPL-2.0 or MIT */
+/* Copyright 2025 Collabora ltd. */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM panthor
+
+#if !defined(__PANTHOR_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
+#define __PANTHOR_TRACE_H__
+
+#include <linux/tracepoint.h>
+#include <linux/types.h>
+
+#include "panthor_hw.h"
+
+/**
+ * gpu_power_status - called whenever parts of GPU hardware are turned on or off
+ * @dev: pointer to the &struct device, for printing the device name
+ * @shader_bitmap: bitmap where a high bit indicates the shader core at a given
+ *                 bit index is on, and a low bit indicates a shader core is
+ *                 either powered off or absent
+ * @tiler_bitmap: bitmap where a high bit indicates the tiler unit at a given
+ *                bit index is on, and a low bit indicates a tiler unit is
+ *                either powered off or absent
+ * @l2_bitmap: bitmap where a high bit indicates the L2 cache at a given bit
+ *             index is on, and a low bit indicates the L2 cache is either
+ *             powered off or absent
+ */
+TRACE_EVENT_FN(gpu_power_status,
+	TP_PROTO(const struct device *dev, u64 shader_bitmap, u64 tiler_bitmap,
+		 u64 l2_bitmap),
+	TP_ARGS(dev, shader_bitmap, tiler_bitmap, l2_bitmap),
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(dev))
+		__field(u64, shader_bitmap)
+		__field(u64, tiler_bitmap)
+		__field(u64, l2_bitmap)
+	),
+	TP_fast_assign(
+		__assign_str(dev_name);
+		__entry->shader_bitmap	= shader_bitmap;
+		__entry->tiler_bitmap	= tiler_bitmap;
+		__entry->l2_bitmap	= l2_bitmap;
+	),
+	TP_printk("%s: shader_bitmap=0x%llx tiler_bitmap=0x%llx l2_bitmap=0x%llx",
+		  __get_str(dev_name), __entry->shader_bitmap, __entry->tiler_bitmap,
+		  __entry->l2_bitmap
+	),
+	panthor_hw_power_status_register, panthor_hw_power_status_unregister
+);
+
+#endif /* __PANTHOR_TRACE_H__ */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE panthor_trace
+
+#include <trace/define_trace.h>

-- 
2.52.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v7 4/4] drm/panthor: Add gpu_job_irq tracepoint
  2026-01-08 14:19 [PATCH v7 0/4] Add a few tracepoints to panthor Nicolas Frattaroli
                   ` (2 preceding siblings ...)
  2026-01-08 14:19 ` [PATCH v7 3/4] drm/panthor: Add tracepoint for hardware utilisation changes Nicolas Frattaroli
@ 2026-01-08 14:19 ` Nicolas Frattaroli
  2026-01-09 16:23   ` Steven Price
  3 siblings, 1 reply; 13+ messages in thread
From: Nicolas Frattaroli @ 2026-01-08 14:19 UTC (permalink / raw)
  To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Chia-I Wu, Karunika Choo
  Cc: kernel, linux-kernel, dri-devel, Nicolas Frattaroli

Mali's CSF firmware triggers the job IRQ whenever there's new firmware
events for processing. While this can be a global event (BIT(31) of the
status register), it's usually an event relating to a command stream
group (the other bit indices).

Panthor throws these events onto a workqueue for processing outside the
IRQ handler. It's therefore useful to have an instrumented tracepoint
that goes beyond the generic IRQ tracepoint for this specific case, as
it can be augmented with additional data, namely the events bit mask.

This can then be used to debug problems relating to GPU jobs events not
being processed quickly enough. The duration_ns field can be used to
work backwards from when the tracepoint fires (at the end of the IRQ
handler) to figure out when the interrupt itself landed, providing not
just information on how long the work queueing took, but also when the
actual interrupt itself arrived.

With this information in hand, the IRQ handler itself being slow can be
excluded as a possible source of problems, and attention can be directed
to the workqueue processing instead.

Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_fw.c    | 13 +++++++++++++
 drivers/gpu/drm/panthor/panthor_trace.h | 28 ++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 0e46625f7621..b3b48c1b049c 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -26,6 +26,7 @@
 #include "panthor_mmu.h"
 #include "panthor_regs.h"
 #include "panthor_sched.h"
+#include "panthor_trace.h"
 
 #define CSF_FW_NAME "mali_csffw.bin"
 
@@ -1060,6 +1061,12 @@ static void panthor_fw_init_global_iface(struct panthor_device *ptdev)
 
 static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
 {
+	u32 duration;
+	u64 start;
+
+	if (tracepoint_enabled(gpu_job_irq))
+		start = ktime_get_ns();
+
 	gpu_write(ptdev, JOB_INT_CLEAR, status);
 
 	if (!ptdev->fw->booted && (status & JOB_INT_GLOBAL_IF))
@@ -1072,6 +1079,12 @@ static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
 		return;
 
 	panthor_sched_report_fw_events(ptdev, status);
+
+	if (tracepoint_enabled(gpu_job_irq)) {
+		if (check_sub_overflow(ktime_get_ns(), start, &duration))
+			duration = U32_MAX;
+		trace_gpu_job_irq(ptdev->base.dev, status, duration);
+	}
 }
 PANTHOR_IRQ_HANDLER(job, JOB, panthor_job_irq_handler);
 
diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
index 5bd420894745..6ffeb4fe6599 100644
--- a/drivers/gpu/drm/panthor/panthor_trace.h
+++ b/drivers/gpu/drm/panthor/panthor_trace.h
@@ -48,6 +48,34 @@ TRACE_EVENT_FN(gpu_power_status,
 	panthor_hw_power_status_register, panthor_hw_power_status_unregister
 );
 
+/**
+ * gpu_job_irq - called after a job interrupt from firmware completes
+ * @dev: pointer to the &struct device, for printing the device name
+ * @events: bitmask of BIT(CSG id) | BIT(31) for a global event
+ * @duration_ns: Nanoseconds between job IRQ handler entry and exit
+ *
+ * The panthor_job_irq_handler() function instrumented by this tracepoint exits
+ * once it has queued the firmware interrupts for processing, not when the
+ * firmware interrupts are fully processed. This tracepoint allows for debugging
+ * issues with delays in the workqueue's processing of events.
+ */
+TRACE_EVENT(gpu_job_irq,
+	TP_PROTO(const struct device *dev, u32 events, u32 duration_ns),
+	TP_ARGS(dev, events, duration_ns),
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(dev))
+		__field(u32, events)
+		__field(u32, duration_ns)
+	),
+	TP_fast_assign(
+		__assign_str(dev_name);
+		__entry->events		= events;
+		__entry->duration_ns	= duration_ns;
+	),
+	TP_printk("%s: events=0x%x duration_ns=%d", __get_str(dev_name),
+		  __entry->events, __entry->duration_ns)
+);
+
 #endif /* __PANTHOR_TRACE_H__ */
 
 #undef TRACE_INCLUDE_PATH

-- 
2.52.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 1/4] drm/panthor: Extend IRQ helpers for mask modification/restoration
  2026-01-08 14:19 ` [PATCH v7 1/4] drm/panthor: Extend IRQ helpers for mask modification/restoration Nicolas Frattaroli
@ 2026-01-09 15:59   ` Steven Price
  2026-01-11 11:39     ` Nicolas Frattaroli
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Price @ 2026-01-09 15:59 UTC (permalink / raw)
  To: Nicolas Frattaroli, Boris Brezillon, Liviu Dudau,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Chia-I Wu, Karunika Choo
  Cc: kernel, linux-kernel, dri-devel

On 08/01/2026 14:19, Nicolas Frattaroli wrote:
> The current IRQ helpers do not guarantee mutual exclusion that covers
> the entire transaction from accessing the mask member and modifying the
> mask register.
> 
> This makes it hard, if not impossible, to implement mask modification
> helpers that may change one of these outside the normal
> suspend/resume/isr code paths.
> 
> Add a spinlock to struct panthor_irq that protects both the mask member
> and register. Acquire it in all code paths that access these, but drop
> it before processing the threaded handler function. Then, add the
> aforementioned new helpers: enable_events, and disable_events. They work
> by ORing and NANDing the mask bits.
> 
> resume is changed to no longer have a mask passed, as pirq->mask is
> supposed to be the user-requested mask now, rather than a mirror of the
> INT_MASK register contents. Users of the resume helper are adjusted
> accordingly, including a rather painful refactor in panthor_mmu.c.
> 
> panthor_irq::suspended remains an atomic, as it's necessarily written to
> outside the mask_lock in the suspend path.
> 
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
>  drivers/gpu/drm/panthor/panthor_device.h |  60 ++++++--
>  drivers/gpu/drm/panthor/panthor_fw.c     |   3 +-
>  drivers/gpu/drm/panthor/panthor_gpu.c    |   2 +-
>  drivers/gpu/drm/panthor/panthor_mmu.c    | 247 ++++++++++++++++---------------
>  drivers/gpu/drm/panthor/panthor_pwr.c    |   2 +-
>  5 files changed, 179 insertions(+), 135 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index f35e52b9546a..cf76a8abca76 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -73,11 +73,14 @@ struct panthor_irq {
>  	/** @irq: IRQ number. */
>  	int irq;
>  
> -	/** @mask: Current mask being applied to xxx_INT_MASK. */
> +	/** @mask: Values to write to xxx_INT_MASK if active. */
>  	u32 mask;
>  
>  	/** @suspended: Set to true when the IRQ is suspended. */
>  	atomic_t suspended;
> +
> +	/** @mask_lock: protects modifications to _INT_MASK and @mask */
> +	spinlock_t mask_lock;
>  };
>  
>  /**
> @@ -410,6 +413,8 @@ static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data)
>  	struct panthor_irq *pirq = data;							\
>  	struct panthor_device *ptdev = pirq->ptdev;						\
>  												\
> +	guard(spinlock_irqsave)(&pirq->mask_lock);						\
> +												\
>  	if (atomic_read(&pirq->suspended))							\
>  		return IRQ_NONE;								\
>  	if (!gpu_read(ptdev, __reg_prefix ## _INT_STAT))					\
> @@ -424,9 +429,14 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
>  	struct panthor_irq *pirq = data;							\
>  	struct panthor_device *ptdev = pirq->ptdev;						\
>  	irqreturn_t ret = IRQ_NONE;								\
> +	u32 mask;										\
> +												\
> +	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
> +		mask = pirq->mask;								\
> +	}											\
>  												\
>  	while (true) {										\
> -		u32 status = gpu_read(ptdev, __reg_prefix ## _INT_RAWSTAT) & pirq->mask;	\
> +		u32 status = (gpu_read(ptdev, __reg_prefix ## _INT_RAWSTAT) & mask);		\
>  												\
>  		if (!status)									\
>  			break;									\
> @@ -435,26 +445,34 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
>  		ret = IRQ_HANDLED;								\
>  	}											\
>  												\
> -	if (!atomic_read(&pirq->suspended))							\
> -		gpu_write(ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
> +	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
> +		if (!atomic_read(&pirq->suspended)) {						\
> +			/* Only restore the bits that were used and are still enabled */	\
> +			gpu_write(ptdev, __reg_prefix ## _INT_MASK,				\
> +				  gpu_read(ptdev, __reg_prefix ## _INT_MASK) |			\
> +				  (mask & pirq->mask));						\
> +		}										\
> +	}											\
>  												\
>  	return ret;										\
>  }												\
>  												\
>  static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq)			\
>  {												\
> -	pirq->mask = 0;										\
> -	gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, 0);					\
> +	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
> +		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, 0);				\
> +	}											\
>  	synchronize_irq(pirq->irq);								\

This isn't quite safe with the threaded handler. The following can occur:

CPU 0				| CPU 1
--------------------------------+----------------------------
Running _irq_threaded_handler() |
Enters __handler() callback     |
				| Enters _irq_suspend
				| Writes 0 to _INT_MASK
				| Drops scoped_guard()
				| Waits for the threaded handler
Enters the final scoped_guard   |
pirq->suspended is non-zero	|
Reads pirq->mask/mask		|
Writes non-zero to _INT_MASK	|
				| Sets suspended, but it's too late

Leading to the suspend occurring with interrupts not masked.

In the next patches you introduce the SUSPENDING flag which I think
might fix this, but with just this patch it's broken so we could have
bisection issues.

Admittedly the old code was a little dodgy with the usage of irq->mask
(I think really we should have atomic accesses to ensure that the write
of pirq->mask is observed before the gpu_write).

Can you reorder the patches - introduce the panthor_irq_state enum first
(with just SUSPENDED and ACTIVE states) and then do the conversion in
one step?

Thanks,
Steve

>  	atomic_set(&pirq->suspended, true);							\
>  }												\
>  												\
> -static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq, u32 mask)	\
> +static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq)			\
>  {												\
> +	guard(spinlock_irqsave)(&pirq->mask_lock);						\
> +												\
>  	atomic_set(&pirq->suspended, false);							\
> -	pirq->mask = mask;									\
> -	gpu_write(pirq->ptdev, __reg_prefix ## _INT_CLEAR, mask);				\
> -	gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, mask);				\
> +	gpu_write(pirq->ptdev, __reg_prefix ## _INT_CLEAR, pirq->mask);				\
> +	gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);				\
>  }												\
>  												\
>  static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
> @@ -463,13 +481,33 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
>  {												\
>  	pirq->ptdev = ptdev;									\
>  	pirq->irq = irq;									\
> -	panthor_ ## __name ## _irq_resume(pirq, mask);						\
> +	pirq->mask = mask;									\
> +	spin_lock_init(&pirq->mask_lock);							\
> +	panthor_ ## __name ## _irq_resume(pirq);						\
>  												\
>  	return devm_request_threaded_irq(ptdev->base.dev, irq,					\
>  					 panthor_ ## __name ## _irq_raw_handler,		\
>  					 panthor_ ## __name ## _irq_threaded_handler,		\
>  					 IRQF_SHARED, KBUILD_MODNAME "-" # __name,		\
>  					 pirq);							\
> +}												\
> +												\
> +static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *pirq, u32 mask)	\
> +{												\
> +	guard(spinlock_irqsave)(&pirq->mask_lock);						\
> +												\
> +	pirq->mask |= mask;									\
> +	if (!atomic_read(&pirq->suspended))							\
> +		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
> +}												\
> +												\
> +static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq *pirq, u32 mask)\
> +{												\
> +	guard(spinlock_irqsave)(&pirq->mask_lock);						\
> +												\
> +	pirq->mask &= ~mask;									\
> +	if (!atomic_read(&pirq->suspended))							\
> +		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
>  }
>  
>  extern struct workqueue_struct *panthor_cleanup_wq;
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index a64ec8756bed..0e46625f7621 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -1080,7 +1080,8 @@ static int panthor_fw_start(struct panthor_device *ptdev)
>  	bool timedout = false;
>  
>  	ptdev->fw->booted = false;
> -	panthor_job_irq_resume(&ptdev->fw->irq, ~0);
> +	panthor_job_irq_enable_events(&ptdev->fw->irq, ~0);
> +	panthor_job_irq_resume(&ptdev->fw->irq);
>  	gpu_write(ptdev, MCU_CONTROL, MCU_CONTROL_AUTO);
>  
>  	if (!wait_event_timeout(ptdev->fw->req_waitqueue,
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index 057e167468d0..9304469a711a 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -395,7 +395,7 @@ void panthor_gpu_suspend(struct panthor_device *ptdev)
>   */
>  void panthor_gpu_resume(struct panthor_device *ptdev)
>  {
> -	panthor_gpu_irq_resume(&ptdev->gpu->irq, GPU_INTERRUPTS_MASK);
> +	panthor_gpu_irq_resume(&ptdev->gpu->irq);
>  	panthor_hw_l2_power_on(ptdev);
>  }
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index b888fff05efe..64aa249c8a93 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -655,125 +655,6 @@ static void panthor_vm_release_as_locked(struct panthor_vm *vm)
>  	vm->as.id = -1;
>  }
>  
> -/**
> - * panthor_vm_active() - Flag a VM as active
> - * @vm: VM to flag as active.
> - *
> - * Assigns an address space to a VM so it can be used by the GPU/MCU.
> - *
> - * Return: 0 on success, a negative error code otherwise.
> - */
> -int panthor_vm_active(struct panthor_vm *vm)
> -{
> -	struct panthor_device *ptdev = vm->ptdev;
> -	u32 va_bits = GPU_MMU_FEATURES_VA_BITS(ptdev->gpu_info.mmu_features);
> -	struct io_pgtable_cfg *cfg = &io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg;
> -	int ret = 0, as, cookie;
> -	u64 transtab, transcfg;
> -
> -	if (!drm_dev_enter(&ptdev->base, &cookie))
> -		return -ENODEV;
> -
> -	if (refcount_inc_not_zero(&vm->as.active_cnt))
> -		goto out_dev_exit;
> -
> -	/* Make sure we don't race with lock/unlock_region() calls
> -	 * happening around VM bind operations.
> -	 */
> -	mutex_lock(&vm->op_lock);
> -	mutex_lock(&ptdev->mmu->as.slots_lock);
> -
> -	if (refcount_inc_not_zero(&vm->as.active_cnt))
> -		goto out_unlock;
> -
> -	as = vm->as.id;
> -	if (as >= 0) {
> -		/* Unhandled pagefault on this AS, the MMU was disabled. We need to
> -		 * re-enable the MMU after clearing+unmasking the AS interrupts.
> -		 */
> -		if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as))
> -			goto out_enable_as;
> -
> -		goto out_make_active;
> -	}
> -
> -	/* Check for a free AS */
> -	if (vm->for_mcu) {
> -		drm_WARN_ON(&ptdev->base, ptdev->mmu->as.alloc_mask & BIT(0));
> -		as = 0;
> -	} else {
> -		as = ffz(ptdev->mmu->as.alloc_mask | BIT(0));
> -	}
> -
> -	if (!(BIT(as) & ptdev->gpu_info.as_present)) {
> -		struct panthor_vm *lru_vm;
> -
> -		lru_vm = list_first_entry_or_null(&ptdev->mmu->as.lru_list,
> -						  struct panthor_vm,
> -						  as.lru_node);
> -		if (drm_WARN_ON(&ptdev->base, !lru_vm)) {
> -			ret = -EBUSY;
> -			goto out_unlock;
> -		}
> -
> -		drm_WARN_ON(&ptdev->base, refcount_read(&lru_vm->as.active_cnt));
> -		as = lru_vm->as.id;
> -
> -		ret = panthor_mmu_as_disable(ptdev, as, true);
> -		if (ret)
> -			goto out_unlock;
> -
> -		panthor_vm_release_as_locked(lru_vm);
> -	}
> -
> -	/* Assign the free or reclaimed AS to the FD */
> -	vm->as.id = as;
> -	set_bit(as, &ptdev->mmu->as.alloc_mask);
> -	ptdev->mmu->as.slots[as].vm = vm;
> -
> -out_enable_as:
> -	transtab = cfg->arm_lpae_s1_cfg.ttbr;
> -	transcfg = AS_TRANSCFG_PTW_MEMATTR_WB |
> -		   AS_TRANSCFG_PTW_RA |
> -		   AS_TRANSCFG_ADRMODE_AARCH64_4K |
> -		   AS_TRANSCFG_INA_BITS(55 - va_bits);
> -	if (ptdev->coherent)
> -		transcfg |= AS_TRANSCFG_PTW_SH_OS;
> -
> -	/* If the VM is re-activated, we clear the fault. */
> -	vm->unhandled_fault = false;
> -
> -	/* Unhandled pagefault on this AS, clear the fault and re-enable interrupts
> -	 * before enabling the AS.
> -	 */
> -	if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as)) {
> -		gpu_write(ptdev, MMU_INT_CLEAR, panthor_mmu_as_fault_mask(ptdev, as));
> -		ptdev->mmu->as.faulty_mask &= ~panthor_mmu_as_fault_mask(ptdev, as);
> -		ptdev->mmu->irq.mask |= panthor_mmu_as_fault_mask(ptdev, as);
> -		gpu_write(ptdev, MMU_INT_MASK, ~ptdev->mmu->as.faulty_mask);
> -	}
> -
> -	/* The VM update is guarded by ::op_lock, which we take at the beginning
> -	 * of this function, so we don't expect any locked region here.
> -	 */
> -	drm_WARN_ON(&vm->ptdev->base, vm->locked_region.size > 0);
> -	ret = panthor_mmu_as_enable(vm->ptdev, vm->as.id, transtab, transcfg, vm->memattr);
> -
> -out_make_active:
> -	if (!ret) {
> -		refcount_set(&vm->as.active_cnt, 1);
> -		list_del_init(&vm->as.lru_node);
> -	}
> -
> -out_unlock:
> -	mutex_unlock(&ptdev->mmu->as.slots_lock);
> -	mutex_unlock(&vm->op_lock);
> -
> -out_dev_exit:
> -	drm_dev_exit(cookie);
> -	return ret;
> -}
> -
>  /**
>   * panthor_vm_idle() - Flag a VM idle
>   * @vm: VM to flag as idle.
> @@ -1772,6 +1653,128 @@ static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
>  }
>  PANTHOR_IRQ_HANDLER(mmu, MMU, panthor_mmu_irq_handler);
>  
> +/**
> + * panthor_vm_active() - Flag a VM as active
> + * @vm: VM to flag as active.
> + *
> + * Assigns an address space to a VM so it can be used by the GPU/MCU.
> + *
> + * Return: 0 on success, a negative error code otherwise.
> + */
> +int panthor_vm_active(struct panthor_vm *vm)
> +{
> +	struct panthor_device *ptdev = vm->ptdev;
> +	u32 va_bits = GPU_MMU_FEATURES_VA_BITS(ptdev->gpu_info.mmu_features);
> +	struct io_pgtable_cfg *cfg = &io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg;
> +	int ret = 0, as, cookie;
> +	u64 transtab, transcfg;
> +	u32 fault_mask;
> +
> +	if (!drm_dev_enter(&ptdev->base, &cookie))
> +		return -ENODEV;
> +
> +	if (refcount_inc_not_zero(&vm->as.active_cnt))
> +		goto out_dev_exit;
> +
> +	/* Make sure we don't race with lock/unlock_region() calls
> +	 * happening around VM bind operations.
> +	 */
> +	mutex_lock(&vm->op_lock);
> +	mutex_lock(&ptdev->mmu->as.slots_lock);
> +
> +	if (refcount_inc_not_zero(&vm->as.active_cnt))
> +		goto out_unlock;
> +
> +	as = vm->as.id;
> +	if (as >= 0) {
> +		/* Unhandled pagefault on this AS, the MMU was disabled. We need to
> +		 * re-enable the MMU after clearing+unmasking the AS interrupts.
> +		 */
> +		if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as))
> +			goto out_enable_as;
> +
> +		goto out_make_active;
> +	}
> +
> +	/* Check for a free AS */
> +	if (vm->for_mcu) {
> +		drm_WARN_ON(&ptdev->base, ptdev->mmu->as.alloc_mask & BIT(0));
> +		as = 0;
> +	} else {
> +		as = ffz(ptdev->mmu->as.alloc_mask | BIT(0));
> +	}
> +
> +	if (!(BIT(as) & ptdev->gpu_info.as_present)) {
> +		struct panthor_vm *lru_vm;
> +
> +		lru_vm = list_first_entry_or_null(&ptdev->mmu->as.lru_list,
> +						  struct panthor_vm,
> +						  as.lru_node);
> +		if (drm_WARN_ON(&ptdev->base, !lru_vm)) {
> +			ret = -EBUSY;
> +			goto out_unlock;
> +		}
> +
> +		drm_WARN_ON(&ptdev->base, refcount_read(&lru_vm->as.active_cnt));
> +		as = lru_vm->as.id;
> +
> +		ret = panthor_mmu_as_disable(ptdev, as, true);
> +		if (ret)
> +			goto out_unlock;
> +
> +		panthor_vm_release_as_locked(lru_vm);
> +	}
> +
> +	/* Assign the free or reclaimed AS to the FD */
> +	vm->as.id = as;
> +	set_bit(as, &ptdev->mmu->as.alloc_mask);
> +	ptdev->mmu->as.slots[as].vm = vm;
> +
> +out_enable_as:
> +	transtab = cfg->arm_lpae_s1_cfg.ttbr;
> +	transcfg = AS_TRANSCFG_PTW_MEMATTR_WB |
> +		   AS_TRANSCFG_PTW_RA |
> +		   AS_TRANSCFG_ADRMODE_AARCH64_4K |
> +		   AS_TRANSCFG_INA_BITS(55 - va_bits);
> +	if (ptdev->coherent)
> +		transcfg |= AS_TRANSCFG_PTW_SH_OS;
> +
> +	/* If the VM is re-activated, we clear the fault. */
> +	vm->unhandled_fault = false;
> +
> +	/* Unhandled pagefault on this AS, clear the fault and re-enable interrupts
> +	 * before enabling the AS.
> +	 */
> +	fault_mask = panthor_mmu_as_fault_mask(ptdev, as);
> +	if (ptdev->mmu->as.faulty_mask & fault_mask) {
> +		gpu_write(ptdev, MMU_INT_CLEAR, fault_mask);
> +		ptdev->mmu->as.faulty_mask &= ~fault_mask;
> +		panthor_mmu_irq_enable_events(&ptdev->mmu->irq, fault_mask);
> +		panthor_mmu_irq_disable_events(&ptdev->mmu->irq, ptdev->mmu->as.faulty_mask);
> +	}
> +
> +	/* The VM update is guarded by ::op_lock, which we take at the beginning
> +	 * of this function, so we don't expect any locked region here.
> +	 */
> +	drm_WARN_ON(&vm->ptdev->base, vm->locked_region.size > 0);
> +	ret = panthor_mmu_as_enable(vm->ptdev, vm->as.id, transtab, transcfg, vm->memattr);
> +
> +out_make_active:
> +	if (!ret) {
> +		refcount_set(&vm->as.active_cnt, 1);
> +		list_del_init(&vm->as.lru_node);
> +	}
> +
> +out_unlock:
> +	mutex_unlock(&ptdev->mmu->as.slots_lock);
> +	mutex_unlock(&vm->op_lock);
> +
> +out_dev_exit:
> +	drm_dev_exit(cookie);
> +	return ret;
> +}
> +
> +
>  /**
>   * panthor_mmu_suspend() - Suspend the MMU logic
>   * @ptdev: Device.
> @@ -1815,7 +1818,8 @@ void panthor_mmu_resume(struct panthor_device *ptdev)
>  	ptdev->mmu->as.faulty_mask = 0;
>  	mutex_unlock(&ptdev->mmu->as.slots_lock);
>  
> -	panthor_mmu_irq_resume(&ptdev->mmu->irq, panthor_mmu_fault_mask(ptdev, ~0));
> +	panthor_mmu_irq_enable_events(&ptdev->mmu->irq, panthor_mmu_fault_mask(ptdev, ~0));
> +	panthor_mmu_irq_resume(&ptdev->mmu->irq);
>  }
>  
>  /**
> @@ -1869,7 +1873,8 @@ void panthor_mmu_post_reset(struct panthor_device *ptdev)
>  
>  	mutex_unlock(&ptdev->mmu->as.slots_lock);
>  
> -	panthor_mmu_irq_resume(&ptdev->mmu->irq, panthor_mmu_fault_mask(ptdev, ~0));
> +	panthor_mmu_irq_enable_events(&ptdev->mmu->irq, panthor_mmu_fault_mask(ptdev, ~0));
> +	panthor_mmu_irq_resume(&ptdev->mmu->irq);
>  
>  	/* Restart the VM_BIND queues. */
>  	mutex_lock(&ptdev->mmu->vm.lock);
> diff --git a/drivers/gpu/drm/panthor/panthor_pwr.c b/drivers/gpu/drm/panthor/panthor_pwr.c
> index 57cfc7ce715b..ed3b2b4479ca 100644
> --- a/drivers/gpu/drm/panthor/panthor_pwr.c
> +++ b/drivers/gpu/drm/panthor/panthor_pwr.c
> @@ -545,5 +545,5 @@ void panthor_pwr_resume(struct panthor_device *ptdev)
>  	if (!ptdev->pwr)
>  		return;
>  
> -	panthor_pwr_irq_resume(&ptdev->pwr->irq, PWR_INTERRUPTS_MASK);
> +	panthor_pwr_irq_resume(&ptdev->pwr->irq);
>  }
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 2/4] drm/panthor: Rework panthor_irq::suspended into panthor_irq::state
  2026-01-08 14:19 ` [PATCH v7 2/4] drm/panthor: Rework panthor_irq::suspended into panthor_irq::state Nicolas Frattaroli
@ 2026-01-09 16:05   ` Steven Price
  2026-01-12 13:39     ` Nicolas Frattaroli
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Price @ 2026-01-09 16:05 UTC (permalink / raw)
  To: Nicolas Frattaroli, Boris Brezillon, Liviu Dudau,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Chia-I Wu, Karunika Choo
  Cc: kernel, linux-kernel, dri-devel

On 08/01/2026 14:19, Nicolas Frattaroli wrote:
> To deal with the threaded interrupt handler and a suspend action
> overlapping, the boolean panthor_irq::suspended is not sufficient.
> 
> Rework it into taking several different values depending on the current
> state, and check it and set it within the IRQ helper functions.
> 
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
>  drivers/gpu/drm/panthor/panthor_device.h | 40 +++++++++++++++++++++++++-------
>  1 file changed, 32 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index cf76a8abca76..a8c21a7eea05 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -61,6 +61,17 @@ enum panthor_device_pm_state {
>  	PANTHOR_DEVICE_PM_STATE_SUSPENDING,
>  };
>  
> +enum panthor_irq_state {
> +	/** @PANTHOR_IRQ_STATE_ACTIVE: IRQ is active and ready to process events. */
> +	PANTHOR_IRQ_STATE_ACTIVE = 0,
> +	/** @PANTHOR_IRQ_STATE_PROCESSING: IRQ is currently processing events. */
> +	PANTHOR_IRQ_STATE_PROCESSING,
> +	/** @PANTHOR_IRQ_STATE_SUSPENDED: IRQ is suspended. */
> +	PANTHOR_IRQ_STATE_SUSPENDED,
> +	/** @PANTHOR_IRQ_STATE_SUSPENDING: IRQ is being suspended. */
> +	PANTHOR_IRQ_STATE_SUSPENDING,
> +};
> +
>  /**
>   * struct panthor_irq - IRQ data
>   *
> @@ -76,8 +87,8 @@ struct panthor_irq {
>  	/** @mask: Values to write to xxx_INT_MASK if active. */
>  	u32 mask;
>  
> -	/** @suspended: Set to true when the IRQ is suspended. */
> -	atomic_t suspended;
> +	/** @state: one of &enum panthor_irq_state reflecting the current state. */
> +	atomic_t state;
>  
>  	/** @mask_lock: protects modifications to _INT_MASK and @mask */
>  	spinlock_t mask_lock;
> @@ -415,7 +426,7 @@ static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data)
>  												\
>  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
>  												\
> -	if (atomic_read(&pirq->suspended))							\
> +	if (atomic_read(&pirq->state) == PANTHOR_IRQ_STATE_SUSPENDED)				\

Do we want to also catch the STATE_SUSPENDING case here? AFAICT in
SUSPENDING we should be confident that _INT_MASK==0 so this shouldn't a
problem (_INT_STAT should read as 0 below). But we don't want interrupts
during STATE_SUSPENDING so we might as well handle it.

Thanks,
Steve

>  		return IRQ_NONE;								\
>  	if (!gpu_read(ptdev, __reg_prefix ## _INT_STAT))					\
>  		return IRQ_NONE;								\
> @@ -428,11 +439,14 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
>  {												\
>  	struct panthor_irq *pirq = data;							\
>  	struct panthor_device *ptdev = pirq->ptdev;						\
> +	enum panthor_irq_state state;								\
>  	irqreturn_t ret = IRQ_NONE;								\
>  	u32 mask;										\
>  												\
>  	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
>  		mask = pirq->mask;								\
> +		atomic_cmpxchg(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE,				\
> +			       PANTHOR_IRQ_STATE_PROCESSING);					\
>  	}											\
>  												\
>  	while (true) {										\
> @@ -446,11 +460,14 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
>  	}											\
>  												\
>  	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
> -		if (!atomic_read(&pirq->suspended)) {						\
> +		state = atomic_read(&pirq->state);						\
> +		if (state != PANTHOR_IRQ_STATE_SUSPENDED &&					\
> +		    state != PANTHOR_IRQ_STATE_SUSPENDING) {					\
>  			/* Only restore the bits that were used and are still enabled */	\
>  			gpu_write(ptdev, __reg_prefix ## _INT_MASK,				\
>  				  gpu_read(ptdev, __reg_prefix ## _INT_MASK) |			\
>  				  (mask & pirq->mask));						\
> +			atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE);			\
>  		}										\
>  	}											\
>  												\
> @@ -461,16 +478,17 @@ static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq)
>  {												\
>  	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
>  		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, 0);				\
> +		atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDING);				\
>  	}											\
>  	synchronize_irq(pirq->irq);								\
> -	atomic_set(&pirq->suspended, true);							\
> +	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDED);					\
>  }												\
>  												\
>  static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq)			\
>  {												\
>  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
>  												\
> -	atomic_set(&pirq->suspended, false);							\
> +	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE);					\
>  	gpu_write(pirq->ptdev, __reg_prefix ## _INT_CLEAR, pirq->mask);				\
>  	gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);				\
>  }												\
> @@ -494,19 +512,25 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
>  												\
>  static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *pirq, u32 mask)	\
>  {												\
> +	enum panthor_irq_state state;								\
> +												\
>  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
>  												\
>  	pirq->mask |= mask;									\
> -	if (!atomic_read(&pirq->suspended))							\
> +	state = atomic_read(&pirq->state);							\
> +	if (state != PANTHOR_IRQ_STATE_SUSPENDED && state != PANTHOR_IRQ_STATE_SUSPENDING)	\
>  		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
>  }												\
>  												\
>  static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq *pirq, u32 mask)\
>  {												\
> +	enum panthor_irq_state state;								\
> +												\
>  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
>  												\
>  	pirq->mask &= ~mask;									\
> -	if (!atomic_read(&pirq->suspended))							\
> +	state = atomic_read(&pirq->state);							\
> +	if (state != PANTHOR_IRQ_STATE_SUSPENDED && state != PANTHOR_IRQ_STATE_SUSPENDING)	\
>  		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
>  }
>  
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 4/4] drm/panthor: Add gpu_job_irq tracepoint
  2026-01-08 14:19 ` [PATCH v7 4/4] drm/panthor: Add gpu_job_irq tracepoint Nicolas Frattaroli
@ 2026-01-09 16:23   ` Steven Price
  2026-01-11 11:49     ` Nicolas Frattaroli
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Price @ 2026-01-09 16:23 UTC (permalink / raw)
  To: Nicolas Frattaroli, Boris Brezillon, Liviu Dudau,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Chia-I Wu, Karunika Choo
  Cc: kernel, linux-kernel, dri-devel

On 08/01/2026 14:19, Nicolas Frattaroli wrote:
> Mali's CSF firmware triggers the job IRQ whenever there's new firmware
> events for processing. While this can be a global event (BIT(31) of the
> status register), it's usually an event relating to a command stream
> group (the other bit indices).
> 
> Panthor throws these events onto a workqueue for processing outside the
> IRQ handler. It's therefore useful to have an instrumented tracepoint
> that goes beyond the generic IRQ tracepoint for this specific case, as
> it can be augmented with additional data, namely the events bit mask.
> 
> This can then be used to debug problems relating to GPU jobs events not
> being processed quickly enough. The duration_ns field can be used to
> work backwards from when the tracepoint fires (at the end of the IRQ
> handler) to figure out when the interrupt itself landed, providing not
> just information on how long the work queueing took, but also when the
> actual interrupt itself arrived.
> 
> With this information in hand, the IRQ handler itself being slow can be
> excluded as a possible source of problems, and attention can be directed
> to the workqueue processing instead.
> 
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
>  drivers/gpu/drm/panthor/panthor_fw.c    | 13 +++++++++++++
>  drivers/gpu/drm/panthor/panthor_trace.h | 28 ++++++++++++++++++++++++++++
>  2 files changed, 41 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index 0e46625f7621..b3b48c1b049c 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -26,6 +26,7 @@
>  #include "panthor_mmu.h"
>  #include "panthor_regs.h"
>  #include "panthor_sched.h"
> +#include "panthor_trace.h"
>  
>  #define CSF_FW_NAME "mali_csffw.bin"
>  
> @@ -1060,6 +1061,12 @@ static void panthor_fw_init_global_iface(struct panthor_device *ptdev)
>  
>  static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
>  {
> +	u32 duration;
> +	u64 start;
> +
> +	if (tracepoint_enabled(gpu_job_irq))
> +		start = ktime_get_ns();
> +
>  	gpu_write(ptdev, JOB_INT_CLEAR, status);
>  
>  	if (!ptdev->fw->booted && (status & JOB_INT_GLOBAL_IF))
> @@ -1072,6 +1079,12 @@ static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
>  		return;
>  
>  	panthor_sched_report_fw_events(ptdev, status);
> +
> +	if (tracepoint_enabled(gpu_job_irq)) {
> +		if (check_sub_overflow(ktime_get_ns(), start, &duration))

It's minor but if the tracepoint was enabled during the handler, the
duration will use start uninitialised. It's probably best to initialise
start just to avoid a potential stack leak.

Thanks,
Steve

> +			duration = U32_MAX;
> +		trace_gpu_job_irq(ptdev->base.dev, status, duration);
> +	}
>  }
>  PANTHOR_IRQ_HANDLER(job, JOB, panthor_job_irq_handler);
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
> index 5bd420894745..6ffeb4fe6599 100644
> --- a/drivers/gpu/drm/panthor/panthor_trace.h
> +++ b/drivers/gpu/drm/panthor/panthor_trace.h
> @@ -48,6 +48,34 @@ TRACE_EVENT_FN(gpu_power_status,
>  	panthor_hw_power_status_register, panthor_hw_power_status_unregister
>  );
>  
> +/**
> + * gpu_job_irq - called after a job interrupt from firmware completes
> + * @dev: pointer to the &struct device, for printing the device name
> + * @events: bitmask of BIT(CSG id) | BIT(31) for a global event
> + * @duration_ns: Nanoseconds between job IRQ handler entry and exit
> + *
> + * The panthor_job_irq_handler() function instrumented by this tracepoint exits
> + * once it has queued the firmware interrupts for processing, not when the
> + * firmware interrupts are fully processed. This tracepoint allows for debugging
> + * issues with delays in the workqueue's processing of events.
> + */
> +TRACE_EVENT(gpu_job_irq,
> +	TP_PROTO(const struct device *dev, u32 events, u32 duration_ns),
> +	TP_ARGS(dev, events, duration_ns),
> +	TP_STRUCT__entry(
> +		__string(dev_name, dev_name(dev))
> +		__field(u32, events)
> +		__field(u32, duration_ns)
> +	),
> +	TP_fast_assign(
> +		__assign_str(dev_name);
> +		__entry->events		= events;
> +		__entry->duration_ns	= duration_ns;
> +	),
> +	TP_printk("%s: events=0x%x duration_ns=%d", __get_str(dev_name),
> +		  __entry->events, __entry->duration_ns)
> +);
> +
>  #endif /* __PANTHOR_TRACE_H__ */
>  
>  #undef TRACE_INCLUDE_PATH
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 1/4] drm/panthor: Extend IRQ helpers for mask modification/restoration
  2026-01-09 15:59   ` Steven Price
@ 2026-01-11 11:39     ` Nicolas Frattaroli
  0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Frattaroli @ 2026-01-11 11:39 UTC (permalink / raw)
  To: Boris Brezillon, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Chia-I Wu,
	Karunika Choo, Steven Price
  Cc: kernel, linux-kernel, dri-devel

On Friday, 9 January 2026 16:59:46 Central European Standard Time Steven Price wrote:
> On 08/01/2026 14:19, Nicolas Frattaroli wrote:
> > The current IRQ helpers do not guarantee mutual exclusion that covers
> > the entire transaction from accessing the mask member and modifying the
> > mask register.
> > 
> > This makes it hard, if not impossible, to implement mask modification
> > helpers that may change one of these outside the normal
> > suspend/resume/isr code paths.
> > 
> > Add a spinlock to struct panthor_irq that protects both the mask member
> > and register. Acquire it in all code paths that access these, but drop
> > it before processing the threaded handler function. Then, add the
> > aforementioned new helpers: enable_events, and disable_events. They work
> > by ORing and NANDing the mask bits.
> > 
> > resume is changed to no longer have a mask passed, as pirq->mask is
> > supposed to be the user-requested mask now, rather than a mirror of the
> > INT_MASK register contents. Users of the resume helper are adjusted
> > accordingly, including a rather painful refactor in panthor_mmu.c.
> > 
> > panthor_irq::suspended remains an atomic, as it's necessarily written to
> > outside the mask_lock in the suspend path.
> > 
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > ---
> >  drivers/gpu/drm/panthor/panthor_device.h |  60 ++++++--
> >  drivers/gpu/drm/panthor/panthor_fw.c     |   3 +-
> >  drivers/gpu/drm/panthor/panthor_gpu.c    |   2 +-
> >  drivers/gpu/drm/panthor/panthor_mmu.c    | 247 ++++++++++++++++---------------
> >  drivers/gpu/drm/panthor/panthor_pwr.c    |   2 +-
> >  5 files changed, 179 insertions(+), 135 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> > index f35e52b9546a..cf76a8abca76 100644
> > --- a/drivers/gpu/drm/panthor/panthor_device.h
> > +++ b/drivers/gpu/drm/panthor/panthor_device.h
> > @@ -73,11 +73,14 @@ struct panthor_irq {
> >  	/** @irq: IRQ number. */
> >  	int irq;
> >  
> > -	/** @mask: Current mask being applied to xxx_INT_MASK. */
> > +	/** @mask: Values to write to xxx_INT_MASK if active. */
> >  	u32 mask;
> >  
> >  	/** @suspended: Set to true when the IRQ is suspended. */
> >  	atomic_t suspended;
> > +
> > +	/** @mask_lock: protects modifications to _INT_MASK and @mask */
> > +	spinlock_t mask_lock;
> >  };
> >  
> >  /**
> > @@ -410,6 +413,8 @@ static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data)
> >  	struct panthor_irq *pirq = data;							\
> >  	struct panthor_device *ptdev = pirq->ptdev;						\
> >  												\
> > +	guard(spinlock_irqsave)(&pirq->mask_lock);						\
> > +												\
> >  	if (atomic_read(&pirq->suspended))							\
> >  		return IRQ_NONE;								\
> >  	if (!gpu_read(ptdev, __reg_prefix ## _INT_STAT))					\
> > @@ -424,9 +429,14 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
> >  	struct panthor_irq *pirq = data;							\
> >  	struct panthor_device *ptdev = pirq->ptdev;						\
> >  	irqreturn_t ret = IRQ_NONE;								\
> > +	u32 mask;										\
> > +												\
> > +	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
> > +		mask = pirq->mask;								\
> > +	}											\
> >  												\
> >  	while (true) {										\
> > -		u32 status = gpu_read(ptdev, __reg_prefix ## _INT_RAWSTAT) & pirq->mask;	\
> > +		u32 status = (gpu_read(ptdev, __reg_prefix ## _INT_RAWSTAT) & mask);		\
> >  												\
> >  		if (!status)									\
> >  			break;									\
> > @@ -435,26 +445,34 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
> >  		ret = IRQ_HANDLED;								\
> >  	}											\
> >  												\
> > -	if (!atomic_read(&pirq->suspended))							\
> > -		gpu_write(ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
> > +	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
> > +		if (!atomic_read(&pirq->suspended)) {						\
> > +			/* Only restore the bits that were used and are still enabled */	\
> > +			gpu_write(ptdev, __reg_prefix ## _INT_MASK,				\
> > +				  gpu_read(ptdev, __reg_prefix ## _INT_MASK) |			\
> > +				  (mask & pirq->mask));						\
> > +		}										\
> > +	}											\
> >  												\
> >  	return ret;										\
> >  }												\
> >  												\
> >  static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq)			\
> >  {												\
> > -	pirq->mask = 0;										\
> > -	gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, 0);					\
> > +	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
> > +		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, 0);				\
> > +	}											\
> >  	synchronize_irq(pirq->irq);								\
> 
> This isn't quite safe with the threaded handler. The following can occur:
> 
> CPU 0				| CPU 1
> --------------------------------+----------------------------
> Running _irq_threaded_handler() |
> Enters __handler() callback     |
> 				| Enters _irq_suspend
> 				| Writes 0 to _INT_MASK
> 				| Drops scoped_guard()
> 				| Waits for the threaded handler
> Enters the final scoped_guard   |
> pirq->suspended is non-zero	|
> Reads pirq->mask/mask		|
> Writes non-zero to _INT_MASK	|
> 				| Sets suspended, but it's too late
> 
> Leading to the suspend occurring with interrupts not masked.
> 
> In the next patches you introduce the SUSPENDING flag which I think
> might fix this, but with just this patch it's broken so we could have
> bisection issues.

Yeah, when I sent it out I was aware it could have a problem because
I did the conversion in the follow-up patch. I figured at the time
that this was worth not having a giant "do everything" patch, but now
you've pointed out to me that I could just reorder the follow-up to be
before this one and things will work out.

> 
> Admittedly the old code was a little dodgy with the usage of irq->mask
> (I think really we should have atomic accesses to ensure that the write
> of pirq->mask is observed before the gpu_write).
> 
> Can you reorder the patches - introduce the panthor_irq_state enum first
> (with just SUSPENDED and ACTIVE states) and then do the conversion in
> one step?

Will do, thanks for pointing this out as a possibility. Shouldn't be too
painful either, hopefully.

> 
> Thanks,
> Steve
> 
> [ ... snip ... ]




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 4/4] drm/panthor: Add gpu_job_irq tracepoint
  2026-01-09 16:23   ` Steven Price
@ 2026-01-11 11:49     ` Nicolas Frattaroli
  2026-01-12  9:15       ` Steven Price
  0 siblings, 1 reply; 13+ messages in thread
From: Nicolas Frattaroli @ 2026-01-11 11:49 UTC (permalink / raw)
  To: Boris Brezillon, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Chia-I Wu,
	Karunika Choo, Steven Price
  Cc: kernel, linux-kernel, dri-devel

On Friday, 9 January 2026 17:23:32 Central European Standard Time Steven Price wrote:
> On 08/01/2026 14:19, Nicolas Frattaroli wrote:
> > Mali's CSF firmware triggers the job IRQ whenever there's new firmware
> > events for processing. While this can be a global event (BIT(31) of the
> > status register), it's usually an event relating to a command stream
> > group (the other bit indices).
> > 
> > Panthor throws these events onto a workqueue for processing outside the
> > IRQ handler. It's therefore useful to have an instrumented tracepoint
> > that goes beyond the generic IRQ tracepoint for this specific case, as
> > it can be augmented with additional data, namely the events bit mask.
> > 
> > This can then be used to debug problems relating to GPU jobs events not
> > being processed quickly enough. The duration_ns field can be used to
> > work backwards from when the tracepoint fires (at the end of the IRQ
> > handler) to figure out when the interrupt itself landed, providing not
> > just information on how long the work queueing took, but also when the
> > actual interrupt itself arrived.
> > 
> > With this information in hand, the IRQ handler itself being slow can be
> > excluded as a possible source of problems, and attention can be directed
> > to the workqueue processing instead.
> > 
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > ---
> >  drivers/gpu/drm/panthor/panthor_fw.c    | 13 +++++++++++++
> >  drivers/gpu/drm/panthor/panthor_trace.h | 28 ++++++++++++++++++++++++++++
> >  2 files changed, 41 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> > index 0e46625f7621..b3b48c1b049c 100644
> > --- a/drivers/gpu/drm/panthor/panthor_fw.c
> > +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> > @@ -26,6 +26,7 @@
> >  #include "panthor_mmu.h"
> >  #include "panthor_regs.h"
> >  #include "panthor_sched.h"
> > +#include "panthor_trace.h"
> >  
> >  #define CSF_FW_NAME "mali_csffw.bin"
> >  
> > @@ -1060,6 +1061,12 @@ static void panthor_fw_init_global_iface(struct panthor_device *ptdev)
> >  
> >  static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
> >  {
> > +	u32 duration;
> > +	u64 start;
> > +
> > +	if (tracepoint_enabled(gpu_job_irq))
> > +		start = ktime_get_ns();
> > +
> >  	gpu_write(ptdev, JOB_INT_CLEAR, status);
> >  
> >  	if (!ptdev->fw->booted && (status & JOB_INT_GLOBAL_IF))
> > @@ -1072,6 +1079,12 @@ static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
> >  		return;
> >  
> >  	panthor_sched_report_fw_events(ptdev, status);
> > +
> > +	if (tracepoint_enabled(gpu_job_irq)) {
> > +		if (check_sub_overflow(ktime_get_ns(), start, &duration))
> 
> It's minor but if the tracepoint was enabled during the handler, the
> duration will use start uninitialised. It's probably best to initialise
> start just to avoid a potential stack leak.

Good catch.

Should I unconditionally initialize it to ktime_get_ns(), or do we want
to avoid a call into that and initialize it to something that will result
in a nonsense duration? Alternatively we initialize it to 0 and skip the
tracepoint if !start.

My gut tells me reading the monotonic clock shouldn't be considered
expensive, though having the tracepoint overhead with an inactive
tracepoint be within a Planck time of "free" would be preferable,
so I'm leaning towards

    u64 start = 0;

    if (tracepoint_enabled(gpu_job_irq))
            start = ktime_get_ns();

    ...

    if (start && tracepoint_enabled(gpu_job_irq)) {
            ...

Kind regards,
Nicolas Frattaroli

> 
> Thanks,
> Steve
> 
> > +			duration = U32_MAX;
> > +		trace_gpu_job_irq(ptdev->base.dev, status, duration);
> > +	}
> >  }
> >  PANTHOR_IRQ_HANDLER(job, JOB, panthor_job_irq_handler);
> >  
> > diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
> > index 5bd420894745..6ffeb4fe6599 100644
> > --- a/drivers/gpu/drm/panthor/panthor_trace.h
> > +++ b/drivers/gpu/drm/panthor/panthor_trace.h
> > @@ -48,6 +48,34 @@ TRACE_EVENT_FN(gpu_power_status,
> >  	panthor_hw_power_status_register, panthor_hw_power_status_unregister
> >  );
> >  
> > +/**
> > + * gpu_job_irq - called after a job interrupt from firmware completes
> > + * @dev: pointer to the &struct device, for printing the device name
> > + * @events: bitmask of BIT(CSG id) | BIT(31) for a global event
> > + * @duration_ns: Nanoseconds between job IRQ handler entry and exit
> > + *
> > + * The panthor_job_irq_handler() function instrumented by this tracepoint exits
> > + * once it has queued the firmware interrupts for processing, not when the
> > + * firmware interrupts are fully processed. This tracepoint allows for debugging
> > + * issues with delays in the workqueue's processing of events.
> > + */
> > +TRACE_EVENT(gpu_job_irq,
> > +	TP_PROTO(const struct device *dev, u32 events, u32 duration_ns),
> > +	TP_ARGS(dev, events, duration_ns),
> > +	TP_STRUCT__entry(
> > +		__string(dev_name, dev_name(dev))
> > +		__field(u32, events)
> > +		__field(u32, duration_ns)
> > +	),
> > +	TP_fast_assign(
> > +		__assign_str(dev_name);
> > +		__entry->events		= events;
> > +		__entry->duration_ns	= duration_ns;
> > +	),
> > +	TP_printk("%s: events=0x%x duration_ns=%d", __get_str(dev_name),
> > +		  __entry->events, __entry->duration_ns)
> > +);
> > +
> >  #endif /* __PANTHOR_TRACE_H__ */
> >  
> >  #undef TRACE_INCLUDE_PATH
> > 
> 
> 





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 4/4] drm/panthor: Add gpu_job_irq tracepoint
  2026-01-11 11:49     ` Nicolas Frattaroli
@ 2026-01-12  9:15       ` Steven Price
  0 siblings, 0 replies; 13+ messages in thread
From: Steven Price @ 2026-01-12  9:15 UTC (permalink / raw)
  To: Nicolas Frattaroli, Boris Brezillon, Liviu Dudau,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Chia-I Wu, Karunika Choo
  Cc: kernel, linux-kernel, dri-devel

On 11/01/2026 11:49, Nicolas Frattaroli wrote:
> On Friday, 9 January 2026 17:23:32 Central European Standard Time Steven Price wrote:
>> On 08/01/2026 14:19, Nicolas Frattaroli wrote:
>>> Mali's CSF firmware triggers the job IRQ whenever there's new firmware
>>> events for processing. While this can be a global event (BIT(31) of the
>>> status register), it's usually an event relating to a command stream
>>> group (the other bit indices).
>>>
>>> Panthor throws these events onto a workqueue for processing outside the
>>> IRQ handler. It's therefore useful to have an instrumented tracepoint
>>> that goes beyond the generic IRQ tracepoint for this specific case, as
>>> it can be augmented with additional data, namely the events bit mask.
>>>
>>> This can then be used to debug problems relating to GPU jobs events not
>>> being processed quickly enough. The duration_ns field can be used to
>>> work backwards from when the tracepoint fires (at the end of the IRQ
>>> handler) to figure out when the interrupt itself landed, providing not
>>> just information on how long the work queueing took, but also when the
>>> actual interrupt itself arrived.
>>>
>>> With this information in hand, the IRQ handler itself being slow can be
>>> excluded as a possible source of problems, and attention can be directed
>>> to the workqueue processing instead.
>>>
>>> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
>>> ---
>>>  drivers/gpu/drm/panthor/panthor_fw.c    | 13 +++++++++++++
>>>  drivers/gpu/drm/panthor/panthor_trace.h | 28 ++++++++++++++++++++++++++++
>>>  2 files changed, 41 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
>>> index 0e46625f7621..b3b48c1b049c 100644
>>> --- a/drivers/gpu/drm/panthor/panthor_fw.c
>>> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
>>> @@ -26,6 +26,7 @@
>>>  #include "panthor_mmu.h"
>>>  #include "panthor_regs.h"
>>>  #include "panthor_sched.h"
>>> +#include "panthor_trace.h"
>>>  
>>>  #define CSF_FW_NAME "mali_csffw.bin"
>>>  
>>> @@ -1060,6 +1061,12 @@ static void panthor_fw_init_global_iface(struct panthor_device *ptdev)
>>>  
>>>  static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
>>>  {
>>> +	u32 duration;
>>> +	u64 start;
>>> +
>>> +	if (tracepoint_enabled(gpu_job_irq))
>>> +		start = ktime_get_ns();
>>> +
>>>  	gpu_write(ptdev, JOB_INT_CLEAR, status);
>>>  
>>>  	if (!ptdev->fw->booted && (status & JOB_INT_GLOBAL_IF))
>>> @@ -1072,6 +1079,12 @@ static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
>>>  		return;
>>>  
>>>  	panthor_sched_report_fw_events(ptdev, status);
>>> +
>>> +	if (tracepoint_enabled(gpu_job_irq)) {
>>> +		if (check_sub_overflow(ktime_get_ns(), start, &duration))
>>
>> It's minor but if the tracepoint was enabled during the handler, the
>> duration will use start uninitialised. It's probably best to initialise
>> start just to avoid a potential stack leak.
> 
> Good catch.
> 
> Should I unconditionally initialize it to ktime_get_ns(), or do we want
> to avoid a call into that and initialize it to something that will result
> in a nonsense duration? Alternatively we initialize it to 0 and skip the
> tracepoint if !start.
> 
> My gut tells me reading the monotonic clock shouldn't be considered
> expensive, though having the tracepoint overhead with an inactive
> tracepoint be within a Planck time of "free" would be preferable,
> so I'm leaning towards
> 
>     u64 start = 0;
> 
>     if (tracepoint_enabled(gpu_job_irq))
>             start = ktime_get_ns();
> 
>     ...
> 
>     if (start && tracepoint_enabled(gpu_job_irq)) {
>             ...

Yeah I'd go with this option. There is quite a bit of effort to keep
reading the clock cheap, but it's still going to be much more expensive
than not reading it. And we really don't care if we drop the first
tracepoint when it's enabled.

Note that reordering the condition to

     if (tracepoint_enabled(gpu_job_irq) && start) {

would be slightly preferable, so that the static key avoids the check of
'start' when the tracepoint is disabled. Although with compiler
optimisations and CPU out-of-order execution, I'm not sure whether the
difference is actually measurable ;)

Thanks,
Steve

> Kind regards,
> Nicolas Frattaroli
> 
>>
>> Thanks,
>> Steve
>>
>>> +			duration = U32_MAX;
>>> +		trace_gpu_job_irq(ptdev->base.dev, status, duration);
>>> +	}
>>>  }
>>>  PANTHOR_IRQ_HANDLER(job, JOB, panthor_job_irq_handler);
>>>  
>>> diff --git a/drivers/gpu/drm/panthor/panthor_trace.h b/drivers/gpu/drm/panthor/panthor_trace.h
>>> index 5bd420894745..6ffeb4fe6599 100644
>>> --- a/drivers/gpu/drm/panthor/panthor_trace.h
>>> +++ b/drivers/gpu/drm/panthor/panthor_trace.h
>>> @@ -48,6 +48,34 @@ TRACE_EVENT_FN(gpu_power_status,
>>>  	panthor_hw_power_status_register, panthor_hw_power_status_unregister
>>>  );
>>>  
>>> +/**
>>> + * gpu_job_irq - called after a job interrupt from firmware completes
>>> + * @dev: pointer to the &struct device, for printing the device name
>>> + * @events: bitmask of BIT(CSG id) | BIT(31) for a global event
>>> + * @duration_ns: Nanoseconds between job IRQ handler entry and exit
>>> + *
>>> + * The panthor_job_irq_handler() function instrumented by this tracepoint exits
>>> + * once it has queued the firmware interrupts for processing, not when the
>>> + * firmware interrupts are fully processed. This tracepoint allows for debugging
>>> + * issues with delays in the workqueue's processing of events.
>>> + */
>>> +TRACE_EVENT(gpu_job_irq,
>>> +	TP_PROTO(const struct device *dev, u32 events, u32 duration_ns),
>>> +	TP_ARGS(dev, events, duration_ns),
>>> +	TP_STRUCT__entry(
>>> +		__string(dev_name, dev_name(dev))
>>> +		__field(u32, events)
>>> +		__field(u32, duration_ns)
>>> +	),
>>> +	TP_fast_assign(
>>> +		__assign_str(dev_name);
>>> +		__entry->events		= events;
>>> +		__entry->duration_ns	= duration_ns;
>>> +	),
>>> +	TP_printk("%s: events=0x%x duration_ns=%d", __get_str(dev_name),
>>> +		  __entry->events, __entry->duration_ns)
>>> +);
>>> +
>>>  #endif /* __PANTHOR_TRACE_H__ */
>>>  
>>>  #undef TRACE_INCLUDE_PATH
>>>
>>
>>
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 2/4] drm/panthor: Rework panthor_irq::suspended into panthor_irq::state
  2026-01-09 16:05   ` Steven Price
@ 2026-01-12 13:39     ` Nicolas Frattaroli
  2026-01-12 14:09       ` Steven Price
  0 siblings, 1 reply; 13+ messages in thread
From: Nicolas Frattaroli @ 2026-01-12 13:39 UTC (permalink / raw)
  To: Boris Brezillon, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Chia-I Wu,
	Karunika Choo, Steven Price
  Cc: kernel, linux-kernel, dri-devel

On Friday, 9 January 2026 17:05:05 Central European Standard Time Steven Price wrote:
> On 08/01/2026 14:19, Nicolas Frattaroli wrote:
> > To deal with the threaded interrupt handler and a suspend action
> > overlapping, the boolean panthor_irq::suspended is not sufficient.
> > 
> > Rework it into taking several different values depending on the current
> > state, and check it and set it within the IRQ helper functions.
> > 
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > ---
> >  drivers/gpu/drm/panthor/panthor_device.h | 40 +++++++++++++++++++++++++-------
> >  1 file changed, 32 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> > index cf76a8abca76..a8c21a7eea05 100644
> > --- a/drivers/gpu/drm/panthor/panthor_device.h
> > +++ b/drivers/gpu/drm/panthor/panthor_device.h
> > @@ -61,6 +61,17 @@ enum panthor_device_pm_state {
> >  	PANTHOR_DEVICE_PM_STATE_SUSPENDING,
> >  };
> >  
> > +enum panthor_irq_state {
> > +	/** @PANTHOR_IRQ_STATE_ACTIVE: IRQ is active and ready to process events. */
> > +	PANTHOR_IRQ_STATE_ACTIVE = 0,
> > +	/** @PANTHOR_IRQ_STATE_PROCESSING: IRQ is currently processing events. */
> > +	PANTHOR_IRQ_STATE_PROCESSING,
> > +	/** @PANTHOR_IRQ_STATE_SUSPENDED: IRQ is suspended. */
> > +	PANTHOR_IRQ_STATE_SUSPENDED,
> > +	/** @PANTHOR_IRQ_STATE_SUSPENDING: IRQ is being suspended. */
> > +	PANTHOR_IRQ_STATE_SUSPENDING,
> > +};
> > +
> >  /**
> >   * struct panthor_irq - IRQ data
> >   *
> > @@ -76,8 +87,8 @@ struct panthor_irq {
> >  	/** @mask: Values to write to xxx_INT_MASK if active. */
> >  	u32 mask;
> >  
> > -	/** @suspended: Set to true when the IRQ is suspended. */
> > -	atomic_t suspended;
> > +	/** @state: one of &enum panthor_irq_state reflecting the current state. */
> > +	atomic_t state;
> >  
> >  	/** @mask_lock: protects modifications to _INT_MASK and @mask */
> >  	spinlock_t mask_lock;
> > @@ -415,7 +426,7 @@ static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data)
> >  												\
> >  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
> >  												\
> > -	if (atomic_read(&pirq->suspended))							\
> > +	if (atomic_read(&pirq->state) == PANTHOR_IRQ_STATE_SUSPENDED)				\
> 
> Do we want to also catch the STATE_SUSPENDING case here? AFAICT in
> SUSPENDING we should be confident that _INT_MASK==0 so this shouldn't a
> problem (_INT_STAT should read as 0 below). But we don't want interrupts
> during STATE_SUSPENDING so we might as well handle it.

Depends on what we want to happen here, I think. If the suspend handler
writing 0 to _INT_MASK does not also happen to clear _INT_STAT, then as
far as I can see, we may enter _irq_raw_handler, block for the lock as
suspend drops it, and then enter the function with STATE_SUSPENDING in
a context where we'd probably want to process the remaining interrupts
(i.e. suspend is at synchronise_irq() or about to be there, and we're
handling the last IRQ that got raised before mask was written to 0).

Let me know if my understanding here is correct, because I think in such
a case, it's reasonable to process that last IRQ instead of discarding it,
unless that's something frowned upon that I'm not aware of.

> 
> Thanks,
> Steve
> 
> >  		return IRQ_NONE;								\
> >  	if (!gpu_read(ptdev, __reg_prefix ## _INT_STAT))					\
> >  		return IRQ_NONE;								\
> > @@ -428,11 +439,14 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
> >  {												\
> >  	struct panthor_irq *pirq = data;							\
> >  	struct panthor_device *ptdev = pirq->ptdev;						\
> > +	enum panthor_irq_state state;								\
> >  	irqreturn_t ret = IRQ_NONE;								\
> >  	u32 mask;										\
> >  												\
> >  	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
> >  		mask = pirq->mask;								\
> > +		atomic_cmpxchg(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE,				\
> > +			       PANTHOR_IRQ_STATE_PROCESSING);					\
> >  	}											\
> >  												\
> >  	while (true) {										\
> > @@ -446,11 +460,14 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
> >  	}											\
> >  												\
> >  	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
> > -		if (!atomic_read(&pirq->suspended)) {						\
> > +		state = atomic_read(&pirq->state);						\
> > +		if (state != PANTHOR_IRQ_STATE_SUSPENDED &&					\
> > +		    state != PANTHOR_IRQ_STATE_SUSPENDING) {					\
> >  			/* Only restore the bits that were used and are still enabled */	\
> >  			gpu_write(ptdev, __reg_prefix ## _INT_MASK,				\
> >  				  gpu_read(ptdev, __reg_prefix ## _INT_MASK) |			\
> >  				  (mask & pirq->mask));						\
> > +			atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE);			\
> >  		}										\
> >  	}											\
> >  												\
> > @@ -461,16 +478,17 @@ static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq)
> >  {												\
> >  	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
> >  		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, 0);				\
> > +		atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDING);				\
> >  	}											\
> >  	synchronize_irq(pirq->irq);								\
> > -	atomic_set(&pirq->suspended, true);							\
> > +	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDED);					\
> >  }												\
> >  												\
> >  static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq)			\
> >  {												\
> >  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
> >  												\
> > -	atomic_set(&pirq->suspended, false);							\
> > +	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE);					\
> >  	gpu_write(pirq->ptdev, __reg_prefix ## _INT_CLEAR, pirq->mask);				\
> >  	gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);				\
> >  }												\
> > @@ -494,19 +512,25 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
> >  												\
> >  static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *pirq, u32 mask)	\
> >  {												\
> > +	enum panthor_irq_state state;								\
> > +												\
> >  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
> >  												\
> >  	pirq->mask |= mask;									\
> > -	if (!atomic_read(&pirq->suspended))							\
> > +	state = atomic_read(&pirq->state);							\
> > +	if (state != PANTHOR_IRQ_STATE_SUSPENDED && state != PANTHOR_IRQ_STATE_SUSPENDING)	\
> >  		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
> >  }												\
> >  												\
> >  static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq *pirq, u32 mask)\
> >  {												\
> > +	enum panthor_irq_state state;								\
> > +												\
> >  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
> >  												\
> >  	pirq->mask &= ~mask;									\
> > -	if (!atomic_read(&pirq->suspended))							\
> > +	state = atomic_read(&pirq->state);							\
> > +	if (state != PANTHOR_IRQ_STATE_SUSPENDED && state != PANTHOR_IRQ_STATE_SUSPENDING)	\
> >  		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
> >  }
> >  
> > 
> 
> 





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 2/4] drm/panthor: Rework panthor_irq::suspended into panthor_irq::state
  2026-01-12 13:39     ` Nicolas Frattaroli
@ 2026-01-12 14:09       ` Steven Price
  0 siblings, 0 replies; 13+ messages in thread
From: Steven Price @ 2026-01-12 14:09 UTC (permalink / raw)
  To: Nicolas Frattaroli, Boris Brezillon, Liviu Dudau,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Chia-I Wu, Karunika Choo
  Cc: kernel, linux-kernel, dri-devel

On 12/01/2026 13:39, Nicolas Frattaroli wrote:
> On Friday, 9 January 2026 17:05:05 Central European Standard Time Steven Price wrote:
>> On 08/01/2026 14:19, Nicolas Frattaroli wrote:
>>> To deal with the threaded interrupt handler and a suspend action
>>> overlapping, the boolean panthor_irq::suspended is not sufficient.
>>>
>>> Rework it into taking several different values depending on the current
>>> state, and check it and set it within the IRQ helper functions.
>>>
>>> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
>>> ---
>>>  drivers/gpu/drm/panthor/panthor_device.h | 40 +++++++++++++++++++++++++-------
>>>  1 file changed, 32 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
>>> index cf76a8abca76..a8c21a7eea05 100644
>>> --- a/drivers/gpu/drm/panthor/panthor_device.h
>>> +++ b/drivers/gpu/drm/panthor/panthor_device.h
>>> @@ -61,6 +61,17 @@ enum panthor_device_pm_state {
>>>  	PANTHOR_DEVICE_PM_STATE_SUSPENDING,
>>>  };
>>>  
>>> +enum panthor_irq_state {
>>> +	/** @PANTHOR_IRQ_STATE_ACTIVE: IRQ is active and ready to process events. */
>>> +	PANTHOR_IRQ_STATE_ACTIVE = 0,
>>> +	/** @PANTHOR_IRQ_STATE_PROCESSING: IRQ is currently processing events. */
>>> +	PANTHOR_IRQ_STATE_PROCESSING,
>>> +	/** @PANTHOR_IRQ_STATE_SUSPENDED: IRQ is suspended. */
>>> +	PANTHOR_IRQ_STATE_SUSPENDED,
>>> +	/** @PANTHOR_IRQ_STATE_SUSPENDING: IRQ is being suspended. */
>>> +	PANTHOR_IRQ_STATE_SUSPENDING,
>>> +};
>>> +
>>>  /**
>>>   * struct panthor_irq - IRQ data
>>>   *
>>> @@ -76,8 +87,8 @@ struct panthor_irq {
>>>  	/** @mask: Values to write to xxx_INT_MASK if active. */
>>>  	u32 mask;
>>>  
>>> -	/** @suspended: Set to true when the IRQ is suspended. */
>>> -	atomic_t suspended;
>>> +	/** @state: one of &enum panthor_irq_state reflecting the current state. */
>>> +	atomic_t state;
>>>  
>>>  	/** @mask_lock: protects modifications to _INT_MASK and @mask */
>>>  	spinlock_t mask_lock;
>>> @@ -415,7 +426,7 @@ static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data)
>>>  												\
>>>  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
>>>  												\
>>> -	if (atomic_read(&pirq->suspended))							\
>>> +	if (atomic_read(&pirq->state) == PANTHOR_IRQ_STATE_SUSPENDED)				\
>>
>> Do we want to also catch the STATE_SUSPENDING case here? AFAICT in
>> SUSPENDING we should be confident that _INT_MASK==0 so this shouldn't a
>> problem (_INT_STAT should read as 0 below). But we don't want interrupts
>> during STATE_SUSPENDING so we might as well handle it.
> 
> Depends on what we want to happen here, I think. If the suspend handler
> writing 0 to _INT_MASK does not also happen to clear _INT_STAT, then as

_STAT is defined by the hardware to be the bitwise AND of _RAWSTAT and
_MASK. So we shouldn't see _INT_STAT being anything other than 0.

> far as I can see, we may enter _irq_raw_handler, block for the lock as
> suspend drops it, and then enter the function with STATE_SUSPENDING in
> a context where we'd probably want to process the remaining interrupts
> (i.e. suspend is at synchronise_irq() or about to be there, and we're
> handling the last IRQ that got raised before mask was written to 0).
> 
> Let me know if my understanding here is correct, because I think in such
> a case, it's reasonable to process that last IRQ instead of discarding it,
> unless that's something frowned upon that I'm not aware of.

My understanding is that once we have set the status to _SUSPENDING we
don't expect any further interrupts to be fired, but there may be
outstanding interrupts we're still handling. I believe this has to be
the case because AFAIK synchronize_irq() only waits for existing IRQs to
be handled - if another IRQ is triggered during the synchronize_irq()
call then it won't (necessarily) be waited for.

I don't believe the code you have is actually wrong (because the
hardware will mask out any interrupts so _STAT will be zero), but I
personally thought it would be clearer to explicitly handle the
_SUSPENDING state.

Thanks,
Steve

>>
>> Thanks,
>> Steve
>>
>>>  		return IRQ_NONE;								\
>>>  	if (!gpu_read(ptdev, __reg_prefix ## _INT_STAT))					\
>>>  		return IRQ_NONE;								\
>>> @@ -428,11 +439,14 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
>>>  {												\
>>>  	struct panthor_irq *pirq = data;							\
>>>  	struct panthor_device *ptdev = pirq->ptdev;						\
>>> +	enum panthor_irq_state state;								\
>>>  	irqreturn_t ret = IRQ_NONE;								\
>>>  	u32 mask;										\
>>>  												\
>>>  	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
>>>  		mask = pirq->mask;								\
>>> +		atomic_cmpxchg(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE,				\
>>> +			       PANTHOR_IRQ_STATE_PROCESSING);					\
>>>  	}											\
>>>  												\
>>>  	while (true) {										\
>>> @@ -446,11 +460,14 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
>>>  	}											\
>>>  												\
>>>  	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
>>> -		if (!atomic_read(&pirq->suspended)) {						\
>>> +		state = atomic_read(&pirq->state);						\
>>> +		if (state != PANTHOR_IRQ_STATE_SUSPENDED &&					\
>>> +		    state != PANTHOR_IRQ_STATE_SUSPENDING) {					\
>>>  			/* Only restore the bits that were used and are still enabled */	\
>>>  			gpu_write(ptdev, __reg_prefix ## _INT_MASK,				\
>>>  				  gpu_read(ptdev, __reg_prefix ## _INT_MASK) |			\
>>>  				  (mask & pirq->mask));						\
>>> +			atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE);			\
>>>  		}										\
>>>  	}											\
>>>  												\
>>> @@ -461,16 +478,17 @@ static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq)
>>>  {												\
>>>  	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
>>>  		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, 0);				\
>>> +		atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDING);				\
>>>  	}											\
>>>  	synchronize_irq(pirq->irq);								\
>>> -	atomic_set(&pirq->suspended, true);							\
>>> +	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDED);					\
>>>  }												\
>>>  												\
>>>  static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq)			\
>>>  {												\
>>>  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
>>>  												\
>>> -	atomic_set(&pirq->suspended, false);							\
>>> +	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE);					\
>>>  	gpu_write(pirq->ptdev, __reg_prefix ## _INT_CLEAR, pirq->mask);				\
>>>  	gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);				\
>>>  }												\
>>> @@ -494,19 +512,25 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
>>>  												\
>>>  static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *pirq, u32 mask)	\
>>>  {												\
>>> +	enum panthor_irq_state state;								\
>>> +												\
>>>  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
>>>  												\
>>>  	pirq->mask |= mask;									\
>>> -	if (!atomic_read(&pirq->suspended))							\
>>> +	state = atomic_read(&pirq->state);							\
>>> +	if (state != PANTHOR_IRQ_STATE_SUSPENDED && state != PANTHOR_IRQ_STATE_SUSPENDING)	\
>>>  		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
>>>  }												\
>>>  												\
>>>  static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq *pirq, u32 mask)\
>>>  {												\
>>> +	enum panthor_irq_state state;								\
>>> +												\
>>>  	guard(spinlock_irqsave)(&pirq->mask_lock);						\
>>>  												\
>>>  	pirq->mask &= ~mask;									\
>>> -	if (!atomic_read(&pirq->suspended))							\
>>> +	state = atomic_read(&pirq->state);							\
>>> +	if (state != PANTHOR_IRQ_STATE_SUSPENDED && state != PANTHOR_IRQ_STATE_SUSPENDING)	\
>>>  		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, pirq->mask);			\
>>>  }
>>>  
>>>
>>
>>
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-01-12 14:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-08 14:19 [PATCH v7 0/4] Add a few tracepoints to panthor Nicolas Frattaroli
2026-01-08 14:19 ` [PATCH v7 1/4] drm/panthor: Extend IRQ helpers for mask modification/restoration Nicolas Frattaroli
2026-01-09 15:59   ` Steven Price
2026-01-11 11:39     ` Nicolas Frattaroli
2026-01-08 14:19 ` [PATCH v7 2/4] drm/panthor: Rework panthor_irq::suspended into panthor_irq::state Nicolas Frattaroli
2026-01-09 16:05   ` Steven Price
2026-01-12 13:39     ` Nicolas Frattaroli
2026-01-12 14:09       ` Steven Price
2026-01-08 14:19 ` [PATCH v7 3/4] drm/panthor: Add tracepoint for hardware utilisation changes Nicolas Frattaroli
2026-01-08 14:19 ` [PATCH v7 4/4] drm/panthor: Add gpu_job_irq tracepoint Nicolas Frattaroli
2026-01-09 16:23   ` Steven Price
2026-01-11 11:49     ` Nicolas Frattaroli
2026-01-12  9:15       ` Steven Price

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox