All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency
@ 2026-06-25  9:36 Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 01/16] drm/panthor: Fix theoretical IOMEM access in suspended state Boris Brezillon
                   ` (16 more replies)
  0 siblings, 17 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon,
	sashiko-bot

Right now, panthor is one of the rare drivers to signal fences
from work items (not even from the threaded IRQ handler). We
tried moving the job_completion check to hardirq handlers like
other drivers do, but the duration of this handler gets
slightly over the few usec (20+ usecs) we usually expect from
hardird handlers, and we're not sure we want to hold off the
processing of other interrupts for that long. So this series
just gets rid of the threaded-handler -> work_item indirection
and checks for job completion (and thus, fence signalling)
directly in the threaded handler.

Sorry for the high submission rate (v4 was sent this morning),
but I'd like get the remaining blockers out of the way, and
shashiko keeps finding new legitimate issues :-).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
Changes in v5:
- Add a fix for a theoretical IOMEM access in suspended state (patch 1)
- Make sure we don't delay a pending immediate tick in
  sched_resume_tick() (patch 2)
- Make sure we initialize panthor_irq::state properly in the irq_request
  helper
- Link to v4: https://lore.kernel.org/r/20260625-panthor-signal-from-irq-v4-0-3d2908912afa@collabora.com

Changes in v4:
- Add a bunch of fixes for bugs reported by shashiko
- Link to v3: https://lore.kernel.org/r/20260623-panthor-signal-from-irq-v3-0-2ece396f8ee0@collabora.com

Changes in v3:
- Save/restore the irq state in the raw handler.
- Rename panthor_irq::mask_lock into panthor_irq::lock
- Use the __always_inline specifier on
  panthor_irq_default_threaded_handler()
- Use devm_request_threaded_irq() even when the threaded handler is
  NULL
- Drop the patch that dynamically enables request-related interrupts
  (FW-side race) after the polling period has expired
- Don't process FW events from the hardirq handler (too heavy for an
  hardirq handler according to our testing)
- Link to v2: https://lore.kernel.org/r/20260512-panthor-signal-from-irq-v2-0-95c614a739cb@collabora.com

Changes in v2:
- Fix commit message in patch 4
- Move devm_kasprintf() before panthor_irq_resume() in patch 3
- Fix erroneous lockdep_assert_held() in patch 6
- Make sure events_lock is held when calling
  csg_slot_sync_update_locked() in patch 6
- Restore a csg_slot_sync_update_locked() call in patch 7
- Fix a potential deadlock in patch 9
- Drop the IRQ coalescing patch (formerly patch 10)
- Change panthor_irq_request() so we don't have to define a dummy
  threaded handler, and we can let RT kernels move the hard handler
  to a thread
- Add patches to transition GPU event processing to the hard IRQ handler
- Link to v1: https://lore.kernel.org/r/20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com

---
Boris Brezillon (16):
      drm/panthor: Fix theoretical IOMEM access in suspended state
      drm/panthor: Don't overrule pending immediate ticks in sched_resume_tick()
      drm/panthor: Fix panthor_pwr_unplug()
      drm/panthor: Drop a needless check in panthor_fw_unplug()
      drm/panthor: Fix a leak when a group is evicted before the tiler OOM is serviced
      drm/panthor: Interrupt group start/resumption if group_bind_locked() fails
      drm/panthor: Keep interrupts masked until they are needed
      drm/panthor: Make panthor_irq::state a non-atomic field
      drm/panthor: Move the register accessors before the IRQ helpers
      drm/panthor: Replace the panthor_irq macro machinery by inline helpers
      drm/panthor: Don't update might_have_idle_groups in process_idle_event_locked()
      drm/panthor: Get rid of panthor_group::fatal_lock
      drm/panthor: Protect events processing with a separate spinlock
      drm/panthor: Don't defer job completion checks
      drm/panthor: Don't defer FW event processing
      drm/panthor: Automate CSG IRQ processing at group unbind time

 drivers/gpu/drm/panthor/panthor_device.h | 281 ++++++++--------
 drivers/gpu/drm/panthor/panthor_fw.c     |  24 +-
 drivers/gpu/drm/panthor/panthor_gpu.c    |  27 +-
 drivers/gpu/drm/panthor/panthor_mmu.c    |  44 +--
 drivers/gpu/drm/panthor/panthor_pwr.c    |  24 +-
 drivers/gpu/drm/panthor/panthor_sched.c  | 533 +++++++++++++++----------------
 6 files changed, 472 insertions(+), 461 deletions(-)
---
base-commit: ac5ac0acf11df04295eb1811066097b7022d6c7f
change-id: 20260429-panthor-signal-from-irq-d33684f4d292

Best regards,
-- 
Boris Brezillon <boris.brezillon@collabora.com>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v5 01/16] drm/panthor: Fix theoretical IOMEM access in suspended state
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 02/16] drm/panthor: Don't overrule pending immediate ticks in sched_resume_tick() Boris Brezillon
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon,
	sashiko-bot

In theory, our hardirq handler can be called while the device (and
thus the panthor_irq) is suspended, because the IRQ line is shared.
In practice though, in all the designs we've seen, the line is only
shared within the GPU, and because sub-component suspend state is
consistent (all-suspended or all-resumed), we shouldn't end up with
an interrupt triggered while we're suspended.

Fix the problem anyway, if nothing else, for our sanity.

Fixes: 0b2d86670a84 ("drm/panthor: Rework panthor_irq::suspended into panthor_irq::state")
Reported-by: sashiko-bot@kernel.org
Closes: https://sashiko.dev/#/patchset/20260625-panthor-signal-from-irq-v4-0-3d2908912afa@collabora.com?part=1
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_device.h | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 35679bfa1f3a..a39386bd6382 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -512,9 +512,6 @@ static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data)
 	struct panthor_irq *pirq = data;							\
 	enum panthor_irq_state old_state;							\
 												\
-	if (!gpu_read(pirq->iomem, INT_STAT))							\
-		return IRQ_NONE;								\
-												\
 	guard(spinlock_irqsave)(&pirq->mask_lock);						\
 	old_state = atomic_cmpxchg(&pirq->state,						\
 				   PANTHOR_IRQ_STATE_ACTIVE,					\
@@ -522,6 +519,13 @@ static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data)
 	if (old_state != PANTHOR_IRQ_STATE_ACTIVE)						\
 		return IRQ_NONE;								\
 												\
+	if (!gpu_read(pirq->iomem, INT_STAT)) {							\
+		atomic_cmpxchg(&pirq->state,							\
+			       PANTHOR_IRQ_STATE_PROCESSING,					\
+			       PANTHOR_IRQ_STATE_ACTIVE);					\
+		return IRQ_NONE;								\
+	}											\
+												\
 	gpu_write(pirq->iomem, INT_MASK, 0);							\
 	return IRQ_WAKE_THREAD;									\
 }												\

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 02/16] drm/panthor: Don't overrule pending immediate ticks in sched_resume_tick()
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 01/16] drm/panthor: Fix theoretical IOMEM access in suspended state Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25 10:04   ` sashiko-bot
  2026-06-25  9:36 ` [PATCH v5 03/16] drm/panthor: Fix panthor_pwr_unplug() Boris Brezillon
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon,
	sashiko-bot

We schedule immediate ticks when we need to process events on CSGs,
but those immediate ticks don't change the resched_target because we
want the other groups to stay scheduled for the remaining of the GPU
timeslot they were given. Make sure these immediate ticks don't get
overruled by a sched_queue_delayed_work() that would delay the tick
execution.

Fixes: 99820b4b7e50 ("drm/panthor: Make sure we resume the tick when new jobs are submitted")
Reported-by: sashiko-bot@kernel.org
Closes: https://sashiko.dev/#/patchset/20260625-panthor-signal-from-irq-v4-0-3d2908912afa@collabora.com?part=9
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_sched.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index 5b34032deff8..1913bc8a6297 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -2668,7 +2668,14 @@ static void sched_resume_tick(struct panthor_device *ptdev)
 	else
 		delay_jiffies = 0;
 
-	sched_queue_delayed_work(sched, tick, delay_jiffies);
+	/* We schedule immediate ticks when we need to process events on CSGs,
+	 * but those don't change the resched_target because we want the other
+	 * groups to stay scheduled for the remaining of the GPU timeslot they
+	 * were given. Make sure those immediate ticks don't get overruled by
+	 * a sched_queue_delayed_work() that would delay the tick execution.
+	 */
+	if (!delayed_work_pending(&sched->tick_work))
+		sched_queue_delayed_work(sched, tick, delay_jiffies);
 }
 
 static void group_schedule_locked(struct panthor_group *group, u32 queue_mask)

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 03/16] drm/panthor: Fix panthor_pwr_unplug()
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 01/16] drm/panthor: Fix theoretical IOMEM access in suspended state Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 02/16] drm/panthor: Don't overrule pending immediate ticks in sched_resume_tick() Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 04/16] drm/panthor: Drop a needless check in panthor_fw_unplug() Boris Brezillon
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon

We can't call panthor_pwr_irq_suspend() if the device is suspended,
or this leads to a hang when the IOMEM region is accessed while the
clks are disabled. Do what other sub-components do and conditionally
call panthor_pwr_irq_suspend() if we know the PWR regbank block is
accessible.

Fixes: c27787f2b77f ("drm/panthor: Introduce panthor_pwr API and power control framework")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_pwr.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panthor/panthor_pwr.c b/drivers/gpu/drm/panthor/panthor_pwr.c
index 7c7f424a1436..090362bd700b 100644
--- a/drivers/gpu/drm/panthor/panthor_pwr.c
+++ b/drivers/gpu/drm/panthor/panthor_pwr.c
@@ -453,7 +453,8 @@ void panthor_pwr_unplug(struct panthor_device *ptdev)
 		return;
 
 	/* Make sure the IRQ handler is not running after that point. */
-	panthor_pwr_irq_suspend(&ptdev->pwr->irq);
+	if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev))
+		panthor_pwr_irq_suspend(&ptdev->pwr->irq);
 
 	/* Wake-up all waiters. */
 	spin_lock_irqsave(&ptdev->pwr->reqs_lock, flags);

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 04/16] drm/panthor: Drop a needless check in panthor_fw_unplug()
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (2 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 03/16] drm/panthor: Fix panthor_pwr_unplug() Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25 10:00   ` sashiko-bot
  2026-06-25  9:36 ` [PATCH v5 05/16] drm/panthor: Fix a leak when a group is evicted before the tiler OOM is serviced Boris Brezillon
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon

panthor_fw_unplug() is only called if we at least managed to initialize
the IRQ, so it's safe to drop the "is IRQ initialized" check.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_fw.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 986151681b24..4fbddb9e18c8 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1279,9 +1279,7 @@ void panthor_fw_unplug(struct panthor_device *ptdev)
 
 	if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev)) {
 		/* Make sure the IRQ handler cannot be called after that point. */
-		if (ptdev->fw->irq.irq)
-			panthor_job_irq_suspend(&ptdev->fw->irq);
-
+		panthor_job_irq_suspend(&ptdev->fw->irq);
 		panthor_fw_stop(ptdev);
 	}
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 05/16] drm/panthor: Fix a leak when a group is evicted before the tiler OOM is serviced
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (3 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 04/16] drm/panthor: Drop a needless check in panthor_fw_unplug() Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 06/16] drm/panthor: Interrupt group start/resumption if group_bind_locked() fails Boris Brezillon
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon,
	sashiko-bot

A group ref is tied to the pending tiler_oom_work, so we need to release
it if the cancel was effective.

Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
Reported-by: sashiko-bot@kernel.org
Closes: https://sashiko.dev/#/patchset/20260623-panthor-signal-from-irq-v3-0-2ece396f8ee0@collabora.com?part=7
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_sched.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index 1913bc8a6297..a9119aaddabc 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -1057,7 +1057,8 @@ group_unbind_locked(struct panthor_group *group)
 
 	/* Tiler OOM events will be re-issued next time the group is scheduled. */
 	atomic_set(&group->tiler_oom, 0);
-	cancel_work(&group->tiler_oom_work);
+	if (cancel_work(&group->tiler_oom_work))
+		group_put(group);
 
 	for (u32 i = 0; i < group->queue_count; i++)
 		group->queues[i]->doorbell_id = -1;

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 06/16] drm/panthor: Interrupt group start/resumption if group_bind_locked() fails
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (4 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 05/16] drm/panthor: Fix a leak when a group is evicted before the tiler OOM is serviced Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 07/16] drm/panthor: Keep interrupts masked until they are needed Boris Brezillon
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon,
	sashiko-bot

group_bind_locked() can fail if the MMU block is stuck. This is normally
a reset situation, but by the time we reset the GPU, we might have
tried to resume a group that's not resident, which will probably trip
out the FW. So let's avoid that by bailing out when group_bind_locked()
returns an error. We don't even try to start more groups because the
GPU will be reset anyway.

Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
Reported-by: sashiko-bot@kernel.org
Closes: https://sashiko.dev/#/patchset/20260623-panthor-signal-from-irq-v3-0-2ece396f8ee0@collabora.com?part=7
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_sched.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index a9119aaddabc..237f6a75e624 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -2369,7 +2369,13 @@ tick_ctx_apply(struct panthor_scheduler *sched, struct panthor_sched_tick_ctx *c
 
 			csg_iface = panthor_fw_get_csg_iface(ptdev, csg_id);
 			csg_slot = &sched->csg_slots[csg_id];
-			group_bind_locked(group, csg_id);
+			ret = group_bind_locked(group, csg_id);
+			if (ret) {
+				panthor_device_schedule_reset(ptdev);
+				ctx->csg_upd_failed_mask |= BIT(csg_id);
+				return;
+			}
+
 			csg_slot_prog_locked(ptdev, csg_id, new_csg_prio--);
 			csgs_upd_ctx_queue_reqs(ptdev, &upd_ctx, csg_id,
 						group->state == PANTHOR_CS_GROUP_SUSPENDED ?

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 07/16] drm/panthor: Keep interrupts masked until they are needed
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (5 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 06/16] drm/panthor: Interrupt group start/resumption if group_bind_locked() fails Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 08/16] drm/panthor: Make panthor_irq::state a non-atomic field Boris Brezillon
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon, Shashiko

The autogenerated panthor_request_xx_irq() helpers unmask Mali
interrupts before we're sure we'll have a handler registered. For
non-shared IRQ lines, that's fine, but for shared ones, it might cause
an interrupt flood if the HW block raises an interrupt for any reason.

We could reworking the calls in panthor_request_xx_irq(), but it's just
simpler to let the caller decide when they are ready to handle interrupts
and call panthor_pwr_irq_resume() themselves. While at it, rework the
prototype to let users call panthor_pwr_irq_enable_events() explicitly
instead of passing an initial mask to panthor_request_pwr_irq().

Fixes: 5fe909cae118 ("drm/panthor: Add the device logical block")
Reported-by: Shashiko <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260623-panthor-signal-from-irq-v3-0-2ece396f8ee0@collabora.com?part=3
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_device.h | 7 ++++---
 drivers/gpu/drm/panthor/panthor_fw.c     | 2 +-
 drivers/gpu/drm/panthor/panthor_gpu.c    | 3 ++-
 drivers/gpu/drm/panthor/panthor_mmu.c    | 9 +++++++--
 drivers/gpu/drm/panthor/panthor_pwr.c    | 7 ++++---
 5 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index a39386bd6382..0fda64fbe5f2 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -588,14 +588,15 @@ static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq)
 												\
 static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
 					      struct panthor_irq *pirq,				\
-					      int irq, u32 mask, void __iomem *iomem)		\
+					      int irq, void __iomem *iomem)			\
 {												\
 	pirq->ptdev = ptdev;									\
 	pirq->irq = irq;									\
-	pirq->mask = mask;									\
+	pirq->mask = 0;										\
 	pirq->iomem = iomem;									\
 	spin_lock_init(&pirq->mask_lock);							\
-	panthor_ ## __name ## _irq_resume(pirq);						\
+	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDED);					\
+	gpu_write(pirq->iomem, INT_MASK, 0);							\
 												\
 	return devm_request_threaded_irq(ptdev->base.dev, irq,					\
 					 panthor_ ## __name ## _irq_raw_handler,		\
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 4fbddb9e18c8..de8e6689a869 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1474,7 +1474,7 @@ int panthor_fw_init(struct panthor_device *ptdev)
 	if (irq <= 0)
 		return -ENODEV;
 
-	ret = panthor_request_job_irq(ptdev, &fw->irq, irq, 0,
+	ret = panthor_request_job_irq(ptdev, &fw->irq, irq,
 				      ptdev->iomem + JOB_INT_BASE);
 	if (ret) {
 		drm_err(&ptdev->base, "failed to request job irq");
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index e52c5675981f..c013d6bf9a59 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -170,11 +170,12 @@ int panthor_gpu_init(struct panthor_device *ptdev)
 		return irq;
 
 	ret = panthor_request_gpu_irq(ptdev, &ptdev->gpu->irq, irq,
-				      GPU_INTERRUPTS_MASK,
 				      ptdev->iomem + GPU_INT_BASE);
 	if (ret)
 		return ret;
 
+	panthor_gpu_irq_enable_events(&ptdev->gpu->irq, GPU_INTERRUPTS_MASK);
+	panthor_gpu_irq_resume(&ptdev->gpu->irq);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index 31cc57029c12..1fef3c5c1b50 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -3406,7 +3406,6 @@ int panthor_mmu_init(struct panthor_device *ptdev)
 		return -ENODEV;
 
 	ret = panthor_request_mmu_irq(ptdev, &mmu->irq, irq,
-				      panthor_mmu_fault_mask(ptdev, ~0),
 				      ptdev->iomem + MMU_INT_BASE);
 	if (ret)
 		return ret;
@@ -3424,7 +3423,13 @@ int panthor_mmu_init(struct panthor_device *ptdev)
 		ptdev->gpu_info.mmu_features |= BITS_PER_LONG;
 	}
 
-	return drmm_add_action_or_reset(&ptdev->base, panthor_mmu_release_wq, mmu->vm.wq);
+	ret = drmm_add_action_or_reset(&ptdev->base, panthor_mmu_release_wq, mmu->vm.wq);
+	if (ret)
+		return ret;
+
+	panthor_mmu_irq_enable_events(&mmu->irq, panthor_mmu_fault_mask(ptdev, ~0));
+	panthor_mmu_irq_resume(&mmu->irq);
+	return 0;
 }
 
 #ifdef CONFIG_DEBUG_FS
diff --git a/drivers/gpu/drm/panthor/panthor_pwr.c b/drivers/gpu/drm/panthor/panthor_pwr.c
index 090362bd700b..f2c2c3000590 100644
--- a/drivers/gpu/drm/panthor/panthor_pwr.c
+++ b/drivers/gpu/drm/panthor/panthor_pwr.c
@@ -484,12 +484,13 @@ int panthor_pwr_init(struct panthor_device *ptdev)
 	if (irq < 0)
 		return irq;
 
-	err = panthor_request_pwr_irq(
-		ptdev, &pwr->irq, irq, PWR_INTERRUPTS_MASK,
-		pwr->iomem + PWR_INT_BASE);
+	err = panthor_request_pwr_irq(ptdev, &pwr->irq, irq,
+				      pwr->iomem + PWR_INT_BASE);
 	if (err)
 		return err;
 
+	panthor_pwr_irq_enable_events(&pwr->irq, PWR_INTERRUPTS_MASK);
+	panthor_pwr_irq_resume(&pwr->irq);
 	return 0;
 }
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 08/16] drm/panthor: Make panthor_irq::state a non-atomic field
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (6 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 07/16] drm/panthor: Keep interrupts masked until they are needed Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 09/16] drm/panthor: Move the register accessors before the IRQ helpers Boris Brezillon
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon

The only place where panthor_irq::state is accessed without
panthor_irq::mask_lock held is in the prologue of _irq_suspend(),
which is not really a fast-path. So let's simplify things by assuming
panthor_irq::state must always be accessed with the mask_lock held,
and add a scoped_guard() in _irq_suspend().

While at it, rename the lock so it's clear it doesn't just protect
access to the panthor_irq::mask or the INT_MASK register.

Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_device.h | 61 +++++++++++++++-----------------
 1 file changed, 28 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 0fda64fbe5f2..4e5f7b0fb53f 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -92,17 +92,21 @@ struct panthor_irq {
 	u32 mask;
 
 	/**
-	 * @mask_lock: protects modifications to _INT_MASK and @mask.
+	 * @lock: protects modifications to _INT_MASK, @mask and @state.
 	 *
 	 * In paths where _INT_MASK is updated based on a state
 	 * transition/check, it's crucial for the state update/check to be
 	 * inside the locked section, otherwise it introduces a race window
 	 * leading to potential _INT_MASK inconsistencies.
 	 */
-	spinlock_t mask_lock;
+	spinlock_t lock;
 
-	/** @state: one of &enum panthor_irq_state reflecting the current state. */
-	atomic_t state;
+	/**
+	 * @state: one of &enum panthor_irq_state reflecting the current state.
+	 *
+	 * Must be accessed with lock held.
+	 */
+	enum panthor_irq_state state;
 };
 
 /**
@@ -510,22 +514,15 @@ const char *panthor_exception_name(struct panthor_device *ptdev,
 static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data)			\
 {												\
 	struct panthor_irq *pirq = data;							\
-	enum panthor_irq_state old_state;							\
 												\
-	guard(spinlock_irqsave)(&pirq->mask_lock);						\
-	old_state = atomic_cmpxchg(&pirq->state,						\
-				   PANTHOR_IRQ_STATE_ACTIVE,					\
-				   PANTHOR_IRQ_STATE_PROCESSING);				\
-	if (old_state != PANTHOR_IRQ_STATE_ACTIVE)						\
+	guard(spinlock_irqsave)(&pirq->lock);							\
+	if (pirq->state != PANTHOR_IRQ_STATE_ACTIVE)						\
 		return IRQ_NONE;								\
 												\
-	if (!gpu_read(pirq->iomem, INT_STAT)) {							\
-		atomic_cmpxchg(&pirq->state,							\
-			       PANTHOR_IRQ_STATE_PROCESSING,					\
-			       PANTHOR_IRQ_STATE_ACTIVE);					\
+	if (!gpu_read(pirq->iomem, INT_STAT))							\
 		return IRQ_NONE;								\
-	}											\
 												\
+	pirq->state = PANTHOR_IRQ_STATE_PROCESSING;						\
 	gpu_write(pirq->iomem, INT_MASK, 0);							\
 	return IRQ_WAKE_THREAD;									\
 }												\
@@ -554,14 +551,11 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
 		ret = IRQ_HANDLED;								\
 	}											\
 												\
-	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
-		enum panthor_irq_state old_state;						\
-												\
-		old_state = atomic_cmpxchg(&pirq->state,					\
-					   PANTHOR_IRQ_STATE_PROCESSING,			\
-					   PANTHOR_IRQ_STATE_ACTIVE);				\
-		if (old_state == PANTHOR_IRQ_STATE_PROCESSING)					\
+	scoped_guard(spinlock_irqsave, &pirq->lock) {						\
+		if (pirq->state == PANTHOR_IRQ_STATE_PROCESSING) {				\
+			pirq->state = PANTHOR_IRQ_STATE_ACTIVE;					\
 			gpu_write(pirq->iomem, INT_MASK, pirq->mask);				\
+		}										\
 	}											\
 												\
 	return ret;										\
@@ -569,19 +563,20 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
 												\
 static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq)			\
 {												\
-	scoped_guard(spinlock_irqsave, &pirq->mask_lock) {					\
-		atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDING);				\
+	scoped_guard(spinlock_irqsave, &pirq->lock) {						\
+		pirq->state = PANTHOR_IRQ_STATE_SUSPENDING;					\
 		gpu_write(pirq->iomem, INT_MASK, 0);						\
 	}											\
 	synchronize_irq(pirq->irq);								\
-	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDED);					\
+	scoped_guard(spinlock_irqsave, &pirq->lock)						\
+		pirq->state = PANTHOR_IRQ_STATE_SUSPENDED;					\
 }												\
 												\
 static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq)			\
 {												\
-	guard(spinlock_irqsave)(&pirq->mask_lock);						\
+	guard(spinlock_irqsave)(&pirq->lock);							\
 												\
-	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE);					\
+	pirq->state = PANTHOR_IRQ_STATE_ACTIVE;							\
 	gpu_write(pirq->iomem, INT_CLEAR, pirq->mask);						\
 	gpu_write(pirq->iomem, INT_MASK, pirq->mask);						\
 }												\
@@ -594,8 +589,8 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
 	pirq->irq = irq;									\
 	pirq->mask = 0;										\
 	pirq->iomem = iomem;									\
-	spin_lock_init(&pirq->mask_lock);							\
-	atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDED);					\
+	spin_lock_init(&pirq->lock);								\
+	pirq->state = PANTHOR_IRQ_STATE_SUSPENDED;						\
 	gpu_write(pirq->iomem, INT_MASK, 0);							\
 												\
 	return devm_request_threaded_irq(ptdev->base.dev, irq,					\
@@ -607,7 +602,7 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
 												\
 static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *pirq, u32 mask)	\
 {												\
-	guard(spinlock_irqsave)(&pirq->mask_lock);						\
+	guard(spinlock_irqsave)(&pirq->lock);							\
 	pirq->mask |= mask;									\
 												\
 	/* The only situation where we need to write the new mask is if the IRQ is active.	\
@@ -615,13 +610,13 @@ static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *
 	 * on the PROCESSING -> ACTIVE transition.						\
 	 * If the IRQ is suspended/suspending, the mask is restored at resume time.		\
 	 */											\
-	if (atomic_read(&pirq->state) == PANTHOR_IRQ_STATE_ACTIVE)				\
+	if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE)						\
 		gpu_write(pirq->iomem, INT_MASK, pirq->mask);					\
 }												\
 												\
 static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq *pirq, u32 mask)\
 {												\
-	guard(spinlock_irqsave)(&pirq->mask_lock);						\
+	guard(spinlock_irqsave)(&pirq->lock);							\
 	pirq->mask &= ~mask;									\
 												\
 	/* The only situation where we need to write the new mask is if the IRQ is active.	\
@@ -629,7 +624,7 @@ static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq
 	 * on the PROCESSING -> ACTIVE transition.						\
 	 * If the IRQ is suspended/suspending, the mask is restored at resume time.		\
 	 */											\
-	if (atomic_read(&pirq->state) == PANTHOR_IRQ_STATE_ACTIVE)				\
+	if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE)						\
 		gpu_write(pirq->iomem, INT_MASK, pirq->mask);					\
 }
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 09/16] drm/panthor: Move the register accessors before the IRQ helpers
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (7 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 08/16] drm/panthor: Make panthor_irq::state a non-atomic field Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 10/16] drm/panthor: Replace the panthor_irq macro machinery by inline helpers Boris Brezillon
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon

We're about to add an IRQ inline helper using gpu_read(). Move things
around to avoid forward declarations.

No functional changes.

Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_device.h | 142 +++++++++++++++----------------
 1 file changed, 71 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 4e5f7b0fb53f..b102ea77fd1a 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -495,6 +495,77 @@ panthor_exception_is_fault(u32 exception_code)
 const char *panthor_exception_name(struct panthor_device *ptdev,
 				   u32 exception_code);
 
+static inline void gpu_write(void __iomem *iomem, u32 reg, u32 data)
+{
+	writel(data, iomem + reg);
+}
+
+static inline u32 gpu_read(void __iomem *iomem, u32 reg)
+{
+	return readl(iomem + reg);
+}
+
+static inline u32 gpu_read_relaxed(void __iomem *iomem, u32 reg)
+{
+	return readl_relaxed(iomem + reg);
+}
+
+static inline void gpu_write64(void __iomem *iomem, u32 reg, u64 data)
+{
+	gpu_write(iomem, reg, lower_32_bits(data));
+	gpu_write(iomem, reg + 4, upper_32_bits(data));
+}
+
+static inline u64 gpu_read64(void __iomem *iomem, u32 reg)
+{
+	return (gpu_read(iomem, reg) | ((u64)gpu_read(iomem, reg + 4) << 32));
+}
+
+static inline u64 gpu_read64_relaxed(void __iomem *iomem, u32 reg)
+{
+	return (gpu_read_relaxed(iomem, reg) |
+		((u64)gpu_read_relaxed(iomem, reg + 4) << 32));
+}
+
+static inline u64 gpu_read64_counter(void __iomem *iomem, u32 reg)
+{
+	u32 lo, hi1, hi2;
+	do {
+		hi1 = gpu_read(iomem, reg + 4);
+		lo = gpu_read(iomem, reg);
+		hi2 = gpu_read(iomem, reg + 4);
+	} while (hi1 != hi2);
+	return lo | ((u64)hi2 << 32);
+}
+
+#define gpu_read_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us)	\
+	read_poll_timeout(gpu_read, val, cond, delay_us, timeout_us, false,	\
+			  iomem, reg)
+
+#define gpu_read_poll_timeout_atomic(iomem, reg, val, cond, delay_us,		\
+				     timeout_us)				\
+	read_poll_timeout_atomic(gpu_read, val, cond, delay_us, timeout_us,	\
+				 false, iomem, reg)
+
+#define gpu_read64_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us)	\
+	read_poll_timeout(gpu_read64, val, cond, delay_us, timeout_us, false,	\
+			  iomem, reg)
+
+#define gpu_read64_poll_timeout_atomic(iomem, reg, val, cond, delay_us,		\
+				       timeout_us)				\
+	read_poll_timeout_atomic(gpu_read64, val, cond, delay_us, timeout_us,	\
+				 false, iomem, reg)
+
+#define gpu_read_relaxed_poll_timeout_atomic(iomem, reg, val, cond, delay_us,	\
+					     timeout_us)			\
+	read_poll_timeout_atomic(gpu_read_relaxed, val, cond, delay_us,		\
+				 timeout_us, false, iomem, reg)
+
+#define gpu_read64_relaxed_poll_timeout(iomem, reg, val, cond, delay_us,	\
+					timeout_us)				\
+	read_poll_timeout(gpu_read64_relaxed, val, cond, delay_us, timeout_us,	\
+			  false, iomem, reg)
+
 #define INT_RAWSTAT 0x0
 #define INT_CLEAR   0x4
 #define INT_MASK    0x8
@@ -630,75 +701,4 @@ static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq
 
 extern struct workqueue_struct *panthor_cleanup_wq;
 
-static inline void gpu_write(void __iomem *iomem, u32 reg, u32 data)
-{
-	writel(data, iomem + reg);
-}
-
-static inline u32 gpu_read(void __iomem *iomem, u32 reg)
-{
-	return readl(iomem + reg);
-}
-
-static inline u32 gpu_read_relaxed(void __iomem *iomem, u32 reg)
-{
-	return readl_relaxed(iomem + reg);
-}
-
-static inline void gpu_write64(void __iomem *iomem, u32 reg, u64 data)
-{
-	gpu_write(iomem, reg, lower_32_bits(data));
-	gpu_write(iomem, reg + 4, upper_32_bits(data));
-}
-
-static inline u64 gpu_read64(void __iomem *iomem, u32 reg)
-{
-	return (gpu_read(iomem, reg) | ((u64)gpu_read(iomem, reg + 4) << 32));
-}
-
-static inline u64 gpu_read64_relaxed(void __iomem *iomem, u32 reg)
-{
-	return (gpu_read_relaxed(iomem, reg) |
-		((u64)gpu_read_relaxed(iomem, reg + 4) << 32));
-}
-
-static inline u64 gpu_read64_counter(void __iomem *iomem, u32 reg)
-{
-	u32 lo, hi1, hi2;
-	do {
-		hi1 = gpu_read(iomem, reg + 4);
-		lo = gpu_read(iomem, reg);
-		hi2 = gpu_read(iomem, reg + 4);
-	} while (hi1 != hi2);
-	return lo | ((u64)hi2 << 32);
-}
-
-#define gpu_read_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us)	\
-	read_poll_timeout(gpu_read, val, cond, delay_us, timeout_us, false,	\
-			  iomem, reg)
-
-#define gpu_read_poll_timeout_atomic(iomem, reg, val, cond, delay_us,		\
-				     timeout_us)				\
-	read_poll_timeout_atomic(gpu_read, val, cond, delay_us, timeout_us,	\
-				 false, iomem, reg)
-
-#define gpu_read64_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us)	\
-	read_poll_timeout(gpu_read64, val, cond, delay_us, timeout_us, false,	\
-			  iomem, reg)
-
-#define gpu_read64_poll_timeout_atomic(iomem, reg, val, cond, delay_us,		\
-				       timeout_us)				\
-	read_poll_timeout_atomic(gpu_read64, val, cond, delay_us, timeout_us,	\
-				 false, iomem, reg)
-
-#define gpu_read_relaxed_poll_timeout_atomic(iomem, reg, val, cond, delay_us,	\
-					     timeout_us)			\
-	read_poll_timeout_atomic(gpu_read_relaxed, val, cond, delay_us,		\
-				 timeout_us, false, iomem, reg)
-
-#define gpu_read64_relaxed_poll_timeout(iomem, reg, val, cond, delay_us,	\
-					timeout_us)				\
-	read_poll_timeout(gpu_read64_relaxed, val, cond, delay_us, timeout_us,	\
-			  false, iomem, reg)
-
 #endif

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 10/16] drm/panthor: Replace the panthor_irq macro machinery by inline helpers
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (8 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 09/16] drm/panthor: Move the register accessors before the IRQ helpers Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 11/16] drm/panthor: Don't update might_have_idle_groups in process_idle_event_locked() Boris Brezillon
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon

Now that panthor_irq contains the iomem region, there's no real need
for the macro-based panthor_irq helper generation logic. We can just
provide inline helpers that do the same and let the compiler optimize
indirect function calls. The only extra annoyance is the fact we have
to open-code the panthor_xxx_irq_threaded_handler() implementation, but
those are single-line functions, so it's acceptable.

While at it, we changed the prototype of the IRQ handlers to take
a panthor_irq instead of panthor_device, since that's the thing
that's passed around when it comes to panthor_irq, and the
panthor_device can be directly extracted from there.

Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_device.h | 247 +++++++++++++++----------------
 drivers/gpu/drm/panthor/panthor_fw.c     |  22 ++-
 drivers/gpu/drm/panthor/panthor_gpu.c    |  28 ++--
 drivers/gpu/drm/panthor/panthor_mmu.c    |  39 ++---
 drivers/gpu/drm/panthor/panthor_pwr.c    |  24 +--
 5 files changed, 188 insertions(+), 172 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index b102ea77fd1a..b55a3f9edd41 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -571,132 +571,127 @@ static inline u64 gpu_read64_counter(void __iomem *iomem, u32 reg)
 #define INT_MASK    0x8
 #define INT_STAT    0xc
 
-/**
- * PANTHOR_IRQ_HANDLER() - Define interrupt handlers and the interrupt
- * registration function.
- *
- * The boiler-plate to gracefully deal with shared interrupts is
- * auto-generated. All you have to do is call PANTHOR_IRQ_HANDLER()
- * just after the actual handler. The handler prototype is:
- *
- * void (*handler)(struct panthor_device *, u32 status);
- */
-#define PANTHOR_IRQ_HANDLER(__name, __handler)							\
-static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data)			\
-{												\
-	struct panthor_irq *pirq = data;							\
-												\
-	guard(spinlock_irqsave)(&pirq->lock);							\
-	if (pirq->state != PANTHOR_IRQ_STATE_ACTIVE)						\
-		return IRQ_NONE;								\
-												\
-	if (!gpu_read(pirq->iomem, INT_STAT))							\
-		return IRQ_NONE;								\
-												\
-	pirq->state = PANTHOR_IRQ_STATE_PROCESSING;						\
-	gpu_write(pirq->iomem, INT_MASK, 0);							\
-	return IRQ_WAKE_THREAD;									\
-}												\
-												\
-static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *data)		\
-{												\
-	struct panthor_irq *pirq = data;							\
-	struct panthor_device *ptdev = pirq->ptdev;						\
-	irqreturn_t ret = IRQ_NONE;								\
-												\
-	while (true) {										\
-		/* It's safe to access pirq->mask without the lock held here. If a new		\
-		 * event gets added to the mask and the corresponding IRQ is pending,		\
-		 * we'll process it right away instead of adding an extra raw -> threaded	\
-		 * round trip. If an event is removed and the status bit is set, it will	\
-		 * be ignored, just like it would have been if the mask had been adjusted	\
-		 * right before the HW event kicks in. TLDR; it's all expected races we're	\
-		 * covered for.									\
-		 */										\
-		u32 status = gpu_read(pirq->iomem, INT_RAWSTAT) & pirq->mask;			\
-												\
-		if (!status)									\
-			break;									\
-												\
-		__handler(ptdev, status);							\
-		ret = IRQ_HANDLED;								\
-	}											\
-												\
-	scoped_guard(spinlock_irqsave, &pirq->lock) {						\
-		if (pirq->state == PANTHOR_IRQ_STATE_PROCESSING) {				\
-			pirq->state = PANTHOR_IRQ_STATE_ACTIVE;					\
-			gpu_write(pirq->iomem, INT_MASK, pirq->mask);				\
-		}										\
-	}											\
-												\
-	return ret;										\
-}												\
-												\
-static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq)			\
-{												\
-	scoped_guard(spinlock_irqsave, &pirq->lock) {						\
-		pirq->state = PANTHOR_IRQ_STATE_SUSPENDING;					\
-		gpu_write(pirq->iomem, INT_MASK, 0);						\
-	}											\
-	synchronize_irq(pirq->irq);								\
-	scoped_guard(spinlock_irqsave, &pirq->lock)						\
-		pirq->state = PANTHOR_IRQ_STATE_SUSPENDED;					\
-}												\
-												\
-static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq)			\
-{												\
-	guard(spinlock_irqsave)(&pirq->lock);							\
-												\
-	pirq->state = PANTHOR_IRQ_STATE_ACTIVE;							\
-	gpu_write(pirq->iomem, INT_CLEAR, pirq->mask);						\
-	gpu_write(pirq->iomem, INT_MASK, pirq->mask);						\
-}												\
-												\
-static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
-					      struct panthor_irq *pirq,				\
-					      int irq, void __iomem *iomem)			\
-{												\
-	pirq->ptdev = ptdev;									\
-	pirq->irq = irq;									\
-	pirq->mask = 0;										\
-	pirq->iomem = iomem;									\
-	spin_lock_init(&pirq->lock);								\
-	pirq->state = PANTHOR_IRQ_STATE_SUSPENDED;						\
-	gpu_write(pirq->iomem, INT_MASK, 0);							\
-												\
-	return devm_request_threaded_irq(ptdev->base.dev, irq,					\
-					 panthor_ ## __name ## _irq_raw_handler,		\
-					 panthor_ ## __name ## _irq_threaded_handler,		\
-					 IRQF_SHARED, KBUILD_MODNAME "-" # __name,		\
-					 pirq);							\
-}												\
-												\
-static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *pirq, u32 mask)	\
-{												\
-	guard(spinlock_irqsave)(&pirq->lock);							\
-	pirq->mask |= mask;									\
-												\
-	/* The only situation where we need to write the new mask is if the IRQ is active.	\
-	 * If it's being processed, the mask will be restored for us in _irq_threaded_handler()	\
-	 * on the PROCESSING -> ACTIVE transition.						\
-	 * If the IRQ is suspended/suspending, the mask is restored at resume time.		\
-	 */											\
-	if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE)						\
-		gpu_write(pirq->iomem, INT_MASK, pirq->mask);					\
-}												\
-												\
-static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq *pirq, u32 mask)\
-{												\
-	guard(spinlock_irqsave)(&pirq->lock);							\
-	pirq->mask &= ~mask;									\
-												\
-	/* The only situation where we need to write the new mask is if the IRQ is active.	\
-	 * If it's being processed, the mask will be restored for us in _irq_threaded_handler()	\
-	 * on the PROCESSING -> ACTIVE transition.						\
-	 * If the IRQ is suspended/suspending, the mask is restored at resume time.		\
-	 */											\
-	if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE)						\
-		gpu_write(pirq->iomem, INT_MASK, pirq->mask);					\
+static inline irqreturn_t panthor_irq_default_raw_handler(int irq, void *data)
+{
+	struct panthor_irq *pirq = data;
+
+	guard(spinlock_irqsave)(&pirq->lock);
+	if (pirq->state != PANTHOR_IRQ_STATE_ACTIVE)
+		return IRQ_NONE;
+
+	if (!gpu_read(pirq->iomem, INT_STAT))
+		return IRQ_NONE;
+
+	pirq->state = PANTHOR_IRQ_STATE_PROCESSING;
+	gpu_write(pirq->iomem, INT_MASK, 0);
+	return IRQ_WAKE_THREAD;
+}
+
+static __always_inline irqreturn_t
+panthor_irq_default_threaded_handler(void *data,
+				     void (*slow_handler)(struct panthor_irq *, u32))
+{
+	struct panthor_irq *pirq = data;
+	irqreturn_t ret = IRQ_NONE;
+
+	while (true) {
+		/* It's safe to access pirq->mask without the lock held here. If a new
+		 * event gets added to the mask and the corresponding IRQ is pending,
+		 * we'll process it right away instead of adding an extra raw -> threaded
+		 * round trip. If an event is removed and the status bit is set, it will
+		 * be ignored, just like it would have been if the mask had been adjusted
+		 * right before the HW event kicks in. TLDR; it's all expected races we're
+		 * covered for.
+		 */
+		u32 status = gpu_read(pirq->iomem, INT_RAWSTAT) & pirq->mask;
+
+		if (!status)
+			break;
+
+		slow_handler(pirq, status);
+		ret = IRQ_HANDLED;
+	}
+
+	scoped_guard(spinlock_irqsave, &pirq->lock) {
+		if (pirq->state == PANTHOR_IRQ_STATE_PROCESSING) {
+			pirq->state = PANTHOR_IRQ_STATE_ACTIVE;
+			gpu_write(pirq->iomem, INT_MASK, pirq->mask);
+		}
+	}
+
+	return ret;
+}
+
+static inline void panthor_irq_suspend(struct panthor_irq *pirq)
+{
+	scoped_guard(spinlock_irqsave, &pirq->lock) {
+		pirq->state = PANTHOR_IRQ_STATE_SUSPENDING;
+		gpu_write(pirq->iomem, INT_MASK, 0);
+	}
+	synchronize_irq(pirq->irq);
+	scoped_guard(spinlock_irqsave, &pirq->lock)
+		pirq->state = PANTHOR_IRQ_STATE_SUSPENDED;
+}
+
+static inline void panthor_irq_resume(struct panthor_irq *pirq)
+{
+	guard(spinlock_irqsave)(&pirq->lock);
+	pirq->state = PANTHOR_IRQ_STATE_ACTIVE;
+	gpu_write(pirq->iomem, INT_CLEAR, pirq->mask);
+	gpu_write(pirq->iomem, INT_MASK, pirq->mask);
+}
+
+static inline void panthor_irq_enable_events(struct panthor_irq *pirq, u32 mask)
+{
+	guard(spinlock_irqsave)(&pirq->lock);
+	pirq->mask |= mask;
+
+	/* The only situation where we need to write the new mask is if the IRQ is active.
+	 * If it's being processed, the mask will be restored for us in _irq_threaded_handler()
+	 * on the PROCESSING -> ACTIVE transition.
+	 * If the IRQ is suspended/suspending, the mask is restored at resume time.
+	 */
+	if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE)
+		gpu_write(pirq->iomem, INT_MASK, pirq->mask);
+}
+
+static inline void panthor_irq_disable_events(struct panthor_irq *pirq, u32 mask)
+{
+	guard(spinlock_irqsave)(&pirq->lock);
+	pirq->mask &= ~mask;
+
+	/* The only situation where we need to write the new mask is if the IRQ is active.
+	 * If it's being processed, the mask will be restored for us in _irq_threaded_handler()
+	 * on the PROCESSING -> ACTIVE transition.
+	 * If the IRQ is suspended/suspending, the mask is restored at resume time.
+	 */
+	if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE)
+		gpu_write(pirq->iomem, INT_MASK, pirq->mask);
+}
+
+static inline int
+panthor_irq_request(struct panthor_device *ptdev, struct panthor_irq *pirq,
+		    int irq, void __iomem *iomem, const char *name,
+		    irqreturn_t (*threaded_handler)(int, void *data))
+{
+	const char *full_name;
+
+	pirq->ptdev = ptdev;
+	pirq->irq = irq;
+	pirq->mask = 0;
+	pirq->iomem = iomem;
+	spin_lock_init(&pirq->lock);
+	pirq->state = PANTHOR_IRQ_STATE_SUSPENDED;
+
+	full_name = devm_kasprintf(ptdev->base.dev, GFP_KERNEL, KBUILD_MODNAME "-%s", name);
+	if (!full_name)
+		return -ENOMEM;
+
+	gpu_write(pirq->iomem, INT_MASK, 0);
+	return devm_request_threaded_irq(ptdev->base.dev, irq,
+					 panthor_irq_default_raw_handler,
+					 threaded_handler,
+					 IRQF_SHARED, full_name, pirq);
 }
 
 extern struct workqueue_struct *panthor_cleanup_wq;
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index de8e6689a869..e358ca296eec 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1064,8 +1064,9 @@ static void panthor_fw_init_global_iface(struct panthor_device *ptdev)
 			 msecs_to_jiffies(PING_INTERVAL_MS));
 }
 
-static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
+static void panthor_job_irq_handler(struct panthor_irq *pirq, u32 status)
 {
+	struct panthor_device *ptdev = pirq->ptdev;
 	u32 duration;
 	u64 start = 0;
 
@@ -1091,7 +1092,11 @@ static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
 		trace_gpu_job_irq(ptdev->base.dev, status, duration);
 	}
 }
-PANTHOR_IRQ_HANDLER(job, panthor_job_irq_handler);
+
+static irqreturn_t panthor_job_irq_threaded_handler(int irq, void *data)
+{
+	return panthor_irq_default_threaded_handler(data, panthor_job_irq_handler);
+}
 
 static int panthor_fw_start(struct panthor_device *ptdev)
 {
@@ -1099,8 +1104,8 @@ static int panthor_fw_start(struct panthor_device *ptdev)
 	bool timedout = false;
 
 	ptdev->fw->booted = false;
-	panthor_job_irq_enable_events(&ptdev->fw->irq, ~0);
-	panthor_job_irq_resume(&ptdev->fw->irq);
+	panthor_irq_enable_events(&ptdev->fw->irq, ~0);
+	panthor_irq_resume(&ptdev->fw->irq);
 	gpu_write(fw->iomem, MCU_CONTROL, MCU_CONTROL_AUTO);
 
 	if (!wait_event_timeout(ptdev->fw->req_waitqueue,
@@ -1210,7 +1215,7 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
 			ptdev->reset.fast = true;
 	}
 
-	panthor_job_irq_suspend(&ptdev->fw->irq);
+	panthor_irq_suspend(&ptdev->fw->irq);
 	panthor_fw_stop(ptdev);
 }
 
@@ -1279,7 +1284,7 @@ void panthor_fw_unplug(struct panthor_device *ptdev)
 
 	if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev)) {
 		/* Make sure the IRQ handler cannot be called after that point. */
-		panthor_job_irq_suspend(&ptdev->fw->irq);
+		panthor_irq_suspend(&ptdev->fw->irq);
 		panthor_fw_stop(ptdev);
 	}
 
@@ -1474,8 +1479,9 @@ int panthor_fw_init(struct panthor_device *ptdev)
 	if (irq <= 0)
 		return -ENODEV;
 
-	ret = panthor_request_job_irq(ptdev, &fw->irq, irq,
-				      ptdev->iomem + JOB_INT_BASE);
+	ret = panthor_irq_request(ptdev, &fw->irq, irq,
+				  ptdev->iomem + JOB_INT_BASE, "job",
+				  panthor_job_irq_threaded_handler);
 	if (ret) {
 		drm_err(&ptdev->base, "failed to request job irq");
 		return ret;
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index c013d6bf9a59..7f287242285a 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -86,8 +86,9 @@ static void panthor_gpu_l2_config_set(struct panthor_device *ptdev)
 	gpu_write(gpu->iomem, GPU_L2_CONFIG, l2_config);
 }
 
-static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
+static void panthor_gpu_irq_handler(struct panthor_irq *pirq, u32 status)
 {
+	struct panthor_device *ptdev = pirq->ptdev;
 	struct panthor_gpu *gpu = ptdev->gpu;
 
 	gpu_write(gpu->irq.iomem, INT_CLEAR, status);
@@ -116,7 +117,11 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
 	}
 	spin_unlock(&ptdev->gpu->reqs_lock);
 }
-PANTHOR_IRQ_HANDLER(gpu, panthor_gpu_irq_handler);
+
+static irqreturn_t panthor_gpu_irq_threaded_handler(int irq, void *data)
+{
+	return panthor_irq_default_threaded_handler(data, panthor_gpu_irq_handler);
+}
 
 /**
  * panthor_gpu_unplug() - Called when the GPU is unplugged.
@@ -128,7 +133,7 @@ void panthor_gpu_unplug(struct panthor_device *ptdev)
 
 	/* Make sure the IRQ handler is not running after that point. */
 	if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev))
-		panthor_gpu_irq_suspend(&ptdev->gpu->irq);
+		panthor_irq_suspend(&ptdev->gpu->irq);
 
 	/* Wake-up all waiters. */
 	spin_lock_irqsave(&ptdev->gpu->reqs_lock, flags);
@@ -169,13 +174,14 @@ int panthor_gpu_init(struct panthor_device *ptdev)
 	if (irq < 0)
 		return irq;
 
-	ret = panthor_request_gpu_irq(ptdev, &ptdev->gpu->irq, irq,
-				      ptdev->iomem + GPU_INT_BASE);
+	ret = panthor_irq_request(ptdev, &ptdev->gpu->irq, irq,
+				  ptdev->iomem + GPU_INT_BASE, "gpu",
+				  panthor_gpu_irq_threaded_handler);
 	if (ret)
 		return ret;
 
-	panthor_gpu_irq_enable_events(&ptdev->gpu->irq, GPU_INTERRUPTS_MASK);
-	panthor_gpu_irq_resume(&ptdev->gpu->irq);
+	panthor_irq_enable_events(&ptdev->gpu->irq, GPU_INTERRUPTS_MASK);
+	panthor_irq_resume(&ptdev->gpu->irq);
 	return 0;
 }
 
@@ -183,7 +189,7 @@ int panthor_gpu_power_changed_on(struct panthor_device *ptdev)
 {
 	guard(pm_runtime_active)(ptdev->base.dev);
 
-	panthor_gpu_irq_enable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
+	panthor_irq_enable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
 
 	return 0;
 }
@@ -192,7 +198,7 @@ void panthor_gpu_power_changed_off(struct panthor_device *ptdev)
 {
 	guard(pm_runtime_active)(ptdev->base.dev);
 
-	panthor_gpu_irq_disable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
+	panthor_irq_disable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
 }
 
 /**
@@ -425,7 +431,7 @@ void panthor_gpu_suspend(struct panthor_device *ptdev)
 	else
 		panthor_hw_l2_power_off(ptdev);
 
-	panthor_gpu_irq_suspend(&ptdev->gpu->irq);
+	panthor_irq_suspend(&ptdev->gpu->irq);
 }
 
 /**
@@ -437,7 +443,7 @@ void panthor_gpu_suspend(struct panthor_device *ptdev)
  */
 void panthor_gpu_resume(struct panthor_device *ptdev)
 {
-	panthor_gpu_irq_resume(&ptdev->gpu->irq);
+	panthor_irq_resume(&ptdev->gpu->irq);
 	panthor_hw_l2_power_on(ptdev);
 }
 
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index 1fef3c5c1b50..4023ffdf0996 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -598,17 +598,13 @@ static u32 panthor_mmu_as_fault_mask(struct panthor_device *ptdev, u32 as)
 	return BIT(as);
 }
 
-/* Forward declaration to call helpers within as_enable/disable */
-static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status);
-PANTHOR_IRQ_HANDLER(mmu, panthor_mmu_irq_handler);
-
 static int panthor_mmu_as_enable(struct panthor_device *ptdev, u32 as_nr,
 				 u64 transtab, u64 transcfg, u64 memattr)
 {
 	struct panthor_mmu *mmu = ptdev->mmu;
 
-	panthor_mmu_irq_enable_events(&ptdev->mmu->irq,
-				      panthor_mmu_as_fault_mask(ptdev, as_nr));
+	panthor_irq_enable_events(&ptdev->mmu->irq,
+				  panthor_mmu_as_fault_mask(ptdev, as_nr));
 
 	gpu_write64(mmu->iomem, AS_TRANSTAB(as_nr), transtab);
 	gpu_write64(mmu->iomem, AS_MEMATTR(as_nr), memattr);
@@ -626,8 +622,8 @@ static int panthor_mmu_as_disable(struct panthor_device *ptdev, u32 as_nr,
 
 	lockdep_assert_held(&ptdev->mmu->as.slots_lock);
 
-	panthor_mmu_irq_disable_events(&ptdev->mmu->irq,
-				       panthor_mmu_as_fault_mask(ptdev, as_nr));
+	panthor_irq_disable_events(&ptdev->mmu->irq,
+				   panthor_mmu_as_fault_mask(ptdev, as_nr));
 
 	/* Flush+invalidate RW caches, invalidate RO ones. */
 	ret = panthor_gpu_flush_caches(ptdev, CACHE_CLEAN | CACHE_INV,
@@ -1857,8 +1853,9 @@ static void panthor_vm_unlock_region(struct panthor_vm *vm)
 	mutex_unlock(&ptdev->mmu->as.slots_lock);
 }
 
-static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
+static void panthor_mmu_irq_handler(struct panthor_irq *pirq, u32 status)
 {
+	struct panthor_device *ptdev = pirq->ptdev;
 	struct panthor_mmu *mmu = ptdev->mmu;
 	bool has_unhandled_faults = false;
 
@@ -1921,6 +1918,11 @@ static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
 		panthor_sched_report_mmu_fault(ptdev);
 }
 
+static irqreturn_t panthor_mmu_irq_threaded_handler(int irq, void *data)
+{
+	return panthor_irq_default_threaded_handler(data, panthor_mmu_irq_handler);
+}
+
 /**
  * panthor_mmu_suspend() - Suspend the MMU logic
  * @ptdev: Device.
@@ -1945,7 +1947,7 @@ void panthor_mmu_suspend(struct panthor_device *ptdev)
 	}
 	mutex_unlock(&ptdev->mmu->as.slots_lock);
 
-	panthor_mmu_irq_suspend(&ptdev->mmu->irq);
+	panthor_irq_suspend(&ptdev->mmu->irq);
 }
 
 /**
@@ -1964,7 +1966,7 @@ void panthor_mmu_resume(struct panthor_device *ptdev)
 	ptdev->mmu->as.faulty_mask = 0;
 	mutex_unlock(&ptdev->mmu->as.slots_lock);
 
-	panthor_mmu_irq_resume(&ptdev->mmu->irq);
+	panthor_irq_resume(&ptdev->mmu->irq);
 }
 
 /**
@@ -1981,7 +1983,7 @@ void panthor_mmu_pre_reset(struct panthor_device *ptdev)
 {
 	struct panthor_vm *vm;
 
-	panthor_mmu_irq_suspend(&ptdev->mmu->irq);
+	panthor_irq_suspend(&ptdev->mmu->irq);
 
 	mutex_lock(&ptdev->mmu->vm.lock);
 	ptdev->mmu->vm.reset_in_progress = true;
@@ -2018,7 +2020,7 @@ void panthor_mmu_post_reset(struct panthor_device *ptdev)
 
 	mutex_unlock(&ptdev->mmu->as.slots_lock);
 
-	panthor_mmu_irq_resume(&ptdev->mmu->irq);
+	panthor_irq_resume(&ptdev->mmu->irq);
 
 	/* Restart the VM_BIND queues. */
 	mutex_lock(&ptdev->mmu->vm.lock);
@@ -3344,7 +3346,7 @@ panthor_mmu_reclaim_priv_bos(struct panthor_device *ptdev,
 void panthor_mmu_unplug(struct panthor_device *ptdev)
 {
 	if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev))
-		panthor_mmu_irq_suspend(&ptdev->mmu->irq);
+		panthor_irq_suspend(&ptdev->mmu->irq);
 
 	mutex_lock(&ptdev->mmu->as.slots_lock);
 	for (u32 i = 0; i < ARRAY_SIZE(ptdev->mmu->as.slots); i++) {
@@ -3405,8 +3407,9 @@ int panthor_mmu_init(struct panthor_device *ptdev)
 	if (irq <= 0)
 		return -ENODEV;
 
-	ret = panthor_request_mmu_irq(ptdev, &mmu->irq, irq,
-				      ptdev->iomem + MMU_INT_BASE);
+	ret = panthor_irq_request(ptdev, &mmu->irq, irq,
+				  ptdev->iomem + MMU_INT_BASE, "mmu",
+				  panthor_mmu_irq_threaded_handler);
 	if (ret)
 		return ret;
 
@@ -3427,8 +3430,8 @@ int panthor_mmu_init(struct panthor_device *ptdev)
 	if (ret)
 		return ret;
 
-	panthor_mmu_irq_enable_events(&mmu->irq, panthor_mmu_fault_mask(ptdev, ~0));
-	panthor_mmu_irq_resume(&mmu->irq);
+	panthor_irq_enable_events(&mmu->irq, panthor_mmu_fault_mask(ptdev, ~0));
+	panthor_irq_resume(&mmu->irq);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/panthor/panthor_pwr.c b/drivers/gpu/drm/panthor/panthor_pwr.c
index f2c2c3000590..dd7b6ef8ea20 100644
--- a/drivers/gpu/drm/panthor/panthor_pwr.c
+++ b/drivers/gpu/drm/panthor/panthor_pwr.c
@@ -56,8 +56,9 @@ struct panthor_pwr {
 	wait_queue_head_t reqs_acked;
 };
 
-static void panthor_pwr_irq_handler(struct panthor_device *ptdev, u32 status)
+static void panthor_pwr_irq_handler(struct panthor_irq *pirq, u32 status)
 {
+	struct panthor_device *ptdev = pirq->ptdev;
 	struct panthor_pwr *pwr = ptdev->pwr;
 
 	spin_lock(&ptdev->pwr->reqs_lock);
@@ -75,7 +76,11 @@ static void panthor_pwr_irq_handler(struct panthor_device *ptdev, u32 status)
 	}
 	spin_unlock(&ptdev->pwr->reqs_lock);
 }
-PANTHOR_IRQ_HANDLER(pwr, panthor_pwr_irq_handler);
+
+static irqreturn_t panthor_pwr_irq_threaded_handler(int irq, void *data)
+{
+	return panthor_irq_default_threaded_handler(data, panthor_pwr_irq_handler);
+}
 
 static void panthor_pwr_write_command(struct panthor_device *ptdev, u32 command, u64 args)
 {
@@ -454,7 +459,7 @@ void panthor_pwr_unplug(struct panthor_device *ptdev)
 
 	/* Make sure the IRQ handler is not running after that point. */
 	if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev))
-		panthor_pwr_irq_suspend(&ptdev->pwr->irq);
+		panthor_irq_suspend(&ptdev->pwr->irq);
 
 	/* Wake-up all waiters. */
 	spin_lock_irqsave(&ptdev->pwr->reqs_lock, flags);
@@ -484,13 +489,14 @@ int panthor_pwr_init(struct panthor_device *ptdev)
 	if (irq < 0)
 		return irq;
 
-	err = panthor_request_pwr_irq(ptdev, &pwr->irq, irq,
-				      pwr->iomem + PWR_INT_BASE);
+	err = panthor_irq_request(ptdev, &pwr->irq, irq,
+				  pwr->iomem + PWR_INT_BASE, "pwr",
+				  panthor_pwr_irq_threaded_handler);
 	if (err)
 		return err;
 
-	panthor_pwr_irq_enable_events(&pwr->irq, PWR_INTERRUPTS_MASK);
-	panthor_pwr_irq_resume(&pwr->irq);
+	panthor_irq_enable_events(&pwr->irq, PWR_INTERRUPTS_MASK);
+	panthor_irq_resume(&pwr->irq);
 	return 0;
 }
 
@@ -566,7 +572,7 @@ void panthor_pwr_suspend(struct panthor_device *ptdev)
 	if (!ptdev->pwr)
 		return;
 
-	panthor_pwr_irq_suspend(&ptdev->pwr->irq);
+	panthor_irq_suspend(&ptdev->pwr->irq);
 }
 
 void panthor_pwr_resume(struct panthor_device *ptdev)
@@ -574,5 +580,5 @@ void panthor_pwr_resume(struct panthor_device *ptdev)
 	if (!ptdev->pwr)
 		return;
 
-	panthor_pwr_irq_resume(&ptdev->pwr->irq);
+	panthor_irq_resume(&ptdev->pwr->irq);
 }

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 11/16] drm/panthor: Don't update might_have_idle_groups in process_idle_event_locked()
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (9 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 10/16] drm/panthor: Replace the panthor_irq macro machinery by inline helpers Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25 10:06   ` sashiko-bot
  2026-06-25  9:36 ` [PATCH v5 12/16] drm/panthor: Get rid of panthor_group::fatal_lock Boris Brezillon
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon

By scheduling an immediate tick, we already force idleness re-evaluation,
which gives the scheduler the opportunity to evict idle groups
and schedule onces that have jobs pending.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_sched.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index 237f6a75e624..a5dfb1beafff 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -1734,8 +1734,6 @@ static void csg_slot_process_idle_event_locked(struct panthor_device *ptdev, u32
 
 	lockdep_assert_held(&sched->lock);
 
-	sched->might_have_idle_groups = true;
-
 	/* Schedule a tick so we can evict idle groups and schedule non-idle
 	 * ones. This will also update runtime PM and devfreq busy/idle states,
 	 * so the device can lower its frequency or get suspended.

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 12/16] drm/panthor: Get rid of panthor_group::fatal_lock
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (10 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 11/16] drm/panthor: Don't update might_have_idle_groups in process_idle_event_locked() Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 13/16] drm/panthor: Protect events processing with a separate spinlock Boris Brezillon
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon

This lock is never used, and we're about to make fatal_queues an
atomic to cope with concurrent updates.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_sched.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index a5dfb1beafff..b3ee891d05aa 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -565,9 +565,6 @@ struct panthor_group {
 	/** @idle_queues: Bitmask reflecting the idle queues. */
 	u32 idle_queues;
 
-	/** @fatal_lock: Lock used to protect access to fatal fields. */
-	spinlock_t fatal_lock;
-
 	/** @fatal_queues: Bitmask reflecting the queues that hit a fatal exception. */
 	u32 fatal_queues;
 
@@ -3678,7 +3675,6 @@ int panthor_group_create(struct panthor_file *pfile,
 	if (!group)
 		return -ENOMEM;
 
-	spin_lock_init(&group->fatal_lock);
 	kref_init(&group->refcount);
 	group->state = PANTHOR_CS_GROUP_CREATED;
 	group->csg_id = -1;

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 13/16] drm/panthor: Protect events processing with a separate spinlock
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (11 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 12/16] drm/panthor: Get rid of panthor_group::fatal_lock Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 14/16] drm/panthor: Don't defer job completion checks Boris Brezillon
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon

Add a specific spinlock for events processing so we can selectively
move some event processing to the threaded IRQ handler. For events to be
processed, we need to have access to the group attached to the CSG slot
which also forces us to protect the csg_slots[] updates with this
lock.

Note that fatal_queues/timedout are turned into atomics to avoid having
to take the events_lock every time those are checked or updated.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_sched.c | 123 ++++++++++++++++++++------------
 1 file changed, 78 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index b3ee891d05aa..2ef6c4f19388 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -254,8 +254,21 @@ struct panthor_scheduler {
 		struct list_head waiting;
 	} groups;
 
+	/**
+	 * @events_lock: Lock taken when processing events.
+	 *
+	 * This also needs to be taken when csg_slots are updated, to make sure
+	 * the event processing logic doesn't touch groups that have left the CSG
+	 * slot.
+	 */
+	spinlock_t events_lock;
+
 	/**
 	 * @csg_slots: FW command stream group slots.
+	 *
+	 * Updates to these slots must happen with both panthor_scheduler::lock and
+	 * panthor_scheduler::events_lock held. As a result, reads can happen with
+	 * either of these locks held.
 	 */
 	struct panthor_csg_slot csg_slots[MAX_CSGS];
 
@@ -565,8 +578,13 @@ struct panthor_group {
 	/** @idle_queues: Bitmask reflecting the idle queues. */
 	u32 idle_queues;
 
-	/** @fatal_queues: Bitmask reflecting the queues that hit a fatal exception. */
-	u32 fatal_queues;
+	/**
+	 * @fatal_queues: Bitmask reflecting the queues that hit a fatal exception.
+	 *
+	 * This is an atomic because we don't want to acquire the events_lock
+	 * every time we need to check the group state.
+	 */
+	atomic_t fatal_queues;
 
 	/** @tiler_oom: Mask of queues that have a tiler OOM event to process. */
 	atomic_t tiler_oom;
@@ -602,8 +620,14 @@ struct panthor_group {
 	 * any timeout situation is unrecoverable, and the group becomes useless. We
 	 * simply wait for all references to be dropped so we can release the group
 	 * object.
+	 *
+	 * This is an atomic because it can be set from both a scheduling context
+	 * (protected with panthor_scheduler::lock) and an event processing context
+	 * (protected with panthor_scheduler::events_lock). We could protect access
+	 * with the events_lock, but this is simpler to make it an atomic since the
+	 * only allowed transition is false -> true.
 	 */
-	bool timedout;
+	atomic_t timedout;
 
 	/**
 	 * @innocent: True when the group becomes unusable because the group suspension
@@ -996,7 +1020,6 @@ static int
 group_bind_locked(struct panthor_group *group, u32 csg_id)
 {
 	struct panthor_device *ptdev = group->ptdev;
-	struct panthor_csg_slot *csg_slot;
 	int ret;
 
 	lockdep_assert_held(&ptdev->scheduler->lock);
@@ -1009,9 +1032,7 @@ group_bind_locked(struct panthor_group *group, u32 csg_id)
 	if (ret)
 		return ret;
 
-	csg_slot = &ptdev->scheduler->csg_slots[csg_id];
 	group_get(group);
-	group->csg_id = csg_id;
 
 	/* Dummy doorbell allocation: doorbell is assigned to the group and
 	 * all queues use the same doorbell.
@@ -1023,7 +1044,10 @@ group_bind_locked(struct panthor_group *group, u32 csg_id)
 	for (u32 i = 0; i < group->queue_count; i++)
 		group->queues[i]->doorbell_id = csg_id + 1;
 
-	csg_slot->group = group;
+	scoped_guard(spinlock, &ptdev->scheduler->events_lock) {
+		ptdev->scheduler->csg_slots[csg_id].group = group;
+		group->csg_id = csg_id;
+	}
 
 	return 0;
 }
@@ -1038,7 +1062,6 @@ static int
 group_unbind_locked(struct panthor_group *group)
 {
 	struct panthor_device *ptdev = group->ptdev;
-	struct panthor_csg_slot *slot;
 
 	lockdep_assert_held(&ptdev->scheduler->lock);
 
@@ -1048,9 +1071,12 @@ group_unbind_locked(struct panthor_group *group)
 	if (drm_WARN_ON(&ptdev->base, group->state == PANTHOR_CS_GROUP_ACTIVE))
 		return -EINVAL;
 
-	slot = &ptdev->scheduler->csg_slots[group->csg_id];
+	scoped_guard(spinlock, &ptdev->scheduler->events_lock) {
+		ptdev->scheduler->csg_slots[group->csg_id].group = NULL;
+		group->csg_id = -1;
+	}
+
 	panthor_vm_idle(group->vm);
-	group->csg_id = -1;
 
 	/* Tiler OOM events will be re-issued next time the group is scheduled. */
 	atomic_set(&group->tiler_oom, 0);
@@ -1060,8 +1086,6 @@ group_unbind_locked(struct panthor_group *group)
 	for (u32 i = 0; i < group->queue_count; i++)
 		group->queues[i]->doorbell_id = -1;
 
-	slot->group = NULL;
-
 	group_put(group);
 	return 0;
 }
@@ -1079,8 +1103,9 @@ group_can_run(struct panthor_group *group)
 {
 	return group->state != PANTHOR_CS_GROUP_TERMINATED &&
 	       group->state != PANTHOR_CS_GROUP_UNKNOWN_STATE &&
-	       !group->destroyed && group->fatal_queues == 0 &&
-	       !group->timedout;
+	       !group->destroyed &&
+	       !atomic_read(&group->fatal_queues) &&
+	       !atomic_read(&group->timedout);
 }
 
 static bool
@@ -1482,7 +1507,7 @@ cs_slot_process_fatal_event_locked(struct panthor_device *ptdev,
 	u32 fatal;
 	u64 info;
 
-	lockdep_assert_held(&sched->lock);
+	lockdep_assert_held(&sched->events_lock);
 
 	cs_iface = panthor_fw_get_cs_iface(ptdev, csg_id, cs_id);
 	fatal = cs_iface->output->fatal;
@@ -1492,7 +1517,7 @@ cs_slot_process_fatal_event_locked(struct panthor_device *ptdev,
 		drm_warn(&ptdev->base, "CS_FATAL: pid=%d, comm=%s\n",
 			 group->task_info.pid, group->task_info.comm);
 
-		group->fatal_queues |= BIT(cs_id);
+		atomic_or(BIT(cs_id), &group->fatal_queues);
 	}
 
 	if (CS_EXCEPTION_TYPE(fatal) == DRM_PANTHOR_EXCEPTION_CS_UNRECOVERABLE) {
@@ -1530,7 +1555,7 @@ cs_slot_process_fault_event_locked(struct panthor_device *ptdev,
 	u32 fault;
 	u64 info;
 
-	lockdep_assert_held(&sched->lock);
+	lockdep_assert_held(&sched->events_lock);
 
 	cs_iface = panthor_fw_get_cs_iface(ptdev, csg_id, cs_id);
 	fault = cs_iface->output->fault;
@@ -1620,7 +1645,7 @@ static int group_process_tiler_oom(struct panthor_group *group, u32 cs_id)
 	 */
 	if (ret && ret != -ENOMEM) {
 		drm_warn(&ptdev->base, "Failed to extend the tiler heap\n");
-		group->fatal_queues |= BIT(cs_id);
+		atomic_or(BIT(cs_id), &group->fatal_queues);
 		sched_queue_delayed_work(sched, tick, 0);
 		goto out_put_heap_pool;
 	}
@@ -1680,7 +1705,7 @@ cs_slot_process_tiler_oom_event_locked(struct panthor_device *ptdev,
 	struct panthor_csg_slot *csg_slot = &sched->csg_slots[csg_id];
 	struct panthor_group *group = csg_slot->group;
 
-	lockdep_assert_held(&sched->lock);
+	lockdep_assert_held(&sched->events_lock);
 
 	if (drm_WARN_ON(&ptdev->base, !group))
 		return;
@@ -1701,7 +1726,7 @@ static bool cs_slot_process_irq_locked(struct panthor_device *ptdev,
 	struct panthor_fw_cs_iface *cs_iface;
 	u32 req, ack, events;
 
-	lockdep_assert_held(&ptdev->scheduler->lock);
+	lockdep_assert_held(&ptdev->scheduler->events_lock);
 
 	cs_iface = panthor_fw_get_cs_iface(ptdev, csg_id, cs_id);
 	req = cs_iface->input->req;
@@ -1729,7 +1754,7 @@ static void csg_slot_process_idle_event_locked(struct panthor_device *ptdev, u32
 {
 	struct panthor_scheduler *sched = ptdev->scheduler;
 
-	lockdep_assert_held(&sched->lock);
+	lockdep_assert_held(&sched->events_lock);
 
 	/* Schedule a tick so we can evict idle groups and schedule non-idle
 	 * ones. This will also update runtime PM and devfreq busy/idle states,
@@ -1744,7 +1769,7 @@ static void csg_slot_sync_update_locked(struct panthor_device *ptdev,
 	struct panthor_csg_slot *csg_slot = &ptdev->scheduler->csg_slots[csg_id];
 	struct panthor_group *group = csg_slot->group;
 
-	lockdep_assert_held(&ptdev->scheduler->lock);
+	lockdep_assert_held(&ptdev->scheduler->events_lock);
 
 	if (group)
 		group_queue_work(group, sync_upd);
@@ -1759,14 +1784,14 @@ csg_slot_process_progress_timer_event_locked(struct panthor_device *ptdev, u32 c
 	struct panthor_csg_slot *csg_slot = &sched->csg_slots[csg_id];
 	struct panthor_group *group = csg_slot->group;
 
-	lockdep_assert_held(&sched->lock);
+	lockdep_assert_held(&sched->events_lock);
 
 	group = csg_slot->group;
 	if (!drm_WARN_ON(&ptdev->base, !group)) {
 		drm_warn(&ptdev->base, "CSG_PROGRESS_TIMER_EVENT: pid=%d, comm=%s\n",
 			 group->task_info.pid, group->task_info.comm);
 
-		group->timedout = true;
+		atomic_set(&group->timedout, true);
 	}
 
 	drm_warn(&ptdev->base, "CSG slot %d progress timeout\n", csg_id);
@@ -1780,7 +1805,7 @@ static void sched_process_csg_irq_locked(struct panthor_device *ptdev, u32 csg_i
 	struct panthor_fw_csg_iface *csg_iface;
 	u32 ring_cs_db_mask = 0;
 
-	lockdep_assert_held(&ptdev->scheduler->lock);
+	lockdep_assert_held(&ptdev->scheduler->events_lock);
 
 	if (drm_WARN_ON(&ptdev->base, csg_id >= ptdev->scheduler->csg_slot_count))
 		return;
@@ -1838,7 +1863,7 @@ static void sched_process_idle_event_locked(struct panthor_device *ptdev)
 {
 	struct panthor_fw_global_iface *glb_iface = panthor_fw_get_glb_iface(ptdev);
 
-	lockdep_assert_held(&ptdev->scheduler->lock);
+	lockdep_assert_held(&ptdev->scheduler->events_lock);
 
 	/* Acknowledge the idle event and schedule a tick. */
 	panthor_fw_update_reqs(glb_iface, req, glb_iface->output->ack, GLB_IDLE);
@@ -1854,7 +1879,7 @@ static void sched_process_global_irq_locked(struct panthor_device *ptdev)
 	struct panthor_fw_global_iface *glb_iface = panthor_fw_get_glb_iface(ptdev);
 	u32 req, ack, evts;
 
-	lockdep_assert_held(&ptdev->scheduler->lock);
+	lockdep_assert_held(&ptdev->scheduler->events_lock);
 
 	req = READ_ONCE(glb_iface->input->req);
 	ack = READ_ONCE(glb_iface->output->ack);
@@ -1871,7 +1896,7 @@ static void process_fw_events_work(struct work_struct *work)
 	u32 events = atomic_xchg(&sched->fw_events, 0);
 	struct panthor_device *ptdev = sched->ptdev;
 
-	mutex_lock(&sched->lock);
+	guard(spinlock)(&sched->events_lock);
 
 	if (events & JOB_INT_GLOBAL_IF) {
 		sched_process_global_irq_locked(ptdev);
@@ -1884,8 +1909,6 @@ static void process_fw_events_work(struct work_struct *work)
 		sched_process_csg_irq_locked(ptdev, csg_id);
 		events &= ~BIT(csg_id);
 	}
-
-	mutex_unlock(&sched->lock);
 }
 
 /**
@@ -2132,11 +2155,12 @@ tick_ctx_init(struct panthor_scheduler *sched,
 		 * CSG IRQs, so we can flag the faulty queue.
 		 */
 		if (panthor_vm_has_unhandled_faults(group->vm)) {
-			sched_process_csg_irq_locked(ptdev, i);
+			scoped_guard(spinlock, &sched->events_lock)
+				sched_process_csg_irq_locked(ptdev, i);
 
 			/* No fatal fault reported, flag all queues as faulty. */
-			if (!group->fatal_queues)
-				group->fatal_queues |= GENMASK(group->queue_count - 1, 0);
+			atomic_cmpxchg(&group->fatal_queues, 0,
+				       GENMASK(group->queue_count - 1, 0));
 		}
 
 		tick_ctx_insert_old_group(sched, ctx, group);
@@ -2169,9 +2193,9 @@ group_term_post_processing(struct panthor_group *group)
 		struct panthor_syncobj_64b *syncobj;
 		int err;
 
-		if (group->fatal_queues & BIT(i))
+		if (atomic_read(&group->fatal_queues) & BIT(i))
 			err = -EINVAL;
-		else if (group->timedout)
+		else if (atomic_read(&group->timedout))
 			err = -ETIMEDOUT;
 		else
 			err = -ECANCELED;
@@ -2332,8 +2356,10 @@ tick_ctx_apply(struct panthor_scheduler *sched, struct panthor_sched_tick_ctx *c
 			 * any pending interrupts before we start the new
 			 * group.
 			 */
-			if (group->csg_id >= 0)
+			if (group->csg_id >= 0) {
+				guard(spinlock)(&sched->events_lock);
 				sched_process_csg_irq_locked(ptdev, group->csg_id);
+			}
 
 			group_unbind_locked(group);
 		}
@@ -2862,7 +2888,7 @@ void panthor_sched_suspend(struct panthor_device *ptdev)
 			/* We consider group suspension failures as fatal and flag the
 			 * group as unusable by setting timedout=true.
 			 */
-			csg_slot->group->timedout = true;
+			atomic_set(&csg_slot->group->timedout, true);
 
 			csgs_upd_ctx_queue_reqs(ptdev, &upd_ctx, csg_id,
 						CSG_STATE_TERMINATE,
@@ -2911,10 +2937,12 @@ void panthor_sched_suspend(struct panthor_device *ptdev)
 			u32 csg_id = ffs(slot_mask) - 1;
 			struct panthor_csg_slot *csg_slot = &sched->csg_slots[csg_id];
 
-			if (flush_caches_failed)
+			if (flush_caches_failed) {
 				csg_slot->group->state = PANTHOR_CS_GROUP_TERMINATED;
-			else
+			} else {
+				guard(spinlock)(&sched->events_lock);
 				csg_slot_sync_update_locked(ptdev, csg_id);
+			}
 
 			slot_mask &= ~BIT(csg_id);
 		}
@@ -2929,8 +2957,10 @@ void panthor_sched_suspend(struct panthor_device *ptdev)
 
 		group_get(group);
 
-		if (group->csg_id >= 0)
+		if (group->csg_id >= 0) {
+			guard(spinlock)(&sched->events_lock);
 			sched_process_csg_irq_locked(ptdev, group->csg_id);
+		}
 
 		group_unbind_locked(group);
 
@@ -3423,7 +3453,7 @@ queue_timedout_job(struct drm_sched_job *sched_job)
 	queue_stop(queue, job);
 
 	mutex_lock(&sched->lock);
-	group->timedout = true;
+	atomic_set(&group->timedout, true);
 	if (group->csg_id >= 0) {
 		sched_queue_delayed_work(ptdev->scheduler, tick, 0);
 	} else {
@@ -3846,12 +3876,13 @@ int panthor_group_get_state(struct panthor_file *pfile,
 	memset(get_state, 0, sizeof(*get_state));
 
 	mutex_lock(&sched->lock);
-	if (group->timedout)
+	if (atomic_read(&group->timedout))
 		get_state->state |= DRM_PANTHOR_GROUP_STATE_TIMEDOUT;
-	if (group->fatal_queues) {
+
+	get_state->fatal_queues = atomic_read(&group->fatal_queues);
+	if (get_state->fatal_queues)
 		get_state->state |= DRM_PANTHOR_GROUP_STATE_FATAL_FAULT;
-		get_state->fatal_queues = group->fatal_queues;
-	}
+
 	if (group->innocent)
 		get_state->state |= DRM_PANTHOR_GROUP_STATE_INNOCENT;
 	mutex_unlock(&sched->lock);
@@ -4149,6 +4180,8 @@ int panthor_sched_init(struct panthor_device *ptdev)
 	INIT_WORK(&sched->sync_upd_work, sync_upd_work);
 	INIT_WORK(&sched->fw_events_work, process_fw_events_work);
 
+	spin_lock_init(&sched->events_lock);
+
 	ret = drmm_mutex_init(&ptdev->base, &sched->lock);
 	if (ret)
 		return ret;

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 14/16] drm/panthor: Don't defer job completion checks
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (12 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 13/16] drm/panthor: Protect events processing with a separate spinlock Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 15/16] drm/panthor: Don't defer FW event processing Boris Brezillon
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon

Call group_check_job_completion() directly from
csg_slot_sync_update_locked() instead of deferring it.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_sched.c | 182 +++++++++++++++-----------------
 1 file changed, 87 insertions(+), 95 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index 2ef6c4f19388..eea80121212f 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -697,9 +697,6 @@ struct panthor_group {
 	 */
 	struct panthor_kernel_bo *protm_suspend_buf;
 
-	/** @sync_upd_work: Work used to check/signal job fences. */
-	struct work_struct sync_upd_work;
-
 	/** @tiler_oom_work: Work used to process tiler OOM events happening on this group. */
 	struct work_struct tiler_oom_work;
 
@@ -1763,6 +1760,92 @@ static void csg_slot_process_idle_event_locked(struct panthor_device *ptdev, u32
 	sched_queue_delayed_work(sched, tick, 0);
 }
 
+static void update_fdinfo_stats(struct panthor_job *job)
+{
+	struct panthor_group *group = job->group;
+	struct panthor_queue *queue = group->queues[job->queue_idx];
+	struct panthor_gpu_usage *fdinfo = &group->fdinfo.data;
+	struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
+	struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
+
+	scoped_guard(spinlock_irqsave, &group->fdinfo.lock) {
+		if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
+			fdinfo->cycles += data->cycles.after - data->cycles.before;
+		if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
+			fdinfo->time += data->time.after - data->time.before;
+	}
+}
+
+static bool queue_check_job_completion(struct panthor_queue *queue)
+{
+	struct panthor_syncobj_64b *syncobj = NULL;
+	struct panthor_job *job, *job_tmp;
+	bool cookie, progress = false;
+	LIST_HEAD(done_jobs);
+
+	cookie = dma_fence_begin_signalling();
+	spin_lock(&queue->fence_ctx.lock);
+	list_for_each_entry_safe(job, job_tmp, &queue->fence_ctx.in_flight_jobs, node) {
+		if (!syncobj) {
+			struct panthor_group *group = job->group;
+
+			syncobj = group->syncobjs->kmap +
+				  (job->queue_idx * sizeof(*syncobj));
+		}
+
+		if (syncobj->seqno < job->done_fence->seqno)
+			break;
+
+		list_move_tail(&job->node, &done_jobs);
+		dma_fence_signal_locked(job->done_fence);
+	}
+
+	if (list_empty(&queue->fence_ctx.in_flight_jobs)) {
+		/* If we have no job left, we cancel the timer, and reset remaining
+		 * time to its default so it can be restarted next time
+		 * queue_resume_timeout() is called.
+		 */
+		queue_suspend_timeout_locked(queue);
+
+		/* If there's no job pending, we consider it progress to avoid a
+		 * spurious timeout if the timeout handler and the sync update
+		 * handler raced.
+		 */
+		progress = true;
+	} else if (!list_empty(&done_jobs)) {
+		queue_reset_timeout_locked(queue);
+		progress = true;
+	}
+	spin_unlock(&queue->fence_ctx.lock);
+	dma_fence_end_signalling(cookie);
+
+	list_for_each_entry_safe(job, job_tmp, &done_jobs, node) {
+		if (job->profiling.mask)
+			update_fdinfo_stats(job);
+		list_del_init(&job->node);
+		panthor_job_put(&job->base);
+	}
+
+	return progress;
+}
+
+static void group_check_job_completion(struct panthor_group *group)
+{
+	u32 queue_idx;
+	bool cookie;
+
+	cookie = dma_fence_begin_signalling();
+	for (queue_idx = 0; queue_idx < group->queue_count; queue_idx++) {
+		struct panthor_queue *queue = group->queues[queue_idx];
+
+		if (!queue)
+			continue;
+
+		queue_check_job_completion(queue);
+	}
+	dma_fence_end_signalling(cookie);
+}
+
 static void csg_slot_sync_update_locked(struct panthor_device *ptdev,
 					u32 csg_id)
 {
@@ -1772,7 +1855,7 @@ static void csg_slot_sync_update_locked(struct panthor_device *ptdev,
 	lockdep_assert_held(&ptdev->scheduler->events_lock);
 
 	if (group)
-		group_queue_work(group, sync_upd);
+		group_check_job_completion(group);
 
 	sched_queue_work(ptdev->scheduler, sync_upd);
 }
@@ -3044,22 +3127,6 @@ void panthor_sched_post_reset(struct panthor_device *ptdev, bool reset_failed)
 	}
 }
 
-static void update_fdinfo_stats(struct panthor_job *job)
-{
-	struct panthor_group *group = job->group;
-	struct panthor_queue *queue = group->queues[job->queue_idx];
-	struct panthor_gpu_usage *fdinfo = &group->fdinfo.data;
-	struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
-	struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
-
-	scoped_guard(spinlock, &group->fdinfo.lock) {
-		if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
-			fdinfo->cycles += data->cycles.after - data->cycles.before;
-		if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
-			fdinfo->time += data->time.after - data->time.before;
-	}
-}
-
 void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
 {
 	struct panthor_group_pool *gpool = pfile->groups;
@@ -3080,80 +3147,6 @@ void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
 	xa_unlock(&gpool->xa);
 }
 
-static bool queue_check_job_completion(struct panthor_queue *queue)
-{
-	struct panthor_syncobj_64b *syncobj = NULL;
-	struct panthor_job *job, *job_tmp;
-	bool cookie, progress = false;
-	LIST_HEAD(done_jobs);
-
-	cookie = dma_fence_begin_signalling();
-	spin_lock(&queue->fence_ctx.lock);
-	list_for_each_entry_safe(job, job_tmp, &queue->fence_ctx.in_flight_jobs, node) {
-		if (!syncobj) {
-			struct panthor_group *group = job->group;
-
-			syncobj = group->syncobjs->kmap +
-				  (job->queue_idx * sizeof(*syncobj));
-		}
-
-		if (syncobj->seqno < job->done_fence->seqno)
-			break;
-
-		list_move_tail(&job->node, &done_jobs);
-		dma_fence_signal_locked(job->done_fence);
-	}
-
-	if (list_empty(&queue->fence_ctx.in_flight_jobs)) {
-		/* If we have no job left, we cancel the timer, and reset remaining
-		 * time to its default so it can be restarted next time
-		 * queue_resume_timeout() is called.
-		 */
-		queue_suspend_timeout_locked(queue);
-
-		/* If there's no job pending, we consider it progress to avoid a
-		 * spurious timeout if the timeout handler and the sync update
-		 * handler raced.
-		 */
-		progress = true;
-	} else if (!list_empty(&done_jobs)) {
-		queue_reset_timeout_locked(queue);
-		progress = true;
-	}
-	spin_unlock(&queue->fence_ctx.lock);
-	dma_fence_end_signalling(cookie);
-
-	list_for_each_entry_safe(job, job_tmp, &done_jobs, node) {
-		if (job->profiling.mask)
-			update_fdinfo_stats(job);
-		list_del_init(&job->node);
-		panthor_job_put(&job->base);
-	}
-
-	return progress;
-}
-
-static void group_sync_upd_work(struct work_struct *work)
-{
-	struct panthor_group *group =
-		container_of(work, struct panthor_group, sync_upd_work);
-	u32 queue_idx;
-	bool cookie;
-
-	cookie = dma_fence_begin_signalling();
-	for (queue_idx = 0; queue_idx < group->queue_count; queue_idx++) {
-		struct panthor_queue *queue = group->queues[queue_idx];
-
-		if (!queue)
-			continue;
-
-		queue_check_job_completion(queue);
-	}
-	dma_fence_end_signalling(cookie);
-
-	group_put(group);
-}
-
 struct panthor_job_ringbuf_instrs {
 	u64 buffer[MAX_INSTRS_PER_JOB];
 	u32 count;
@@ -3721,7 +3714,6 @@ int panthor_group_create(struct panthor_file *pfile,
 	INIT_LIST_HEAD(&group->wait_node);
 	INIT_LIST_HEAD(&group->run_node);
 	INIT_WORK(&group->term_work, group_term_work);
-	INIT_WORK(&group->sync_upd_work, group_sync_upd_work);
 	INIT_WORK(&group->tiler_oom_work, group_tiler_oom_work);
 	INIT_WORK(&group->release_work, group_release_work);
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 15/16] drm/panthor: Don't defer FW event processing
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (13 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 14/16] drm/panthor: Don't defer job completion checks Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25  9:36 ` [PATCH v5 16/16] drm/panthor: Automate CSG IRQ processing at group unbind time Boris Brezillon
  2026-06-25 12:43 ` [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon

Avoid a workqueue roundtrip and process things immediately from
panthor_sched_report_fw_events().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_sched.c | 48 +++++++--------------------------
 1 file changed, 9 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index eea80121212f..2a7cd88d012a 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -177,23 +177,6 @@ struct panthor_scheduler {
 	 */
 	struct work_struct sync_upd_work;
 
-	/**
-	 * @fw_events_work: Work used to process FW events outside the interrupt path.
-	 *
-	 * Even if the interrupt is threaded, we need any event processing
-	 * that require taking the panthor_scheduler::lock to be processed
-	 * outside the interrupt path so we don't block the tick logic when
-	 * it calls panthor_fw_{csg,wait}_wait_acks(). Since most of the
-	 * event processing requires taking this lock, we just delegate all
-	 * FW event processing to the scheduler workqueue.
-	 */
-	struct work_struct fw_events_work;
-
-	/**
-	 * @fw_events: Bitmask encoding pending FW events.
-	 */
-	atomic_t fw_events;
-
 	/**
 	 * @resched_target: When the next tick should occur.
 	 *
@@ -1972,14 +1955,17 @@ static void sched_process_global_irq_locked(struct panthor_device *ptdev)
 		sched_process_idle_event_locked(ptdev);
 }
 
-static void process_fw_events_work(struct work_struct *work)
+/**
+ * panthor_sched_report_fw_events() - Report FW events to the scheduler.
+ * @ptdev: Device.
+ * @events: Bitmask of pending FW events to report.
+ */
+void panthor_sched_report_fw_events(struct panthor_device *ptdev, u32 events)
 {
-	struct panthor_scheduler *sched = container_of(work, struct panthor_scheduler,
-						      fw_events_work);
-	u32 events = atomic_xchg(&sched->fw_events, 0);
-	struct panthor_device *ptdev = sched->ptdev;
+	if (!ptdev->scheduler)
+		return;
 
-	guard(spinlock)(&sched->events_lock);
+	guard(spinlock)(&ptdev->scheduler->events_lock);
 
 	if (events & JOB_INT_GLOBAL_IF) {
 		sched_process_global_irq_locked(ptdev);
@@ -1994,20 +1980,6 @@ static void process_fw_events_work(struct work_struct *work)
 	}
 }
 
-/**
- * panthor_sched_report_fw_events() - Report FW events to the scheduler.
- * @ptdev: Device.
- * @events: Bitmask of pending FW events to report.
- */
-void panthor_sched_report_fw_events(struct panthor_device *ptdev, u32 events)
-{
-	if (!ptdev->scheduler)
-		return;
-
-	atomic_or(events, &ptdev->scheduler->fw_events);
-	sched_queue_work(ptdev->scheduler, fw_events);
-}
-
 static const char *fence_get_driver_name(struct dma_fence *fence)
 {
 	return "panthor";
@@ -4085,7 +4057,6 @@ void panthor_sched_unplug(struct panthor_device *ptdev)
 	struct panthor_scheduler *sched = ptdev->scheduler;
 
 	disable_delayed_work_sync(&sched->tick_work);
-	disable_work_sync(&sched->fw_events_work);
 	disable_work_sync(&sched->sync_upd_work);
 
 	mutex_lock(&sched->lock);
@@ -4170,7 +4141,6 @@ int panthor_sched_init(struct panthor_device *ptdev)
 	sched->tick_period = msecs_to_jiffies(10);
 	INIT_DELAYED_WORK(&sched->tick_work, tick_work);
 	INIT_WORK(&sched->sync_upd_work, sync_upd_work);
-	INIT_WORK(&sched->fw_events_work, process_fw_events_work);
 
 	spin_lock_init(&sched->events_lock);
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 16/16] drm/panthor: Automate CSG IRQ processing at group unbind time
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (14 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 15/16] drm/panthor: Don't defer FW event processing Boris Brezillon
@ 2026-06-25  9:36 ` Boris Brezillon
  2026-06-25 12:43 ` [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25  9:36 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, Boris Brezillon

Make the sched_process_csg_irq_locked() call part of
group_unbind_locked() so we don't have to manually call it in
tick_ctx_apply()/panthor_sched_suspend().

This implies moving group_[un]bind_locked() around to avoid a
forward declaration.

Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_sched.c | 182 +++++++++++++++-----------------
 1 file changed, 84 insertions(+), 98 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index 2a7cd88d012a..7f587e110ebd 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -989,87 +989,6 @@ group_get(struct panthor_group *group)
 	return group;
 }
 
-/**
- * group_bind_locked() - Bind a group to a group slot
- * @group: Group.
- * @csg_id: Slot.
- *
- * Return: 0 on success, a negative error code otherwise.
- */
-static int
-group_bind_locked(struct panthor_group *group, u32 csg_id)
-{
-	struct panthor_device *ptdev = group->ptdev;
-	int ret;
-
-	lockdep_assert_held(&ptdev->scheduler->lock);
-
-	if (drm_WARN_ON(&ptdev->base, group->csg_id != -1 || csg_id >= MAX_CSGS ||
-			ptdev->scheduler->csg_slots[csg_id].group))
-		return -EINVAL;
-
-	ret = panthor_vm_active(group->vm);
-	if (ret)
-		return ret;
-
-	group_get(group);
-
-	/* Dummy doorbell allocation: doorbell is assigned to the group and
-	 * all queues use the same doorbell.
-	 *
-	 * TODO: Implement LRU-based doorbell assignment, so the most often
-	 * updated queues get their own doorbell, thus avoiding useless checks
-	 * on queues belonging to the same group that are rarely updated.
-	 */
-	for (u32 i = 0; i < group->queue_count; i++)
-		group->queues[i]->doorbell_id = csg_id + 1;
-
-	scoped_guard(spinlock, &ptdev->scheduler->events_lock) {
-		ptdev->scheduler->csg_slots[csg_id].group = group;
-		group->csg_id = csg_id;
-	}
-
-	return 0;
-}
-
-/**
- * group_unbind_locked() - Unbind a group from a slot.
- * @group: Group to unbind.
- *
- * Return: 0 on success, a negative error code otherwise.
- */
-static int
-group_unbind_locked(struct panthor_group *group)
-{
-	struct panthor_device *ptdev = group->ptdev;
-
-	lockdep_assert_held(&ptdev->scheduler->lock);
-
-	if (drm_WARN_ON(&ptdev->base, group->csg_id < 0 || group->csg_id >= MAX_CSGS))
-		return -EINVAL;
-
-	if (drm_WARN_ON(&ptdev->base, group->state == PANTHOR_CS_GROUP_ACTIVE))
-		return -EINVAL;
-
-	scoped_guard(spinlock, &ptdev->scheduler->events_lock) {
-		ptdev->scheduler->csg_slots[group->csg_id].group = NULL;
-		group->csg_id = -1;
-	}
-
-	panthor_vm_idle(group->vm);
-
-	/* Tiler OOM events will be re-issued next time the group is scheduled. */
-	atomic_set(&group->tiler_oom, 0);
-	if (cancel_work(&group->tiler_oom_work))
-		group_put(group);
-
-	for (u32 i = 0; i < group->queue_count; i++)
-		group->queues[i]->doorbell_id = -1;
-
-	group_put(group);
-	return 0;
-}
-
 static bool
 group_is_idle(struct panthor_group *group)
 {
@@ -1980,6 +1899,89 @@ void panthor_sched_report_fw_events(struct panthor_device *ptdev, u32 events)
 	}
 }
 
+/**
+ * group_bind_locked() - Bind a group to a group slot
+ * @group: Group.
+ * @csg_id: Slot.
+ *
+ * Return: 0 on success, a negative error code otherwise.
+ */
+static int
+group_bind_locked(struct panthor_group *group, u32 csg_id)
+{
+	struct panthor_device *ptdev = group->ptdev;
+	int ret;
+
+	lockdep_assert_held(&ptdev->scheduler->lock);
+
+	if (drm_WARN_ON(&ptdev->base, group->csg_id != -1 || csg_id >= MAX_CSGS ||
+			ptdev->scheduler->csg_slots[csg_id].group))
+		return -EINVAL;
+
+	ret = panthor_vm_active(group->vm);
+	if (ret)
+		return ret;
+
+	group_get(group);
+
+	/* Dummy doorbell allocation: doorbell is assigned to the group and
+	 * all queues use the same doorbell.
+	 *
+	 * TODO: Implement LRU-based doorbell assignment, so the most often
+	 * updated queues get their own doorbell, thus avoiding useless checks
+	 * on queues belonging to the same group that are rarely updated.
+	 */
+	for (u32 i = 0; i < group->queue_count; i++)
+		group->queues[i]->doorbell_id = csg_id + 1;
+
+	scoped_guard(spinlock, &ptdev->scheduler->events_lock) {
+		ptdev->scheduler->csg_slots[csg_id].group = group;
+		group->csg_id = csg_id;
+	}
+
+	return 0;
+}
+
+/**
+ * group_unbind_locked() - Unbind a group from a slot.
+ * @group: Group to unbind.
+ *
+ * Return: 0 on success, a negative error code otherwise.
+ */
+static int
+group_unbind_locked(struct panthor_group *group)
+{
+	struct panthor_device *ptdev = group->ptdev;
+
+	lockdep_assert_held(&ptdev->scheduler->lock);
+
+	if (drm_WARN_ON(&ptdev->base, group->csg_id < 0 || group->csg_id >= MAX_CSGS))
+		return -EINVAL;
+
+	if (drm_WARN_ON(&ptdev->base, group->state == PANTHOR_CS_GROUP_ACTIVE))
+		return -EINVAL;
+
+	scoped_guard(spinlock, &ptdev->scheduler->events_lock) {
+		/* Process all pending IRQs before returning the slot. */
+		sched_process_csg_irq_locked(ptdev, group->csg_id);
+		ptdev->scheduler->csg_slots[group->csg_id].group = NULL;
+		group->csg_id = -1;
+	}
+
+	panthor_vm_idle(group->vm);
+
+	/* Tiler OOM events will be re-issued next time the group is scheduled. */
+	atomic_set(&group->tiler_oom, 0);
+	if (cancel_work(&group->tiler_oom_work))
+		group_put(group);
+
+	for (u32 i = 0; i < group->queue_count; i++)
+		group->queues[i]->doorbell_id = -1;
+
+	group_put(group);
+	return 0;
+}
+
 static const char *fence_get_driver_name(struct dma_fence *fence)
 {
 	return "panthor";
@@ -2406,18 +2408,8 @@ tick_ctx_apply(struct panthor_scheduler *sched, struct panthor_sched_tick_ctx *c
 
 	/* Unbind evicted groups. */
 	for (prio = PANTHOR_CSG_PRIORITY_COUNT - 1; prio >= 0; prio--) {
-		list_for_each_entry(group, &ctx->old_groups[prio], run_node) {
-			/* This group is gone. Process interrupts to clear
-			 * any pending interrupts before we start the new
-			 * group.
-			 */
-			if (group->csg_id >= 0) {
-				guard(spinlock)(&sched->events_lock);
-				sched_process_csg_irq_locked(ptdev, group->csg_id);
-			}
-
+		list_for_each_entry(group, &ctx->old_groups[prio], run_node)
 			group_unbind_locked(group);
-		}
 	}
 
 	for (i = 0; i < sched->csg_slot_count; i++) {
@@ -3011,12 +3003,6 @@ void panthor_sched_suspend(struct panthor_device *ptdev)
 			continue;
 
 		group_get(group);
-
-		if (group->csg_id >= 0) {
-			guard(spinlock)(&sched->events_lock);
-			sched_process_csg_irq_locked(ptdev, group->csg_id);
-		}
-
 		group_unbind_locked(group);
 
 		drm_WARN_ON(&group->ptdev->base, !list_empty(&group->run_node));

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 04/16] drm/panthor: Drop a needless check in panthor_fw_unplug()
  2026-06-25  9:36 ` [PATCH v5 04/16] drm/panthor: Drop a needless check in panthor_fw_unplug() Boris Brezillon
@ 2026-06-25 10:00   ` sashiko-bot
  0 siblings, 0 replies; 21+ messages in thread
From: sashiko-bot @ 2026-06-25 10:00 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] The reset work `panthor_device_reset_work` can execute before device initialization completes, leading to a NULL pointer dereference.
--

commit b7f6d51a0860da42553b0a2ac207a7c13c51cd47
Author: Boris Brezillon <boris.brezillon@collabora.com>

drm/panthor: Drop a needless check in panthor_fw_unplug()

panthor_fw_unplug() is only called if we at least managed to initialize
the IRQ, so it's safe to drop the "is IRQ initialized" check.

> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index 986151681b246..4fbddb9e18c8b 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -1279,9 +1279,7 @@ void panthor_fw_unplug(struct panthor_device *ptdev)
>  
>  	if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev)) {
>  		/* Make sure the IRQ handler cannot be called after that point. */
> -		if (ptdev->fw->irq.irq)
> -			panthor_job_irq_suspend(&ptdev->fw->irq);
> -
> +		panthor_job_irq_suspend(&ptdev->fw->irq);
>  		panthor_fw_stop(ptdev);
>  	}
>  

[Severity: High]
This is a pre-existing issue, but can the reset work panthor_device_reset_work
execute before device initialization completes?

During device initialization in panthor_device_init(), panthor_fw_init() is
called, which invokes panthor_vm_active(). This in turn calls
panthor_mmu_as_enable() and as_send_cmd_and_wait().

If the MMU AS_ACTIVE bit gets stuck and times out,
panthor_device_schedule_reset() is called. Because the device PM state has
already been transitioned to ACTIVE earlier via pm_runtime_resume_and_get(),
the reset work is queued and executes asynchronously.

The reset worker unconditionally calls panthor_sched_pre_reset(), which
immediately dereferences ptdev->scheduler:

drivers/gpu/drm/panthor/panthor_sched.c:panthor_sched_pre_reset() {
        struct panthor_scheduler *sched = ptdev->scheduler;
        u32 i;

        mutex_lock(&sched->reset.lock);
        ...
}

Since ptdev->scheduler is only allocated later in the probe sequence by
panthor_sched_init(), could this result in a NULL pointer dereference if the
reset happens during the probe window?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260625-panthor-signal-from-irq-v5-0-8836a74e0ef9@collabora.com?part=4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 02/16] drm/panthor: Don't overrule pending immediate ticks in sched_resume_tick()
  2026-06-25  9:36 ` [PATCH v5 02/16] drm/panthor: Don't overrule pending immediate ticks in sched_resume_tick() Boris Brezillon
@ 2026-06-25 10:04   ` sashiko-bot
  0 siblings, 0 replies; 21+ messages in thread
From: sashiko-bot @ 2026-06-25 10:04 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] Dereference of ERR_PTR in panthor_heap_pool_put() during tiler OOM handling.
- [High] Asynchronous cleanup of scheduling groups executes without holding a DRM device reference, leading to a Use-After-Free of panthor_device.
- [High] Missing bounds check on synchronization object offset allows out-of-bounds read.
--

commit eccb99f3f3029758d8c3f80d0f349c280e91b3c2
Author: Boris Brezillon <boris.brezillon@collabora.com>

drm/panthor: Don't overrule pending immediate ticks in sched_resume_tick()

We schedule immediate ticks when we need to process events on CSGs,
but those immediate ticks don't change the resched_target because we
want the other groups to stay scheduled for the remaining of the GPU
timeslot they were given. Make sure these immediate ticks don't get
overruled by a sched_queue_delayed_work() that would delay the tick
execution.

Fixes: 99820b4b7e50 ("drm/panthor: Make sure we resume the tick when new jobs are submitted")
Reported-by: sashiko-bot@kernel.org
Closes: https://sashiko.dev/#/patchset/20260625-panthor-signal-from-irq-v4-0-3d2908912afa@collabora.com?part=9

> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index 5b34032deff81..1913bc8a62977 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -2668,7 +2668,14 @@ static void sched_resume_tick(struct panthor_device *ptdev)
>  	else
>  		delay_jiffies = 0;
>  
> -	sched_queue_delayed_work(sched, tick, delay_jiffies);
> +	/* We schedule immediate ticks when we need to process events on CSGs,
> +	 * but those don't change the resched_target because we want the other
> +	 * groups to stay scheduled for the remaining of the GPU timeslot they
> +	 * were given. Make sure those immediate ticks don't get overruled by
> +	 * a sched_queue_delayed_work() that would delay the tick execution.
> +	 */
> +	if (!delayed_work_pending(&sched->tick_work))
> +		sched_queue_delayed_work(sched, tick, delay_jiffies);
>  }

[Severity: High]
This isn't a bug introduced by this patch, but we noticed a potential
out-of-bounds read nearby. In panthor_queue_get_syncwait_obj(), the
return value of panthor_gem_sync() is ignored:

drivers/gpu/drm/panthor/panthor_sched.c:panthor_queue_get_syncwait_obj() {
    ...
	panthor_gem_sync(&bo->base,
			 DRM_PANTHOR_BO_SYNC_CPU_CACHE_FLUSH_AND_INVALIDATE,
			 queue->syncwait.offset,
			 queue->syncwait.sync64 ?
			 sizeof(struct panthor_syncobj_64b) :
			 sizeof(struct panthor_syncobj_32b));

	return queue->syncwait.kmap + queue->syncwait.offset;
}

If panthor_gem_sync() detects the offset is out of bounds and returns
-EINVAL, the function ignores the error and still returns the
pointer, which points outside the buffer boundary.

Could this allow panthor_queue_eval_syncwait() to dereference an
out-of-bounds pointer when it reads syncobj->sync64.seqno?

[Severity: High]
This is a pre-existing issue, but is it safe for group_release() to queue
asynchronous cleanup work without pinning the DRM device?

drivers/gpu/drm/panthor/panthor_sched.c:group_release() {
	...
	queue_work(panthor_cleanup_wq, &group->release_work);
}

When the DRM device file is closed and panthor_postclose() executes group
destruction, it doesn't flush the panthor_cleanup_wq workqueue. If the
device is hot-unplugged, the DRM core might free the panthor_device.

When group_release_work() eventually executes later, does this lead to a
use-after-free when it accesses the freed panthor_device during cleanup
(e.g., in panthor_kernel_bo_destroy())?

[Severity: High]
This is a pre-existing issue, but we noticed a potential pointer dereference
issue in the tiler OOM handling path. In group_process_tiler_oom():

drivers/gpu/drm/panthor/panthor_sched.c:group_process_tiler_oom() {
    ...
	heaps = panthor_vm_get_heap_pool(group->vm, false);
    ...
	if (IS_ERR(heaps) || frag_end > vt_end || vt_end >= vt_start) {
		ret = -EINVAL;
    ...
out_put_heap_pool:
	panthor_heap_pool_put(heaps);
	return ret;
}

If heaps is an ERR_PTR (like ERR_PTR(-ENOENT)), the code correctly sets
ret = -EINVAL but later jumps to out_put_heap_pool and calls
panthor_heap_pool_put().

Since panthor_heap_pool_put() only checks if (pool) and doesn't check
IS_ERR(pool), wouldn't this try to dereference an invalid error pointer
in kref_put()?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260625-panthor-signal-from-irq-v5-0-8836a74e0ef9@collabora.com?part=2

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 11/16] drm/panthor: Don't update might_have_idle_groups in process_idle_event_locked()
  2026-06-25  9:36 ` [PATCH v5 11/16] drm/panthor: Don't update might_have_idle_groups in process_idle_event_locked() Boris Brezillon
@ 2026-06-25 10:06   ` sashiko-bot
  0 siblings, 0 replies; 21+ messages in thread
From: sashiko-bot @ 2026-06-25 10:06 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] The panthor scheduler acquires `queue->fence_ctx.lock` using `spin_lock()` instead of `spin_lock_irqsave()`, violating the `dma_fence` locking rules and risking a deadlock against hardware interrupts.
--

commit 3ecce8ed069220d684864af5f10bd10ad0ae008b
Author: Boris Brezillon <boris.brezillon@collabora.com>

drm/panthor: Don't update might_have_idle_groups in process_idle_event_locked()

By scheduling an immediate tick, we already force idleness re-evaluation,
which gives the scheduler the opportunity to evict idle groups
and schedule onces that have jobs pending.

> and schedule onces that have jobs pending.

[Severity: Low]
This isn't a bug, but there appears to be a typo (onces instead of ones)
in the commit message.

> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index 237f6a75e624c..a5dfb1beafff8 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -1734,8 +1734,6 @@ static void csg_slot_process_idle_event_locked(struct panthor_device *ptdev, u32
>  
>  	lockdep_assert_held(&sched->lock);
>  
> -	sched->might_have_idle_groups = true;
> -

[Severity: High]
This is a pre-existing issue, but looking at panthor_sched.c,
queue->fence_ctx.lock is acquired using plain spin_lock() in multiple
functions:

queue_suspend_timeout()
queue_resume_timeout()
cs_slot_process_fault_event_locked()
queue_check_job_completion()
group_term_post_processing()
queue_run_job()

For example, in queue_check_job_completion():

    spin_lock(&queue->fence_ctx.lock);
    ...

This lock is used as the dma_fence lock for the job's done_fence. Since
dma_fence objects can be shared with other subsystems (like KMS display
drivers), they may be accessed from hardirq context (such as a VBLANK
interrupt calling dma_fence_is_signaled() or dma_fence_add_callback(),
which attempt to acquire the fence lock).

If a hardware interrupt occurs on the same CPU while the panthor driver
holds this lock with interrupts enabled, could the interrupt handler spin
forever attempting to acquire the lock, causing a deadlock?

Should these acquisitions be updated to use spin_lock_irqsave() to comply
with dma_fence locking rules?

>  	/* Schedule a tick so we can evict idle groups and schedule non-idle
>  	 * ones. This will also update runtime PM and devfreq busy/idle states,
>  	 * so the device can lower its frequency or get suspended.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260625-panthor-signal-from-irq-v5-0-8836a74e0ef9@collabora.com?part=11

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency
  2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
                   ` (15 preceding siblings ...)
  2026-06-25  9:36 ` [PATCH v5 16/16] drm/panthor: Automate CSG IRQ processing at group unbind time Boris Brezillon
@ 2026-06-25 12:43 ` Boris Brezillon
  16 siblings, 0 replies; 21+ messages in thread
From: Boris Brezillon @ 2026-06-25 12:43 UTC (permalink / raw)
  To: Steven Price, Liviu Dudau, Chia-I Wu
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, dri-devel, linux-kernel, sashiko-bot

On Thu, 25 Jun 2026 11:36:15 +0200
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> Right now, panthor is one of the rare drivers to signal fences
> from work items (not even from the threaded IRQ handler). We
> tried moving the job_completion check to hardirq handlers like
> other drivers do, but the duration of this handler gets
> slightly over the few usec (20+ usecs) we usually expect from
> hardird handlers, and we're not sure we want to hold off the
> processing of other interrupts for that long. So this series
> just gets rid of the threaded-handler -> work_item indirection
> and checks for job completion (and thus, fence signalling)
> directly in the threaded handler.
> 
> Sorry for the high submission rate (v4 was sent this morning),
> but I'd like get the remaining blockers out of the way, and
> shashiko keeps finding new legitimate issues :-).
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> Changes in v5:
> - Add a fix for a theoretical IOMEM access in suspended state (patch 1)
> - Make sure we don't delay a pending immediate tick in
>   sched_resume_tick() (patch 2)
> - Make sure we initialize panthor_irq::state properly in the irq_request
>   helper
> - Link to v4: https://lore.kernel.org/r/20260625-panthor-signal-from-irq-v4-0-3d2908912afa@collabora.com
> 
> Changes in v4:
> - Add a bunch of fixes for bugs reported by shashiko
> - Link to v3: https://lore.kernel.org/r/20260623-panthor-signal-from-irq-v3-0-2ece396f8ee0@collabora.com
> 
> Changes in v3:
> - Save/restore the irq state in the raw handler.
> - Rename panthor_irq::mask_lock into panthor_irq::lock
> - Use the __always_inline specifier on
>   panthor_irq_default_threaded_handler()
> - Use devm_request_threaded_irq() even when the threaded handler is
>   NULL
> - Drop the patch that dynamically enables request-related interrupts
>   (FW-side race) after the polling period has expired
> - Don't process FW events from the hardirq handler (too heavy for an
>   hardirq handler according to our testing)
> - Link to v2: https://lore.kernel.org/r/20260512-panthor-signal-from-irq-v2-0-95c614a739cb@collabora.com
> 
> Changes in v2:
> - Fix commit message in patch 4
> - Move devm_kasprintf() before panthor_irq_resume() in patch 3
> - Fix erroneous lockdep_assert_held() in patch 6
> - Make sure events_lock is held when calling
>   csg_slot_sync_update_locked() in patch 6
> - Restore a csg_slot_sync_update_locked() call in patch 7
> - Fix a potential deadlock in patch 9
> - Drop the IRQ coalescing patch (formerly patch 10)
> - Change panthor_irq_request() so we don't have to define a dummy
>   threaded handler, and we can let RT kernels move the hard handler
>   to a thread
> - Add patches to transition GPU event processing to the hard IRQ handler
> - Link to v1: https://lore.kernel.org/r/20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com
> 
> ---
> Boris Brezillon (16):
>       drm/panthor: Fix theoretical IOMEM access in suspended state
>       drm/panthor: Don't overrule pending immediate ticks in sched_resume_tick()
>       drm/panthor: Fix panthor_pwr_unplug()
>       drm/panthor: Drop a needless check in panthor_fw_unplug()
>       drm/panthor: Fix a leak when a group is evicted before the tiler OOM is serviced
>       drm/panthor: Interrupt group start/resumption if group_bind_locked() fails
>       drm/panthor: Keep interrupts masked until they are needed

If you're reviewing this version, please ignore anything coming before
patch 8, since those have been re-posted as a separate misc-fixes
series[1].

Thanks,

Boris

>       drm/panthor: Make panthor_irq::state a non-atomic field
>       drm/panthor: Move the register accessors before the IRQ helpers
>       drm/panthor: Replace the panthor_irq macro machinery by inline helpers
>       drm/panthor: Don't update might_have_idle_groups in process_idle_event_locked()
>       drm/panthor: Get rid of panthor_group::fatal_lock
>       drm/panthor: Protect events processing with a separate spinlock
>       drm/panthor: Don't defer job completion checks
>       drm/panthor: Don't defer FW event processing
>       drm/panthor: Automate CSG IRQ processing at group unbind time

[1]https://lore.kernel.org/dri-devel/20260625-panthor-misc-fixes-v1-0-b67ed973fea6@collabora.com/T/#t

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-06-25 12:43 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-25  9:36 [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 01/16] drm/panthor: Fix theoretical IOMEM access in suspended state Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 02/16] drm/panthor: Don't overrule pending immediate ticks in sched_resume_tick() Boris Brezillon
2026-06-25 10:04   ` sashiko-bot
2026-06-25  9:36 ` [PATCH v5 03/16] drm/panthor: Fix panthor_pwr_unplug() Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 04/16] drm/panthor: Drop a needless check in panthor_fw_unplug() Boris Brezillon
2026-06-25 10:00   ` sashiko-bot
2026-06-25  9:36 ` [PATCH v5 05/16] drm/panthor: Fix a leak when a group is evicted before the tiler OOM is serviced Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 06/16] drm/panthor: Interrupt group start/resumption if group_bind_locked() fails Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 07/16] drm/panthor: Keep interrupts masked until they are needed Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 08/16] drm/panthor: Make panthor_irq::state a non-atomic field Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 09/16] drm/panthor: Move the register accessors before the IRQ helpers Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 10/16] drm/panthor: Replace the panthor_irq macro machinery by inline helpers Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 11/16] drm/panthor: Don't update might_have_idle_groups in process_idle_event_locked() Boris Brezillon
2026-06-25 10:06   ` sashiko-bot
2026-06-25  9:36 ` [PATCH v5 12/16] drm/panthor: Get rid of panthor_group::fatal_lock Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 13/16] drm/panthor: Protect events processing with a separate spinlock Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 14/16] drm/panthor: Don't defer job completion checks Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 15/16] drm/panthor: Don't defer FW event processing Boris Brezillon
2026-06-25  9:36 ` [PATCH v5 16/16] drm/panthor: Automate CSG IRQ processing at group unbind time Boris Brezillon
2026-06-25 12:43 ` [PATCH v5 00/16] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.