* [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency
@ 2026-05-12 11:37 Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 01/11] drm/panthor: Make panthor_irq::state a non-atomic field Boris Brezillon
` (10 more replies)
0 siblings, 11 replies; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
Right now, panthor is one of the rare drivers to signal fences
from work items (not even from the threaded IRQ handler). We
could move that to the threaded handler, but that would still
leave the latency caused by the scheduling of the IRQ thread.
Instead, this patchset moves all the JOB/GPU IRQ processing to
the raw IRQ handler, which is fine because what the current
code does is demux the interrupts and defer actual handling
to sub work items. The only non-trivial thing we keep in the
IRQ path is the dma_fence signalling, which should be acceptable
in term of CPU cycles burnt in IRQ context.
Note that the MMU event handling is left in a threaded handler
because it requires acquiring sleepable locks and fixing that
is non-trivial.
Still very basic testing done, but glmark2 and gfxbench's
manhattan test show a ~5% perf improvement on a rk3588 with this
patchset applied.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
Changes in v2:
- Fix commit message in patch 4
- Move devm_kasprintf() before panthor_irq_resume() in patch 3
- Fix erroneous lockdep_assert_held() in patch 6
- Make sure events_lock is held when calling
csg_slot_sync_update_locked() in patch 6
- Restore a csg_slot_sync_update_locked() call in patch 7
- Fix a potential deadlock in patch 9
- Drop the IRQ coalescing patch (formerly patch 10)
- Change panthor_irq_request() so we don't have to define a dummy
threaded handler, and we can let RT kernels move the hard handler
to a thread
- Add patches to transition GPU event processing to the hard IRQ handler
- Link to v1: https://lore.kernel.org/r/20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com
---
Boris Brezillon (11):
drm/panthor: Make panthor_irq::state a non-atomic field
drm/panthor: Move the register accessors before the IRQ helpers
drm/panthor: Replace the panthor_irq macro machinery by inline helpers
drm/panthor: Extend the IRQ logic to allow fast/hard IRQ handlers
drm/panthor: Make panthor_fw_{update,toggle}_reqs() callable from IRQ context
drm/panthor: Prepare the scheduler logic for FW events in IRQ context
drm/panthor: Automate CSG IRQ processing at group unbind time
drm/panthor: Automatically enable interrupts in panthor_fw_wait_acks()
drm/panthor: Process FW events in IRQ context
drm/panthor: Use the irqsave variant of spin_lock in panthor_gpu_irq_handler()
drm/panthor: Process GPU events in IRQ context
drivers/gpu/drm/panthor/panthor_device.h | 281 +++++++++---------
drivers/gpu/drm/panthor/panthor_fw.c | 76 +++--
drivers/gpu/drm/panthor/panthor_fw.h | 9 +-
drivers/gpu/drm/panthor/panthor_gpu.c | 31 +-
drivers/gpu/drm/panthor/panthor_mmu.c | 38 +--
drivers/gpu/drm/panthor/panthor_pwr.c | 21 +-
drivers/gpu/drm/panthor/panthor_sched.c | 483 ++++++++++++++-----------------
7 files changed, 476 insertions(+), 463 deletions(-)
---
base-commit: ac5ac0acf11df04295eb1811066097b7022d6c7f
change-id: 20260429-panthor-signal-from-irq-d33684f4d292
Best regards,
--
Boris Brezillon <boris.brezillon@collabora.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 01/11] drm/panthor: Make panthor_irq::state a non-atomic field
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
@ 2026-05-12 11:37 ` Boris Brezillon
2026-05-12 18:40 ` Chia-I Wu
2026-05-12 11:37 ` [PATCH v2 02/11] drm/panthor: Move the register accessors before the IRQ helpers Boris Brezillon
` (9 subsequent siblings)
10 siblings, 1 reply; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
The only place where panthor_irq::state is accessed without
panthor_irq::mask_lock held is in the prologue of _irq_suspend(),
which is not really a fast-path. So let's simplify things by assuming
panthor_irq::state must always be accessed with the mask_lock held,
and add a scoped_guard() in _irq_suspend().
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_device.h | 35 ++++++++++++++++----------------
1 file changed, 17 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 4e4607bca7cc..3f91ba73829d 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -101,8 +101,12 @@ struct panthor_irq {
*/
spinlock_t mask_lock;
- /** @state: one of &enum panthor_irq_state reflecting the current state. */
- atomic_t state;
+ /**
+ * @state: one of &enum panthor_irq_state reflecting the current state.
+ *
+ * Must be accessed with mask_lock held.
+ */
+ enum panthor_irq_state state;
};
/**
@@ -510,18 +514,15 @@ const char *panthor_exception_name(struct panthor_device *ptdev,
static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data) \
{ \
struct panthor_irq *pirq = data; \
- enum panthor_irq_state old_state; \
\
if (!gpu_read(pirq->iomem, INT_STAT)) \
return IRQ_NONE; \
\
guard(spinlock_irqsave)(&pirq->mask_lock); \
- old_state = atomic_cmpxchg(&pirq->state, \
- PANTHOR_IRQ_STATE_ACTIVE, \
- PANTHOR_IRQ_STATE_PROCESSING); \
- if (old_state != PANTHOR_IRQ_STATE_ACTIVE) \
+ if (pirq->state != PANTHOR_IRQ_STATE_ACTIVE) \
return IRQ_NONE; \
\
+ pirq->state = PANTHOR_IRQ_STATE_PROCESSING; \
gpu_write(pirq->iomem, INT_MASK, 0); \
return IRQ_WAKE_THREAD; \
} \
@@ -551,13 +552,10 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
} \
\
scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \
- enum panthor_irq_state old_state; \
- \
- old_state = atomic_cmpxchg(&pirq->state, \
- PANTHOR_IRQ_STATE_PROCESSING, \
- PANTHOR_IRQ_STATE_ACTIVE); \
- if (old_state == PANTHOR_IRQ_STATE_PROCESSING) \
+ if (pirq->state == PANTHOR_IRQ_STATE_PROCESSING) { \
+ pirq->state = PANTHOR_IRQ_STATE_ACTIVE; \
gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
+ } \
} \
\
return ret; \
@@ -566,18 +564,19 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq) \
{ \
scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \
- atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDING); \
+ pirq->state = PANTHOR_IRQ_STATE_SUSPENDING; \
gpu_write(pirq->iomem, INT_MASK, 0); \
} \
synchronize_irq(pirq->irq); \
- atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDED); \
+ scoped_guard(spinlock_irqsave, &pirq->mask_lock) \
+ pirq->state = PANTHOR_IRQ_STATE_SUSPENDED; \
} \
\
static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq) \
{ \
guard(spinlock_irqsave)(&pirq->mask_lock); \
\
- atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE); \
+ pirq->state = PANTHOR_IRQ_STATE_ACTIVE; \
gpu_write(pirq->iomem, INT_CLEAR, pirq->mask); \
gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
} \
@@ -610,7 +609,7 @@ static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *
* on the PROCESSING -> ACTIVE transition. \
* If the IRQ is suspended/suspending, the mask is restored at resume time. \
*/ \
- if (atomic_read(&pirq->state) == PANTHOR_IRQ_STATE_ACTIVE) \
+ if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE) \
gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
} \
\
@@ -624,7 +623,7 @@ static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq
* on the PROCESSING -> ACTIVE transition. \
* If the IRQ is suspended/suspending, the mask is restored at resume time. \
*/ \
- if (atomic_read(&pirq->state) == PANTHOR_IRQ_STATE_ACTIVE) \
+ if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE) \
gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
}
--
2.54.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 02/11] drm/panthor: Move the register accessors before the IRQ helpers
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 01/11] drm/panthor: Make panthor_irq::state a non-atomic field Boris Brezillon
@ 2026-05-12 11:37 ` Boris Brezillon
2026-05-12 18:41 ` Chia-I Wu
2026-05-12 11:37 ` [PATCH v2 03/11] drm/panthor: Replace the panthor_irq macro machinery by inline helpers Boris Brezillon
` (8 subsequent siblings)
10 siblings, 1 reply; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
We're about to add an IRQ inline helper using gpu_read(). Move things
around to avoid forward declarations.
No functional changes.
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_device.h | 142 +++++++++++++++----------------
1 file changed, 71 insertions(+), 71 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 3f91ba73829d..768fc1992368 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -495,6 +495,77 @@ panthor_exception_is_fault(u32 exception_code)
const char *panthor_exception_name(struct panthor_device *ptdev,
u32 exception_code);
+static inline void gpu_write(void __iomem *iomem, u32 reg, u32 data)
+{
+ writel(data, iomem + reg);
+}
+
+static inline u32 gpu_read(void __iomem *iomem, u32 reg)
+{
+ return readl(iomem + reg);
+}
+
+static inline u32 gpu_read_relaxed(void __iomem *iomem, u32 reg)
+{
+ return readl_relaxed(iomem + reg);
+}
+
+static inline void gpu_write64(void __iomem *iomem, u32 reg, u64 data)
+{
+ gpu_write(iomem, reg, lower_32_bits(data));
+ gpu_write(iomem, reg + 4, upper_32_bits(data));
+}
+
+static inline u64 gpu_read64(void __iomem *iomem, u32 reg)
+{
+ return (gpu_read(iomem, reg) | ((u64)gpu_read(iomem, reg + 4) << 32));
+}
+
+static inline u64 gpu_read64_relaxed(void __iomem *iomem, u32 reg)
+{
+ return (gpu_read_relaxed(iomem, reg) |
+ ((u64)gpu_read_relaxed(iomem, reg + 4) << 32));
+}
+
+static inline u64 gpu_read64_counter(void __iomem *iomem, u32 reg)
+{
+ u32 lo, hi1, hi2;
+ do {
+ hi1 = gpu_read(iomem, reg + 4);
+ lo = gpu_read(iomem, reg);
+ hi2 = gpu_read(iomem, reg + 4);
+ } while (hi1 != hi2);
+ return lo | ((u64)hi2 << 32);
+}
+
+#define gpu_read_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us) \
+ read_poll_timeout(gpu_read, val, cond, delay_us, timeout_us, false, \
+ iomem, reg)
+
+#define gpu_read_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
+ timeout_us) \
+ read_poll_timeout_atomic(gpu_read, val, cond, delay_us, timeout_us, \
+ false, iomem, reg)
+
+#define gpu_read64_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us) \
+ read_poll_timeout(gpu_read64, val, cond, delay_us, timeout_us, false, \
+ iomem, reg)
+
+#define gpu_read64_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
+ timeout_us) \
+ read_poll_timeout_atomic(gpu_read64, val, cond, delay_us, timeout_us, \
+ false, iomem, reg)
+
+#define gpu_read_relaxed_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
+ timeout_us) \
+ read_poll_timeout_atomic(gpu_read_relaxed, val, cond, delay_us, \
+ timeout_us, false, iomem, reg)
+
+#define gpu_read64_relaxed_poll_timeout(iomem, reg, val, cond, delay_us, \
+ timeout_us) \
+ read_poll_timeout(gpu_read64_relaxed, val, cond, delay_us, timeout_us, \
+ false, iomem, reg)
+
#define INT_RAWSTAT 0x0
#define INT_CLEAR 0x4
#define INT_MASK 0x8
@@ -629,75 +700,4 @@ static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq
extern struct workqueue_struct *panthor_cleanup_wq;
-static inline void gpu_write(void __iomem *iomem, u32 reg, u32 data)
-{
- writel(data, iomem + reg);
-}
-
-static inline u32 gpu_read(void __iomem *iomem, u32 reg)
-{
- return readl(iomem + reg);
-}
-
-static inline u32 gpu_read_relaxed(void __iomem *iomem, u32 reg)
-{
- return readl_relaxed(iomem + reg);
-}
-
-static inline void gpu_write64(void __iomem *iomem, u32 reg, u64 data)
-{
- gpu_write(iomem, reg, lower_32_bits(data));
- gpu_write(iomem, reg + 4, upper_32_bits(data));
-}
-
-static inline u64 gpu_read64(void __iomem *iomem, u32 reg)
-{
- return (gpu_read(iomem, reg) | ((u64)gpu_read(iomem, reg + 4) << 32));
-}
-
-static inline u64 gpu_read64_relaxed(void __iomem *iomem, u32 reg)
-{
- return (gpu_read_relaxed(iomem, reg) |
- ((u64)gpu_read_relaxed(iomem, reg + 4) << 32));
-}
-
-static inline u64 gpu_read64_counter(void __iomem *iomem, u32 reg)
-{
- u32 lo, hi1, hi2;
- do {
- hi1 = gpu_read(iomem, reg + 4);
- lo = gpu_read(iomem, reg);
- hi2 = gpu_read(iomem, reg + 4);
- } while (hi1 != hi2);
- return lo | ((u64)hi2 << 32);
-}
-
-#define gpu_read_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us) \
- read_poll_timeout(gpu_read, val, cond, delay_us, timeout_us, false, \
- iomem, reg)
-
-#define gpu_read_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
- timeout_us) \
- read_poll_timeout_atomic(gpu_read, val, cond, delay_us, timeout_us, \
- false, iomem, reg)
-
-#define gpu_read64_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us) \
- read_poll_timeout(gpu_read64, val, cond, delay_us, timeout_us, false, \
- iomem, reg)
-
-#define gpu_read64_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
- timeout_us) \
- read_poll_timeout_atomic(gpu_read64, val, cond, delay_us, timeout_us, \
- false, iomem, reg)
-
-#define gpu_read_relaxed_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
- timeout_us) \
- read_poll_timeout_atomic(gpu_read_relaxed, val, cond, delay_us, \
- timeout_us, false, iomem, reg)
-
-#define gpu_read64_relaxed_poll_timeout(iomem, reg, val, cond, delay_us, \
- timeout_us) \
- read_poll_timeout(gpu_read64_relaxed, val, cond, delay_us, timeout_us, \
- false, iomem, reg)
-
#endif
--
2.54.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 03/11] drm/panthor: Replace the panthor_irq macro machinery by inline helpers
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 01/11] drm/panthor: Make panthor_irq::state a non-atomic field Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 02/11] drm/panthor: Move the register accessors before the IRQ helpers Boris Brezillon
@ 2026-05-12 11:37 ` Boris Brezillon
2026-05-12 18:58 ` Chia-I Wu
2026-05-12 11:37 ` [PATCH v2 04/11] drm/panthor: Extend the IRQ logic to allow fast/hard IRQ handlers Boris Brezillon
` (7 subsequent siblings)
10 siblings, 1 reply; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
Now that panthor_irq contains the iomem region, there's no real need
for the macro-based panthor_irq helper generation logic. We can just
provide inline helpers that do the same and let the compiler optimize
indirect function calls. The only extra annoyance is the fact we have
to open-code the panthor_xxx_irq_threaded_handler() implementation, but
those are single-line functions, so it's acceptable.
While at it, we changed the prototype of the IRQ handlers to take
a panthor_irq instead of panthor_device, since that's the thing
that's passed around when it comes to panthor_irq, and the
panthor_device can be directly extracted from there.
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_device.h | 245 +++++++++++++++----------------
drivers/gpu/drm/panthor/panthor_fw.c | 22 ++-
drivers/gpu/drm/panthor/panthor_gpu.c | 26 ++--
drivers/gpu/drm/panthor/panthor_mmu.c | 37 ++---
drivers/gpu/drm/panthor/panthor_pwr.c | 20 ++-
5 files changed, 183 insertions(+), 167 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 768fc1992368..393fcda73d88 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -571,131 +571,126 @@ static inline u64 gpu_read64_counter(void __iomem *iomem, u32 reg)
#define INT_MASK 0x8
#define INT_STAT 0xc
-/**
- * PANTHOR_IRQ_HANDLER() - Define interrupt handlers and the interrupt
- * registration function.
- *
- * The boiler-plate to gracefully deal with shared interrupts is
- * auto-generated. All you have to do is call PANTHOR_IRQ_HANDLER()
- * just after the actual handler. The handler prototype is:
- *
- * void (*handler)(struct panthor_device *, u32 status);
- */
-#define PANTHOR_IRQ_HANDLER(__name, __handler) \
-static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data) \
-{ \
- struct panthor_irq *pirq = data; \
- \
- if (!gpu_read(pirq->iomem, INT_STAT)) \
- return IRQ_NONE; \
- \
- guard(spinlock_irqsave)(&pirq->mask_lock); \
- if (pirq->state != PANTHOR_IRQ_STATE_ACTIVE) \
- return IRQ_NONE; \
- \
- pirq->state = PANTHOR_IRQ_STATE_PROCESSING; \
- gpu_write(pirq->iomem, INT_MASK, 0); \
- return IRQ_WAKE_THREAD; \
-} \
- \
-static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *data) \
-{ \
- struct panthor_irq *pirq = data; \
- struct panthor_device *ptdev = pirq->ptdev; \
- irqreturn_t ret = IRQ_NONE; \
- \
- while (true) { \
- /* It's safe to access pirq->mask without the lock held here. If a new \
- * event gets added to the mask and the corresponding IRQ is pending, \
- * we'll process it right away instead of adding an extra raw -> threaded \
- * round trip. If an event is removed and the status bit is set, it will \
- * be ignored, just like it would have been if the mask had been adjusted \
- * right before the HW event kicks in. TLDR; it's all expected races we're \
- * covered for. \
- */ \
- u32 status = gpu_read(pirq->iomem, INT_RAWSTAT) & pirq->mask; \
- \
- if (!status) \
- break; \
- \
- __handler(ptdev, status); \
- ret = IRQ_HANDLED; \
- } \
- \
- scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \
- if (pirq->state == PANTHOR_IRQ_STATE_PROCESSING) { \
- pirq->state = PANTHOR_IRQ_STATE_ACTIVE; \
- gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
- } \
- } \
- \
- return ret; \
-} \
- \
-static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq) \
-{ \
- scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \
- pirq->state = PANTHOR_IRQ_STATE_SUSPENDING; \
- gpu_write(pirq->iomem, INT_MASK, 0); \
- } \
- synchronize_irq(pirq->irq); \
- scoped_guard(spinlock_irqsave, &pirq->mask_lock) \
- pirq->state = PANTHOR_IRQ_STATE_SUSPENDED; \
-} \
- \
-static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq) \
-{ \
- guard(spinlock_irqsave)(&pirq->mask_lock); \
- \
- pirq->state = PANTHOR_IRQ_STATE_ACTIVE; \
- gpu_write(pirq->iomem, INT_CLEAR, pirq->mask); \
- gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
-} \
- \
-static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev, \
- struct panthor_irq *pirq, \
- int irq, u32 mask, void __iomem *iomem) \
-{ \
- pirq->ptdev = ptdev; \
- pirq->irq = irq; \
- pirq->mask = mask; \
- pirq->iomem = iomem; \
- spin_lock_init(&pirq->mask_lock); \
- panthor_ ## __name ## _irq_resume(pirq); \
- \
- return devm_request_threaded_irq(ptdev->base.dev, irq, \
- panthor_ ## __name ## _irq_raw_handler, \
- panthor_ ## __name ## _irq_threaded_handler, \
- IRQF_SHARED, KBUILD_MODNAME "-" # __name, \
- pirq); \
-} \
- \
-static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *pirq, u32 mask) \
-{ \
- guard(spinlock_irqsave)(&pirq->mask_lock); \
- pirq->mask |= mask; \
- \
- /* The only situation where we need to write the new mask is if the IRQ is active. \
- * If it's being processed, the mask will be restored for us in _irq_threaded_handler() \
- * on the PROCESSING -> ACTIVE transition. \
- * If the IRQ is suspended/suspending, the mask is restored at resume time. \
- */ \
- if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE) \
- gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
-} \
- \
-static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq *pirq, u32 mask)\
-{ \
- guard(spinlock_irqsave)(&pirq->mask_lock); \
- pirq->mask &= ~mask; \
- \
- /* The only situation where we need to write the new mask is if the IRQ is active. \
- * If it's being processed, the mask will be restored for us in _irq_threaded_handler() \
- * on the PROCESSING -> ACTIVE transition. \
- * If the IRQ is suspended/suspending, the mask is restored at resume time. \
- */ \
- if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE) \
- gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
+static inline irqreturn_t panthor_irq_default_raw_handler(int irq, void *data)
+{
+ struct panthor_irq *pirq = data;
+
+ if (!gpu_read(pirq->iomem, INT_STAT))
+ return IRQ_NONE;
+
+ guard(spinlock_irqsave)(&pirq->mask_lock);
+ if (pirq->state != PANTHOR_IRQ_STATE_ACTIVE)
+ return IRQ_NONE;
+
+ pirq->state = PANTHOR_IRQ_STATE_PROCESSING;
+ gpu_write(pirq->iomem, INT_MASK, 0);
+ return IRQ_WAKE_THREAD;
+}
+
+static inline irqreturn_t
+panthor_irq_default_threaded_handler(void *data,
+ void (*slow_handler)(struct panthor_irq *, u32))
+{
+ struct panthor_irq *pirq = data;
+ irqreturn_t ret = IRQ_NONE;
+
+ while (true) {
+ /* It's safe to access pirq->mask without the lock held here. If a new
+ * event gets added to the mask and the corresponding IRQ is pending,
+ * we'll process it right away instead of adding an extra raw -> threaded
+ * round trip. If an event is removed and the status bit is set, it will
+ * be ignored, just like it would have been if the mask had been adjusted
+ * right before the HW event kicks in. TLDR; it's all expected races we're
+ * covered for.
+ */
+ u32 status = gpu_read(pirq->iomem, INT_RAWSTAT) & pirq->mask;
+
+ if (!status)
+ break;
+
+ slow_handler(pirq, status);
+ ret = IRQ_HANDLED;
+ }
+
+ scoped_guard(spinlock_irqsave, &pirq->mask_lock) {
+ if (pirq->state == PANTHOR_IRQ_STATE_PROCESSING) {
+ pirq->state = PANTHOR_IRQ_STATE_ACTIVE;
+ gpu_write(pirq->iomem, INT_MASK, pirq->mask);
+ }
+ }
+
+ return ret;
+}
+
+static inline void panthor_irq_suspend(struct panthor_irq *pirq)
+{
+ scoped_guard(spinlock_irqsave, &pirq->mask_lock) {
+ pirq->state = PANTHOR_IRQ_STATE_SUSPENDING;
+ gpu_write(pirq->iomem, INT_MASK, 0);
+ }
+ synchronize_irq(pirq->irq);
+ scoped_guard(spinlock_irqsave, &pirq->mask_lock)
+ pirq->state = PANTHOR_IRQ_STATE_SUSPENDED;
+}
+
+static inline void panthor_irq_resume(struct panthor_irq *pirq)
+{
+ guard(spinlock_irqsave)(&pirq->mask_lock);
+ pirq->state = PANTHOR_IRQ_STATE_ACTIVE;
+ gpu_write(pirq->iomem, INT_CLEAR, pirq->mask);
+ gpu_write(pirq->iomem, INT_MASK, pirq->mask);
+}
+
+static inline void panthor_irq_enable_events(struct panthor_irq *pirq, u32 mask)
+{
+ guard(spinlock_irqsave)(&pirq->mask_lock);
+ pirq->mask |= mask;
+
+ /* The only situation where we need to write the new mask is if the IRQ is active.
+ * If it's being processed, the mask will be restored for us in _irq_threaded_handler()
+ * on the PROCESSING -> ACTIVE transition.
+ * If the IRQ is suspended/suspending, the mask is restored at resume time.
+ */
+ if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE)
+ gpu_write(pirq->iomem, INT_MASK, pirq->mask);
+}
+
+static inline void panthor_irq_disable_events(struct panthor_irq *pirq, u32 mask)
+{
+ guard(spinlock_irqsave)(&pirq->mask_lock);
+ pirq->mask &= ~mask;
+
+ /* The only situation where we need to write the new mask is if the IRQ is active.
+ * If it's being processed, the mask will be restored for us in _irq_threaded_handler()
+ * on the PROCESSING -> ACTIVE transition.
+ * If the IRQ is suspended/suspending, the mask is restored at resume time.
+ */
+ if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE)
+ gpu_write(pirq->iomem, INT_MASK, pirq->mask);
+}
+
+static inline int
+panthor_irq_request(struct panthor_device *ptdev, struct panthor_irq *pirq,
+ int irq, u32 mask, void __iomem *iomem, const char *name,
+ irqreturn_t (*threaded_handler)(int, void *data))
+{
+ const char *full_name;
+
+ pirq->ptdev = ptdev;
+ pirq->irq = irq;
+ pirq->mask = mask;
+ pirq->iomem = iomem;
+ spin_lock_init(&pirq->mask_lock);
+
+ full_name = devm_kasprintf(ptdev->base.dev, GFP_KERNEL, KBUILD_MODNAME "-%s", name);
+ if (!full_name)
+ return -ENOMEM;
+
+ panthor_irq_resume(pirq);
+ return devm_request_threaded_irq(ptdev->base.dev, irq,
+ panthor_irq_default_raw_handler,
+ threaded_handler,
+ IRQF_SHARED, full_name, pirq);
}
extern struct workqueue_struct *panthor_cleanup_wq;
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 986151681b24..eaf599b0a887 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1064,8 +1064,9 @@ static void panthor_fw_init_global_iface(struct panthor_device *ptdev)
msecs_to_jiffies(PING_INTERVAL_MS));
}
-static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
+static void panthor_job_irq_handler(struct panthor_irq *pirq, u32 status)
{
+ struct panthor_device *ptdev = pirq->ptdev;
u32 duration;
u64 start = 0;
@@ -1091,7 +1092,11 @@ static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
trace_gpu_job_irq(ptdev->base.dev, status, duration);
}
}
-PANTHOR_IRQ_HANDLER(job, panthor_job_irq_handler);
+
+static irqreturn_t panthor_job_irq_threaded_handler(int irq, void *data)
+{
+ return panthor_irq_default_threaded_handler(data, panthor_job_irq_handler);
+}
static int panthor_fw_start(struct panthor_device *ptdev)
{
@@ -1099,8 +1104,8 @@ static int panthor_fw_start(struct panthor_device *ptdev)
bool timedout = false;
ptdev->fw->booted = false;
- panthor_job_irq_enable_events(&ptdev->fw->irq, ~0);
- panthor_job_irq_resume(&ptdev->fw->irq);
+ panthor_irq_enable_events(&ptdev->fw->irq, ~0);
+ panthor_irq_resume(&ptdev->fw->irq);
gpu_write(fw->iomem, MCU_CONTROL, MCU_CONTROL_AUTO);
if (!wait_event_timeout(ptdev->fw->req_waitqueue,
@@ -1210,7 +1215,7 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
ptdev->reset.fast = true;
}
- panthor_job_irq_suspend(&ptdev->fw->irq);
+ panthor_irq_suspend(&ptdev->fw->irq);
panthor_fw_stop(ptdev);
}
@@ -1280,7 +1285,7 @@ void panthor_fw_unplug(struct panthor_device *ptdev)
if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev)) {
/* Make sure the IRQ handler cannot be called after that point. */
if (ptdev->fw->irq.irq)
- panthor_job_irq_suspend(&ptdev->fw->irq);
+ panthor_irq_suspend(&ptdev->fw->irq);
panthor_fw_stop(ptdev);
}
@@ -1476,8 +1481,9 @@ int panthor_fw_init(struct panthor_device *ptdev)
if (irq <= 0)
return -ENODEV;
- ret = panthor_request_job_irq(ptdev, &fw->irq, irq, 0,
- ptdev->iomem + JOB_INT_BASE);
+ ret = panthor_irq_request(ptdev, &fw->irq, irq, 0,
+ ptdev->iomem + JOB_INT_BASE, "job",
+ panthor_job_irq_threaded_handler);
if (ret) {
drm_err(&ptdev->base, "failed to request job irq");
return ret;
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index e52c5675981f..ce208e384762 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -86,8 +86,9 @@ static void panthor_gpu_l2_config_set(struct panthor_device *ptdev)
gpu_write(gpu->iomem, GPU_L2_CONFIG, l2_config);
}
-static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
+static void panthor_gpu_irq_handler(struct panthor_irq *pirq, u32 status)
{
+ struct panthor_device *ptdev = pirq->ptdev;
struct panthor_gpu *gpu = ptdev->gpu;
gpu_write(gpu->irq.iomem, INT_CLEAR, status);
@@ -116,7 +117,11 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
}
spin_unlock(&ptdev->gpu->reqs_lock);
}
-PANTHOR_IRQ_HANDLER(gpu, panthor_gpu_irq_handler);
+
+static irqreturn_t panthor_gpu_irq_threaded_handler(int irq, void *data)
+{
+ return panthor_irq_default_threaded_handler(data, panthor_gpu_irq_handler);
+}
/**
* panthor_gpu_unplug() - Called when the GPU is unplugged.
@@ -128,7 +133,7 @@ void panthor_gpu_unplug(struct panthor_device *ptdev)
/* Make sure the IRQ handler is not running after that point. */
if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev))
- panthor_gpu_irq_suspend(&ptdev->gpu->irq);
+ panthor_irq_suspend(&ptdev->gpu->irq);
/* Wake-up all waiters. */
spin_lock_irqsave(&ptdev->gpu->reqs_lock, flags);
@@ -169,9 +174,10 @@ int panthor_gpu_init(struct panthor_device *ptdev)
if (irq < 0)
return irq;
- ret = panthor_request_gpu_irq(ptdev, &ptdev->gpu->irq, irq,
- GPU_INTERRUPTS_MASK,
- ptdev->iomem + GPU_INT_BASE);
+ ret = panthor_irq_request(ptdev, &ptdev->gpu->irq, irq,
+ GPU_INTERRUPTS_MASK,
+ ptdev->iomem + GPU_INT_BASE, "gpu",
+ panthor_gpu_irq_threaded_handler);
if (ret)
return ret;
@@ -182,7 +188,7 @@ int panthor_gpu_power_changed_on(struct panthor_device *ptdev)
{
guard(pm_runtime_active)(ptdev->base.dev);
- panthor_gpu_irq_enable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
+ panthor_irq_enable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
return 0;
}
@@ -191,7 +197,7 @@ void panthor_gpu_power_changed_off(struct panthor_device *ptdev)
{
guard(pm_runtime_active)(ptdev->base.dev);
- panthor_gpu_irq_disable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
+ panthor_irq_disable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
}
/**
@@ -424,7 +430,7 @@ void panthor_gpu_suspend(struct panthor_device *ptdev)
else
panthor_hw_l2_power_off(ptdev);
- panthor_gpu_irq_suspend(&ptdev->gpu->irq);
+ panthor_irq_suspend(&ptdev->gpu->irq);
}
/**
@@ -436,7 +442,7 @@ void panthor_gpu_suspend(struct panthor_device *ptdev)
*/
void panthor_gpu_resume(struct panthor_device *ptdev)
{
- panthor_gpu_irq_resume(&ptdev->gpu->irq);
+ panthor_irq_resume(&ptdev->gpu->irq);
panthor_hw_l2_power_on(ptdev);
}
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index 452d0b6d4668..375022fb3fd8 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -586,17 +586,13 @@ static u32 panthor_mmu_as_fault_mask(struct panthor_device *ptdev, u32 as)
return BIT(as);
}
-/* Forward declaration to call helpers within as_enable/disable */
-static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status);
-PANTHOR_IRQ_HANDLER(mmu, panthor_mmu_irq_handler);
-
static int panthor_mmu_as_enable(struct panthor_device *ptdev, u32 as_nr,
u64 transtab, u64 transcfg, u64 memattr)
{
struct panthor_mmu *mmu = ptdev->mmu;
- panthor_mmu_irq_enable_events(&ptdev->mmu->irq,
- panthor_mmu_as_fault_mask(ptdev, as_nr));
+ panthor_irq_enable_events(&ptdev->mmu->irq,
+ panthor_mmu_as_fault_mask(ptdev, as_nr));
gpu_write64(mmu->iomem, AS_TRANSTAB(as_nr), transtab);
gpu_write64(mmu->iomem, AS_MEMATTR(as_nr), memattr);
@@ -614,8 +610,8 @@ static int panthor_mmu_as_disable(struct panthor_device *ptdev, u32 as_nr,
lockdep_assert_held(&ptdev->mmu->as.slots_lock);
- panthor_mmu_irq_disable_events(&ptdev->mmu->irq,
- panthor_mmu_as_fault_mask(ptdev, as_nr));
+ panthor_irq_disable_events(&ptdev->mmu->irq,
+ panthor_mmu_as_fault_mask(ptdev, as_nr));
/* Flush+invalidate RW caches, invalidate RO ones. */
ret = panthor_gpu_flush_caches(ptdev, CACHE_CLEAN | CACHE_INV,
@@ -1785,8 +1781,9 @@ static void panthor_vm_unlock_region(struct panthor_vm *vm)
mutex_unlock(&ptdev->mmu->as.slots_lock);
}
-static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
+static void panthor_mmu_irq_handler(struct panthor_irq *pirq, u32 status)
{
+ struct panthor_device *ptdev = pirq->ptdev;
struct panthor_mmu *mmu = ptdev->mmu;
bool has_unhandled_faults = false;
@@ -1849,6 +1846,11 @@ static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
panthor_sched_report_mmu_fault(ptdev);
}
+static irqreturn_t panthor_mmu_irq_threaded_handler(int irq, void *data)
+{
+ return panthor_irq_default_threaded_handler(data, panthor_mmu_irq_handler);
+}
+
/**
* panthor_mmu_suspend() - Suspend the MMU logic
* @ptdev: Device.
@@ -1873,7 +1875,7 @@ void panthor_mmu_suspend(struct panthor_device *ptdev)
}
mutex_unlock(&ptdev->mmu->as.slots_lock);
- panthor_mmu_irq_suspend(&ptdev->mmu->irq);
+ panthor_irq_suspend(&ptdev->mmu->irq);
}
/**
@@ -1892,7 +1894,7 @@ void panthor_mmu_resume(struct panthor_device *ptdev)
ptdev->mmu->as.faulty_mask = 0;
mutex_unlock(&ptdev->mmu->as.slots_lock);
- panthor_mmu_irq_resume(&ptdev->mmu->irq);
+ panthor_irq_resume(&ptdev->mmu->irq);
}
/**
@@ -1909,7 +1911,7 @@ void panthor_mmu_pre_reset(struct panthor_device *ptdev)
{
struct panthor_vm *vm;
- panthor_mmu_irq_suspend(&ptdev->mmu->irq);
+ panthor_irq_suspend(&ptdev->mmu->irq);
mutex_lock(&ptdev->mmu->vm.lock);
ptdev->mmu->vm.reset_in_progress = true;
@@ -1946,7 +1948,7 @@ void panthor_mmu_post_reset(struct panthor_device *ptdev)
mutex_unlock(&ptdev->mmu->as.slots_lock);
- panthor_mmu_irq_resume(&ptdev->mmu->irq);
+ panthor_irq_resume(&ptdev->mmu->irq);
/* Restart the VM_BIND queues. */
mutex_lock(&ptdev->mmu->vm.lock);
@@ -3207,7 +3209,7 @@ panthor_mmu_reclaim_priv_bos(struct panthor_device *ptdev,
void panthor_mmu_unplug(struct panthor_device *ptdev)
{
if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev))
- panthor_mmu_irq_suspend(&ptdev->mmu->irq);
+ panthor_irq_suspend(&ptdev->mmu->irq);
mutex_lock(&ptdev->mmu->as.slots_lock);
for (u32 i = 0; i < ARRAY_SIZE(ptdev->mmu->as.slots); i++) {
@@ -3261,9 +3263,10 @@ int panthor_mmu_init(struct panthor_device *ptdev)
if (irq <= 0)
return -ENODEV;
- ret = panthor_request_mmu_irq(ptdev, &mmu->irq, irq,
- panthor_mmu_fault_mask(ptdev, ~0),
- ptdev->iomem + MMU_INT_BASE);
+ ret = panthor_irq_request(ptdev, &mmu->irq, irq,
+ panthor_mmu_fault_mask(ptdev, ~0),
+ ptdev->iomem + MMU_INT_BASE, "mmu",
+ panthor_mmu_irq_threaded_handler);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/panthor/panthor_pwr.c b/drivers/gpu/drm/panthor/panthor_pwr.c
index 7c7f424a1436..80cf78007896 100644
--- a/drivers/gpu/drm/panthor/panthor_pwr.c
+++ b/drivers/gpu/drm/panthor/panthor_pwr.c
@@ -56,8 +56,9 @@ struct panthor_pwr {
wait_queue_head_t reqs_acked;
};
-static void panthor_pwr_irq_handler(struct panthor_device *ptdev, u32 status)
+static void panthor_pwr_irq_handler(struct panthor_irq *pirq, u32 status)
{
+ struct panthor_device *ptdev = pirq->ptdev;
struct panthor_pwr *pwr = ptdev->pwr;
spin_lock(&ptdev->pwr->reqs_lock);
@@ -75,7 +76,11 @@ static void panthor_pwr_irq_handler(struct panthor_device *ptdev, u32 status)
}
spin_unlock(&ptdev->pwr->reqs_lock);
}
-PANTHOR_IRQ_HANDLER(pwr, panthor_pwr_irq_handler);
+
+static irqreturn_t panthor_pwr_irq_threaded_handler(int irq, void *data)
+{
+ return panthor_irq_default_threaded_handler(data, panthor_pwr_irq_handler);
+}
static void panthor_pwr_write_command(struct panthor_device *ptdev, u32 command, u64 args)
{
@@ -453,7 +458,7 @@ void panthor_pwr_unplug(struct panthor_device *ptdev)
return;
/* Make sure the IRQ handler is not running after that point. */
- panthor_pwr_irq_suspend(&ptdev->pwr->irq);
+ panthor_irq_suspend(&ptdev->pwr->irq);
/* Wake-up all waiters. */
spin_lock_irqsave(&ptdev->pwr->reqs_lock, flags);
@@ -483,9 +488,10 @@ int panthor_pwr_init(struct panthor_device *ptdev)
if (irq < 0)
return irq;
- err = panthor_request_pwr_irq(
+ err = panthor_irq_request(
ptdev, &pwr->irq, irq, PWR_INTERRUPTS_MASK,
- pwr->iomem + PWR_INT_BASE);
+ pwr->iomem + PWR_INT_BASE, "pwr",
+ panthor_pwr_irq_threaded_handler);
if (err)
return err;
@@ -564,7 +570,7 @@ void panthor_pwr_suspend(struct panthor_device *ptdev)
if (!ptdev->pwr)
return;
- panthor_pwr_irq_suspend(&ptdev->pwr->irq);
+ panthor_irq_suspend(&ptdev->pwr->irq);
}
void panthor_pwr_resume(struct panthor_device *ptdev)
@@ -572,5 +578,5 @@ void panthor_pwr_resume(struct panthor_device *ptdev)
if (!ptdev->pwr)
return;
- panthor_pwr_irq_resume(&ptdev->pwr->irq);
+ panthor_irq_resume(&ptdev->pwr->irq);
}
--
2.54.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 04/11] drm/panthor: Extend the IRQ logic to allow fast/hard IRQ handlers
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
` (2 preceding siblings ...)
2026-05-12 11:37 ` [PATCH v2 03/11] drm/panthor: Replace the panthor_irq macro machinery by inline helpers Boris Brezillon
@ 2026-05-12 11:37 ` Boris Brezillon
2026-05-12 19:11 ` Chia-I Wu
2026-05-12 11:37 ` [PATCH v2 05/11] drm/panthor: Make panthor_fw_{update,toggle}_reqs() callable from IRQ context Boris Brezillon
` (6 subsequent siblings)
10 siblings, 1 reply; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
All drivers except panthor signal their fences from their interrupt
handler to minimize latency. We could do the same from the threaded
handler, but the latency is still quite high in that case, so let's
allow components to choose the context they want their IRQ handler
to run in by exposing support for custom hard handlers.
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_device.h | 11 ++++++++---
drivers/gpu/drm/panthor/panthor_fw.c | 1 +
drivers/gpu/drm/panthor/panthor_gpu.c | 1 +
drivers/gpu/drm/panthor/panthor_mmu.c | 1 +
drivers/gpu/drm/panthor/panthor_pwr.c | 1 +
5 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 393fcda73d88..1aaf06df875b 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -672,6 +672,7 @@ static inline void panthor_irq_disable_events(struct panthor_irq *pirq, u32 mask
static inline int
panthor_irq_request(struct panthor_device *ptdev, struct panthor_irq *pirq,
int irq, u32 mask, void __iomem *iomem, const char *name,
+ irqreturn_t (*raw_handler)(int, void *data),
irqreturn_t (*threaded_handler)(int, void *data))
{
const char *full_name;
@@ -687,9 +688,13 @@ panthor_irq_request(struct panthor_device *ptdev, struct panthor_irq *pirq,
return -ENOMEM;
panthor_irq_resume(pirq);
- return devm_request_threaded_irq(ptdev->base.dev, irq,
- panthor_irq_default_raw_handler,
- threaded_handler,
+
+ if (!threaded_handler) {
+ return devm_request_irq(ptdev->base.dev, irq, raw_handler,
+ IRQF_SHARED, full_name, pirq);
+ }
+
+ return devm_request_threaded_irq(ptdev->base.dev, irq, raw_handler, threaded_handler,
IRQF_SHARED, full_name, pirq);
}
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index eaf599b0a887..8239a6951569 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1483,6 +1483,7 @@ int panthor_fw_init(struct panthor_device *ptdev)
ret = panthor_irq_request(ptdev, &fw->irq, irq, 0,
ptdev->iomem + JOB_INT_BASE, "job",
+ panthor_irq_default_raw_handler,
panthor_job_irq_threaded_handler);
if (ret) {
drm_err(&ptdev->base, "failed to request job irq");
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index ce208e384762..d0be758ea3e1 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -177,6 +177,7 @@ int panthor_gpu_init(struct panthor_device *ptdev)
ret = panthor_irq_request(ptdev, &ptdev->gpu->irq, irq,
GPU_INTERRUPTS_MASK,
ptdev->iomem + GPU_INT_BASE, "gpu",
+ panthor_irq_default_raw_handler,
panthor_gpu_irq_threaded_handler);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index 375022fb3fd8..2955b8baa2e2 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -3266,6 +3266,7 @@ int panthor_mmu_init(struct panthor_device *ptdev)
ret = panthor_irq_request(ptdev, &mmu->irq, irq,
panthor_mmu_fault_mask(ptdev, ~0),
ptdev->iomem + MMU_INT_BASE, "mmu",
+ panthor_irq_default_raw_handler,
panthor_mmu_irq_threaded_handler);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/panthor/panthor_pwr.c b/drivers/gpu/drm/panthor/panthor_pwr.c
index 80cf78007896..1efb7f3482ba 100644
--- a/drivers/gpu/drm/panthor/panthor_pwr.c
+++ b/drivers/gpu/drm/panthor/panthor_pwr.c
@@ -491,6 +491,7 @@ int panthor_pwr_init(struct panthor_device *ptdev)
err = panthor_irq_request(
ptdev, &pwr->irq, irq, PWR_INTERRUPTS_MASK,
pwr->iomem + PWR_INT_BASE, "pwr",
+ panthor_irq_default_raw_handler,
panthor_pwr_irq_threaded_handler);
if (err)
return err;
--
2.54.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 05/11] drm/panthor: Make panthor_fw_{update,toggle}_reqs() callable from IRQ context
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
` (3 preceding siblings ...)
2026-05-12 11:37 ` [PATCH v2 04/11] drm/panthor: Extend the IRQ logic to allow fast/hard IRQ handlers Boris Brezillon
@ 2026-05-12 11:37 ` Boris Brezillon
2026-05-12 19:29 ` Chia-I Wu
2026-05-12 11:37 ` [PATCH v2 06/11] drm/panthor: Prepare the scheduler logic for FW events in " Boris Brezillon
` (5 subsequent siblings)
10 siblings, 1 reply; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
If we want some FW events to be processed in the interrupt path, we need
the helpers manipulating req regs to be IRQ-safe, which implies using
spin_lock_irqsave instead of spinlock. While at it, use guards instead
of plain spin_lock/unlock calls.
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_fw.h | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_fw.h b/drivers/gpu/drm/panthor/panthor_fw.h
index a99a9b6f4825..e56b7fe15bb3 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.h
+++ b/drivers/gpu/drm/panthor/panthor_fw.h
@@ -432,12 +432,11 @@ struct panthor_fw_global_iface {
#define panthor_fw_toggle_reqs(__iface, __in_reg, __out_reg, __mask) \
do { \
u32 __cur_val, __new_val, __out_val; \
- spin_lock(&(__iface)->lock); \
+ guard(spinlock_irqsave)(&(__iface)->lock); \
__cur_val = READ_ONCE((__iface)->input->__in_reg); \
__out_val = READ_ONCE((__iface)->output->__out_reg); \
__new_val = ((__out_val ^ (__mask)) & (__mask)) | (__cur_val & ~(__mask)); \
WRITE_ONCE((__iface)->input->__in_reg, __new_val); \
- spin_unlock(&(__iface)->lock); \
} while (0)
/**
@@ -458,21 +457,19 @@ struct panthor_fw_global_iface {
#define panthor_fw_update_reqs(__iface, __in_reg, __val, __mask) \
do { \
u32 __cur_val, __new_val; \
- spin_lock(&(__iface)->lock); \
+ guard(spinlock_irqsave)(&(__iface)->lock); \
__cur_val = READ_ONCE((__iface)->input->__in_reg); \
__new_val = (__cur_val & ~(__mask)) | ((__val) & (__mask)); \
WRITE_ONCE((__iface)->input->__in_reg, __new_val); \
- spin_unlock(&(__iface)->lock); \
} while (0)
#define panthor_fw_update_reqs64(__iface, __in_reg, __val, __mask) \
do { \
u64 __cur_val, __new_val; \
- spin_lock(&(__iface)->lock); \
+ guard(spinlock_irqsave)(&(__iface)->lock); \
__cur_val = READ_ONCE((__iface)->input->__in_reg); \
__new_val = (__cur_val & ~(__mask)) | ((__val) & (__mask)); \
WRITE_ONCE((__iface)->input->__in_reg, __new_val); \
- spin_unlock(&(__iface)->lock); \
} while (0)
struct panthor_fw_global_iface *
--
2.54.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 06/11] drm/panthor: Prepare the scheduler logic for FW events in IRQ context
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
` (4 preceding siblings ...)
2026-05-12 11:37 ` [PATCH v2 05/11] drm/panthor: Make panthor_fw_{update,toggle}_reqs() callable from IRQ context Boris Brezillon
@ 2026-05-12 11:37 ` Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 07/11] drm/panthor: Automate CSG IRQ processing at group unbind time Boris Brezillon
` (4 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
Add a specific spinlock for events processing, and force processing
of events in the panthor_sched_report_fw_events() path rather than
deferring it to a work item. We also fast-track fence signalling by
making the job completion logic IRQ-safe.
Note that it requires changing a couple spin_lock() into
spin_lock_irqsave() when those are taken inside a events_lock section.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_sched.c | 332 +++++++++++++++-----------------
1 file changed, 155 insertions(+), 177 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index 5b34032deff8..fbf76b59b7ef 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -177,18 +177,6 @@ struct panthor_scheduler {
*/
struct work_struct sync_upd_work;
- /**
- * @fw_events_work: Work used to process FW events outside the interrupt path.
- *
- * Even if the interrupt is threaded, we need any event processing
- * that require taking the panthor_scheduler::lock to be processed
- * outside the interrupt path so we don't block the tick logic when
- * it calls panthor_fw_{csg,wait}_wait_acks(). Since most of the
- * event processing requires taking this lock, we just delegate all
- * FW event processing to the scheduler workqueue.
- */
- struct work_struct fw_events_work;
-
/**
* @fw_events: Bitmask encoding pending FW events.
*/
@@ -254,6 +242,15 @@ struct panthor_scheduler {
struct list_head waiting;
} groups;
+ /**
+ * @events_lock: Lock taken when processing events.
+ *
+ * This also needs to be taken when csg_slots are updated, to make sure
+ * the event processing logic doesn't touch groups that have left the CSG
+ * slot.
+ */
+ spinlock_t events_lock;
+
/**
* @csg_slots: FW command stream group slots.
*/
@@ -676,9 +673,6 @@ struct panthor_group {
*/
struct panthor_kernel_bo *protm_suspend_buf;
- /** @sync_upd_work: Work used to check/signal job fences. */
- struct work_struct sync_upd_work;
-
/** @tiler_oom_work: Work used to process tiler OOM events happening on this group. */
struct work_struct tiler_oom_work;
@@ -999,7 +993,6 @@ static int
group_bind_locked(struct panthor_group *group, u32 csg_id)
{
struct panthor_device *ptdev = group->ptdev;
- struct panthor_csg_slot *csg_slot;
int ret;
lockdep_assert_held(&ptdev->scheduler->lock);
@@ -1012,9 +1005,7 @@ group_bind_locked(struct panthor_group *group, u32 csg_id)
if (ret)
return ret;
- csg_slot = &ptdev->scheduler->csg_slots[csg_id];
group_get(group);
- group->csg_id = csg_id;
/* Dummy doorbell allocation: doorbell is assigned to the group and
* all queues use the same doorbell.
@@ -1026,7 +1017,10 @@ group_bind_locked(struct panthor_group *group, u32 csg_id)
for (u32 i = 0; i < group->queue_count; i++)
group->queues[i]->doorbell_id = csg_id + 1;
- csg_slot->group = group;
+ scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) {
+ ptdev->scheduler->csg_slots[csg_id].group = group;
+ group->csg_id = csg_id;
+ }
return 0;
}
@@ -1041,7 +1035,6 @@ static int
group_unbind_locked(struct panthor_group *group)
{
struct panthor_device *ptdev = group->ptdev;
- struct panthor_csg_slot *slot;
lockdep_assert_held(&ptdev->scheduler->lock);
@@ -1051,9 +1044,12 @@ group_unbind_locked(struct panthor_group *group)
if (drm_WARN_ON(&ptdev->base, group->state == PANTHOR_CS_GROUP_ACTIVE))
return -EINVAL;
- slot = &ptdev->scheduler->csg_slots[group->csg_id];
+ scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) {
+ ptdev->scheduler->csg_slots[group->csg_id].group = NULL;
+ group->csg_id = -1;
+ }
+
panthor_vm_idle(group->vm);
- group->csg_id = -1;
/* Tiler OOM events will be re-issued next time the group is scheduled. */
atomic_set(&group->tiler_oom, 0);
@@ -1062,8 +1058,6 @@ group_unbind_locked(struct panthor_group *group)
for (u32 i = 0; i < group->queue_count; i++)
group->queues[i]->doorbell_id = -1;
- slot->group = NULL;
-
group_put(group);
return 0;
}
@@ -1151,16 +1145,14 @@ queue_suspend_timeout_locked(struct panthor_queue *queue)
static void
queue_suspend_timeout(struct panthor_queue *queue)
{
- spin_lock(&queue->fence_ctx.lock);
+ guard(spinlock_irqsave)(&queue->fence_ctx.lock);
queue_suspend_timeout_locked(queue);
- spin_unlock(&queue->fence_ctx.lock);
}
static void
queue_resume_timeout(struct panthor_queue *queue)
{
- spin_lock(&queue->fence_ctx.lock);
-
+ guard(spinlock_irqsave)(&queue->fence_ctx.lock);
if (queue_timeout_is_suspended(queue)) {
mod_delayed_work(queue->scheduler.timeout_wq,
&queue->timeout.work,
@@ -1168,8 +1160,6 @@ queue_resume_timeout(struct panthor_queue *queue)
queue->timeout.remaining = MAX_SCHEDULE_TIMEOUT;
}
-
- spin_unlock(&queue->fence_ctx.lock);
}
/**
@@ -1484,7 +1474,7 @@ cs_slot_process_fatal_event_locked(struct panthor_device *ptdev,
u32 fatal;
u64 info;
- lockdep_assert_held(&sched->lock);
+ lockdep_assert_held(&sched->events_lock);
cs_iface = panthor_fw_get_cs_iface(ptdev, csg_id, cs_id);
fatal = cs_iface->output->fatal;
@@ -1532,7 +1522,7 @@ cs_slot_process_fault_event_locked(struct panthor_device *ptdev,
u32 fault;
u64 info;
- lockdep_assert_held(&sched->lock);
+ lockdep_assert_held(&sched->events_lock);
cs_iface = panthor_fw_get_cs_iface(ptdev, csg_id, cs_id);
fault = cs_iface->output->fault;
@@ -1542,7 +1532,7 @@ cs_slot_process_fault_event_locked(struct panthor_device *ptdev,
u64 cs_extract = queue->iface.output->extract;
struct panthor_job *job;
- spin_lock(&queue->fence_ctx.lock);
+ guard(spinlock_irqsave)(&queue->fence_ctx.lock);
list_for_each_entry(job, &queue->fence_ctx.in_flight_jobs, node) {
if (cs_extract >= job->ringbuf.end)
continue;
@@ -1552,7 +1542,6 @@ cs_slot_process_fault_event_locked(struct panthor_device *ptdev,
dma_fence_set_error(job->done_fence, -EINVAL);
}
- spin_unlock(&queue->fence_ctx.lock);
}
if (group) {
@@ -1682,7 +1671,7 @@ cs_slot_process_tiler_oom_event_locked(struct panthor_device *ptdev,
struct panthor_csg_slot *csg_slot = &sched->csg_slots[csg_id];
struct panthor_group *group = csg_slot->group;
- lockdep_assert_held(&sched->lock);
+ lockdep_assert_held(&sched->events_lock);
if (drm_WARN_ON(&ptdev->base, !group))
return;
@@ -1703,7 +1692,7 @@ static bool cs_slot_process_irq_locked(struct panthor_device *ptdev,
struct panthor_fw_cs_iface *cs_iface;
u32 req, ack, events;
- lockdep_assert_held(&ptdev->scheduler->lock);
+ lockdep_assert_held(&ptdev->scheduler->events_lock);
cs_iface = panthor_fw_get_cs_iface(ptdev, csg_id, cs_id);
req = cs_iface->input->req;
@@ -1731,7 +1720,7 @@ static void csg_slot_process_idle_event_locked(struct panthor_device *ptdev, u32
{
struct panthor_scheduler *sched = ptdev->scheduler;
- lockdep_assert_held(&sched->lock);
+ lockdep_assert_held(&sched->events_lock);
sched->might_have_idle_groups = true;
@@ -1742,16 +1731,102 @@ static void csg_slot_process_idle_event_locked(struct panthor_device *ptdev, u32
sched_queue_delayed_work(sched, tick, 0);
}
+static void update_fdinfo_stats(struct panthor_job *job)
+{
+ struct panthor_group *group = job->group;
+ struct panthor_queue *queue = group->queues[job->queue_idx];
+ struct panthor_gpu_usage *fdinfo = &group->fdinfo.data;
+ struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
+ struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
+
+ scoped_guard(spinlock_irqsave, &group->fdinfo.lock) {
+ if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
+ fdinfo->cycles += data->cycles.after - data->cycles.before;
+ if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
+ fdinfo->time += data->time.after - data->time.before;
+ }
+}
+
+static bool queue_check_job_completion(struct panthor_queue *queue)
+{
+ struct panthor_syncobj_64b *syncobj = NULL;
+ struct panthor_job *job, *job_tmp;
+ bool cookie, progress = false;
+ LIST_HEAD(done_jobs);
+
+ cookie = dma_fence_begin_signalling();
+ scoped_guard(spinlock_irqsave, &queue->fence_ctx.lock) {
+ list_for_each_entry_safe(job, job_tmp, &queue->fence_ctx.in_flight_jobs, node) {
+ if (!syncobj) {
+ struct panthor_group *group = job->group;
+
+ syncobj = group->syncobjs->kmap +
+ (job->queue_idx * sizeof(*syncobj));
+ }
+
+ if (syncobj->seqno < job->done_fence->seqno)
+ break;
+
+ list_move_tail(&job->node, &done_jobs);
+ dma_fence_signal_locked(job->done_fence);
+ }
+
+ if (list_empty(&queue->fence_ctx.in_flight_jobs)) {
+ /* If we have no job left, we cancel the timer, and reset remaining
+ * time to its default so it can be restarted next time
+ * queue_resume_timeout() is called.
+ */
+ queue_suspend_timeout_locked(queue);
+
+ /* If there's no job pending, we consider it progress to avoid a
+ * spurious timeout if the timeout handler and the sync update
+ * handler raced.
+ */
+ progress = true;
+ } else if (!list_empty(&done_jobs)) {
+ queue_reset_timeout_locked(queue);
+ progress = true;
+ }
+ }
+ dma_fence_end_signalling(cookie);
+
+ list_for_each_entry_safe(job, job_tmp, &done_jobs, node) {
+ if (job->profiling.mask)
+ update_fdinfo_stats(job);
+ list_del_init(&job->node);
+ panthor_job_put(&job->base);
+ }
+
+ return progress;
+}
+
+static void group_check_job_completion(struct panthor_group *group)
+{
+ bool cookie;
+ u32 queue_idx;
+
+ cookie = dma_fence_begin_signalling();
+ for (queue_idx = 0; queue_idx < group->queue_count; queue_idx++) {
+ struct panthor_queue *queue = group->queues[queue_idx];
+
+ if (!queue)
+ continue;
+
+ queue_check_job_completion(queue);
+ }
+ dma_fence_end_signalling(cookie);
+}
+
static void csg_slot_sync_update_locked(struct panthor_device *ptdev,
u32 csg_id)
{
struct panthor_csg_slot *csg_slot = &ptdev->scheduler->csg_slots[csg_id];
struct panthor_group *group = csg_slot->group;
- lockdep_assert_held(&ptdev->scheduler->lock);
+ lockdep_assert_held(&ptdev->scheduler->events_lock);
if (group)
- group_queue_work(group, sync_upd);
+ group_check_job_completion(group);
sched_queue_work(ptdev->scheduler, sync_upd);
}
@@ -1763,7 +1838,7 @@ csg_slot_process_progress_timer_event_locked(struct panthor_device *ptdev, u32 c
struct panthor_csg_slot *csg_slot = &sched->csg_slots[csg_id];
struct panthor_group *group = csg_slot->group;
- lockdep_assert_held(&sched->lock);
+ lockdep_assert_held(&sched->events_lock);
group = csg_slot->group;
if (!drm_WARN_ON(&ptdev->base, !group)) {
@@ -1784,7 +1859,7 @@ static void sched_process_csg_irq_locked(struct panthor_device *ptdev, u32 csg_i
struct panthor_fw_csg_iface *csg_iface;
u32 ring_cs_db_mask = 0;
- lockdep_assert_held(&ptdev->scheduler->lock);
+ lockdep_assert_held(&ptdev->scheduler->events_lock);
if (drm_WARN_ON(&ptdev->base, csg_id >= ptdev->scheduler->csg_slot_count))
return;
@@ -1842,7 +1917,7 @@ static void sched_process_idle_event_locked(struct panthor_device *ptdev)
{
struct panthor_fw_global_iface *glb_iface = panthor_fw_get_glb_iface(ptdev);
- lockdep_assert_held(&ptdev->scheduler->lock);
+ lockdep_assert_held(&ptdev->scheduler->events_lock);
/* Acknowledge the idle event and schedule a tick. */
panthor_fw_update_reqs(glb_iface, req, glb_iface->output->ack, GLB_IDLE);
@@ -1858,7 +1933,7 @@ static void sched_process_global_irq_locked(struct panthor_device *ptdev)
struct panthor_fw_global_iface *glb_iface = panthor_fw_get_glb_iface(ptdev);
u32 req, ack, evts;
- lockdep_assert_held(&ptdev->scheduler->lock);
+ lockdep_assert_held(&ptdev->scheduler->events_lock);
req = READ_ONCE(glb_iface->input->req);
ack = READ_ONCE(glb_iface->output->ack);
@@ -1868,30 +1943,6 @@ static void sched_process_global_irq_locked(struct panthor_device *ptdev)
sched_process_idle_event_locked(ptdev);
}
-static void process_fw_events_work(struct work_struct *work)
-{
- struct panthor_scheduler *sched = container_of(work, struct panthor_scheduler,
- fw_events_work);
- u32 events = atomic_xchg(&sched->fw_events, 0);
- struct panthor_device *ptdev = sched->ptdev;
-
- mutex_lock(&sched->lock);
-
- if (events & JOB_INT_GLOBAL_IF) {
- sched_process_global_irq_locked(ptdev);
- events &= ~JOB_INT_GLOBAL_IF;
- }
-
- while (events) {
- u32 csg_id = ffs(events) - 1;
-
- sched_process_csg_irq_locked(ptdev, csg_id);
- events &= ~BIT(csg_id);
- }
-
- mutex_unlock(&sched->lock);
-}
-
/**
* panthor_sched_report_fw_events() - Report FW events to the scheduler.
* @ptdev: Device.
@@ -1902,8 +1953,19 @@ void panthor_sched_report_fw_events(struct panthor_device *ptdev, u32 events)
if (!ptdev->scheduler)
return;
- atomic_or(events, &ptdev->scheduler->fw_events);
- sched_queue_work(ptdev->scheduler, fw_events);
+ guard(spinlock_irqsave)(&ptdev->scheduler->events_lock);
+
+ if (events & JOB_INT_GLOBAL_IF) {
+ sched_process_global_irq_locked(ptdev);
+ events &= ~JOB_INT_GLOBAL_IF;
+ }
+
+ while (events) {
+ u32 csg_id = ffs(events) - 1;
+
+ sched_process_csg_irq_locked(ptdev, csg_id);
+ events &= ~BIT(csg_id);
+ }
}
static const char *fence_get_driver_name(struct dma_fence *fence)
@@ -2136,7 +2198,9 @@ tick_ctx_init(struct panthor_scheduler *sched,
* CSG IRQs, so we can flag the faulty queue.
*/
if (panthor_vm_has_unhandled_faults(group->vm)) {
- sched_process_csg_irq_locked(ptdev, i);
+ scoped_guard(spinlock_irqsave, &sched->events_lock) {
+ sched_process_csg_irq_locked(ptdev, i);
+ }
/* No fatal fault reported, flag all queues as faulty. */
if (!group->fatal_queues)
@@ -2183,13 +2247,13 @@ group_term_post_processing(struct panthor_group *group)
if (!queue)
continue;
- spin_lock(&queue->fence_ctx.lock);
- list_for_each_entry_safe(job, tmp, &queue->fence_ctx.in_flight_jobs, node) {
- list_move_tail(&job->node, &faulty_jobs);
- dma_fence_set_error(job->done_fence, err);
- dma_fence_signal_locked(job->done_fence);
+ scoped_guard(spinlock_irqsave, &queue->fence_ctx.lock) {
+ list_for_each_entry_safe(job, tmp, &queue->fence_ctx.in_flight_jobs, node) {
+ list_move_tail(&job->node, &faulty_jobs);
+ dma_fence_set_error(job->done_fence, err);
+ dma_fence_signal_locked(job->done_fence);
+ }
}
- spin_unlock(&queue->fence_ctx.lock);
/* Manually update the syncobj seqno to unblock waiters. */
syncobj = group->syncobjs->kmap + (i * sizeof(*syncobj));
@@ -2336,8 +2400,10 @@ tick_ctx_apply(struct panthor_scheduler *sched, struct panthor_sched_tick_ctx *c
* any pending interrupts before we start the new
* group.
*/
- if (group->csg_id >= 0)
+ if (group->csg_id >= 0) {
+ guard(spinlock_irqsave)(&sched->events_lock);
sched_process_csg_irq_locked(ptdev, group->csg_id);
+ }
group_unbind_locked(group);
}
@@ -2902,10 +2968,12 @@ void panthor_sched_suspend(struct panthor_device *ptdev)
u32 csg_id = ffs(slot_mask) - 1;
struct panthor_csg_slot *csg_slot = &sched->csg_slots[csg_id];
- if (flush_caches_failed)
+ if (flush_caches_failed) {
csg_slot->group->state = PANTHOR_CS_GROUP_TERMINATED;
- else
+ } else {
+ guard(spinlock_irqsave)(&sched->events_lock);
csg_slot_sync_update_locked(ptdev, csg_id);
+ }
slot_mask &= ~BIT(csg_id);
}
@@ -2920,8 +2988,10 @@ void panthor_sched_suspend(struct panthor_device *ptdev)
group_get(group);
- if (group->csg_id >= 0)
+ if (group->csg_id >= 0) {
+ guard(spinlock_irqsave)(&sched->events_lock);
sched_process_csg_irq_locked(ptdev, group->csg_id);
+ }
group_unbind_locked(group);
@@ -3005,22 +3075,6 @@ void panthor_sched_post_reset(struct panthor_device *ptdev, bool reset_failed)
}
}
-static void update_fdinfo_stats(struct panthor_job *job)
-{
- struct panthor_group *group = job->group;
- struct panthor_queue *queue = group->queues[job->queue_idx];
- struct panthor_gpu_usage *fdinfo = &group->fdinfo.data;
- struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
- struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
-
- scoped_guard(spinlock, &group->fdinfo.lock) {
- if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
- fdinfo->cycles += data->cycles.after - data->cycles.before;
- if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
- fdinfo->time += data->time.after - data->time.before;
- }
-}
-
void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
{
struct panthor_group_pool *gpool = pfile->groups;
@@ -3032,7 +3086,7 @@ void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
xa_lock(&gpool->xa);
xa_for_each_marked(&gpool->xa, i, group, GROUP_REGISTERED) {
- guard(spinlock)(&group->fdinfo.lock);
+ guard(spinlock_irqsave)(&group->fdinfo.lock);
pfile->stats.cycles += group->fdinfo.data.cycles;
pfile->stats.time += group->fdinfo.data.time;
group->fdinfo.data.cycles = 0;
@@ -3041,80 +3095,6 @@ void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
xa_unlock(&gpool->xa);
}
-static bool queue_check_job_completion(struct panthor_queue *queue)
-{
- struct panthor_syncobj_64b *syncobj = NULL;
- struct panthor_job *job, *job_tmp;
- bool cookie, progress = false;
- LIST_HEAD(done_jobs);
-
- cookie = dma_fence_begin_signalling();
- spin_lock(&queue->fence_ctx.lock);
- list_for_each_entry_safe(job, job_tmp, &queue->fence_ctx.in_flight_jobs, node) {
- if (!syncobj) {
- struct panthor_group *group = job->group;
-
- syncobj = group->syncobjs->kmap +
- (job->queue_idx * sizeof(*syncobj));
- }
-
- if (syncobj->seqno < job->done_fence->seqno)
- break;
-
- list_move_tail(&job->node, &done_jobs);
- dma_fence_signal_locked(job->done_fence);
- }
-
- if (list_empty(&queue->fence_ctx.in_flight_jobs)) {
- /* If we have no job left, we cancel the timer, and reset remaining
- * time to its default so it can be restarted next time
- * queue_resume_timeout() is called.
- */
- queue_suspend_timeout_locked(queue);
-
- /* If there's no job pending, we consider it progress to avoid a
- * spurious timeout if the timeout handler and the sync update
- * handler raced.
- */
- progress = true;
- } else if (!list_empty(&done_jobs)) {
- queue_reset_timeout_locked(queue);
- progress = true;
- }
- spin_unlock(&queue->fence_ctx.lock);
- dma_fence_end_signalling(cookie);
-
- list_for_each_entry_safe(job, job_tmp, &done_jobs, node) {
- if (job->profiling.mask)
- update_fdinfo_stats(job);
- list_del_init(&job->node);
- panthor_job_put(&job->base);
- }
-
- return progress;
-}
-
-static void group_sync_upd_work(struct work_struct *work)
-{
- struct panthor_group *group =
- container_of(work, struct panthor_group, sync_upd_work);
- u32 queue_idx;
- bool cookie;
-
- cookie = dma_fence_begin_signalling();
- for (queue_idx = 0; queue_idx < group->queue_count; queue_idx++) {
- struct panthor_queue *queue = group->queues[queue_idx];
-
- if (!queue)
- continue;
-
- queue_check_job_completion(queue);
- }
- dma_fence_end_signalling(cookie);
-
- group_put(group);
-}
-
struct panthor_job_ringbuf_instrs {
u64 buffer[MAX_INSTRS_PER_JOB];
u32 count;
@@ -3346,9 +3326,8 @@ queue_run_job(struct drm_sched_job *sched_job)
job->ringbuf.end = job->ringbuf.start + (instrs.count * sizeof(u64));
panthor_job_get(&job->base);
- spin_lock(&queue->fence_ctx.lock);
- list_add_tail(&job->node, &queue->fence_ctx.in_flight_jobs);
- spin_unlock(&queue->fence_ctx.lock);
+ scoped_guard(spinlock_irqsave, &queue->fence_ctx.lock)
+ list_add_tail(&job->node, &queue->fence_ctx.in_flight_jobs);
/* Make sure the ring buffer is updated before the INSERT
* register.
@@ -3683,7 +3662,6 @@ int panthor_group_create(struct panthor_file *pfile,
INIT_LIST_HEAD(&group->wait_node);
INIT_LIST_HEAD(&group->run_node);
INIT_WORK(&group->term_work, group_term_work);
- INIT_WORK(&group->sync_upd_work, group_sync_upd_work);
INIT_WORK(&group->tiler_oom_work, group_tiler_oom_work);
INIT_WORK(&group->release_work, group_release_work);
@@ -4054,7 +4032,6 @@ void panthor_sched_unplug(struct panthor_device *ptdev)
struct panthor_scheduler *sched = ptdev->scheduler;
disable_delayed_work_sync(&sched->tick_work);
- disable_work_sync(&sched->fw_events_work);
disable_work_sync(&sched->sync_upd_work);
mutex_lock(&sched->lock);
@@ -4139,7 +4116,8 @@ int panthor_sched_init(struct panthor_device *ptdev)
sched->tick_period = msecs_to_jiffies(10);
INIT_DELAYED_WORK(&sched->tick_work, tick_work);
INIT_WORK(&sched->sync_upd_work, sync_upd_work);
- INIT_WORK(&sched->fw_events_work, process_fw_events_work);
+
+ spin_lock_init(&sched->events_lock);
ret = drmm_mutex_init(&ptdev->base, &sched->lock);
if (ret)
--
2.54.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 07/11] drm/panthor: Automate CSG IRQ processing at group unbind time
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
` (5 preceding siblings ...)
2026-05-12 11:37 ` [PATCH v2 06/11] drm/panthor: Prepare the scheduler logic for FW events in " Boris Brezillon
@ 2026-05-12 11:37 ` Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 08/11] drm/panthor: Automatically enable interrupts in panthor_fw_wait_acks() Boris Brezillon
` (3 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
Make the sched_process_csg_irq_locked() call part of
group_unbind_locked() so we don't have to manually call it in
tick_ctx_apply()/panthor_sched_suspend().
This implies moving group_[un]bind_locked() around to avoid a
forward declaration.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_sched.c | 176 +++++++++++++++-----------------
1 file changed, 82 insertions(+), 94 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index fbf76b59b7ef..6c5ba747ae45 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -982,86 +982,6 @@ group_get(struct panthor_group *group)
return group;
}
-/**
- * group_bind_locked() - Bind a group to a group slot
- * @group: Group.
- * @csg_id: Slot.
- *
- * Return: 0 on success, a negative error code otherwise.
- */
-static int
-group_bind_locked(struct panthor_group *group, u32 csg_id)
-{
- struct panthor_device *ptdev = group->ptdev;
- int ret;
-
- lockdep_assert_held(&ptdev->scheduler->lock);
-
- if (drm_WARN_ON(&ptdev->base, group->csg_id != -1 || csg_id >= MAX_CSGS ||
- ptdev->scheduler->csg_slots[csg_id].group))
- return -EINVAL;
-
- ret = panthor_vm_active(group->vm);
- if (ret)
- return ret;
-
- group_get(group);
-
- /* Dummy doorbell allocation: doorbell is assigned to the group and
- * all queues use the same doorbell.
- *
- * TODO: Implement LRU-based doorbell assignment, so the most often
- * updated queues get their own doorbell, thus avoiding useless checks
- * on queues belonging to the same group that are rarely updated.
- */
- for (u32 i = 0; i < group->queue_count; i++)
- group->queues[i]->doorbell_id = csg_id + 1;
-
- scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) {
- ptdev->scheduler->csg_slots[csg_id].group = group;
- group->csg_id = csg_id;
- }
-
- return 0;
-}
-
-/**
- * group_unbind_locked() - Unbind a group from a slot.
- * @group: Group to unbind.
- *
- * Return: 0 on success, a negative error code otherwise.
- */
-static int
-group_unbind_locked(struct panthor_group *group)
-{
- struct panthor_device *ptdev = group->ptdev;
-
- lockdep_assert_held(&ptdev->scheduler->lock);
-
- if (drm_WARN_ON(&ptdev->base, group->csg_id < 0 || group->csg_id >= MAX_CSGS))
- return -EINVAL;
-
- if (drm_WARN_ON(&ptdev->base, group->state == PANTHOR_CS_GROUP_ACTIVE))
- return -EINVAL;
-
- scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) {
- ptdev->scheduler->csg_slots[group->csg_id].group = NULL;
- group->csg_id = -1;
- }
-
- panthor_vm_idle(group->vm);
-
- /* Tiler OOM events will be re-issued next time the group is scheduled. */
- atomic_set(&group->tiler_oom, 0);
- cancel_work(&group->tiler_oom_work);
-
- for (u32 i = 0; i < group->queue_count; i++)
- group->queues[i]->doorbell_id = -1;
-
- group_put(group);
- return 0;
-}
-
static bool
group_is_idle(struct panthor_group *group)
{
@@ -1968,6 +1888,88 @@ void panthor_sched_report_fw_events(struct panthor_device *ptdev, u32 events)
}
}
+/**
+ * group_bind_locked() - Bind a group to a group slot
+ * @group: Group.
+ * @csg_id: Slot.
+ *
+ * Return: 0 on success, a negative error code otherwise.
+ */
+static int
+group_bind_locked(struct panthor_group *group, u32 csg_id)
+{
+ struct panthor_device *ptdev = group->ptdev;
+ int ret;
+
+ lockdep_assert_held(&ptdev->scheduler->lock);
+
+ if (drm_WARN_ON(&ptdev->base, group->csg_id != -1 || csg_id >= MAX_CSGS ||
+ ptdev->scheduler->csg_slots[csg_id].group))
+ return -EINVAL;
+
+ ret = panthor_vm_active(group->vm);
+ if (ret)
+ return ret;
+
+ group_get(group);
+
+ /* Dummy doorbell allocation: doorbell is assigned to the group and
+ * all queues use the same doorbell.
+ *
+ * TODO: Implement LRU-based doorbell assignment, so the most often
+ * updated queues get their own doorbell, thus avoiding useless checks
+ * on queues belonging to the same group that are rarely updated.
+ */
+ for (u32 i = 0; i < group->queue_count; i++)
+ group->queues[i]->doorbell_id = csg_id + 1;
+
+ scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) {
+ ptdev->scheduler->csg_slots[csg_id].group = group;
+ group->csg_id = csg_id;
+ }
+
+ return 0;
+}
+
+/**
+ * group_unbind_locked() - Unbind a group from a slot.
+ * @group: Group to unbind.
+ *
+ * Return: 0 on success, a negative error code otherwise.
+ */
+static int
+group_unbind_locked(struct panthor_group *group)
+{
+ struct panthor_device *ptdev = group->ptdev;
+
+ lockdep_assert_held(&ptdev->scheduler->lock);
+
+ if (drm_WARN_ON(&ptdev->base, group->csg_id < 0 || group->csg_id >= MAX_CSGS))
+ return -EINVAL;
+
+ if (drm_WARN_ON(&ptdev->base, group->state == PANTHOR_CS_GROUP_ACTIVE))
+ return -EINVAL;
+
+ scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) {
+ /* Process all pending IRQs before returning the slot. */
+ sched_process_csg_irq_locked(ptdev, group->csg_id);
+ ptdev->scheduler->csg_slots[group->csg_id].group = NULL;
+ group->csg_id = -1;
+ }
+
+ panthor_vm_idle(group->vm);
+
+ /* Tiler OOM events will be re-issued next time the group is scheduled. */
+ atomic_set(&group->tiler_oom, 0);
+ cancel_work(&group->tiler_oom_work);
+
+ for (u32 i = 0; i < group->queue_count; i++)
+ group->queues[i]->doorbell_id = -1;
+
+ group_put(group);
+ return 0;
+}
+
static const char *fence_get_driver_name(struct dma_fence *fence)
{
return "panthor";
@@ -2396,15 +2398,6 @@ tick_ctx_apply(struct panthor_scheduler *sched, struct panthor_sched_tick_ctx *c
/* Unbind evicted groups. */
for (prio = PANTHOR_CSG_PRIORITY_COUNT - 1; prio >= 0; prio--) {
list_for_each_entry(group, &ctx->old_groups[prio], run_node) {
- /* This group is gone. Process interrupts to clear
- * any pending interrupts before we start the new
- * group.
- */
- if (group->csg_id >= 0) {
- guard(spinlock_irqsave)(&sched->events_lock);
- sched_process_csg_irq_locked(ptdev, group->csg_id);
- }
-
group_unbind_locked(group);
}
}
@@ -2988,11 +2981,6 @@ void panthor_sched_suspend(struct panthor_device *ptdev)
group_get(group);
- if (group->csg_id >= 0) {
- guard(spinlock_irqsave)(&sched->events_lock);
- sched_process_csg_irq_locked(ptdev, group->csg_id);
- }
-
group_unbind_locked(group);
drm_WARN_ON(&group->ptdev->base, !list_empty(&group->run_node));
--
2.54.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 08/11] drm/panthor: Automatically enable interrupts in panthor_fw_wait_acks()
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
` (6 preceding siblings ...)
2026-05-12 11:37 ` [PATCH v2 07/11] drm/panthor: Automate CSG IRQ processing at group unbind time Boris Brezillon
@ 2026-05-12 11:37 ` Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 09/11] drm/panthor: Process FW events in IRQ context Boris Brezillon
` (2 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
Rather than assuming an interrupt is always expected for request
acks, temporarily enable the relevant interrupts when the polling-wait
failed. This should hopefully reduce the number of interrupts the CPU
has to process.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_fw.c | 34 +++++++++++++++++++--------------
drivers/gpu/drm/panthor/panthor_sched.c | 5 +++--
2 files changed, 23 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 8239a6951569..f5e0ceca4130 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1039,16 +1039,10 @@ static void panthor_fw_init_global_iface(struct panthor_device *ptdev)
glb_iface->input->progress_timer = PROGRESS_TIMEOUT_CYCLES >> PROGRESS_TIMEOUT_SCALE_SHIFT;
glb_iface->input->idle_timer = panthor_fw_conv_timeout(ptdev, IDLE_HYSTERESIS_US);
- /* Enable interrupts we care about. */
- glb_iface->input->ack_irq_mask = GLB_CFG_ALLOC_EN |
- GLB_PING |
- GLB_CFG_PROGRESS_TIMER |
- GLB_CFG_POWEROFF_TIMER |
- GLB_IDLE_EN |
- GLB_IDLE;
-
- if (panthor_fw_has_glb_state(ptdev))
- glb_iface->input->ack_irq_mask |= GLB_STATE_MASK;
+ /* Enable interrupts for asynchronous events that are not
+ * triggered by request acks.
+ */
+ glb_iface->input->ack_irq_mask = GLB_IDLE;
panthor_fw_update_reqs(glb_iface, req, GLB_IDLE_EN | GLB_COUNTER_EN,
GLB_IDLE_EN | GLB_COUNTER_EN);
@@ -1318,8 +1312,8 @@ void panthor_fw_unplug(struct panthor_device *ptdev)
* Return: 0 on success, -ETIMEDOUT otherwise.
*/
static int panthor_fw_wait_acks(const u32 *req_ptr, const u32 *ack_ptr,
- wait_queue_head_t *wq,
- u32 req_mask, u32 *acked,
+ u32 *ack_irq_mask_ptr, spinlock_t *lock,
+ wait_queue_head_t *wq, u32 req_mask, u32 *acked,
u32 timeout_ms)
{
u32 ack, req = READ_ONCE(*req_ptr) & req_mask;
@@ -1334,8 +1328,16 @@ static int panthor_fw_wait_acks(const u32 *req_ptr, const u32 *ack_ptr,
if (!ret)
return 0;
- if (wait_event_timeout(*wq, (READ_ONCE(*ack_ptr) & req_mask) == req,
- msecs_to_jiffies(timeout_ms)))
+ scoped_guard(spinlock_irqsave, lock)
+ *ack_irq_mask_ptr |= req_mask;
+
+ ret = wait_event_timeout(*wq, (READ_ONCE(*ack_ptr) & req_mask) == req,
+ msecs_to_jiffies(timeout_ms));
+
+ scoped_guard(spinlock_irqsave, lock)
+ *ack_irq_mask_ptr &= ~req_mask;
+
+ if (ret)
return 0;
/* Check one last time, in case we were not woken up for some reason. */
@@ -1369,6 +1371,8 @@ int panthor_fw_glb_wait_acks(struct panthor_device *ptdev,
return panthor_fw_wait_acks(&glb_iface->input->req,
&glb_iface->output->ack,
+ &glb_iface->input->ack_irq_mask,
+ &glb_iface->lock,
&ptdev->fw->req_waitqueue,
req_mask, acked, timeout_ms);
}
@@ -1395,6 +1399,8 @@ int panthor_fw_csg_wait_acks(struct panthor_device *ptdev, u32 csg_slot,
ret = panthor_fw_wait_acks(&csg_iface->input->req,
&csg_iface->output->ack,
+ &csg_iface->input->ack_irq_mask,
+ &csg_iface->lock,
&ptdev->fw->req_waitqueue,
req_mask, acked, timeout_ms);
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index 6c5ba747ae45..a9124bcc7de6 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -1110,7 +1110,7 @@ cs_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 cs_id)
cs_iface->input->ringbuf_output = queue->iface.output_fw_va;
cs_iface->input->config = CS_CONFIG_PRIORITY(queue->priority) |
CS_CONFIG_DOORBELL(queue->doorbell_id);
- cs_iface->input->ack_irq_mask = ~0;
+ cs_iface->input->ack_irq_mask = CS_FATAL | CS_FAULT | CS_TILER_OOM;
panthor_fw_update_reqs(cs_iface, req,
CS_IDLE_SYNC_WAIT |
CS_IDLE_EMPTY |
@@ -1378,7 +1378,8 @@ csg_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 priority)
csg_iface->input->protm_suspend_buf = 0;
}
- csg_iface->input->ack_irq_mask = ~0;
+ csg_iface->input->ack_irq_mask = CSG_SYNC_UPDATE | CSG_IDLE |
+ CSG_PROGRESS_TIMER_EVENT;
panthor_fw_toggle_reqs(csg_iface, doorbell_req, doorbell_ack, queue_mask);
return 0;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 09/11] drm/panthor: Process FW events in IRQ context
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
` (7 preceding siblings ...)
2026-05-12 11:37 ` [PATCH v2 08/11] drm/panthor: Automatically enable interrupts in panthor_fw_wait_acks() Boris Brezillon
@ 2026-05-12 11:37 ` Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 10/11] drm/panthor: Use the irqsave variant of spin_lock in panthor_gpu_irq_handler() Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 11/11] drm/panthor: Process GPU events in IRQ context Boris Brezillon
10 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
Now that everything is set to allow processing FW events in IRQ context,
go for it. This should reduce the dma_fence signaling latency.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_fw.c | 27 +++++++++++++++++++++++----
1 file changed, 23 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index f5e0ceca4130..8cfebf180de7 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1087,9 +1087,29 @@ static void panthor_job_irq_handler(struct panthor_irq *pirq, u32 status)
}
}
-static irqreturn_t panthor_job_irq_threaded_handler(int irq, void *data)
+static irqreturn_t panthor_job_irq_raw_handler(int irq, void *data)
{
- return panthor_irq_default_threaded_handler(data, panthor_job_irq_handler);
+ struct panthor_irq *pirq = data;
+
+ if (!gpu_read(pirq->iomem, INT_STAT))
+ return IRQ_NONE;
+
+ scoped_guard(spinlock_irqsave, &pirq->mask_lock) {
+ if (pirq->state != PANTHOR_IRQ_STATE_ACTIVE)
+ return IRQ_NONE;
+
+ pirq->state = PANTHOR_IRQ_STATE_PROCESSING;
+ }
+
+ /* We can use INT_STAT here, because we didn't mask the IRQs. */
+ panthor_job_irq_handler(pirq, gpu_read(pirq->iomem, INT_STAT));
+
+ scoped_guard(spinlock_irqsave, &pirq->mask_lock) {
+ if (pirq->state == PANTHOR_IRQ_STATE_PROCESSING)
+ pirq->state = PANTHOR_IRQ_STATE_ACTIVE;
+ }
+
+ return IRQ_HANDLED;
}
static int panthor_fw_start(struct panthor_device *ptdev)
@@ -1489,8 +1509,7 @@ int panthor_fw_init(struct panthor_device *ptdev)
ret = panthor_irq_request(ptdev, &fw->irq, irq, 0,
ptdev->iomem + JOB_INT_BASE, "job",
- panthor_irq_default_raw_handler,
- panthor_job_irq_threaded_handler);
+ panthor_job_irq_raw_handler, NULL);
if (ret) {
drm_err(&ptdev->base, "failed to request job irq");
return ret;
--
2.54.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 10/11] drm/panthor: Use the irqsave variant of spin_lock in panthor_gpu_irq_handler()
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
` (8 preceding siblings ...)
2026-05-12 11:37 ` [PATCH v2 09/11] drm/panthor: Process FW events in IRQ context Boris Brezillon
@ 2026-05-12 11:37 ` Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 11/11] drm/panthor: Process GPU events in IRQ context Boris Brezillon
10 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
This is not a bug per-se, because this lock is never taken in an
interrupt context, but it's not consistent with the other users of this
lock. We're also planning on transitioning GPU event processing to
a hard handler. Again, this alone wouldn't justify using the IRQ-safe
variant, because then this _lock/unlock sequence would be in the
hard-IRQ path, where IRQs are already disabled, but let's do it anyway,
to keep things consistent.
While at it, transition to a guard() instead of a plain lock/unlock
sequence.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_gpu.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index d0be758ea3e1..b9c51f8a051d 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -110,12 +110,11 @@ static void panthor_gpu_irq_handler(struct panthor_irq *pirq, u32 status)
if (status & GPU_IRQ_PROTM_FAULT)
drm_warn(&ptdev->base, "GPU Fault in protected mode\n");
- spin_lock(&ptdev->gpu->reqs_lock);
+ guard(spinlock_irqsave)(&ptdev->gpu->reqs_lock);
if (status & ptdev->gpu->pending_reqs) {
ptdev->gpu->pending_reqs &= ~status;
wake_up_all(&ptdev->gpu->reqs_acked);
}
- spin_unlock(&ptdev->gpu->reqs_lock);
}
static irqreturn_t panthor_gpu_irq_threaded_handler(int irq, void *data)
--
2.54.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 11/11] drm/panthor: Process GPU events in IRQ context
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
` (9 preceding siblings ...)
2026-05-12 11:37 ` [PATCH v2 10/11] drm/panthor: Use the irqsave variant of spin_lock in panthor_gpu_irq_handler() Boris Brezillon
@ 2026-05-12 11:37 ` Boris Brezillon
2026-05-12 11:50 ` Boris Brezillon
10 siblings, 1 reply; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:37 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel, Boris Brezillon
The current panthor_gpu_irq_handler() logic is already IRQ-safe
(no sleep or sleeping locks, spinlocks taken with irqsave in other
contexts, etc), so let's toggle the switch and make it an hard IRQ
handler.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panthor/panthor_gpu.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index b9c51f8a051d..04c8f23baf3f 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -86,10 +86,15 @@ static void panthor_gpu_l2_config_set(struct panthor_device *ptdev)
gpu_write(gpu->iomem, GPU_L2_CONFIG, l2_config);
}
-static void panthor_gpu_irq_handler(struct panthor_irq *pirq, u32 status)
+static irqreturn_t panthor_gpu_irq_raw_handler(int irq, void *data)
{
+ struct panthor_irq *pirq = data;
struct panthor_device *ptdev = pirq->ptdev;
struct panthor_gpu *gpu = ptdev->gpu;
+ u32 status = gpu_read(gpu->irq.iomem, INT_STAT);
+
+ if (!status)
+ return IRQ_NONE;
gpu_write(gpu->irq.iomem, INT_CLEAR, status);
@@ -115,11 +120,8 @@ static void panthor_gpu_irq_handler(struct panthor_irq *pirq, u32 status)
ptdev->gpu->pending_reqs &= ~status;
wake_up_all(&ptdev->gpu->reqs_acked);
}
-}
-static irqreturn_t panthor_gpu_irq_threaded_handler(int irq, void *data)
-{
- return panthor_irq_default_threaded_handler(data, panthor_gpu_irq_handler);
+ return IRQ_HANDLED;
}
/**
@@ -176,8 +178,7 @@ int panthor_gpu_init(struct panthor_device *ptdev)
ret = panthor_irq_request(ptdev, &ptdev->gpu->irq, irq,
GPU_INTERRUPTS_MASK,
ptdev->iomem + GPU_INT_BASE, "gpu",
- panthor_irq_default_raw_handler,
- panthor_gpu_irq_threaded_handler);
+ panthor_gpu_irq_raw_handler, NULL);
if (ret)
return ret;
--
2.54.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 11/11] drm/panthor: Process GPU events in IRQ context
2026-05-12 11:37 ` [PATCH v2 11/11] drm/panthor: Process GPU events in IRQ context Boris Brezillon
@ 2026-05-12 11:50 ` Boris Brezillon
0 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2026-05-12 11:50 UTC (permalink / raw)
To: Steven Price, Liviu Dudau
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, dri-devel, linux-kernel
On Tue, 12 May 2026 13:37:41 +0200
Boris Brezillon <boris.brezillon@collabora.com> wrote:
> The current panthor_gpu_irq_handler() logic is already IRQ-safe
> (no sleep or sleeping locks, spinlocks taken with irqsave in other
> contexts, etc), so let's toggle the switch and make it an hard IRQ
> handler.
>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_gpu.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index b9c51f8a051d..04c8f23baf3f 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -86,10 +86,15 @@ static void panthor_gpu_l2_config_set(struct panthor_device *ptdev)
> gpu_write(gpu->iomem, GPU_L2_CONFIG, l2_config);
> }
>
> -static void panthor_gpu_irq_handler(struct panthor_irq *pirq, u32 status)
> +static irqreturn_t panthor_gpu_irq_raw_handler(int irq, void *data)
> {
> + struct panthor_irq *pirq = data;
> struct panthor_device *ptdev = pirq->ptdev;
> struct panthor_gpu *gpu = ptdev->gpu;
> + u32 status = gpu_read(gpu->irq.iomem, INT_STAT);
> +
> + if (!status)
> + return IRQ_NONE;
>
Forgot to add the pirq state transition here:
scoped_guard(spinlock_irqsave, &pirq->mask_lock) {
if (pirq->state != PANTHOR_IRQ_STATE_ACTIVE)
return IRQ_NONE;
pirq->state = PANTHOR_IRQ_STATE_PROCESSING;
}
> gpu_write(gpu->irq.iomem, INT_CLEAR, status);
>
> @@ -115,11 +120,8 @@ static void panthor_gpu_irq_handler(struct panthor_irq *pirq, u32 status)
> ptdev->gpu->pending_reqs &= ~status;
> wake_up_all(&ptdev->gpu->reqs_acked);
> }
> -}
>
> -static irqreturn_t panthor_gpu_irq_threaded_handler(int irq, void *data)
> -{
> - return panthor_irq_default_threaded_handler(data, panthor_gpu_irq_handler);
and restore it here:
scoped_guard(spinlock_irqsave, &pirq->mask_lock) {
if (pirq->state == PANTHOR_IRQ_STATE_PROCESSING)
pirq->state = PANTHOR_IRQ_STATE_ACTIVE;
}
> + return IRQ_HANDLED;
> }
>
> /**
> @@ -176,8 +178,7 @@ int panthor_gpu_init(struct panthor_device *ptdev)
> ret = panthor_irq_request(ptdev, &ptdev->gpu->irq, irq,
> GPU_INTERRUPTS_MASK,
> ptdev->iomem + GPU_INT_BASE, "gpu",
> - panthor_irq_default_raw_handler,
> - panthor_gpu_irq_threaded_handler);
> + panthor_gpu_irq_raw_handler, NULL);
> if (ret)
> return ret;
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 01/11] drm/panthor: Make panthor_irq::state a non-atomic field
2026-05-12 11:37 ` [PATCH v2 01/11] drm/panthor: Make panthor_irq::state a non-atomic field Boris Brezillon
@ 2026-05-12 18:40 ` Chia-I Wu
0 siblings, 0 replies; 18+ messages in thread
From: Chia-I Wu @ 2026-05-12 18:40 UTC (permalink / raw)
To: Boris Brezillon
Cc: Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, dri-devel,
linux-kernel
On Tue, May 12, 2026 at 4:44 AM Boris Brezillon
<boris.brezillon@collabora.com> wrote:
>
> The only place where panthor_irq::state is accessed without
> panthor_irq::mask_lock held is in the prologue of _irq_suspend(),
> which is not really a fast-path. So let's simplify things by assuming
> panthor_irq::state must always be accessed with the mask_lock held,
> and add a scoped_guard() in _irq_suspend().
>
> Reviewed-by: Steven Price <steven.price@arm.com>
> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.h | 35 ++++++++++++++++----------------
> 1 file changed, 17 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 4e4607bca7cc..3f91ba73829d 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -101,8 +101,12 @@ struct panthor_irq {
> */
> spinlock_t mask_lock;
optional nit: might want to rename mask_lock
>
> - /** @state: one of &enum panthor_irq_state reflecting the current state. */
> - atomic_t state;
> + /**
> + * @state: one of &enum panthor_irq_state reflecting the current state.
> + *
> + * Must be accessed with mask_lock held.
> + */
> + enum panthor_irq_state state;
> };
>
> /**
> @@ -510,18 +514,15 @@ const char *panthor_exception_name(struct panthor_device *ptdev,
> static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data) \
> { \
> struct panthor_irq *pirq = data; \
> - enum panthor_irq_state old_state; \
> \
> if (!gpu_read(pirq->iomem, INT_STAT)) \
> return IRQ_NONE; \
> \
> guard(spinlock_irqsave)(&pirq->mask_lock); \
> - old_state = atomic_cmpxchg(&pirq->state, \
> - PANTHOR_IRQ_STATE_ACTIVE, \
> - PANTHOR_IRQ_STATE_PROCESSING); \
> - if (old_state != PANTHOR_IRQ_STATE_ACTIVE) \
> + if (pirq->state != PANTHOR_IRQ_STATE_ACTIVE) \
> return IRQ_NONE; \
> \
> + pirq->state = PANTHOR_IRQ_STATE_PROCESSING; \
> gpu_write(pirq->iomem, INT_MASK, 0); \
> return IRQ_WAKE_THREAD; \
> } \
> @@ -551,13 +552,10 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
> } \
> \
> scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \
> - enum panthor_irq_state old_state; \
> - \
> - old_state = atomic_cmpxchg(&pirq->state, \
> - PANTHOR_IRQ_STATE_PROCESSING, \
> - PANTHOR_IRQ_STATE_ACTIVE); \
> - if (old_state == PANTHOR_IRQ_STATE_PROCESSING) \
> + if (pirq->state == PANTHOR_IRQ_STATE_PROCESSING) { \
> + pirq->state = PANTHOR_IRQ_STATE_ACTIVE; \
> gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
> + } \
> } \
> \
> return ret; \
> @@ -566,18 +564,19 @@ static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *da
> static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq) \
> { \
> scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \
> - atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDING); \
> + pirq->state = PANTHOR_IRQ_STATE_SUSPENDING; \
> gpu_write(pirq->iomem, INT_MASK, 0); \
> } \
> synchronize_irq(pirq->irq); \
> - atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDED); \
> + scoped_guard(spinlock_irqsave, &pirq->mask_lock) \
> + pirq->state = PANTHOR_IRQ_STATE_SUSPENDED; \
> } \
> \
> static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq) \
> { \
> guard(spinlock_irqsave)(&pirq->mask_lock); \
> \
> - atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE); \
> + pirq->state = PANTHOR_IRQ_STATE_ACTIVE; \
> gpu_write(pirq->iomem, INT_CLEAR, pirq->mask); \
> gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
> } \
> @@ -610,7 +609,7 @@ static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *
> * on the PROCESSING -> ACTIVE transition. \
> * If the IRQ is suspended/suspending, the mask is restored at resume time. \
> */ \
> - if (atomic_read(&pirq->state) == PANTHOR_IRQ_STATE_ACTIVE) \
> + if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE) \
> gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
> } \
> \
> @@ -624,7 +623,7 @@ static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq
> * on the PROCESSING -> ACTIVE transition. \
> * If the IRQ is suspended/suspending, the mask is restored at resume time. \
> */ \
> - if (atomic_read(&pirq->state) == PANTHOR_IRQ_STATE_ACTIVE) \
> + if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE) \
> gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
> }
>
>
> --
> 2.54.0
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 02/11] drm/panthor: Move the register accessors before the IRQ helpers
2026-05-12 11:37 ` [PATCH v2 02/11] drm/panthor: Move the register accessors before the IRQ helpers Boris Brezillon
@ 2026-05-12 18:41 ` Chia-I Wu
0 siblings, 0 replies; 18+ messages in thread
From: Chia-I Wu @ 2026-05-12 18:41 UTC (permalink / raw)
To: Boris Brezillon
Cc: Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, dri-devel,
linux-kernel
On Tue, May 12, 2026 at 5:14 AM Boris Brezillon
<boris.brezillon@collabora.com> wrote:
>
> We're about to add an IRQ inline helper using gpu_read(). Move things
> around to avoid forward declarations.
>
> No functional changes.
>
> Reviewed-by: Steven Price <steven.price@arm.com>
> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
This can be dropped if patch 3 uses non-inline functions.
> ---
> drivers/gpu/drm/panthor/panthor_device.h | 142 +++++++++++++++----------------
> 1 file changed, 71 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 3f91ba73829d..768fc1992368 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -495,6 +495,77 @@ panthor_exception_is_fault(u32 exception_code)
> const char *panthor_exception_name(struct panthor_device *ptdev,
> u32 exception_code);
>
> +static inline void gpu_write(void __iomem *iomem, u32 reg, u32 data)
> +{
> + writel(data, iomem + reg);
> +}
> +
> +static inline u32 gpu_read(void __iomem *iomem, u32 reg)
> +{
> + return readl(iomem + reg);
> +}
> +
> +static inline u32 gpu_read_relaxed(void __iomem *iomem, u32 reg)
> +{
> + return readl_relaxed(iomem + reg);
> +}
> +
> +static inline void gpu_write64(void __iomem *iomem, u32 reg, u64 data)
> +{
> + gpu_write(iomem, reg, lower_32_bits(data));
> + gpu_write(iomem, reg + 4, upper_32_bits(data));
> +}
> +
> +static inline u64 gpu_read64(void __iomem *iomem, u32 reg)
> +{
> + return (gpu_read(iomem, reg) | ((u64)gpu_read(iomem, reg + 4) << 32));
> +}
> +
> +static inline u64 gpu_read64_relaxed(void __iomem *iomem, u32 reg)
> +{
> + return (gpu_read_relaxed(iomem, reg) |
> + ((u64)gpu_read_relaxed(iomem, reg + 4) << 32));
> +}
> +
> +static inline u64 gpu_read64_counter(void __iomem *iomem, u32 reg)
> +{
> + u32 lo, hi1, hi2;
> + do {
> + hi1 = gpu_read(iomem, reg + 4);
> + lo = gpu_read(iomem, reg);
> + hi2 = gpu_read(iomem, reg + 4);
> + } while (hi1 != hi2);
> + return lo | ((u64)hi2 << 32);
> +}
> +
> +#define gpu_read_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us) \
> + read_poll_timeout(gpu_read, val, cond, delay_us, timeout_us, false, \
> + iomem, reg)
> +
> +#define gpu_read_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
> + timeout_us) \
> + read_poll_timeout_atomic(gpu_read, val, cond, delay_us, timeout_us, \
> + false, iomem, reg)
> +
> +#define gpu_read64_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us) \
> + read_poll_timeout(gpu_read64, val, cond, delay_us, timeout_us, false, \
> + iomem, reg)
> +
> +#define gpu_read64_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
> + timeout_us) \
> + read_poll_timeout_atomic(gpu_read64, val, cond, delay_us, timeout_us, \
> + false, iomem, reg)
> +
> +#define gpu_read_relaxed_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
> + timeout_us) \
> + read_poll_timeout_atomic(gpu_read_relaxed, val, cond, delay_us, \
> + timeout_us, false, iomem, reg)
> +
> +#define gpu_read64_relaxed_poll_timeout(iomem, reg, val, cond, delay_us, \
> + timeout_us) \
> + read_poll_timeout(gpu_read64_relaxed, val, cond, delay_us, timeout_us, \
> + false, iomem, reg)
> +
> #define INT_RAWSTAT 0x0
> #define INT_CLEAR 0x4
> #define INT_MASK 0x8
> @@ -629,75 +700,4 @@ static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq
>
> extern struct workqueue_struct *panthor_cleanup_wq;
>
> -static inline void gpu_write(void __iomem *iomem, u32 reg, u32 data)
> -{
> - writel(data, iomem + reg);
> -}
> -
> -static inline u32 gpu_read(void __iomem *iomem, u32 reg)
> -{
> - return readl(iomem + reg);
> -}
> -
> -static inline u32 gpu_read_relaxed(void __iomem *iomem, u32 reg)
> -{
> - return readl_relaxed(iomem + reg);
> -}
> -
> -static inline void gpu_write64(void __iomem *iomem, u32 reg, u64 data)
> -{
> - gpu_write(iomem, reg, lower_32_bits(data));
> - gpu_write(iomem, reg + 4, upper_32_bits(data));
> -}
> -
> -static inline u64 gpu_read64(void __iomem *iomem, u32 reg)
> -{
> - return (gpu_read(iomem, reg) | ((u64)gpu_read(iomem, reg + 4) << 32));
> -}
> -
> -static inline u64 gpu_read64_relaxed(void __iomem *iomem, u32 reg)
> -{
> - return (gpu_read_relaxed(iomem, reg) |
> - ((u64)gpu_read_relaxed(iomem, reg + 4) << 32));
> -}
> -
> -static inline u64 gpu_read64_counter(void __iomem *iomem, u32 reg)
> -{
> - u32 lo, hi1, hi2;
> - do {
> - hi1 = gpu_read(iomem, reg + 4);
> - lo = gpu_read(iomem, reg);
> - hi2 = gpu_read(iomem, reg + 4);
> - } while (hi1 != hi2);
> - return lo | ((u64)hi2 << 32);
> -}
> -
> -#define gpu_read_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us) \
> - read_poll_timeout(gpu_read, val, cond, delay_us, timeout_us, false, \
> - iomem, reg)
> -
> -#define gpu_read_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
> - timeout_us) \
> - read_poll_timeout_atomic(gpu_read, val, cond, delay_us, timeout_us, \
> - false, iomem, reg)
> -
> -#define gpu_read64_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us) \
> - read_poll_timeout(gpu_read64, val, cond, delay_us, timeout_us, false, \
> - iomem, reg)
> -
> -#define gpu_read64_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
> - timeout_us) \
> - read_poll_timeout_atomic(gpu_read64, val, cond, delay_us, timeout_us, \
> - false, iomem, reg)
> -
> -#define gpu_read_relaxed_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \
> - timeout_us) \
> - read_poll_timeout_atomic(gpu_read_relaxed, val, cond, delay_us, \
> - timeout_us, false, iomem, reg)
> -
> -#define gpu_read64_relaxed_poll_timeout(iomem, reg, val, cond, delay_us, \
> - timeout_us) \
> - read_poll_timeout(gpu_read64_relaxed, val, cond, delay_us, timeout_us, \
> - false, iomem, reg)
> -
> #endif
>
> --
> 2.54.0
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 03/11] drm/panthor: Replace the panthor_irq macro machinery by inline helpers
2026-05-12 11:37 ` [PATCH v2 03/11] drm/panthor: Replace the panthor_irq macro machinery by inline helpers Boris Brezillon
@ 2026-05-12 18:58 ` Chia-I Wu
0 siblings, 0 replies; 18+ messages in thread
From: Chia-I Wu @ 2026-05-12 18:58 UTC (permalink / raw)
To: Boris Brezillon
Cc: Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, dri-devel,
linux-kernel
On Tue, May 12, 2026 at 4:54 AM Boris Brezillon
<boris.brezillon@collabora.com> wrote:
>
> Now that panthor_irq contains the iomem region, there's no real need
> for the macro-based panthor_irq helper generation logic. We can just
> provide inline helpers that do the same and let the compiler optimize
> indirect function calls. The only extra annoyance is the fact we have
> to open-code the panthor_xxx_irq_threaded_handler() implementation, but
> those are single-line functions, so it's acceptable.
We might want to __always_inline panthor_irq_default_threaded_handler.
For the rest, do we want to un-inline them?
>
> While at it, we changed the prototype of the IRQ handlers to take
> a panthor_irq instead of panthor_device, since that's the thing
> that's passed around when it comes to panthor_irq, and the
> panthor_device can be directly extracted from there.
>
> Reviewed-by: Steven Price <steven.price@arm.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.h | 245 +++++++++++++++----------------
> drivers/gpu/drm/panthor/panthor_fw.c | 22 ++-
> drivers/gpu/drm/panthor/panthor_gpu.c | 26 ++--
> drivers/gpu/drm/panthor/panthor_mmu.c | 37 ++---
> drivers/gpu/drm/panthor/panthor_pwr.c | 20 ++-
> 5 files changed, 183 insertions(+), 167 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 768fc1992368..393fcda73d88 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -571,131 +571,126 @@ static inline u64 gpu_read64_counter(void __iomem *iomem, u32 reg)
> #define INT_MASK 0x8
> #define INT_STAT 0xc
>
> -/**
> - * PANTHOR_IRQ_HANDLER() - Define interrupt handlers and the interrupt
> - * registration function.
> - *
> - * The boiler-plate to gracefully deal with shared interrupts is
> - * auto-generated. All you have to do is call PANTHOR_IRQ_HANDLER()
> - * just after the actual handler. The handler prototype is:
> - *
> - * void (*handler)(struct panthor_device *, u32 status);
> - */
> -#define PANTHOR_IRQ_HANDLER(__name, __handler) \
> -static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *data) \
> -{ \
> - struct panthor_irq *pirq = data; \
> - \
> - if (!gpu_read(pirq->iomem, INT_STAT)) \
> - return IRQ_NONE; \
> - \
> - guard(spinlock_irqsave)(&pirq->mask_lock); \
> - if (pirq->state != PANTHOR_IRQ_STATE_ACTIVE) \
> - return IRQ_NONE; \
> - \
> - pirq->state = PANTHOR_IRQ_STATE_PROCESSING; \
> - gpu_write(pirq->iomem, INT_MASK, 0); \
> - return IRQ_WAKE_THREAD; \
> -} \
> - \
> -static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, void *data) \
> -{ \
> - struct panthor_irq *pirq = data; \
> - struct panthor_device *ptdev = pirq->ptdev; \
> - irqreturn_t ret = IRQ_NONE; \
> - \
> - while (true) { \
> - /* It's safe to access pirq->mask without the lock held here. If a new \
> - * event gets added to the mask and the corresponding IRQ is pending, \
> - * we'll process it right away instead of adding an extra raw -> threaded \
> - * round trip. If an event is removed and the status bit is set, it will \
> - * be ignored, just like it would have been if the mask had been adjusted \
> - * right before the HW event kicks in. TLDR; it's all expected races we're \
> - * covered for. \
> - */ \
> - u32 status = gpu_read(pirq->iomem, INT_RAWSTAT) & pirq->mask; \
> - \
> - if (!status) \
> - break; \
> - \
> - __handler(ptdev, status); \
> - ret = IRQ_HANDLED; \
> - } \
> - \
> - scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \
> - if (pirq->state == PANTHOR_IRQ_STATE_PROCESSING) { \
> - pirq->state = PANTHOR_IRQ_STATE_ACTIVE; \
> - gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
> - } \
> - } \
> - \
> - return ret; \
> -} \
> - \
> -static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *pirq) \
> -{ \
> - scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \
> - pirq->state = PANTHOR_IRQ_STATE_SUSPENDING; \
> - gpu_write(pirq->iomem, INT_MASK, 0); \
> - } \
> - synchronize_irq(pirq->irq); \
> - scoped_guard(spinlock_irqsave, &pirq->mask_lock) \
> - pirq->state = PANTHOR_IRQ_STATE_SUSPENDED; \
> -} \
> - \
> -static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *pirq) \
> -{ \
> - guard(spinlock_irqsave)(&pirq->mask_lock); \
> - \
> - pirq->state = PANTHOR_IRQ_STATE_ACTIVE; \
> - gpu_write(pirq->iomem, INT_CLEAR, pirq->mask); \
> - gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
> -} \
> - \
> -static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev, \
> - struct panthor_irq *pirq, \
> - int irq, u32 mask, void __iomem *iomem) \
> -{ \
> - pirq->ptdev = ptdev; \
> - pirq->irq = irq; \
> - pirq->mask = mask; \
> - pirq->iomem = iomem; \
> - spin_lock_init(&pirq->mask_lock); \
> - panthor_ ## __name ## _irq_resume(pirq); \
> - \
> - return devm_request_threaded_irq(ptdev->base.dev, irq, \
> - panthor_ ## __name ## _irq_raw_handler, \
> - panthor_ ## __name ## _irq_threaded_handler, \
> - IRQF_SHARED, KBUILD_MODNAME "-" # __name, \
> - pirq); \
> -} \
> - \
> -static inline void panthor_ ## __name ## _irq_enable_events(struct panthor_irq *pirq, u32 mask) \
> -{ \
> - guard(spinlock_irqsave)(&pirq->mask_lock); \
> - pirq->mask |= mask; \
> - \
> - /* The only situation where we need to write the new mask is if the IRQ is active. \
> - * If it's being processed, the mask will be restored for us in _irq_threaded_handler() \
> - * on the PROCESSING -> ACTIVE transition. \
> - * If the IRQ is suspended/suspending, the mask is restored at resume time. \
> - */ \
> - if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE) \
> - gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
> -} \
> - \
> -static inline void panthor_ ## __name ## _irq_disable_events(struct panthor_irq *pirq, u32 mask)\
> -{ \
> - guard(spinlock_irqsave)(&pirq->mask_lock); \
> - pirq->mask &= ~mask; \
> - \
> - /* The only situation where we need to write the new mask is if the IRQ is active. \
> - * If it's being processed, the mask will be restored for us in _irq_threaded_handler() \
> - * on the PROCESSING -> ACTIVE transition. \
> - * If the IRQ is suspended/suspending, the mask is restored at resume time. \
> - */ \
> - if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE) \
> - gpu_write(pirq->iomem, INT_MASK, pirq->mask); \
> +static inline irqreturn_t panthor_irq_default_raw_handler(int irq, void *data)
> +{
> + struct panthor_irq *pirq = data;
> +
> + if (!gpu_read(pirq->iomem, INT_STAT))
> + return IRQ_NONE;
> +
> + guard(spinlock_irqsave)(&pirq->mask_lock);
> + if (pirq->state != PANTHOR_IRQ_STATE_ACTIVE)
> + return IRQ_NONE;
> +
> + pirq->state = PANTHOR_IRQ_STATE_PROCESSING;
> + gpu_write(pirq->iomem, INT_MASK, 0);
> + return IRQ_WAKE_THREAD;
> +}
> +
> +static inline irqreturn_t
> +panthor_irq_default_threaded_handler(void *data,
> + void (*slow_handler)(struct panthor_irq *, u32))
> +{
> + struct panthor_irq *pirq = data;
> + irqreturn_t ret = IRQ_NONE;
> +
> + while (true) {
> + /* It's safe to access pirq->mask without the lock held here. If a new
> + * event gets added to the mask and the corresponding IRQ is pending,
> + * we'll process it right away instead of adding an extra raw -> threaded
> + * round trip. If an event is removed and the status bit is set, it will
> + * be ignored, just like it would have been if the mask had been adjusted
> + * right before the HW event kicks in. TLDR; it's all expected races we're
> + * covered for.
> + */
> + u32 status = gpu_read(pirq->iomem, INT_RAWSTAT) & pirq->mask;
> +
> + if (!status)
> + break;
> +
> + slow_handler(pirq, status);
> + ret = IRQ_HANDLED;
> + }
> +
> + scoped_guard(spinlock_irqsave, &pirq->mask_lock) {
> + if (pirq->state == PANTHOR_IRQ_STATE_PROCESSING) {
> + pirq->state = PANTHOR_IRQ_STATE_ACTIVE;
> + gpu_write(pirq->iomem, INT_MASK, pirq->mask);
> + }
> + }
> +
> + return ret;
> +}
> +
> +static inline void panthor_irq_suspend(struct panthor_irq *pirq)
> +{
> + scoped_guard(spinlock_irqsave, &pirq->mask_lock) {
> + pirq->state = PANTHOR_IRQ_STATE_SUSPENDING;
> + gpu_write(pirq->iomem, INT_MASK, 0);
> + }
> + synchronize_irq(pirq->irq);
> + scoped_guard(spinlock_irqsave, &pirq->mask_lock)
> + pirq->state = PANTHOR_IRQ_STATE_SUSPENDED;
> +}
> +
> +static inline void panthor_irq_resume(struct panthor_irq *pirq)
> +{
> + guard(spinlock_irqsave)(&pirq->mask_lock);
> + pirq->state = PANTHOR_IRQ_STATE_ACTIVE;
> + gpu_write(pirq->iomem, INT_CLEAR, pirq->mask);
> + gpu_write(pirq->iomem, INT_MASK, pirq->mask);
> +}
> +
> +static inline void panthor_irq_enable_events(struct panthor_irq *pirq, u32 mask)
> +{
> + guard(spinlock_irqsave)(&pirq->mask_lock);
> + pirq->mask |= mask;
> +
> + /* The only situation where we need to write the new mask is if the IRQ is active.
> + * If it's being processed, the mask will be restored for us in _irq_threaded_handler()
> + * on the PROCESSING -> ACTIVE transition.
> + * If the IRQ is suspended/suspending, the mask is restored at resume time.
> + */
> + if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE)
> + gpu_write(pirq->iomem, INT_MASK, pirq->mask);
> +}
> +
> +static inline void panthor_irq_disable_events(struct panthor_irq *pirq, u32 mask)
> +{
> + guard(spinlock_irqsave)(&pirq->mask_lock);
> + pirq->mask &= ~mask;
> +
> + /* The only situation where we need to write the new mask is if the IRQ is active.
> + * If it's being processed, the mask will be restored for us in _irq_threaded_handler()
> + * on the PROCESSING -> ACTIVE transition.
> + * If the IRQ is suspended/suspending, the mask is restored at resume time.
> + */
> + if (pirq->state == PANTHOR_IRQ_STATE_ACTIVE)
> + gpu_write(pirq->iomem, INT_MASK, pirq->mask);
> +}
> +
> +static inline int
> +panthor_irq_request(struct panthor_device *ptdev, struct panthor_irq *pirq,
> + int irq, u32 mask, void __iomem *iomem, const char *name,
> + irqreturn_t (*threaded_handler)(int, void *data))
> +{
> + const char *full_name;
> +
> + pirq->ptdev = ptdev;
> + pirq->irq = irq;
> + pirq->mask = mask;
> + pirq->iomem = iomem;
> + spin_lock_init(&pirq->mask_lock);
> +
> + full_name = devm_kasprintf(ptdev->base.dev, GFP_KERNEL, KBUILD_MODNAME "-%s", name);
> + if (!full_name)
> + return -ENOMEM;
> +
> + panthor_irq_resume(pirq);
> + return devm_request_threaded_irq(ptdev->base.dev, irq,
> + panthor_irq_default_raw_handler,
> + threaded_handler,
> + IRQF_SHARED, full_name, pirq);
> }
>
> extern struct workqueue_struct *panthor_cleanup_wq;
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index 986151681b24..eaf599b0a887 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -1064,8 +1064,9 @@ static void panthor_fw_init_global_iface(struct panthor_device *ptdev)
> msecs_to_jiffies(PING_INTERVAL_MS));
> }
>
> -static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
> +static void panthor_job_irq_handler(struct panthor_irq *pirq, u32 status)
> {
> + struct panthor_device *ptdev = pirq->ptdev;
> u32 duration;
> u64 start = 0;
>
> @@ -1091,7 +1092,11 @@ static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 status)
> trace_gpu_job_irq(ptdev->base.dev, status, duration);
> }
> }
> -PANTHOR_IRQ_HANDLER(job, panthor_job_irq_handler);
> +
> +static irqreturn_t panthor_job_irq_threaded_handler(int irq, void *data)
> +{
> + return panthor_irq_default_threaded_handler(data, panthor_job_irq_handler);
> +}
>
> static int panthor_fw_start(struct panthor_device *ptdev)
> {
> @@ -1099,8 +1104,8 @@ static int panthor_fw_start(struct panthor_device *ptdev)
> bool timedout = false;
>
> ptdev->fw->booted = false;
> - panthor_job_irq_enable_events(&ptdev->fw->irq, ~0);
> - panthor_job_irq_resume(&ptdev->fw->irq);
> + panthor_irq_enable_events(&ptdev->fw->irq, ~0);
> + panthor_irq_resume(&ptdev->fw->irq);
> gpu_write(fw->iomem, MCU_CONTROL, MCU_CONTROL_AUTO);
>
> if (!wait_event_timeout(ptdev->fw->req_waitqueue,
> @@ -1210,7 +1215,7 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
> ptdev->reset.fast = true;
> }
>
> - panthor_job_irq_suspend(&ptdev->fw->irq);
> + panthor_irq_suspend(&ptdev->fw->irq);
> panthor_fw_stop(ptdev);
> }
>
> @@ -1280,7 +1285,7 @@ void panthor_fw_unplug(struct panthor_device *ptdev)
> if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev)) {
> /* Make sure the IRQ handler cannot be called after that point. */
> if (ptdev->fw->irq.irq)
> - panthor_job_irq_suspend(&ptdev->fw->irq);
> + panthor_irq_suspend(&ptdev->fw->irq);
>
> panthor_fw_stop(ptdev);
> }
> @@ -1476,8 +1481,9 @@ int panthor_fw_init(struct panthor_device *ptdev)
> if (irq <= 0)
> return -ENODEV;
>
> - ret = panthor_request_job_irq(ptdev, &fw->irq, irq, 0,
> - ptdev->iomem + JOB_INT_BASE);
> + ret = panthor_irq_request(ptdev, &fw->irq, irq, 0,
> + ptdev->iomem + JOB_INT_BASE, "job",
> + panthor_job_irq_threaded_handler);
> if (ret) {
> drm_err(&ptdev->base, "failed to request job irq");
> return ret;
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index e52c5675981f..ce208e384762 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -86,8 +86,9 @@ static void panthor_gpu_l2_config_set(struct panthor_device *ptdev)
> gpu_write(gpu->iomem, GPU_L2_CONFIG, l2_config);
> }
>
> -static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
> +static void panthor_gpu_irq_handler(struct panthor_irq *pirq, u32 status)
> {
> + struct panthor_device *ptdev = pirq->ptdev;
> struct panthor_gpu *gpu = ptdev->gpu;
>
> gpu_write(gpu->irq.iomem, INT_CLEAR, status);
> @@ -116,7 +117,11 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
> }
> spin_unlock(&ptdev->gpu->reqs_lock);
> }
> -PANTHOR_IRQ_HANDLER(gpu, panthor_gpu_irq_handler);
> +
> +static irqreturn_t panthor_gpu_irq_threaded_handler(int irq, void *data)
> +{
> + return panthor_irq_default_threaded_handler(data, panthor_gpu_irq_handler);
> +}
>
> /**
> * panthor_gpu_unplug() - Called when the GPU is unplugged.
> @@ -128,7 +133,7 @@ void panthor_gpu_unplug(struct panthor_device *ptdev)
>
> /* Make sure the IRQ handler is not running after that point. */
> if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev))
> - panthor_gpu_irq_suspend(&ptdev->gpu->irq);
> + panthor_irq_suspend(&ptdev->gpu->irq);
>
> /* Wake-up all waiters. */
> spin_lock_irqsave(&ptdev->gpu->reqs_lock, flags);
> @@ -169,9 +174,10 @@ int panthor_gpu_init(struct panthor_device *ptdev)
> if (irq < 0)
> return irq;
>
> - ret = panthor_request_gpu_irq(ptdev, &ptdev->gpu->irq, irq,
> - GPU_INTERRUPTS_MASK,
> - ptdev->iomem + GPU_INT_BASE);
> + ret = panthor_irq_request(ptdev, &ptdev->gpu->irq, irq,
> + GPU_INTERRUPTS_MASK,
> + ptdev->iomem + GPU_INT_BASE, "gpu",
> + panthor_gpu_irq_threaded_handler);
> if (ret)
> return ret;
>
> @@ -182,7 +188,7 @@ int panthor_gpu_power_changed_on(struct panthor_device *ptdev)
> {
> guard(pm_runtime_active)(ptdev->base.dev);
>
> - panthor_gpu_irq_enable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
> + panthor_irq_enable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
>
> return 0;
> }
> @@ -191,7 +197,7 @@ void panthor_gpu_power_changed_off(struct panthor_device *ptdev)
> {
> guard(pm_runtime_active)(ptdev->base.dev);
>
> - panthor_gpu_irq_disable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
> + panthor_irq_disable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK);
> }
>
> /**
> @@ -424,7 +430,7 @@ void panthor_gpu_suspend(struct panthor_device *ptdev)
> else
> panthor_hw_l2_power_off(ptdev);
>
> - panthor_gpu_irq_suspend(&ptdev->gpu->irq);
> + panthor_irq_suspend(&ptdev->gpu->irq);
> }
>
> /**
> @@ -436,7 +442,7 @@ void panthor_gpu_suspend(struct panthor_device *ptdev)
> */
> void panthor_gpu_resume(struct panthor_device *ptdev)
> {
> - panthor_gpu_irq_resume(&ptdev->gpu->irq);
> + panthor_irq_resume(&ptdev->gpu->irq);
> panthor_hw_l2_power_on(ptdev);
> }
>
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index 452d0b6d4668..375022fb3fd8 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -586,17 +586,13 @@ static u32 panthor_mmu_as_fault_mask(struct panthor_device *ptdev, u32 as)
> return BIT(as);
> }
>
> -/* Forward declaration to call helpers within as_enable/disable */
> -static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status);
> -PANTHOR_IRQ_HANDLER(mmu, panthor_mmu_irq_handler);
> -
> static int panthor_mmu_as_enable(struct panthor_device *ptdev, u32 as_nr,
> u64 transtab, u64 transcfg, u64 memattr)
> {
> struct panthor_mmu *mmu = ptdev->mmu;
>
> - panthor_mmu_irq_enable_events(&ptdev->mmu->irq,
> - panthor_mmu_as_fault_mask(ptdev, as_nr));
> + panthor_irq_enable_events(&ptdev->mmu->irq,
> + panthor_mmu_as_fault_mask(ptdev, as_nr));
>
> gpu_write64(mmu->iomem, AS_TRANSTAB(as_nr), transtab);
> gpu_write64(mmu->iomem, AS_MEMATTR(as_nr), memattr);
> @@ -614,8 +610,8 @@ static int panthor_mmu_as_disable(struct panthor_device *ptdev, u32 as_nr,
>
> lockdep_assert_held(&ptdev->mmu->as.slots_lock);
>
> - panthor_mmu_irq_disable_events(&ptdev->mmu->irq,
> - panthor_mmu_as_fault_mask(ptdev, as_nr));
> + panthor_irq_disable_events(&ptdev->mmu->irq,
> + panthor_mmu_as_fault_mask(ptdev, as_nr));
>
> /* Flush+invalidate RW caches, invalidate RO ones. */
> ret = panthor_gpu_flush_caches(ptdev, CACHE_CLEAN | CACHE_INV,
> @@ -1785,8 +1781,9 @@ static void panthor_vm_unlock_region(struct panthor_vm *vm)
> mutex_unlock(&ptdev->mmu->as.slots_lock);
> }
>
> -static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
> +static void panthor_mmu_irq_handler(struct panthor_irq *pirq, u32 status)
> {
> + struct panthor_device *ptdev = pirq->ptdev;
> struct panthor_mmu *mmu = ptdev->mmu;
> bool has_unhandled_faults = false;
>
> @@ -1849,6 +1846,11 @@ static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
> panthor_sched_report_mmu_fault(ptdev);
> }
>
> +static irqreturn_t panthor_mmu_irq_threaded_handler(int irq, void *data)
> +{
> + return panthor_irq_default_threaded_handler(data, panthor_mmu_irq_handler);
> +}
> +
> /**
> * panthor_mmu_suspend() - Suspend the MMU logic
> * @ptdev: Device.
> @@ -1873,7 +1875,7 @@ void panthor_mmu_suspend(struct panthor_device *ptdev)
> }
> mutex_unlock(&ptdev->mmu->as.slots_lock);
>
> - panthor_mmu_irq_suspend(&ptdev->mmu->irq);
> + panthor_irq_suspend(&ptdev->mmu->irq);
> }
>
> /**
> @@ -1892,7 +1894,7 @@ void panthor_mmu_resume(struct panthor_device *ptdev)
> ptdev->mmu->as.faulty_mask = 0;
> mutex_unlock(&ptdev->mmu->as.slots_lock);
>
> - panthor_mmu_irq_resume(&ptdev->mmu->irq);
> + panthor_irq_resume(&ptdev->mmu->irq);
> }
>
> /**
> @@ -1909,7 +1911,7 @@ void panthor_mmu_pre_reset(struct panthor_device *ptdev)
> {
> struct panthor_vm *vm;
>
> - panthor_mmu_irq_suspend(&ptdev->mmu->irq);
> + panthor_irq_suspend(&ptdev->mmu->irq);
>
> mutex_lock(&ptdev->mmu->vm.lock);
> ptdev->mmu->vm.reset_in_progress = true;
> @@ -1946,7 +1948,7 @@ void panthor_mmu_post_reset(struct panthor_device *ptdev)
>
> mutex_unlock(&ptdev->mmu->as.slots_lock);
>
> - panthor_mmu_irq_resume(&ptdev->mmu->irq);
> + panthor_irq_resume(&ptdev->mmu->irq);
>
> /* Restart the VM_BIND queues. */
> mutex_lock(&ptdev->mmu->vm.lock);
> @@ -3207,7 +3209,7 @@ panthor_mmu_reclaim_priv_bos(struct panthor_device *ptdev,
> void panthor_mmu_unplug(struct panthor_device *ptdev)
> {
> if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev))
> - panthor_mmu_irq_suspend(&ptdev->mmu->irq);
> + panthor_irq_suspend(&ptdev->mmu->irq);
>
> mutex_lock(&ptdev->mmu->as.slots_lock);
> for (u32 i = 0; i < ARRAY_SIZE(ptdev->mmu->as.slots); i++) {
> @@ -3261,9 +3263,10 @@ int panthor_mmu_init(struct panthor_device *ptdev)
> if (irq <= 0)
> return -ENODEV;
>
> - ret = panthor_request_mmu_irq(ptdev, &mmu->irq, irq,
> - panthor_mmu_fault_mask(ptdev, ~0),
> - ptdev->iomem + MMU_INT_BASE);
> + ret = panthor_irq_request(ptdev, &mmu->irq, irq,
> + panthor_mmu_fault_mask(ptdev, ~0),
> + ptdev->iomem + MMU_INT_BASE, "mmu",
> + panthor_mmu_irq_threaded_handler);
> if (ret)
> return ret;
>
> diff --git a/drivers/gpu/drm/panthor/panthor_pwr.c b/drivers/gpu/drm/panthor/panthor_pwr.c
> index 7c7f424a1436..80cf78007896 100644
> --- a/drivers/gpu/drm/panthor/panthor_pwr.c
> +++ b/drivers/gpu/drm/panthor/panthor_pwr.c
> @@ -56,8 +56,9 @@ struct panthor_pwr {
> wait_queue_head_t reqs_acked;
> };
>
> -static void panthor_pwr_irq_handler(struct panthor_device *ptdev, u32 status)
> +static void panthor_pwr_irq_handler(struct panthor_irq *pirq, u32 status)
> {
> + struct panthor_device *ptdev = pirq->ptdev;
> struct panthor_pwr *pwr = ptdev->pwr;
>
> spin_lock(&ptdev->pwr->reqs_lock);
> @@ -75,7 +76,11 @@ static void panthor_pwr_irq_handler(struct panthor_device *ptdev, u32 status)
> }
> spin_unlock(&ptdev->pwr->reqs_lock);
> }
> -PANTHOR_IRQ_HANDLER(pwr, panthor_pwr_irq_handler);
> +
> +static irqreturn_t panthor_pwr_irq_threaded_handler(int irq, void *data)
> +{
> + return panthor_irq_default_threaded_handler(data, panthor_pwr_irq_handler);
> +}
>
> static void panthor_pwr_write_command(struct panthor_device *ptdev, u32 command, u64 args)
> {
> @@ -453,7 +458,7 @@ void panthor_pwr_unplug(struct panthor_device *ptdev)
> return;
>
> /* Make sure the IRQ handler is not running after that point. */
> - panthor_pwr_irq_suspend(&ptdev->pwr->irq);
> + panthor_irq_suspend(&ptdev->pwr->irq);
>
> /* Wake-up all waiters. */
> spin_lock_irqsave(&ptdev->pwr->reqs_lock, flags);
> @@ -483,9 +488,10 @@ int panthor_pwr_init(struct panthor_device *ptdev)
> if (irq < 0)
> return irq;
>
> - err = panthor_request_pwr_irq(
> + err = panthor_irq_request(
> ptdev, &pwr->irq, irq, PWR_INTERRUPTS_MASK,
> - pwr->iomem + PWR_INT_BASE);
> + pwr->iomem + PWR_INT_BASE, "pwr",
> + panthor_pwr_irq_threaded_handler);
> if (err)
> return err;
>
> @@ -564,7 +570,7 @@ void panthor_pwr_suspend(struct panthor_device *ptdev)
> if (!ptdev->pwr)
> return;
>
> - panthor_pwr_irq_suspend(&ptdev->pwr->irq);
> + panthor_irq_suspend(&ptdev->pwr->irq);
> }
>
> void panthor_pwr_resume(struct panthor_device *ptdev)
> @@ -572,5 +578,5 @@ void panthor_pwr_resume(struct panthor_device *ptdev)
> if (!ptdev->pwr)
> return;
>
> - panthor_pwr_irq_resume(&ptdev->pwr->irq);
> + panthor_irq_resume(&ptdev->pwr->irq);
> }
>
> --
> 2.54.0
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 04/11] drm/panthor: Extend the IRQ logic to allow fast/hard IRQ handlers
2026-05-12 11:37 ` [PATCH v2 04/11] drm/panthor: Extend the IRQ logic to allow fast/hard IRQ handlers Boris Brezillon
@ 2026-05-12 19:11 ` Chia-I Wu
0 siblings, 0 replies; 18+ messages in thread
From: Chia-I Wu @ 2026-05-12 19:11 UTC (permalink / raw)
To: Boris Brezillon
Cc: Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, dri-devel,
linux-kernel
On Tue, May 12, 2026 at 4:54 AM Boris Brezillon
<boris.brezillon@collabora.com> wrote:
>
> All drivers except panthor signal their fences from their interrupt
> handler to minimize latency. We could do the same from the threaded
> handler, but the latency is still quite high in that case, so let's
> allow components to choose the context they want their IRQ handler
> to run in by exposing support for custom hard handlers.
>
> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
> Reviewed-by: Steven Price <steven.price@arm.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.h | 11 ++++++++---
> drivers/gpu/drm/panthor/panthor_fw.c | 1 +
> drivers/gpu/drm/panthor/panthor_gpu.c | 1 +
> drivers/gpu/drm/panthor/panthor_mmu.c | 1 +
> drivers/gpu/drm/panthor/panthor_pwr.c | 1 +
> 5 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 393fcda73d88..1aaf06df875b 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -672,6 +672,7 @@ static inline void panthor_irq_disable_events(struct panthor_irq *pirq, u32 mask
> static inline int
> panthor_irq_request(struct panthor_device *ptdev, struct panthor_irq *pirq,
> int irq, u32 mask, void __iomem *iomem, const char *name,
> + irqreturn_t (*raw_handler)(int, void *data),
> irqreturn_t (*threaded_handler)(int, void *data))
> {
> const char *full_name;
> @@ -687,9 +688,13 @@ panthor_irq_request(struct panthor_device *ptdev, struct panthor_irq *pirq,
> return -ENOMEM;
>
> panthor_irq_resume(pirq);
> - return devm_request_threaded_irq(ptdev->base.dev, irq,
> - panthor_irq_default_raw_handler,
> - threaded_handler,
> +
> + if (!threaded_handler) {
> + return devm_request_irq(ptdev->base.dev, irq, raw_handler,
> + IRQF_SHARED, full_name, pirq);
> + }
devm_request_irq expands to devm_request_threaded_irq plus
IRQF_COND_ONESHOT. This appears redundant.
> +
> + return devm_request_threaded_irq(ptdev->base.dev, irq, raw_handler, threaded_handler,
> IRQF_SHARED, full_name, pirq);
> }
>
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index eaf599b0a887..8239a6951569 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -1483,6 +1483,7 @@ int panthor_fw_init(struct panthor_device *ptdev)
>
> ret = panthor_irq_request(ptdev, &fw->irq, irq, 0,
> ptdev->iomem + JOB_INT_BASE, "job",
> + panthor_irq_default_raw_handler,
> panthor_job_irq_threaded_handler);
> if (ret) {
> drm_err(&ptdev->base, "failed to request job irq");
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index ce208e384762..d0be758ea3e1 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -177,6 +177,7 @@ int panthor_gpu_init(struct panthor_device *ptdev)
> ret = panthor_irq_request(ptdev, &ptdev->gpu->irq, irq,
> GPU_INTERRUPTS_MASK,
> ptdev->iomem + GPU_INT_BASE, "gpu",
> + panthor_irq_default_raw_handler,
> panthor_gpu_irq_threaded_handler);
> if (ret)
> return ret;
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index 375022fb3fd8..2955b8baa2e2 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -3266,6 +3266,7 @@ int panthor_mmu_init(struct panthor_device *ptdev)
> ret = panthor_irq_request(ptdev, &mmu->irq, irq,
> panthor_mmu_fault_mask(ptdev, ~0),
> ptdev->iomem + MMU_INT_BASE, "mmu",
> + panthor_irq_default_raw_handler,
> panthor_mmu_irq_threaded_handler);
> if (ret)
> return ret;
> diff --git a/drivers/gpu/drm/panthor/panthor_pwr.c b/drivers/gpu/drm/panthor/panthor_pwr.c
> index 80cf78007896..1efb7f3482ba 100644
> --- a/drivers/gpu/drm/panthor/panthor_pwr.c
> +++ b/drivers/gpu/drm/panthor/panthor_pwr.c
> @@ -491,6 +491,7 @@ int panthor_pwr_init(struct panthor_device *ptdev)
> err = panthor_irq_request(
> ptdev, &pwr->irq, irq, PWR_INTERRUPTS_MASK,
> pwr->iomem + PWR_INT_BASE, "pwr",
> + panthor_irq_default_raw_handler,
> panthor_pwr_irq_threaded_handler);
> if (err)
> return err;
>
> --
> 2.54.0
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 05/11] drm/panthor: Make panthor_fw_{update,toggle}_reqs() callable from IRQ context
2026-05-12 11:37 ` [PATCH v2 05/11] drm/panthor: Make panthor_fw_{update,toggle}_reqs() callable from IRQ context Boris Brezillon
@ 2026-05-12 19:29 ` Chia-I Wu
0 siblings, 0 replies; 18+ messages in thread
From: Chia-I Wu @ 2026-05-12 19:29 UTC (permalink / raw)
To: Boris Brezillon
Cc: Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, dri-devel,
linux-kernel
On Tue, May 12, 2026 at 4:54 AM Boris Brezillon
<boris.brezillon@collabora.com> wrote:
>
> If we want some FW events to be processed in the interrupt path, we need
> the helpers manipulating req regs to be IRQ-safe, which implies using
> spin_lock_irqsave instead of spinlock. While at it, use guards instead
> of plain spin_lock/unlock calls.
>
> Reviewed-by: Steven Price <steven.price@arm.com>
> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
> ---
> drivers/gpu/drm/panthor/panthor_fw.h | 9 +++------
> 1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.h b/drivers/gpu/drm/panthor/panthor_fw.h
> index a99a9b6f4825..e56b7fe15bb3 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.h
> +++ b/drivers/gpu/drm/panthor/panthor_fw.h
> @@ -432,12 +432,11 @@ struct panthor_fw_global_iface {
> #define panthor_fw_toggle_reqs(__iface, __in_reg, __out_reg, __mask) \
> do { \
> u32 __cur_val, __new_val, __out_val; \
> - spin_lock(&(__iface)->lock); \
> + guard(spinlock_irqsave)(&(__iface)->lock); \
> __cur_val = READ_ONCE((__iface)->input->__in_reg); \
> __out_val = READ_ONCE((__iface)->output->__out_reg); \
> __new_val = ((__out_val ^ (__mask)) & (__mask)) | (__cur_val & ~(__mask)); \
> WRITE_ONCE((__iface)->input->__in_reg, __new_val); \
> - spin_unlock(&(__iface)->lock); \
> } while (0)
>
> /**
> @@ -458,21 +457,19 @@ struct panthor_fw_global_iface {
> #define panthor_fw_update_reqs(__iface, __in_reg, __val, __mask) \
> do { \
> u32 __cur_val, __new_val; \
> - spin_lock(&(__iface)->lock); \
> + guard(spinlock_irqsave)(&(__iface)->lock); \
> __cur_val = READ_ONCE((__iface)->input->__in_reg); \
> __new_val = (__cur_val & ~(__mask)) | ((__val) & (__mask)); \
> WRITE_ONCE((__iface)->input->__in_reg, __new_val); \
> - spin_unlock(&(__iface)->lock); \
> } while (0)
>
> #define panthor_fw_update_reqs64(__iface, __in_reg, __val, __mask) \
> do { \
> u64 __cur_val, __new_val; \
> - spin_lock(&(__iface)->lock); \
> + guard(spinlock_irqsave)(&(__iface)->lock); \
> __cur_val = READ_ONCE((__iface)->input->__in_reg); \
> __new_val = (__cur_val & ~(__mask)) | ((__val) & (__mask)); \
> WRITE_ONCE((__iface)->input->__in_reg, __new_val); \
> - spin_unlock(&(__iface)->lock); \
> } while (0)
>
> struct panthor_fw_global_iface *
>
> --
> 2.54.0
>
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2026-05-12 19:29 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-12 11:37 [PATCH v2 00/11] drm/panthor: Reduce dma_fence signalling latency Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 01/11] drm/panthor: Make panthor_irq::state a non-atomic field Boris Brezillon
2026-05-12 18:40 ` Chia-I Wu
2026-05-12 11:37 ` [PATCH v2 02/11] drm/panthor: Move the register accessors before the IRQ helpers Boris Brezillon
2026-05-12 18:41 ` Chia-I Wu
2026-05-12 11:37 ` [PATCH v2 03/11] drm/panthor: Replace the panthor_irq macro machinery by inline helpers Boris Brezillon
2026-05-12 18:58 ` Chia-I Wu
2026-05-12 11:37 ` [PATCH v2 04/11] drm/panthor: Extend the IRQ logic to allow fast/hard IRQ handlers Boris Brezillon
2026-05-12 19:11 ` Chia-I Wu
2026-05-12 11:37 ` [PATCH v2 05/11] drm/panthor: Make panthor_fw_{update,toggle}_reqs() callable from IRQ context Boris Brezillon
2026-05-12 19:29 ` Chia-I Wu
2026-05-12 11:37 ` [PATCH v2 06/11] drm/panthor: Prepare the scheduler logic for FW events in " Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 07/11] drm/panthor: Automate CSG IRQ processing at group unbind time Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 08/11] drm/panthor: Automatically enable interrupts in panthor_fw_wait_acks() Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 09/11] drm/panthor: Process FW events in IRQ context Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 10/11] drm/panthor: Use the irqsave variant of spin_lock in panthor_gpu_irq_handler() Boris Brezillon
2026-05-12 11:37 ` [PATCH v2 11/11] drm/panthor: Process GPU events in IRQ context Boris Brezillon
2026-05-12 11:50 ` Boris Brezillon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox