* [PATCH 1/2] gpu: host1x: skip redundant syncpoint loads in host1x_syncpt_wait()
2026-05-14 10:31 [PATCH 0/2] gpu: host1x: syncpt_wait micro-optimizations Tanmay Patil
@ 2026-05-14 10:31 ` Tanmay Patil
2026-05-14 10:31 ` [PATCH 2/2] gpu: host1x: skip redundant HW state update Tanmay Patil
1 sibling, 0 replies; 3+ messages in thread
From: Tanmay Patil @ 2026-05-14 10:31 UTC (permalink / raw)
To: Thierry Reding, Mikko Perttunen
Cc: David Airlie, Simona Vetter, dri-devel, linux-tegra, linux-kernel,
Tanmay Patil
In host1x_syncpt_wait(), the hardware syncpoint value was loaded
initially for expiry check, and then loaded a second time to
populate the caller's value pointer. Reuse a single load for
both purposes.
After dma_fence_wait_timeout(), the previous code reloaded the syncpoint
value for the expiry check, which is only required in the timeout case.
On success (i.e., return value > 0, or return value == 0 with zero
jiffies remaining), the ISR has already cached the value before
signaling the fence. The value pointer can therefore be populated using
the cached value using host1x_syncpt_read_min() without MMIO access.
Only the timeout path requires a fresh load, move host1x_syncpt_load()
under that path.
Measured Syncpoint wait latency (50000 samples):
Average latency: 12.2 us -> 10.6 us
99.99 pct latency: 62.96 us -> 51.90 us
Signed-off-by: Tanmay Patil <tanmayp@nvidia.com>
---
drivers/gpu/host1x/syncpt.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index acc7d82e0585..807c74fc6a0a 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -222,11 +222,12 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout,
{
struct dma_fence *fence;
long wait_err;
+ u32 curr;
- host1x_hw_syncpt_load(sp->host, sp);
+ curr = host1x_syncpt_load(sp);
if (value)
- *value = host1x_syncpt_load(sp);
+ *value = curr;
if (host1x_syncpt_is_expired(sp, thresh))
return 0;
@@ -245,21 +246,25 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout,
host1x_fence_cancel(fence);
dma_fence_put(fence);
- if (value)
- *value = host1x_syncpt_load(sp);
-
/*
* Don't rely on dma_fence_wait_timeout return value,
* since it returns zero both on timeout and if the
* wait completed with 0 jiffies left.
*/
- host1x_hw_syncpt_load(sp->host, sp);
- if (wait_err == 0 && !host1x_syncpt_is_expired(sp, thresh))
+ if (wait_err == 0 && !host1x_syncpt_is_expired(sp, thresh)) {
+ if (value)
+ *value = host1x_syncpt_load(sp);
+
return -EAGAIN;
- else if (wait_err < 0)
+ } else if (wait_err < 0) {
return wait_err;
- else
+ } else {
+ /* Success, read the value cached by ISR */
+ if (value)
+ *value = host1x_syncpt_read_min(sp);
+
return 0;
+ }
}
EXPORT_SYMBOL(host1x_syncpt_wait);
--
2.54.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* [PATCH 2/2] gpu: host1x: skip redundant HW state update
2026-05-14 10:31 [PATCH 0/2] gpu: host1x: syncpt_wait micro-optimizations Tanmay Patil
2026-05-14 10:31 ` [PATCH 1/2] gpu: host1x: skip redundant syncpoint loads in host1x_syncpt_wait() Tanmay Patil
@ 2026-05-14 10:31 ` Tanmay Patil
1 sibling, 0 replies; 3+ messages in thread
From: Tanmay Patil @ 2026-05-14 10:31 UTC (permalink / raw)
To: Thierry Reding, Mikko Perttunen
Cc: David Airlie, Simona Vetter, dri-devel, linux-tegra, linux-kernel,
Tanmay Patil
When the fence list is empty, host1x_intr_update_hw_state()
falls through to host1x_intr_disable_syncpt_intr()
which does two MMIO writes to disable the syncpoint
interrupt and clear its status.
The ISR has already disabled and acked the interrupt
before calling host1x_intr_handle_interrupt(), making
these two writes redundant. Skip the update_hw_state()
call if no fences remain.
Measured Syncpoint wait latency (50000 samples):
Average latency: 10.6 us -> 9.4 us
99.99 pct latency: 51.90 us -> 36.58 us
Signed-off-by: Tanmay Patil <tanmayp@nvidia.com>
---
drivers/gpu/host1x/intr.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
index f77a678949e9..723297250768 100644
--- a/drivers/gpu/host1x/intr.c
+++ b/drivers/gpu/host1x/intr.c
@@ -92,8 +92,12 @@ void host1x_intr_handle_interrupt(struct host1x *host, unsigned int id)
host1x_fence_signal(fence);
}
- /* Re-enable interrupt if necessary */
- host1x_intr_update_hw_state(host, sp);
+ /*
+ * Re-enable interrupt if necessary. The ISR already disabled the interrupt,
+ * so if no fences remain, no update is needed.
+ */
+ if (!list_empty(&sp->fences.list))
+ host1x_intr_update_hw_state(host, sp);
spin_unlock(&sp->fences.lock);
}
--
2.54.0
^ permalink raw reply related [flat|nested] 3+ messages in thread