[PATCH v2 00/17] drm/msm: A8xx Support

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2
@ 2026-03-27  0:13 Akhil P Oommen
  2026-03-27  0:13 ` [PATCH v2 01/17] drm/msm/a6xx: Use barriers while updating HFI Q headers Akhil P Oommen
                   ` (16 more replies)
  0 siblings, 17 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:13 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen,
	Konrad Dybcio

This series brings a few additional features for A8xx GPUs which were
deferred in the initial series. First few patches are a few general fixes
and improvements. The next few patches adds support for SKU tables,
Preemption, IFPC and a UABI to query AQE support.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
Changes in v2:
- Split X2-85 SKU detection patch (Rob)
- Split out a7xx_gmu_gx_is_on() (Konrad)
- Avoid secure init on every gmu_resume (Konrad)
- Simplify code in adreno_load_gpu() using guard(mutex) (Konrad)
- Correct the payload size for CP_ME_INIT packet (Konrad)
- Remove unnecessary gmu_wrapper check in a8xx_preempt.c (Konrad)
- Few misc formatting fixes (Konrad/Claude bot)
- Add Fixes tag & reorder patches to move fixes to top (Rob/Dmitry)
- Captured R-b tags
- Link to v1: https://lore.kernel.org/r/20260324-a8xx-gpu-batch2-v1-0-fc95b8d9c017@oss.qualcomm.com

---
Akhil P Oommen (17):
      drm/msm/a6xx: Use barriers while updating HFI Q headers
      drm/msm/a8xx: Fix the ticks used in submit traces
      drm/msm/a6xx: Switch to preemption safe AO counter
      drm/msm/a6xx: Correct OOB usage
      drm/msm/adreno: Implement gx_is_on() for A8x
      drm/msm/a6xx: Fix gpu init from secure world
      drm/msm/a6xx: Add support for Debug HFI Q
      drm/msm/adreno: Coredump on GPU/GMU init failures
      drm/msm/a6xx: Use packed structs for HFI
      drm/msm/a6xx: Update HFI definitions
      drm/msm/a8xx: Add SKU table for A840
      drm/msm/a6xx: Add soft fuse detection support
      drm/msm/a6xx: Add SKU detection support for X2-85
      drm/msm/a8xx: Implement IFPC support for A840
      drm/msm/a8xx: Preemption support for A840
      drm/msm/a6xx: Enable Preemption on X2-85
      drm/msm/adreno: Expose a PARAM to check AQE support

 drivers/gpu/drm/msm/Makefile                      |   1 +
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c             |   6 +-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c             |  12 +-
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c         | 243 +++++++++++++++++++-
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c             | 141 ++++++++++--
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h             |   9 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c             | 169 +++++++-------
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h             |   7 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c       |   6 +-
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c             |  33 ++-
 drivers/gpu/drm/msm/adreno/a6xx_hfi.h             | 155 +++++++++++--
 drivers/gpu/drm/msm/adreno/a6xx_preempt.c         |  77 +------
 drivers/gpu/drm/msm/adreno/a6xx_preempt.h         |  82 +++++++
 drivers/gpu/drm/msm/adreno/a8xx_gpu.c             | 162 ++++++++++++--
 drivers/gpu/drm/msm/adreno/a8xx_preempt.c         | 259 ++++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/adreno_device.c        |   5 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c           |  20 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h           |   5 +-
 drivers/gpu/drm/msm/msm_gpu.c                     |   5 +-
 drivers/gpu/drm/msm/msm_gpu.h                     |   2 +
 drivers/gpu/drm/msm/registers/adreno/a6xx.xml     |   4 +
 drivers/gpu/drm/msm/registers/adreno/a6xx_gmu.xml |   6 +-
 include/uapi/drm/msm_drm.h                        |   1 +
 23 files changed, 1137 insertions(+), 273 deletions(-)
---
base-commit: 38d568f154c66430920b01edc9c722ec14aa54d7
change-id: 20260322-a8xx-gpu-batch2-bf7f5a9406ac

Best regards,
-- 
Akhil P Oommen <akhilpo@oss.qualcomm.com>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 01/17] drm/msm/a6xx: Use barriers while updating HFI Q headers
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
@ 2026-03-27  0:13 ` Akhil P Oommen
  2026-03-27  0:13 ` [PATCH v2 02/17] drm/msm/a8xx: Fix the ticks used in submit traces Akhil P Oommen
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:13 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

To avoid harmful compiler optimizations and IO reordering in the HW, use
barriers and READ/WRITE_ONCE helpers as necessary while accessing the HFI
queue index variables.

Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
index 53cfdf4e6c34..4f5dbf46132b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
@@ -34,7 +34,7 @@ static int a6xx_hfi_queue_read(struct a6xx_gmu *gmu,
 	struct a6xx_hfi_queue_header *header = queue->header;
 	u32 i, hdr, index = header->read_index;
 
-	if (header->read_index == header->write_index) {
+	if (header->read_index == READ_ONCE(header->write_index)) {
 		header->rx_request = 1;
 		return 0;
 	}
@@ -62,7 +62,10 @@ static int a6xx_hfi_queue_read(struct a6xx_gmu *gmu,
 	if (!gmu->legacy)
 		index = ALIGN(index, 4) % header->size;
 
-	header->read_index = index;
+	/* Ensure all memory operations are complete before updating the read index */
+	dma_mb();
+
+	WRITE_ONCE(header->read_index, index);
 	return HFI_HEADER_SIZE(hdr);
 }
 
@@ -74,7 +77,7 @@ static int a6xx_hfi_queue_write(struct a6xx_gmu *gmu,
 
 	spin_lock(&queue->lock);
 
-	space = CIRC_SPACE(header->write_index, header->read_index,
+	space = CIRC_SPACE(header->write_index, READ_ONCE(header->read_index),
 		header->size);
 	if (space < dwords) {
 		header->dropped++;
@@ -95,7 +98,10 @@ static int a6xx_hfi_queue_write(struct a6xx_gmu *gmu,
 			queue->data[index] = 0xfafafafa;
 	}
 
-	header->write_index = index;
+	/* Ensure all memory operations are complete before updating the write index */
+	dma_mb();
+
+	WRITE_ONCE(header->write_index, index);
 	spin_unlock(&queue->lock);
 
 	gmu_write(gmu, REG_A6XX_GMU_HOST2GMU_INTR_SET, 0x01);

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 02/17] drm/msm/a8xx: Fix the ticks used in submit traces
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
  2026-03-27  0:13 ` [PATCH v2 01/17] drm/msm/a6xx: Use barriers while updating HFI Q headers Akhil P Oommen
@ 2026-03-27  0:13 ` Akhil P Oommen
  2026-03-27 11:37   ` Konrad Dybcio
  2026-03-27  0:13 ` [PATCH v2 03/17] drm/msm/a6xx: Switch to preemption safe AO counter Akhil P Oommen
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:13 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

GMU_ALWAYS_ON_COUNTER_* registers got moved in A8x, but currently, A6x
register offsets are used in the submit traces instead of A8x offsets.
To fix this, refactor a bit and use adreno_gpu->funcs->get_timestamp()
everywhere.

While we are at it, update a8xx_gmu_get_timestamp() to use the GMU AO
counter.

Fixes: 288a93200892 ("drm/msm/adreno: Introduce A8x GPU Support")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c             |  6 ++----
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c             |  6 ++----
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c             | 23 +++++++----------------
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h             |  2 +-
 drivers/gpu/drm/msm/adreno/a8xx_gpu.c             | 20 ++++++++------------
 drivers/gpu/drm/msm/adreno/adreno_gpu.c           |  6 ++----
 drivers/gpu/drm/msm/adreno/adreno_gpu.h           |  2 +-
 drivers/gpu/drm/msm/registers/adreno/a6xx_gmu.xml |  6 ++++--
 8 files changed, 27 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index db06c06067ae..0ed8bf2b5dd5 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -604,11 +604,9 @@ static int a4xx_pm_suspend(struct msm_gpu *gpu) {
 	return 0;
 }
 
-static int a4xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
+static u64 a4xx_get_timestamp(struct msm_gpu *gpu)
 {
-	*value = gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO);
-
-	return 0;
+	return gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO);
 }
 
 static u64 a4xx_gpu_busy(struct msm_gpu *gpu, unsigned long *out_sample_rate)
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 56eaff2ee4e4..79a441e91fa1 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1435,11 +1435,9 @@ static int a5xx_pm_suspend(struct msm_gpu *gpu)
 	return 0;
 }
 
-static int a5xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
+static u64 a5xx_get_timestamp(struct msm_gpu *gpu)
 {
-	*value = gpu_read64(gpu, REG_A5XX_RBBM_ALWAYSON_COUNTER_LO);
-
-	return 0;
+	return gpu_read64(gpu, REG_A5XX_RBBM_ALWAYSON_COUNTER_LO);
 }
 
 struct a5xx_crashdumper {
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 2129d230a92b..50bd7aa4e792 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -16,8 +16,10 @@
 
 #define GPU_PAS_ID 13
 
-static u64 read_gmu_ao_counter(struct a6xx_gpu *a6xx_gpu)
+static u64 a6xx_gmu_get_timestamp(struct msm_gpu *gpu)
 {
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 	u64 count_hi, count_lo, temp;
 
 	do {
@@ -404,7 +406,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	OUT_RING(ring, upper_32_bits(rbmemptr(ring, fence)));
 	OUT_RING(ring, submit->seqno);
 
-	trace_msm_gpu_submit_flush(submit, read_gmu_ao_counter(a6xx_gpu));
+	trace_msm_gpu_submit_flush(submit, adreno_gpu->funcs->get_timestamp(gpu));
 
 	a6xx_flush(gpu, ring);
 }
@@ -614,7 +616,7 @@ static void a7xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	}
 
 
-	trace_msm_gpu_submit_flush(submit, read_gmu_ao_counter(a6xx_gpu));
+	trace_msm_gpu_submit_flush(submit, adreno_gpu->funcs->get_timestamp(gpu));
 
 	a6xx_flush(gpu, ring);
 
@@ -2414,20 +2416,9 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)
 	return 0;
 }
 
-static int a6xx_gmu_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
+static u64 a6xx_get_timestamp(struct msm_gpu *gpu)
 {
-	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
-	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
-
-	*value = read_gmu_ao_counter(a6xx_gpu);
-
-	return 0;
-}
-
-static int a6xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
-{
-	*value = gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER);
-	return 0;
+	return gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER);
 }
 
 static struct msm_ringbuffer *a6xx_active_ring(struct msm_gpu *gpu)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 4eaa04711246..a4434a6a56dd 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -320,7 +320,7 @@ int a6xx_zap_shader_init(struct msm_gpu *gpu);
 void a8xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool gx_off);
 int a8xx_fault_handler(void *arg, unsigned long iova, int flags, void *data);
 void a8xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring);
-int a8xx_gmu_get_timestamp(struct msm_gpu *gpu, uint64_t *value);
+u64 a8xx_gmu_get_timestamp(struct msm_gpu *gpu);
 u64 a8xx_gpu_busy(struct msm_gpu *gpu, unsigned long *out_sample_rate);
 int a8xx_gpu_feature_probe(struct msm_gpu *gpu);
 void a8xx_gpu_get_slice_info(struct msm_gpu *gpu);
diff --git a/drivers/gpu/drm/msm/adreno/a8xx_gpu.c b/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
index b1887e0cf698..840af9c4d718 100644
--- a/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
@@ -1174,23 +1174,19 @@ void a8xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool gx_
 	gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
 }
 
-int a8xx_gmu_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
+u64 a8xx_gmu_get_timestamp(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	u64 count_hi, count_lo, temp;
 
-	mutex_lock(&a6xx_gpu->gmu.lock);
-
-	/* Force the GPU power on so we can read this register */
-	a6xx_gmu_set_oob(&a6xx_gpu->gmu, GMU_OOB_PERFCOUNTER_SET);
-
-	*value = gpu_read64(gpu, REG_A8XX_CP_ALWAYS_ON_COUNTER);
-
-	a6xx_gmu_clear_oob(&a6xx_gpu->gmu, GMU_OOB_PERFCOUNTER_SET);
-
-	mutex_unlock(&a6xx_gpu->gmu.lock);
+	do {
+		count_hi = gmu_read(&a6xx_gpu->gmu, REG_A8XX_GMU_ALWAYS_ON_COUNTER_H);
+		count_lo = gmu_read(&a6xx_gpu->gmu, REG_A8XX_GMU_ALWAYS_ON_COUNTER_L);
+		temp = gmu_read(&a6xx_gpu->gmu, REG_A8XX_GMU_ALWAYS_ON_COUNTER_H);
+	} while (unlikely(count_hi != temp));
 
-	return 0;
+	return (count_hi << 32) | count_lo;
 }
 
 u64 a8xx_gpu_busy(struct msm_gpu *gpu, unsigned long *out_sample_rate)
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index d5fe6f6f0dec..785e99fb5bd5 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -391,13 +391,11 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
 		return 0;
 	case MSM_PARAM_TIMESTAMP:
 		if (adreno_gpu->funcs->get_timestamp) {
-			int ret;
-
 			pm_runtime_get_sync(&gpu->pdev->dev);
-			ret = adreno_gpu->funcs->get_timestamp(gpu, value);
+			*value = adreno_gpu->funcs->get_timestamp(gpu);
 			pm_runtime_put_autosuspend(&gpu->pdev->dev);
 
-			return ret;
+			return 0;
 		}
 		return -EINVAL;
 	case MSM_PARAM_PRIORITIES:
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 1d0145f8b3ec..c08725ed54c4 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -79,7 +79,7 @@ struct adreno_gpu;
 struct adreno_gpu_funcs {
 	struct msm_gpu_funcs base;
 	struct msm_gpu *(*init)(struct drm_device *dev);
-	int (*get_timestamp)(struct msm_gpu *gpu, uint64_t *value);
+	u64 (*get_timestamp)(struct msm_gpu *gpu);
 	void (*bus_halt)(struct adreno_gpu *adreno_gpu, bool gx_off);
 	int (*mmu_fault_handler)(void *arg, unsigned long iova, int flags, void *data);
 };
diff --git a/drivers/gpu/drm/msm/registers/adreno/a6xx_gmu.xml b/drivers/gpu/drm/msm/registers/adreno/a6xx_gmu.xml
index c4e00b1263cd..33404eb18fd0 100644
--- a/drivers/gpu/drm/msm/registers/adreno/a6xx_gmu.xml
+++ b/drivers/gpu/drm/msm/registers/adreno/a6xx_gmu.xml
@@ -141,8 +141,10 @@ xsi:schemaLocation="https://gitlab.freedesktop.org/freedreno/ rules-fd.xsd">
 	<reg32 offset="0x1f9f0" name="GMU_BOOT_KMD_LM_HANDSHAKE"/>
 	<reg32 offset="0x1f957" name="GMU_LLM_GLM_SLEEP_CTRL"/>
 	<reg32 offset="0x1f958" name="GMU_LLM_GLM_SLEEP_STATUS"/>
-	<reg32 offset="0x1f888" name="GMU_ALWAYS_ON_COUNTER_L"/>
-	<reg32 offset="0x1f889" name="GMU_ALWAYS_ON_COUNTER_H"/>
+	<reg32 offset="0x1f888" name="GMU_ALWAYS_ON_COUNTER_L" variants="A6XX-A7XX"/>
+	<reg32 offset="0x1f840" name="GMU_ALWAYS_ON_COUNTER_L" variants="A8XX-"/>
+	<reg32 offset="0x1f889" name="GMU_ALWAYS_ON_COUNTER_H" variants="A6XX-A7XX"/>
+	<reg32 offset="0x1f841" name="GMU_ALWAYS_ON_COUNTER_H" variants="A8XX-"/>
 	<reg32 offset="0x1f8c3" name="GMU_GMU_PWR_COL_KEEPALIVE" variants="A6XX-A7XX"/>
 	<reg32 offset="0x1f7e4" name="GMU_GMU_PWR_COL_KEEPALIVE" variants="A8XX-"/>
 	<reg32 offset="0x1f8c4" name="GMU_PWR_COL_PREEMPT_KEEPALIVE" variants="A6XX-A7XX"/>

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 03/17] drm/msm/a6xx: Switch to preemption safe AO counter
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
  2026-03-27  0:13 ` [PATCH v2 01/17] drm/msm/a6xx: Use barriers while updating HFI Q headers Akhil P Oommen
  2026-03-27  0:13 ` [PATCH v2 02/17] drm/msm/a8xx: Fix the ticks used in submit traces Akhil P Oommen
@ 2026-03-27  0:13 ` Akhil P Oommen
  2026-03-27 11:32   ` Konrad Dybcio
  2026-03-27  0:13 ` [PATCH v2 04/17] drm/msm/a6xx: Correct OOB usage Akhil P Oommen
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:13 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

CP_ALWAYS_ON_COUNTER is not save-restored during preemption, so it won't
provide accurate data about the 'submit' when preemption is enabled.
Switch to CP_ALWAYS_ON_CONTEXT which is preemption safe.

Fixes: e7ae83da4a28 ("drm/msm/a6xx: Implement preemption for a7xx targets")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 50bd7aa4e792..f8cb8b578661 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -347,7 +347,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	 * GPU registers so we need to add 0x1a800 to the register value on A630
 	 * to get the right value from PM4.
 	 */
-	get_stats_counter(ring, REG_A6XX_CP_ALWAYS_ON_COUNTER,
+	get_stats_counter(ring, REG_A6XX_CP_ALWAYS_ON_CONTEXT,
 		rbmemptr_stats(ring, index, alwayson_start));
 
 	/* Invalidate CCU depth and color */
@@ -388,7 +388,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 
 	get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP(0),
 		rbmemptr_stats(ring, index, cpcycles_end));
-	get_stats_counter(ring, REG_A6XX_CP_ALWAYS_ON_COUNTER,
+	get_stats_counter(ring, REG_A6XX_CP_ALWAYS_ON_CONTEXT,
 		rbmemptr_stats(ring, index, alwayson_end));
 
 	/* Write the fence to the scratch register */
@@ -457,7 +457,7 @@ static void a7xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 	struct msm_ringbuffer *ring = submit->ring;
-	u32 rbbm_perfctr_cp0, cp_always_on_counter;
+	u32 rbbm_perfctr_cp0, cp_always_on_context;
 	unsigned int i, ibs = 0;
 
 	adreno_check_and_reenable_stall(adreno_gpu);
@@ -480,14 +480,14 @@ static void a7xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 
 	if (adreno_is_a8xx(adreno_gpu)) {
 		rbbm_perfctr_cp0 = REG_A8XX_RBBM_PERFCTR_CP(0);
-		cp_always_on_counter = REG_A8XX_CP_ALWAYS_ON_COUNTER;
+		cp_always_on_context = REG_A8XX_CP_ALWAYS_ON_CONTEXT;
 	} else {
 		rbbm_perfctr_cp0 = REG_A7XX_RBBM_PERFCTR_CP(0);
-		cp_always_on_counter = REG_A6XX_CP_ALWAYS_ON_COUNTER;
+		cp_always_on_context = REG_A6XX_CP_ALWAYS_ON_CONTEXT;
 	}
 
 	get_stats_counter(ring, rbbm_perfctr_cp0, rbmemptr_stats(ring, index, cpcycles_start));
-	get_stats_counter(ring, cp_always_on_counter, rbmemptr_stats(ring, index, alwayson_start));
+	get_stats_counter(ring, cp_always_on_context, rbmemptr_stats(ring, index, alwayson_start));
 
 	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
 	OUT_RING(ring, CP_SET_THREAD_BOTH);
@@ -535,7 +535,7 @@ static void a7xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	}
 
 	get_stats_counter(ring, rbbm_perfctr_cp0, rbmemptr_stats(ring, index, cpcycles_end));
-	get_stats_counter(ring, cp_always_on_counter, rbmemptr_stats(ring, index, alwayson_end));
+	get_stats_counter(ring, cp_always_on_context, rbmemptr_stats(ring, index, alwayson_end));
 
 	/* Write the fence to the scratch register */
 	if (adreno_is_a8xx(adreno_gpu)) {

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 04/17] drm/msm/a6xx: Correct OOB usage
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (2 preceding siblings ...)
  2026-03-27  0:13 ` [PATCH v2 03/17] drm/msm/a6xx: Switch to preemption safe AO counter Akhil P Oommen
@ 2026-03-27  0:13 ` Akhil P Oommen
  2026-03-27  0:13 ` [PATCH v2 05/17] drm/msm/adreno: Implement gx_is_on() for A8x Akhil P Oommen
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:13 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

During the GMU resume sequence, using another OOB other than OOB_GPU may
confuse the internal state of GMU firmware. To align more strictly with
the downstream sequence, move the sysprof related OOB setup after the
OOB_GPU is cleared.

Fixes: 62cd0fa6990b ("drm/msm/adreno: Disable IFPC when sysprof is active")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 5 -----
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 ++++++
 drivers/gpu/drm/msm/adreno/a8xx_gpu.c | 6 ++++++
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 9662201cd2e9..690d3e53e273 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1236,11 +1236,6 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
 	/* Set the GPU to the current freq */
 	a6xx_gmu_set_initial_freq(gpu, gmu);
 
-	if (refcount_read(&gpu->sysprof_active) > 1) {
-		ret = a6xx_gmu_set_oob(gmu, GMU_OOB_PERFCOUNTER_SET);
-		if (!ret)
-			set_bit(GMU_STATUS_OOB_PERF_SET, &gmu->status);
-	}
 out:
 	/* On failure, shut down the GMU to leave it in a good state */
 	if (ret) {
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index f8cb8b578661..f4b7fc28b677 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1605,6 +1605,12 @@ static int hw_init(struct msm_gpu *gpu)
 		a6xx_gmu_clear_oob(&a6xx_gpu->gmu, GMU_OOB_BOOT_SLUMBER);
 	}
 
+	if (!ret && (refcount_read(&gpu->sysprof_active) > 1)) {
+		ret = a6xx_gmu_set_oob(gmu, GMU_OOB_PERFCOUNTER_SET);
+		if (!ret)
+			set_bit(GMU_STATUS_OOB_PERF_SET, &gmu->status);
+	}
+
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/msm/adreno/a8xx_gpu.c b/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
index 840af9c4d718..fafeac62aebf 100644
--- a/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
@@ -711,6 +711,12 @@ static int hw_init(struct msm_gpu *gpu)
 	 */
 	a6xx_gmu_clear_oob(&a6xx_gpu->gmu, GMU_OOB_GPU_SET);
 
+	if (!ret && (refcount_read(&gpu->sysprof_active) > 1)) {
+		ret = a6xx_gmu_set_oob(gmu, GMU_OOB_PERFCOUNTER_SET);
+		if (!ret)
+			set_bit(GMU_STATUS_OOB_PERF_SET, &gmu->status);
+	}
+
 	return ret;
 }
 

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 05/17] drm/msm/adreno: Implement gx_is_on() for A8x
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (3 preceding siblings ...)
  2026-03-27  0:13 ` [PATCH v2 04/17] drm/msm/a6xx: Correct OOB usage Akhil P Oommen
@ 2026-03-27  0:13 ` Akhil P Oommen
  2026-03-27  0:13 ` [PATCH v2 06/17] drm/msm/a6xx: Fix gpu init from secure world Akhil P Oommen
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:13 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen,
	Konrad Dybcio

A8x has a diverged enough for a separate implementation of gx_is_on()
check. Add that and move them to the adreno func table.

Fixes: 288a93200892 ("drm/msm/adreno: Introduce A8x GPU Support")
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c       | 42 ++++++++++++++++++++++++++---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h       |  5 +++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c       |  6 ++++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  4 +--
 drivers/gpu/drm/msm/adreno/adreno_gpu.h     |  1 +
 5 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 690d3e53e273..b41dbca1ebc6 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -91,10 +91,10 @@ bool a6xx_gmu_sptprac_is_on(struct a6xx_gmu *gmu)
 }
 
 /* Check to see if the GX rail is still powered */
-bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu)
+bool a6xx_gmu_gx_is_on(struct adreno_gpu *adreno_gpu)
 {
-	struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
-	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
 	u32 val;
 
 	/* This can be called from gpu state code so make sure GMU is valid */
@@ -117,6 +117,40 @@ bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu)
 		A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF));
 }
 
+bool a7xx_gmu_gx_is_on(struct adreno_gpu *adreno_gpu)
+{
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
+	u32 val;
+
+	/* This can be called from gpu state code so make sure GMU is valid */
+	if (!gmu->initialized)
+		return false;
+
+	val = gmu_read(gmu, REG_A6XX_GMU_SPTPRAC_PWR_CLK_STATUS);
+
+	return !(val &
+		(A7XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_GDSC_POWER_OFF |
+		A7XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF));
+}
+
+bool a8xx_gmu_gx_is_on(struct adreno_gpu *adreno_gpu)
+{
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
+	u32 val;
+
+	/* This can be called from gpu state code so make sure GMU is valid */
+	if (!gmu->initialized)
+		return false;
+
+	val = gmu_read(gmu, REG_A8XX_GMU_PWR_CLK_STATUS);
+
+	return !(val &
+		(A8XX_GMU_PWR_CLK_STATUS_GX_HM_GDSC_POWER_OFF |
+		 A8XX_GMU_PWR_CLK_STATUS_GX_HM_CLK_OFF));
+}
+
 void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
 		       bool suspended)
 {
@@ -240,7 +274,7 @@ static bool a6xx_gmu_check_idle_level(struct a6xx_gmu *gmu)
 
 	if (val == local) {
 		if (gmu->idle_level != GMU_IDLE_STATE_IFPC ||
-			!a6xx_gmu_gx_is_on(gmu))
+			!adreno_gpu->funcs->gx_is_on(adreno_gpu))
 			return true;
 	}
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 2af074c8e8cf..9f09daf45ab2 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -10,6 +10,7 @@
 #include <linux/notifier.h>
 #include <linux/soc/qcom/qcom_aoss.h>
 #include "msm_drv.h"
+#include "adreno_gpu.h"
 #include "a6xx_hfi.h"
 
 struct a6xx_gmu_bo {
@@ -231,7 +232,9 @@ void a6xx_hfi_stop(struct a6xx_gmu *gmu);
 int a6xx_hfi_send_prep_slumber(struct a6xx_gmu *gmu);
 int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, u32 perf_index, u32 bw_index);
 
-bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu);
+bool a6xx_gmu_gx_is_on(struct adreno_gpu *adreno_gpu);
+bool a7xx_gmu_gx_is_on(struct adreno_gpu *adreno_gpu);
+bool a8xx_gmu_gx_is_on(struct adreno_gpu *adreno_gpu);
 bool a6xx_gmu_sptprac_is_on(struct a6xx_gmu *gmu);
 void a6xx_sptprac_disable(struct a6xx_gmu *gmu);
 int a6xx_sptprac_enable(struct a6xx_gmu *gmu);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index f4b7fc28b677..8718919c7e19 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1643,7 +1643,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
 
 	adreno_dump_info(gpu);
 
-	if (a6xx_gmu_gx_is_on(&a6xx_gpu->gmu)) {
+	if (adreno_gpu->funcs->gx_is_on(adreno_gpu)) {
 		/* Sometimes crashstate capture is skipped, so SQE should be halted here again */
 		gpu_write(gpu, REG_A6XX_CP_SQE_CNTL, 3);
 
@@ -2762,6 +2762,7 @@ const struct adreno_gpu_funcs a6xx_gpu_funcs = {
 	.get_timestamp = a6xx_gmu_get_timestamp,
 	.bus_halt = a6xx_bus_clear_pending_transactions,
 	.mmu_fault_handler = a6xx_fault_handler,
+	.gx_is_on = a6xx_gmu_gx_is_on,
 };
 
 const struct adreno_gpu_funcs a6xx_gmuwrapper_funcs = {
@@ -2794,6 +2795,7 @@ const struct adreno_gpu_funcs a6xx_gmuwrapper_funcs = {
 	.get_timestamp = a6xx_get_timestamp,
 	.bus_halt = a6xx_bus_clear_pending_transactions,
 	.mmu_fault_handler = a6xx_fault_handler,
+	.gx_is_on = a6xx_gmu_gx_is_on,
 };
 
 const struct adreno_gpu_funcs a7xx_gpu_funcs = {
@@ -2828,6 +2830,7 @@ const struct adreno_gpu_funcs a7xx_gpu_funcs = {
 	.get_timestamp = a6xx_gmu_get_timestamp,
 	.bus_halt = a6xx_bus_clear_pending_transactions,
 	.mmu_fault_handler = a6xx_fault_handler,
+	.gx_is_on = a7xx_gmu_gx_is_on,
 };
 
 const struct adreno_gpu_funcs a8xx_gpu_funcs = {
@@ -2855,4 +2858,5 @@ const struct adreno_gpu_funcs a8xx_gpu_funcs = {
 	.get_timestamp = a8xx_gmu_get_timestamp,
 	.bus_halt = a8xx_bus_clear_pending_transactions,
 	.mmu_fault_handler = a8xx_fault_handler,
+	.gx_is_on = a8xx_gmu_gx_is_on,
 };
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index d2d6b2fd3cba..7bec4e509d2c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -1251,7 +1251,7 @@ static void a6xx_get_gmu_registers(struct msm_gpu *gpu,
 		_a6xx_get_gmu_registers(gpu, a6xx_state, &a6xx_gpucc_reg,
 			&a6xx_state->gmu_registers[2], false);
 
-	if (!a6xx_gmu_gx_is_on(&a6xx_gpu->gmu))
+	if (!adreno_gpu->funcs->gx_is_on(adreno_gpu))
 		return;
 
 	/* Set the fence to ALLOW mode so we can access the registers */
@@ -1608,7 +1608,7 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu)
 	}
 
 	/* If GX isn't on the rest of the data isn't going to be accessible */
-	if (!a6xx_gmu_gx_is_on(&a6xx_gpu->gmu))
+	if (!adreno_gpu->funcs->gx_is_on(adreno_gpu))
 		return &a6xx_state->base;
 
 	/* Halt SQE first */
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index c08725ed54c4..29097e6b4253 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -82,6 +82,7 @@ struct adreno_gpu_funcs {
 	u64 (*get_timestamp)(struct msm_gpu *gpu);
 	void (*bus_halt)(struct adreno_gpu *adreno_gpu, bool gx_off);
 	int (*mmu_fault_handler)(void *arg, unsigned long iova, int flags, void *data);
+	bool (*gx_is_on)(struct adreno_gpu *adreno_gpu);
 };
 
 struct adreno_reglist {

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 06/17] drm/msm/a6xx: Fix gpu init from secure world
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (4 preceding siblings ...)
  2026-03-27  0:13 ` [PATCH v2 05/17] drm/msm/adreno: Implement gx_is_on() for A8x Akhil P Oommen
@ 2026-03-27  0:13 ` Akhil P Oommen
  2026-03-27  0:13 ` [PATCH v2 07/17] drm/msm/a6xx: Add support for Debug HFI Q Akhil P Oommen
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:13 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

A7XX_GEN2 and newer GPUs requires initialization of few configurations
related to features/power from secure world. The SCM call to do this
should be triggered after GDSC and clocks are enabled. So, keep this
sequence to a6xx_gmu_resume instead of the probe.

Also, simplify the error handling in a6xx_gmu_resume() using 'goto'
labels.

Fixes: 14b27d5df3ea ("drm/msm/a7xx: Initialize a750 "software fuse"")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 93 +++++++++++++++++++++++++++++------
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  2 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 59 ----------------------
 3 files changed, 80 insertions(+), 74 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index b41dbca1ebc6..1b44b9e21ad8 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -3,6 +3,7 @@
 
 #include <linux/bitfield.h>
 #include <linux/clk.h>
+#include <linux/firmware/qcom/qcom_scm.h>
 #include <linux/interconnect.h>
 #include <linux/of_platform.h>
 #include <linux/platform_device.h>
@@ -1191,6 +1192,65 @@ static void a6xx_gmu_set_initial_bw(struct msm_gpu *gpu, struct a6xx_gmu *gmu)
 	dev_pm_opp_put(gpu_opp);
 }
 
+static int a6xx_gmu_secure_init(struct a6xx_gpu *a6xx_gpu)
+{
+	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
+	struct msm_gpu *gpu = &adreno_gpu->base;
+	struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
+	u32 fuse_val;
+	int ret;
+
+	if (test_bit(GMU_STATUS_SECURE_INIT, &gmu->status))
+		return 0;
+
+	if (adreno_is_a750(adreno_gpu) || adreno_is_a8xx(adreno_gpu)) {
+		/*
+		 * Assume that if qcom scm isn't available, that whatever
+		 * replacement allows writing the fuse register ourselves.
+		 * Users of alternative firmware need to make sure this
+		 * register is writeable or indicate that it's not somehow.
+		 * Print a warning because if you mess this up you're about to
+		 * crash horribly.
+		 */
+		if (!qcom_scm_is_available()) {
+			dev_warn_once(gpu->dev->dev,
+				"SCM is not available, poking fuse register\n");
+			a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
+				A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
+				A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
+				A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
+			adreno_gpu->has_ray_tracing = true;
+			goto done;
+		}
+
+		ret = qcom_scm_gpu_init_regs(QCOM_SCM_GPU_ALWAYS_EN_REQ |
+					     QCOM_SCM_GPU_TSENSE_EN_REQ);
+		if (ret) {
+			dev_warn_once(gpu->dev->dev,
+				"SCM call failed\n");
+			return ret;
+		}
+
+		/*
+		 * On A7XX_GEN3 and newer, raytracing may be disabled by the
+		 * firmware, find out whether that's the case. The scm call
+		 * above sets the fuse register.
+		 */
+		fuse_val = a6xx_llc_read(a6xx_gpu,
+					 REG_A7XX_CX_MISC_SW_FUSE_VALUE);
+		adreno_gpu->has_ray_tracing =
+			!!(fuse_val & A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING);
+	} else if (adreno_is_a740(adreno_gpu)) {
+		/* Raytracing is always enabled on a740 */
+		adreno_gpu->has_ray_tracing = true;
+	}
+
+done:
+	set_bit(GMU_STATUS_SECURE_INIT, &gmu->status);
+	return 0;
+}
+
+
 int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
 {
 	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
@@ -1219,11 +1279,12 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
 	clk_set_rate(gmu->hub_clk, adreno_is_a740_family(adreno_gpu) ?
 		     200000000 : 150000000);
 	ret = clk_bulk_prepare_enable(gmu->nr_clocks, gmu->clocks);
-	if (ret) {
-		pm_runtime_put(gmu->gxpd);
-		pm_runtime_put(gmu->dev);
-		return ret;
-	}
+	if (ret)
+		goto rpm_put;
+
+	ret = a6xx_gmu_secure_init(a6xx_gpu);
+	if (ret)
+		goto disable_clk;
 
 	/* Read the slice info on A8x GPUs */
 	a8xx_gpu_get_slice_info(gpu);
@@ -1253,11 +1314,11 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
 
 	ret = a6xx_gmu_fw_start(gmu, status);
 	if (ret)
-		goto out;
+		goto disable_irq;
 
 	ret = a6xx_hfi_start(gmu, status);
 	if (ret)
-		goto out;
+		goto disable_irq;
 
 	/*
 	 * Turn on the GMU firmware fault interrupt after we know the boot
@@ -1270,14 +1331,16 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
 	/* Set the GPU to the current freq */
 	a6xx_gmu_set_initial_freq(gpu, gmu);
 
-out:
-	/* On failure, shut down the GMU to leave it in a good state */
-	if (ret) {
-		disable_irq(gmu->gmu_irq);
-		a6xx_rpmh_stop(gmu);
-		pm_runtime_put(gmu->gxpd);
-		pm_runtime_put(gmu->dev);
-	}
+	return 0;
+
+disable_irq:
+	disable_irq(gmu->gmu_irq);
+	a6xx_rpmh_stop(gmu);
+disable_clk:
+	clk_bulk_disable_unprepare(gmu->nr_clocks, gmu->clocks);
+rpm_put:
+	pm_runtime_put(gmu->gxpd);
+	pm_runtime_put(gmu->dev);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 9f09daf45ab2..0cd8ae1b4f5c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -130,6 +130,8 @@ struct a6xx_gmu {
 #define GMU_STATUS_PDC_SLEEP	1
 /* To track Perfcounter OOB set status */
 #define GMU_STATUS_OOB_PERF_SET 2
+/* To track whether secure world init was done */
+#define GMU_STATUS_SECURE_INIT	3
 	unsigned long status;
 };
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 8718919c7e19..5cddfc03828f 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,7 +10,6 @@
 
 #include <linux/bitfield.h>
 #include <linux/devfreq.h>
-#include <linux/firmware/qcom/qcom_scm.h>
 #include <linux/pm_domain.h>
 #include <linux/soc/qcom/llcc-qcom.h>
 
@@ -2160,56 +2159,6 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
 		a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
 }
 
-static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
-{
-	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
-	struct msm_gpu *gpu = &adreno_gpu->base;
-	u32 fuse_val;
-	int ret;
-
-	if (adreno_is_a750(adreno_gpu) || adreno_is_a8xx(adreno_gpu)) {
-		/*
-		 * Assume that if qcom scm isn't available, that whatever
-		 * replacement allows writing the fuse register ourselves.
-		 * Users of alternative firmware need to make sure this
-		 * register is writeable or indicate that it's not somehow.
-		 * Print a warning because if you mess this up you're about to
-		 * crash horribly.
-		 */
-		if (!qcom_scm_is_available()) {
-			dev_warn_once(gpu->dev->dev,
-				"SCM is not available, poking fuse register\n");
-			a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
-				A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
-				A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
-				A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
-			adreno_gpu->has_ray_tracing = true;
-			return 0;
-		}
-
-		ret = qcom_scm_gpu_init_regs(QCOM_SCM_GPU_ALWAYS_EN_REQ |
-					     QCOM_SCM_GPU_TSENSE_EN_REQ);
-		if (ret)
-			return ret;
-
-		/*
-		 * On A7XX_GEN3 and newer, raytracing may be disabled by the
-		 * firmware, find out whether that's the case. The scm call
-		 * above sets the fuse register.
-		 */
-		fuse_val = a6xx_llc_read(a6xx_gpu,
-					 REG_A7XX_CX_MISC_SW_FUSE_VALUE);
-		adreno_gpu->has_ray_tracing =
-			!!(fuse_val & A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING);
-	} else if (adreno_is_a740(adreno_gpu)) {
-		/* Raytracing is always enabled on a740 */
-		adreno_gpu->has_ray_tracing = true;
-	}
-
-	return 0;
-}
-
-
 #define GBIF_CLIENT_HALT_MASK		BIT(0)
 #define GBIF_ARB_HALT_MASK		BIT(1)
 #define VBIF_XIN_HALT_CTRL0_MASK	GENMASK(3, 0)
@@ -2705,14 +2654,6 @@ static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 		return ERR_PTR(ret);
 	}
 
-	if (adreno_is_a7xx(adreno_gpu) || adreno_is_a8xx(adreno_gpu)) {
-		ret = a7xx_cx_mem_init(a6xx_gpu);
-		if (ret) {
-			a6xx_destroy(&(a6xx_gpu->base.base));
-			return ERR_PTR(ret);
-		}
-	}
-
 	adreno_gpu->uche_trap_base = 0x1fffffffff000ull;
 
 	msm_mmu_set_fault_handler(to_msm_vm(gpu->vm)->mmu, gpu,

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 07/17] drm/msm/a6xx: Add support for Debug HFI Q
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (5 preceding siblings ...)
  2026-03-27  0:13 ` [PATCH v2 06/17] drm/msm/a6xx: Fix gpu init from secure world Akhil P Oommen
@ 2026-03-27  0:13 ` Akhil P Oommen
  2026-03-27 10:34   ` Konrad Dybcio
  2026-03-27  0:13 ` [PATCH v2 08/17] drm/msm/adreno: Coredump on GPU/GMU init failures Akhil P Oommen
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:13 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

Add the Debug HFI Queue which contains the F2H messages posted from the
GMU firmware. Having this data in coredump is useful to debug firmware
issues.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h       |  2 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  2 +-
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c       | 16 +++++++++++++---
 drivers/gpu/drm/msm/adreno/a6xx_hfi.h       |  2 ++
 4 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 0cd8ae1b4f5c..3f96b10b5f61 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -111,7 +111,7 @@ struct a6xx_gmu {
 
 	unsigned long freq;
 
-	struct a6xx_hfi_queue queues[2];
+	struct a6xx_hfi_queue queues[HFI_MAX_QUEUES];
 
 	bool initialized;
 	bool hung;
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 7bec4e509d2c..c0b9661131e8 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -57,7 +57,7 @@ struct a6xx_gpu_state {
 	struct msm_gpu_state_bo *gmu_hfi;
 	struct msm_gpu_state_bo *gmu_debug;
 
-	s32 hfi_queue_history[2][HFI_HISTORY_SZ];
+	s32 hfi_queue_history[HFI_MAX_QUEUES][HFI_HISTORY_SZ];
 
 	struct list_head objs;
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
index 4f5dbf46132b..09b6bc464b47 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
@@ -1062,8 +1062,8 @@ void a6xx_hfi_init(struct a6xx_gmu *gmu)
 	struct a6xx_gmu_bo *hfi = &gmu->hfi;
 	struct a6xx_hfi_queue_table_header *table = hfi->virt;
 	struct a6xx_hfi_queue_header *headers = hfi->virt + sizeof(*table);
+	int table_size, idx;
 	u64 offset;
-	int table_size;
 
 	/*
 	 * The table size is the size of the table header plus all of the queue
@@ -1082,12 +1082,22 @@ void a6xx_hfi_init(struct a6xx_gmu *gmu)
 	table->active_queues = ARRAY_SIZE(gmu->queues);
 
 	/* Command queue */
+	idx = 0;
 	offset = SZ_4K;
-	a6xx_hfi_queue_init(&gmu->queues[0], &headers[0], hfi->virt + offset,
+	a6xx_hfi_queue_init(&gmu->queues[idx], &headers[idx], hfi->virt + offset,
 		hfi->iova + offset, 0);
 
 	/* GMU response queue */
+	idx++;
 	offset += SZ_4K;
-	a6xx_hfi_queue_init(&gmu->queues[1], &headers[1], hfi->virt + offset,
+	a6xx_hfi_queue_init(&gmu->queues[idx], &headers[idx], hfi->virt + offset,
 		hfi->iova + offset, gmu->legacy ? 4 : 1);
+
+	/* GMU Debug queue */
+	idx++;
+	offset += SZ_4K;
+	a6xx_hfi_queue_init(&gmu->queues[idx], &headers[idx], hfi->virt + offset,
+		hfi->iova + offset, gmu->legacy ? 5 : 2);
+
+	WARN_ON(idx >= HFI_MAX_QUEUES);
 }
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.h b/drivers/gpu/drm/msm/adreno/a6xx_hfi.h
index 6f9f74a0bc85..19f6eca2c8c9 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.h
@@ -4,6 +4,8 @@
 #ifndef _A6XX_HFI_H_
 #define _A6XX_HFI_H_
 
+#define HFI_MAX_QUEUES 3
+
 struct a6xx_hfi_queue_table_header {
 	u32 version;
 	u32 size;		/* Size of the queue table in dwords */

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 08/17] drm/msm/adreno: Coredump on GPU/GMU init failures
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (6 preceding siblings ...)
  2026-03-27  0:13 ` [PATCH v2 07/17] drm/msm/a6xx: Add support for Debug HFI Q Akhil P Oommen
@ 2026-03-27  0:13 ` Akhil P Oommen
  2026-03-27 11:35   ` Konrad Dybcio
  2026-03-30 19:51   ` Rob Clark
  2026-03-27  0:13 ` [PATCH v2 09/17] drm/msm/a6xx: Use packed structs for HFI Akhil P Oommen
                   ` (8 subsequent siblings)
  16 siblings, 2 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:13 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

Capture coredump on GPU or GMU errors during initialization to help in
debugging the issues. To be consistent with the locks while calling
msm_gpu_crashstate_capture(), call pm_runtime_get(gpu) always with
msm_gpu->lock.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c      | 1 +
 drivers/gpu/drm/msm/adreno/adreno_device.c | 5 +++--
 drivers/gpu/drm/msm/adreno/adreno_gpu.c    | 5 ++++-
 drivers/gpu/drm/msm/msm_gpu.c              | 5 +++--
 drivers/gpu/drm/msm/msm_gpu.h              | 2 ++
 5 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 1b44b9e21ad8..916c5d99c4d1 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1335,6 +1335,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
 
 disable_irq:
 	disable_irq(gmu->gmu_irq);
+	msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
 	a6xx_rpmh_stop(gmu);
 disable_clk:
 	clk_bulk_disable_unprepare(gmu->nr_clocks, gmu->clocks);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 4edfe80c5be7..ca5f96e16870 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -105,6 +105,8 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
 	 */
 	pm_runtime_enable(&pdev->dev);
 
+	guard(mutex)(&gpu->lock);
+
 	ret = pm_runtime_get_sync(&pdev->dev);
 	if (ret < 0) {
 		pm_runtime_put_noidle(&pdev->dev);
@@ -112,10 +114,9 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
 		goto err_disable_rpm;
 	}
 
-	mutex_lock(&gpu->lock);
 	ret = msm_gpu_hw_init(gpu);
-	mutex_unlock(&gpu->lock);
 	if (ret) {
+		msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
 		DRM_DEV_ERROR(dev->dev, "gpu hw init failed: %d\n", ret);
 		goto err_put_rpm;
 	}
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 785e99fb5bd5..8475802fdde2 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -391,10 +391,13 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
 		return 0;
 	case MSM_PARAM_TIMESTAMP:
 		if (adreno_gpu->funcs->get_timestamp) {
+			mutex_lock(&gpu->lock);
 			pm_runtime_get_sync(&gpu->pdev->dev);
+
 			*value = adreno_gpu->funcs->get_timestamp(gpu);
-			pm_runtime_put_autosuspend(&gpu->pdev->dev);
 
+			pm_runtime_put_autosuspend(&gpu->pdev->dev);
+			mutex_unlock(&gpu->lock);
 			return 0;
 		}
 		return -EINVAL;
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 995549d0bbbc..472db2c916f9 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -361,7 +361,7 @@ static void crashstate_get_vm_logs(struct msm_gpu_state *state, struct msm_gem_v
 	mutex_unlock(&vm->mmu_lock);
 }
 
-static void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
+void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
 		struct msm_gem_submit *submit, struct msm_gpu_fault_info *fault_info,
 		char *comm, char *cmd)
 {
@@ -886,7 +886,8 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 
 	pm_runtime_get_sync(&gpu->pdev->dev);
 
-	msm_gpu_hw_init(gpu);
+	if (msm_gpu_hw_init(gpu))
+		msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
 
 	submit->seqno = submit->hw_fence->seqno;
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 666cf499b7ec..eb5b3a7b81f9 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -747,6 +747,8 @@ static inline void msm_gpu_crashstate_put(struct msm_gpu *gpu)
 }
 
 void msm_gpu_fault_crashstate_capture(struct msm_gpu *gpu, struct msm_gpu_fault_info *fault_info);
+void msm_gpu_crashstate_capture(struct msm_gpu *gpu, struct msm_gem_submit *submit,
+		struct msm_gpu_fault_info *fault_info, char *comm, char *cmd);
 
 /*
  * Simple macro to semi-cleanly add the MAP_PRIV flag for targets that can

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 09/17] drm/msm/a6xx: Use packed structs for HFI
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (7 preceding siblings ...)
  2026-03-27  0:13 ` [PATCH v2 08/17] drm/msm/adreno: Coredump on GPU/GMU init failures Akhil P Oommen
@ 2026-03-27  0:13 ` Akhil P Oommen
  2026-03-27  0:13 ` [PATCH v2 10/17] drm/msm/a6xx: Update HFI definitions Akhil P Oommen
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:13 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen,
	Konrad Dybcio

HFI related structs define the ABI between the KMD and the GMU firmware.
So, use packed structures to avoid unintended compiler inserted padding.

Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_hfi.h | 40 +++++++++++++++++------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.h b/drivers/gpu/drm/msm/adreno/a6xx_hfi.h
index 19f6eca2c8c9..217708b03f6f 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.h
@@ -13,7 +13,7 @@ struct a6xx_hfi_queue_table_header {
 	u32 qhdr_size;		/* Size of the queue headers */
 	u32 num_queues;		/* Number of total queues */
 	u32 active_queues;	/* Number of active queues */
-};
+} __packed;
 
 struct a6xx_hfi_queue_header {
 	u32 status;
@@ -28,7 +28,7 @@ struct a6xx_hfi_queue_header {
 	u32 tx_request;
 	u32 read_index;
 	u32 write_index;
-};
+} __packed;
 
 struct a6xx_hfi_queue {
 	struct a6xx_hfi_queue_header *header;
@@ -74,7 +74,7 @@ struct a6xx_hfi_msg_response {
 	u32 ret_header;
 	u32 error;
 	u32 payload[HFI_RESPONSE_PAYLOAD_SIZE];
-};
+} __packed;
 
 #define HFI_F2H_MSG_ERROR 100
 
@@ -82,7 +82,7 @@ struct a6xx_hfi_msg_error {
 	u32 header;
 	u32 code;
 	u32 payload[2];
-};
+} __packed;
 
 #define HFI_H2F_MSG_INIT 0
 
@@ -92,27 +92,27 @@ struct a6xx_hfi_msg_gmu_init_cmd {
 	u32 dbg_buffer_addr;
 	u32 dbg_buffer_size;
 	u32 boot_state;
-};
+} __packed;
 
 #define HFI_H2F_MSG_FW_VERSION 1
 
 struct a6xx_hfi_msg_fw_version {
 	u32 header;
 	u32 supported_version;
-};
+} __packed;
 
 #define HFI_H2F_MSG_PERF_TABLE 4
 
 struct perf_level {
 	u32 vote;
 	u32 freq;
-};
+} __packed;
 
 struct perf_gx_level {
 	u32 vote;
 	u32 acd;
 	u32 freq;
-};
+} __packed;
 
 struct a6xx_hfi_msg_perf_table_v1 {
 	u32 header;
@@ -121,7 +121,7 @@ struct a6xx_hfi_msg_perf_table_v1 {
 
 	struct perf_level gx_votes[16];
 	struct perf_level cx_votes[4];
-};
+} __packed;
 
 struct a6xx_hfi_msg_perf_table {
 	u32 header;
@@ -130,7 +130,7 @@ struct a6xx_hfi_msg_perf_table {
 
 	struct perf_gx_level gx_votes[16];
 	struct perf_level cx_votes[4];
-};
+} __packed;
 
 #define HFI_H2F_MSG_BW_TABLE 3
 
@@ -145,13 +145,13 @@ struct a6xx_hfi_msg_bw_table {
 	u32 cnoc_cmds_data[2][6];
 	u32 ddr_cmds_addrs[8];
 	u32 ddr_cmds_data[16][8];
-};
+} __packed;
 
 #define HFI_H2F_MSG_TEST 5
 
 struct a6xx_hfi_msg_test {
 	u32 header;
-};
+} __packed;
 
 #define HFI_H2F_MSG_ACD 7
 #define MAX_ACD_STRIDE 2
@@ -163,13 +163,13 @@ struct a6xx_hfi_acd_table {
 	u32 stride;
 	u32 num_levels;
 	u32 data[16 * MAX_ACD_STRIDE];
-};
+} __packed;
 
 #define HFI_H2F_MSG_START 10
 
 struct a6xx_hfi_msg_start {
 	u32 header;
-};
+} __packed;
 
 #define HFI_H2F_FEATURE_CTRL 11
 
@@ -178,14 +178,14 @@ struct a6xx_hfi_msg_feature_ctrl {
 	u32 feature;
 	u32 enable;
 	u32 data;
-};
+} __packed;
 
 #define HFI_H2F_MSG_CORE_FW_START 14
 
 struct a6xx_hfi_msg_core_fw_start {
 	u32 header;
 	u32 handle;
-};
+} __packed;
 
 #define HFI_H2F_MSG_TABLE 15
 
@@ -193,7 +193,7 @@ struct a6xx_hfi_table_entry {
 	u32 count;
 	u32 stride;
 	u32 data[];
-};
+} __packed;
 
 struct a6xx_hfi_table {
 	u32 header;
@@ -202,7 +202,7 @@ struct a6xx_hfi_table {
 #define HFI_TABLE_BW_VOTE 0
 #define HFI_TABLE_GPU_PERF 1
 	struct a6xx_hfi_table_entry entry[];
-};
+} __packed;
 
 #define HFI_H2F_MSG_GX_BW_PERF_VOTE 30
 
@@ -211,7 +211,7 @@ struct a6xx_hfi_gx_bw_perf_vote_cmd {
 	u32 ack_type;
 	u32 freq;
 	u32 bw;
-};
+} __packed;
 
 #define AB_VOTE_MASK		GENMASK(31, 16)
 #define MAX_AB_VOTE		(FIELD_MAX(AB_VOTE_MASK) - 1)
@@ -224,6 +224,6 @@ struct a6xx_hfi_prep_slumber_cmd {
 	u32 header;
 	u32 bw;
 	u32 freq;
-};
+} __packed;
 
 #endif

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 10/17] drm/msm/a6xx: Update HFI definitions
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (8 preceding siblings ...)
  2026-03-27  0:13 ` [PATCH v2 09/17] drm/msm/a6xx: Use packed structs for HFI Akhil P Oommen
@ 2026-03-27  0:13 ` Akhil P Oommen
  2026-03-30 11:15   ` Konrad Dybcio
  2026-03-27  0:14 ` [PATCH v2 11/17] drm/msm/a8xx: Add SKU table for A840 Akhil P Oommen
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:13 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

Update the HFI definitions to support additional GMU based power
features.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c |   3 -
 drivers/gpu/drm/msm/adreno/a6xx_hfi.h | 113 +++++++++++++++++++++++++++++++++-
 2 files changed, 111 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
index 09b6bc464b47..487c2736f2b3 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
@@ -851,7 +851,6 @@ static int a6xx_hfi_feature_ctrl_msg(struct a6xx_gmu *gmu, u32 feature, u32 enab
 	return a6xx_hfi_send_msg(gmu, HFI_H2F_FEATURE_CTRL, &msg, sizeof(msg), NULL, 0);
 }
 
-#define HFI_FEATURE_IFPC 9
 #define IFPC_LONG_HYST 0x1680
 
 static int a6xx_hfi_enable_ifpc(struct a6xx_gmu *gmu)
@@ -862,8 +861,6 @@ static int a6xx_hfi_enable_ifpc(struct a6xx_gmu *gmu)
 	return a6xx_hfi_feature_ctrl_msg(gmu, HFI_FEATURE_IFPC, 1, IFPC_LONG_HYST);
 }
 
-#define HFI_FEATURE_ACD 12
-
 static int a6xx_hfi_enable_acd(struct a6xx_gmu *gmu)
 {
 	struct a6xx_hfi_acd_table *acd_table = &gmu->acd_table;
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.h b/drivers/gpu/drm/msm/adreno/a6xx_hfi.h
index 217708b03f6f..e10d32ce93e0 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.h
@@ -165,6 +165,42 @@ struct a6xx_hfi_acd_table {
 	u32 data[16 * MAX_ACD_STRIDE];
 } __packed;
 
+#define CLX_DATA(irated, num_phases, clx_path, extd_intf) \
+	((extd_intf << 29) |				  \
+	 (clx_path << 28) |				  \
+	 (num_phases << 22) |				  \
+	 (irated << 16))
+
+struct a6xx_hfi_clx_domain_v2 {
+	/**
+	 * @data: BITS[0:15]  Migration time
+	 *        BITS[16:21] Current rating
+	 *        BITS[22:27] Phases for domain
+	 *        BITS[28:28] Path notification
+	 *        BITS[29:31] Extra features
+	 */
+	u32 data;
+	/** @clxt: CLX time in microseconds */
+	u32 clxt;
+	/** @clxh: CLH time in microseconds */
+	u32 clxh;
+	/** @urg_mode: Urgent HW throttle mode of operation */
+	u32 urg_mode;
+	/** @lkg_en: Enable leakage current estimate */
+	u32 lkg_en;
+	/** curr_budget: Current Budget */
+	u32 curr_budget;
+} __packed;
+
+#define HFI_H2F_MSG_CLX_TBL 8
+
+#define MAX_CLX_DOMAINS 2
+struct a6xx_hfi_clx_table_v2_cmd {
+	u32 hdr;
+	u32 version;
+	struct a6xx_hfi_clx_domain_v2 domain[MAX_CLX_DOMAINS];
+} __packed;
+
 #define HFI_H2F_MSG_START 10
 
 struct a6xx_hfi_msg_start {
@@ -176,6 +212,41 @@ struct a6xx_hfi_msg_start {
 struct a6xx_hfi_msg_feature_ctrl {
 	u32 header;
 	u32 feature;
+#define HFI_FEATURE_DCVS		0
+#define HFI_FEATURE_HWSCHED		1
+#define HFI_FEATURE_PREEMPTION		2
+#define HFI_FEATURE_CLOCKS_ON		3
+#define HFI_FEATURE_BUS_ON		4
+#define HFI_FEATURE_RAIL_ON		5
+#define HFI_FEATURE_HWCG		6
+#define HFI_FEATURE_LM			7
+#define HFI_FEATURE_THROTTLE		8
+#define HFI_FEATURE_IFPC		9
+#define HFI_FEATURE_NAP			10
+#define HFI_FEATURE_BCL			11
+#define HFI_FEATURE_ACD			12
+#define HFI_FEATURE_DIDT		13
+#define HFI_FEATURE_DEPRECATED		14
+#define HFI_FEATURE_CB			15
+#define HFI_FEATURE_KPROF		16
+#define HFI_FEATURE_BAIL_OUT_TIMER	17
+#define HFI_FEATURE_GMU_STATS		18
+#define HFI_FEATURE_DBQ			19
+#define HFI_FEATURE_MINBW		20
+#define HFI_FEATURE_CLX			21
+#define HFI_FEATURE_LSR			23
+#define HFI_FEATURE_LPAC		24
+#define HFI_FEATURE_HW_FENCE		25
+#define HFI_FEATURE_PERF_NORETAIN	26
+#define HFI_FEATURE_DMS			27
+#define HFI_FEATURE_THERMAL		28
+#define HFI_FEATURE_AQE			29
+#define HFI_FEATURE_TDCVS		30
+#define HFI_FEATURE_DCE			31
+#define HFI_FEATURE_IFF_PCLX		32
+#define HFI_FEATURE_SOFT_RESET		0x10000001
+#define HFI_FEATURE_DCVS_PROFILE	0x10000002
+#define HFI_FEATURE_FAST_CTX_DESTROY	0x10000003
 	u32 enable;
 	u32 data;
 } __packed;
@@ -199,8 +270,17 @@ struct a6xx_hfi_table {
 	u32 header;
 	u32 version;
 	u32 type;
-#define HFI_TABLE_BW_VOTE 0
-#define HFI_TABLE_GPU_PERF 1
+#define HFI_TABLE_BW_VOTE	0
+#define HFI_TABLE_GPU_PERF	1
+#define HFI_TABLE_DIDT		2
+#define HFI_TABLE_ACD		3
+#define HFI_TABLE_CLX_V1	4 /* Unused */
+#define HFI_TABLE_CLX_V2	5
+#define HFI_TABLE_THERM		6
+#define HFI_TABLE_DCVS		7
+#define HFI_TABLE_SYS_TIME	8
+#define HFI_TABLE_GMU_DCVS	9
+#define HFI_TABLE_LIMITS_MIT	10
 	struct a6xx_hfi_table_entry entry[];
 } __packed;
 
@@ -226,4 +306,33 @@ struct a6xx_hfi_prep_slumber_cmd {
 	u32 freq;
 } __packed;
 
+struct a6xx_hfi_limits_cfg {
+	u32 enable;
+	u32 msg_path;
+	u32 lkg_en;
+	/*
+	 * BIT[0]: 0 = (static) throttle to fixed sid level
+	 *         1 = (dynamic) throttle to sid level calculated by HW
+	 * BIT[1]: 0 = Mx
+	 *         1 = Bx
+	 */
+	u32 mode;
+	u32 sid;
+	/* Mitigation time in microseconds */
+	u32 mit_time;
+	/* Max current in mA during mitigation */
+	u32 curr_limit;
+} __packed;
+
+struct a6xx_hfi_limits_tbl {
+	u8 feature_id;
+#define GMU_MIT_IFF  0
+#define GMU_MIT_PCLX 1
+	u8 domain;
+#define GMU_GX_DOMAIN 0
+#define GMU_MX_DOMAIN 1
+	u16 feature_rev;
+	struct a6xx_hfi_limits_cfg cfg;
+} __packed;
+
 #endif

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 11/17] drm/msm/a8xx: Add SKU table for A840
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (9 preceding siblings ...)
  2026-03-27  0:13 ` [PATCH v2 10/17] drm/msm/a6xx: Update HFI definitions Akhil P Oommen
@ 2026-03-27  0:14 ` Akhil P Oommen
  2026-03-27 11:14   ` Konrad Dybcio
  2026-03-27  0:14 ` [PATCH v2 12/17] drm/msm/a6xx: Add soft fuse detection support Akhil P Oommen
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:14 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

Add the SKU table in the catalog for A840 GPU. This data helps to pick
the correct bin from the OPP table based on the speed_bin fuse value.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index 38561f26837e..f6b9792531a6 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -1954,6 +1954,12 @@ static const struct adreno_info a8xx_gpus[] = {
 			},
 		},
 		.preempt_record_size = 19708 * SZ_1K,
+		.speedbins = ADRENO_SPEEDBINS(
+			{ 0,   0 },
+			{ 273, 1 },
+			{ 252, 2 },
+			{ 221, 3 },
+		),
 	}
 };
 

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 12/17] drm/msm/a6xx: Add soft fuse detection support
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (10 preceding siblings ...)
  2026-03-27  0:14 ` [PATCH v2 11/17] drm/msm/a8xx: Add SKU table for A840 Akhil P Oommen
@ 2026-03-27  0:14 ` Akhil P Oommen
  2026-03-27 11:16   ` Konrad Dybcio
  2026-04-11  2:19   ` Jie Gan
  2026-03-27  0:14 ` [PATCH v2 13/17] drm/msm/a6xx: Add SKU detection support for X2-85 Akhil P Oommen
                   ` (4 subsequent siblings)
  16 siblings, 2 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:14 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

Recent chipsets like Glymur supports a new mechanism for SKU detection.
A new CX_MISC register exposes the combined (or final) speedbin value
from both HW fuse register and the Soft Fuse register. Implement this new
SKU detection along with a new quirk to identify the GPUs that has soft
fuse support.

There is a side effect of this patch on A4x and older series. The
speedbin field in the MSM_PARAM_CHIPID will be 0 instead of 0xffff. This
should be okay as Mesa correctly handles it. Speedbin was not even a
thing when those GPUs' support were added.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c         |  6 ++++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c         | 41 ++++++++++++++++++++++-----
 drivers/gpu/drm/msm/adreno/adreno_gpu.c       |  5 ----
 drivers/gpu/drm/msm/adreno/adreno_gpu.h       |  1 +
 drivers/gpu/drm/msm/registers/adreno/a6xx.xml |  4 +++
 5 files changed, 45 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 79a441e91fa1..d7ed3225f635 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1731,6 +1731,7 @@ static struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
 	struct adreno_gpu *adreno_gpu;
 	struct msm_gpu *gpu;
 	unsigned int nr_rings;
+	u32 speedbin;
 	int ret;
 
 	a5xx_gpu = kzalloc(sizeof(*a5xx_gpu), GFP_KERNEL);
@@ -1757,6 +1758,11 @@ static struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
 		return ERR_PTR(ret);
 	}
 
+	/* Set the speedbin value that is passed to userspace */
+	if (adreno_read_speedbin(&pdev->dev, &speedbin) || !speedbin)
+		speedbin = 0xffff;
+	adreno_gpu->speedbin = (uint16_t) (0xffff & speedbin);
+
 	msm_mmu_set_fault_handler(to_msm_vm(gpu->vm)->mmu, gpu,
 				  a5xx_fault_handler);
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 5cddfc03828f..fb9662b946d0 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -2546,13 +2546,33 @@ static u32 fuse_to_supp_hw(const struct adreno_info *info, u32 fuse)
 	return UINT_MAX;
 }
 
-static int a6xx_set_supported_hw(struct device *dev, const struct adreno_info *info)
+static int a6xx_read_speedbin(struct device *dev, struct a6xx_gpu *a6xx_gpu,
+		const struct adreno_info *info, u32 *speedbin)
+{
+	int ret;
+
+	/* Use speedbin fuse if present. Otherwise, fallback to softfuse */
+	ret = adreno_read_speedbin(dev, speedbin);
+	if (ret != -ENOENT)
+		return ret;
+
+	if (info->quirks & ADRENO_QUIRK_SOFTFUSE) {
+		*speedbin = a6xx_llc_read(a6xx_gpu, REG_A8XX_CX_MISC_SW_FUSE_FREQ_LIMIT_STATUS);
+		*speedbin = A8XX_CX_MISC_SW_FUSE_FREQ_LIMIT_STATUS_FINALFREQLIMIT(*speedbin);
+		return 0;
+	}
+
+	return -ENOENT;
+}
+
+static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu,
+		const struct adreno_info *info)
 {
 	u32 supp_hw;
 	u32 speedbin;
 	int ret;
 
-	ret = adreno_read_speedbin(dev, &speedbin);
+	ret = a6xx_read_speedbin(dev, a6xx_gpu, info, &speedbin);
 	/*
 	 * -ENOENT means that the platform doesn't support speedbin which is
 	 * fine
@@ -2586,11 +2606,13 @@ static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 	struct msm_drm_private *priv = dev->dev_private;
 	struct platform_device *pdev = priv->gpu_pdev;
 	struct adreno_platform_config *config = pdev->dev.platform_data;
+	const struct adreno_info *info = config->info;
 	struct device_node *node;
 	struct a6xx_gpu *a6xx_gpu;
 	struct adreno_gpu *adreno_gpu;
 	struct msm_gpu *gpu;
 	extern int enable_preemption;
+	u32 speedbin;
 	bool is_a7xx;
 	int ret, nr_rings = 1;
 
@@ -2613,14 +2635,14 @@ static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 	adreno_gpu->gmu_is_wrapper = of_device_is_compatible(node, "qcom,adreno-gmu-wrapper");
 
 	adreno_gpu->base.hw_apriv =
-		!!(config->info->quirks & ADRENO_QUIRK_HAS_HW_APRIV);
+		!!(info->quirks & ADRENO_QUIRK_HAS_HW_APRIV);
 
 	/* gpu->info only gets assigned in adreno_gpu_init(). A8x is included intentionally */
-	is_a7xx = config->info->family >= ADRENO_7XX_GEN1;
+	is_a7xx = info->family >= ADRENO_7XX_GEN1;
 
 	a6xx_llc_slices_init(pdev, a6xx_gpu, is_a7xx);
 
-	ret = a6xx_set_supported_hw(&pdev->dev, config->info);
+	ret = a6xx_set_supported_hw(&pdev->dev, a6xx_gpu, info);
 	if (ret) {
 		a6xx_llc_slices_destroy(a6xx_gpu);
 		kfree(a6xx_gpu);
@@ -2628,15 +2650,20 @@ static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 	}
 
 	if ((enable_preemption == 1) || (enable_preemption == -1 &&
-	    (config->info->quirks & ADRENO_QUIRK_PREEMPTION)))
+	    (info->quirks & ADRENO_QUIRK_PREEMPTION)))
 		nr_rings = 4;
 
-	ret = adreno_gpu_init(dev, pdev, adreno_gpu, config->info->funcs, nr_rings);
+	ret = adreno_gpu_init(dev, pdev, adreno_gpu, info->funcs, nr_rings);
 	if (ret) {
 		a6xx_destroy(&(a6xx_gpu->base.base));
 		return ERR_PTR(ret);
 	}
 
+	/* Set the speedbin value that is passed to userspace */
+	if (a6xx_read_speedbin(&pdev->dev, a6xx_gpu, info, &speedbin) || !speedbin)
+		speedbin = 0xffff;
+	adreno_gpu->speedbin = (uint16_t) (0xffff & speedbin);
+
 	/*
 	 * For now only clamp to idle freq for devices where this is known not
 	 * to cause power supply issues:
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 8475802fdde2..f6c39aed50f2 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -1185,7 +1185,6 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 	struct msm_gpu_config adreno_gpu_config  = { 0 };
 	struct msm_gpu *gpu = &adreno_gpu->base;
 	const char *gpu_name;
-	u32 speedbin;
 	int ret;
 
 	adreno_gpu->funcs = funcs;
@@ -1214,10 +1213,6 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 			devm_pm_opp_set_clkname(dev, "core");
 	}
 
-	if (adreno_read_speedbin(dev, &speedbin) || !speedbin)
-		speedbin = 0xffff;
-	adreno_gpu->speedbin = (uint16_t) (0xffff & speedbin);
-
 	gpu_name = devm_kasprintf(dev, GFP_KERNEL, "%"ADRENO_CHIPID_FMT,
 			ADRENO_CHIPID_ARGS(config->chip_id));
 	if (!gpu_name)
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 29097e6b4253..044ed4d49aa7 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -63,6 +63,7 @@ enum adreno_family {
 #define ADRENO_QUIRK_PREEMPTION			BIT(5)
 #define ADRENO_QUIRK_4GB_VA			BIT(6)
 #define ADRENO_QUIRK_IFPC			BIT(7)
+#define ADRENO_QUIRK_SOFTFUSE			BIT(8)
 
 /* Helper for formating the chip_id in the way that userspace tools like
  * crashdec expect.
diff --git a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
index 3941e7510754..2309870f5031 100644
--- a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
+++ b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
@@ -5016,6 +5016,10 @@ by a particular renderpass/blit.
 		<bitfield pos="1" name="LPAC" type="boolean"/>
 		<bitfield pos="2" name="RAYTRACING" type="boolean"/>
 	</reg32>
+	<reg32 offset="0x0405" name="CX_MISC_SW_FUSE_FREQ_LIMIT_STATUS" variants="A8XX-">
+		<bitfield high="8" low="0" name="FINALFREQLIMIT"/>
+		<bitfield pos="24" name="SOFTSKUDISABLED" type="boolean"/>
+	</reg32>
 </domain>
 
 </database>

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 13/17] drm/msm/a6xx: Add SKU detection support for X2-85
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (11 preceding siblings ...)
  2026-03-27  0:14 ` [PATCH v2 12/17] drm/msm/a6xx: Add soft fuse detection support Akhil P Oommen
@ 2026-03-27  0:14 ` Akhil P Oommen
  2026-03-27  0:14 ` [PATCH v2 14/17] drm/msm/a8xx: Implement IFPC support for A840 Akhil P Oommen
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:14 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

Add the Speedbin table to the catalog to enable SKU detection support
for X2-85 GPU found in Glymur chipset. As this chipset support the SOFT
FUSE mechanism, enable the ADRENO_QUIRK_SOFTFUSE quirk too.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index f6b9792531a6..758bc7bd31f6 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -1902,7 +1902,8 @@ static const struct adreno_info a8xx_gpus[] = {
 		.gmem = 21 * SZ_1M,
 		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
 		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
-			  ADRENO_QUIRK_HAS_HW_APRIV,
+			  ADRENO_QUIRK_HAS_HW_APRIV |
+			  ADRENO_QUIRK_SOFTFUSE,
 		.funcs = &a8xx_gpu_funcs,
 		.a6xx = &(const struct a6xx_info) {
 			.protect = &x285_protect,
@@ -1922,6 +1923,12 @@ static const struct adreno_info a8xx_gpus[] = {
 				{ /* sentinel */ },
 			},
 		},
+		.speedbins = ADRENO_SPEEDBINS(
+			{ 0,   0 },
+			{ 388, 1 },
+			{ 357, 2 },
+			{ 284, 3 },
+		),
 	}, {
 		.chip_ids = ADRENO_CHIP_IDS(0x44050a01),
 		.family = ADRENO_8XX_GEN2,

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 14/17] drm/msm/a8xx: Implement IFPC support for A840
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (12 preceding siblings ...)
  2026-03-27  0:14 ` [PATCH v2 13/17] drm/msm/a6xx: Add SKU detection support for X2-85 Akhil P Oommen
@ 2026-03-27  0:14 ` Akhil P Oommen
  2026-03-27 11:07   ` Konrad Dybcio
  2026-03-27  0:14 ` [PATCH v2 15/17] drm/msm/a8xx: Preemption " Akhil P Oommen
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:14 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

Implement pwrup reglist support and add the necessary register
configurations to enable IFPC support on A840

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 185 +++++++++++++++++++++++++++++-
 drivers/gpu/drm/msm/adreno/a8xx_gpu.c     |  99 +++++++++++++++-
 2 files changed, 281 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index 758bc7bd31f6..53548f6e891b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -1891,6 +1891,185 @@ static const struct adreno_reglist a840_gbif[] = {
 	{ },
 };
 
+static const uint32_t a840_pwrup_reglist_regs[] = {
+	REG_A7XX_SP_HLSQ_TIMEOUT_THRESHOLD_DP,
+	REG_A7XX_SP_READ_SEL,
+	REG_A6XX_UCHE_MODE_CNTL,
+	REG_A8XX_UCHE_VARB_IDLE_TIMEOUT,
+	REG_A8XX_UCHE_GBIF_GX_CONFIG,
+	REG_A8XX_UCHE_CCHE_MODE_CNTL,
+	REG_A8XX_UCHE_CCHE_CACHE_WAYS,
+	REG_A8XX_UCHE_CACHE_WAYS,
+	REG_A8XX_UCHE_CCHE_GC_GMEM_RANGE_MIN,
+	REG_A8XX_UCHE_CCHE_GC_GMEM_RANGE_MIN + 1,
+	REG_A8XX_UCHE_CCHE_LPAC_GMEM_RANGE_MIN,
+	REG_A8XX_UCHE_CCHE_LPAC_GMEM_RANGE_MIN + 1,
+	REG_A8XX_UCHE_CCHE_TRAP_BASE,
+	REG_A8XX_UCHE_CCHE_TRAP_BASE + 1,
+	REG_A8XX_UCHE_CCHE_WRITE_THRU_BASE,
+	REG_A8XX_UCHE_CCHE_WRITE_THRU_BASE + 1,
+	REG_A8XX_UCHE_HW_DBG_CNTL,
+	REG_A8XX_UCHE_WRITE_THRU_BASE,
+	REG_A8XX_UCHE_WRITE_THRU_BASE + 1,
+	REG_A8XX_UCHE_TRAP_BASE,
+	REG_A8XX_UCHE_TRAP_BASE + 1,
+	REG_A8XX_UCHE_CLIENT_PF,
+	REG_A8XX_RB_CMP_NC_MODE_CNTL,
+	REG_A8XX_SP_HLSQ_GC_GMEM_RANGE_MIN,
+	REG_A8XX_SP_HLSQ_GC_GMEM_RANGE_MIN + 1,
+	REG_A6XX_TPL1_NC_MODE_CNTL,
+	REG_A6XX_TPL1_DBG_ECO_CNTL,
+	REG_A6XX_TPL1_DBG_ECO_CNTL1,
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(0),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(1),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(2),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(3),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(4),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(5),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(6),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(7),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(8),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(9),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(10),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(11),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(12),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(13),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(14),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(15),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(16),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(17),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(18),
+	REG_A8XX_TPL1_BICUBIC_WEIGHTS_TABLE(19),
+};
+DECLARE_ADRENO_REGLIST_LIST(a840_pwrup_reglist);
+
+static const u32 a840_ifpc_reglist_regs[] = {
+	REG_A8XX_RBBM_NC_MODE_CNTL,
+	REG_A8XX_RBBM_SLICE_NC_MODE_CNTL,
+	REG_A6XX_SP_NC_MODE_CNTL,
+	REG_A6XX_SP_CHICKEN_BITS,
+	REG_A8XX_SP_SS_CHICKEN_BITS_0,
+	REG_A7XX_SP_CHICKEN_BITS_1,
+	REG_A7XX_SP_CHICKEN_BITS_2,
+	REG_A7XX_SP_CHICKEN_BITS_3,
+	REG_A8XX_SP_CHICKEN_BITS_4,
+	REG_A6XX_SP_PERFCTR_SHADER_MASK,
+	REG_A8XX_RBBM_SLICE_PERFCTR_CNTL,
+	REG_A8XX_RBBM_SLICE_INTERFACE_HANG_INT_CNTL,
+	REG_A7XX_SP_HLSQ_DBG_ECO_CNTL,
+	REG_A7XX_SP_HLSQ_DBG_ECO_CNTL_1,
+	REG_A7XX_SP_HLSQ_DBG_ECO_CNTL_2,
+	REG_A8XX_SP_HLSQ_DBG_ECO_CNTL_3,
+	REG_A8XX_SP_HLSQ_LPAC_GMEM_RANGE_MIN,
+	REG_A8XX_SP_HLSQ_LPAC_GMEM_RANGE_MIN + 1,
+	REG_A8XX_CP_INTERRUPT_STATUS_MASK_GLOBAL,
+	REG_A8XX_RBBM_PERFCTR_CNTL,
+	REG_A8XX_CP_PROTECT_GLOBAL(0),
+	REG_A8XX_CP_PROTECT_GLOBAL(1),
+	REG_A8XX_CP_PROTECT_GLOBAL(2),
+	REG_A8XX_CP_PROTECT_GLOBAL(3),
+	REG_A8XX_CP_PROTECT_GLOBAL(4),
+	REG_A8XX_CP_PROTECT_GLOBAL(5),
+	REG_A8XX_CP_PROTECT_GLOBAL(6),
+	REG_A8XX_CP_PROTECT_GLOBAL(7),
+	REG_A8XX_CP_PROTECT_GLOBAL(8),
+	REG_A8XX_CP_PROTECT_GLOBAL(9),
+	REG_A8XX_CP_PROTECT_GLOBAL(10),
+	REG_A8XX_CP_PROTECT_GLOBAL(11),
+	REG_A8XX_CP_PROTECT_GLOBAL(12),
+	REG_A8XX_CP_PROTECT_GLOBAL(13),
+	REG_A8XX_CP_PROTECT_GLOBAL(14),
+	REG_A8XX_CP_PROTECT_GLOBAL(15),
+	REG_A8XX_CP_PROTECT_GLOBAL(16),
+	REG_A8XX_CP_PROTECT_GLOBAL(17),
+	REG_A8XX_CP_PROTECT_GLOBAL(18),
+	REG_A8XX_CP_PROTECT_GLOBAL(19),
+	REG_A8XX_CP_PROTECT_GLOBAL(20),
+	REG_A8XX_CP_PROTECT_GLOBAL(21),
+	REG_A8XX_CP_PROTECT_GLOBAL(22),
+	REG_A8XX_CP_PROTECT_GLOBAL(23),
+	REG_A8XX_CP_PROTECT_GLOBAL(24),
+	REG_A8XX_CP_PROTECT_GLOBAL(25),
+	REG_A8XX_CP_PROTECT_GLOBAL(26),
+	REG_A8XX_CP_PROTECT_GLOBAL(27),
+	REG_A8XX_CP_PROTECT_GLOBAL(28),
+	REG_A8XX_CP_PROTECT_GLOBAL(29),
+	REG_A8XX_CP_PROTECT_GLOBAL(30),
+	REG_A8XX_CP_PROTECT_GLOBAL(31),
+	REG_A8XX_CP_PROTECT_GLOBAL(32),
+	REG_A8XX_CP_PROTECT_GLOBAL(33),
+	REG_A8XX_CP_PROTECT_GLOBAL(34),
+	REG_A8XX_CP_PROTECT_GLOBAL(35),
+	REG_A8XX_CP_PROTECT_GLOBAL(36),
+	REG_A8XX_CP_PROTECT_GLOBAL(37),
+	REG_A8XX_CP_PROTECT_GLOBAL(38),
+	REG_A8XX_CP_PROTECT_GLOBAL(39),
+	REG_A8XX_CP_PROTECT_GLOBAL(40),
+	REG_A8XX_CP_PROTECT_GLOBAL(41),
+	REG_A8XX_CP_PROTECT_GLOBAL(42),
+	REG_A8XX_CP_PROTECT_GLOBAL(43),
+	REG_A8XX_CP_PROTECT_GLOBAL(44),
+	REG_A8XX_CP_PROTECT_GLOBAL(45),
+	REG_A8XX_CP_PROTECT_GLOBAL(46),
+	REG_A8XX_CP_PROTECT_GLOBAL(47),
+	REG_A8XX_CP_PROTECT_GLOBAL(48),
+	REG_A8XX_CP_PROTECT_GLOBAL(49),
+	REG_A8XX_CP_PROTECT_GLOBAL(50),
+	REG_A8XX_CP_PROTECT_GLOBAL(51),
+	REG_A8XX_CP_PROTECT_GLOBAL(52),
+	REG_A8XX_CP_PROTECT_GLOBAL(53),
+	REG_A8XX_CP_PROTECT_GLOBAL(54),
+	REG_A8XX_CP_PROTECT_GLOBAL(55),
+	REG_A8XX_CP_PROTECT_GLOBAL(56),
+	REG_A8XX_CP_PROTECT_GLOBAL(57),
+	REG_A8XX_CP_PROTECT_GLOBAL(58),
+	REG_A8XX_CP_PROTECT_GLOBAL(59),
+	REG_A8XX_CP_PROTECT_GLOBAL(60),
+	REG_A8XX_CP_PROTECT_GLOBAL(61),
+	REG_A8XX_CP_PROTECT_GLOBAL(62),
+	REG_A8XX_CP_PROTECT_GLOBAL(63),
+};
+DECLARE_ADRENO_REGLIST_LIST(a840_ifpc_reglist);
+
+static const struct adreno_reglist_pipe a840_dyn_pwrup_reglist_regs[] = {
+	{ REG_A8XX_GRAS_TSEFE_DBG_ECO_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_GRAS_NC_MODE_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_GRAS_DBG_ECO_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A6XX_PC_AUTO_VERTEX_STRIDE, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_CHICKEN_BITS_1, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_CHICKEN_BITS_2, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_CHICKEN_BITS_3, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_CHICKEN_BITS_4, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_CONTEXT_SWITCH_STABILIZE_CNTL_1, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_VIS_STREAM_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A7XX_RB_CCU_CNTL, 0, BIT(PIPE_BR) },
+	{ REG_A7XX_RB_CCU_DBG_ECO_CNTL, 0, BIT(PIPE_BR)},
+	{ REG_A8XX_RB_CCU_NC_MODE_CNTL, 0, BIT(PIPE_BR) },
+	{ REG_A8XX_RB_CMP_NC_MODE_CNTL, 0, BIT(PIPE_BR) },
+	{ REG_A6XX_RB_RBP_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_RB_RESOLVE_PREFETCH_CNTL, 0, BIT(PIPE_BR) },
+	{ REG_A6XX_RB_DBG_ECO_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_RB_CMP_DBG_ECO_CNTL, 0, BIT(PIPE_BR) },
+	{ REG_A7XX_VFD_DBG_ECO_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_VFD_CB_BV_THRESHOLD, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_VFD_CB_BR_THRESHOLD, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_VFD_CB_BUSY_REQ_CNT, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_VFD_CB_LP_REQ_CNT, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_VPC_FLATSHADE_MODE_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_CP_HW_FAULT_STATUS_MASK_PIPE, 0, BIT(PIPE_BR) |
+		BIT(PIPE_BV) | BIT(PIPE_LPAC) | BIT(PIPE_AQE0) |
+		BIT(PIPE_AQE1) | BIT(PIPE_DDE_BR) | BIT(PIPE_DDE_BV) },
+	{ REG_A8XX_CP_INTERRUPT_STATUS_MASK_PIPE, 0, BIT(PIPE_BR) |
+		BIT(PIPE_BV) | BIT(PIPE_LPAC) | BIT(PIPE_AQE0) |
+		BIT(PIPE_AQE1) | BIT(PIPE_DDE_BR) | BIT(PIPE_DDE_BV) },
+	{ REG_A8XX_CP_PROTECT_CNTL_PIPE, 0, BIT(PIPE_BR) | BIT(PIPE_BV) | BIT(PIPE_LPAC)},
+	{ REG_A8XX_CP_PROTECT_PIPE(15), 0, BIT(PIPE_BR) | BIT(PIPE_BV) | BIT(PIPE_LPAC) },
+	{ REG_A8XX_RB_GC_GMEM_PROTECT, 0, BIT(PIPE_BR) },
+	{ REG_A8XX_RB_LPAC_GMEM_PROTECT, 0, BIT(PIPE_BR) },
+	{ REG_A6XX_RB_CONTEXT_SWITCH_GMEM_SAVE_RESTORE_ENABLE, 0, BIT(PIPE_BR) },
+};
+DECLARE_ADRENO_REGLIST_PIPE_LIST(a840_dyn_pwrup_reglist);
+
 static const struct adreno_info a8xx_gpus[] = {
 	{
 		.chip_ids = ADRENO_CHIP_IDS(0x44070001),
@@ -1940,11 +2119,15 @@ static const struct adreno_info a8xx_gpus[] = {
 		.gmem = 18 * SZ_1M,
 		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
 		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
-			  ADRENO_QUIRK_HAS_HW_APRIV,
+			  ADRENO_QUIRK_HAS_HW_APRIV |
+			  ADRENO_QUIRK_IFPC,
 		.funcs = &a8xx_gpu_funcs,
 		.a6xx = &(const struct a6xx_info) {
 			.protect = &a840_protect,
 			.nonctxt_reglist = a840_nonctxt_regs,
+			.pwrup_reglist = &a840_pwrup_reglist,
+			.dyn_pwrup_reglist = &a840_dyn_pwrup_reglist,
+			.ifpc_reglist = &a840_ifpc_reglist,
 			.gbif_cx = a840_gbif,
 			.max_slices = 3,
 			.gmu_chipid = 0x8020100,
diff --git a/drivers/gpu/drm/msm/adreno/a8xx_gpu.c b/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
index fafeac62aebf..e406681b8c80 100644
--- a/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
@@ -173,7 +173,7 @@ void a8xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
 	/* Update HW if this is the current ring and we are not in preempt*/
 	if (!a6xx_in_preempt(a6xx_gpu)) {
 		if (a6xx_gpu->cur_ring == ring)
-			gpu_write(gpu, REG_A6XX_CP_RB_WPTR, wptr);
+			a6xx_fenced_write(a6xx_gpu, REG_A6XX_CP_RB_WPTR, wptr, BIT(0), false);
 		else
 			ring->restore_wptr = true;
 	} else {
@@ -386,8 +386,87 @@ static void a8xx_nonctxt_config(struct msm_gpu *gpu, u32 *gmem_protect)
 	a8xx_aperture_clear(gpu);
 }
 
+static void a8xx_patch_pwrup_reglist(struct msm_gpu *gpu)
+{
+	const struct adreno_reglist_pipe_list *dyn_pwrup_reglist;
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	const struct adreno_reglist_list *reglist;
+	void *ptr = a6xx_gpu->pwrup_reglist_ptr;
+	struct cpu_gpu_lock *lock = ptr;
+	u32 *dest = (u32 *)&lock->regs[0];
+	u32 dyn_pwrup_reglist_count = 0;
+	int i;
+
+	lock->gpu_req = lock->cpu_req = lock->turn = 0;
+
+	reglist = adreno_gpu->info->a6xx->ifpc_reglist;
+	if (reglist) {
+		lock->ifpc_list_len = reglist->count;
+
+		/*
+		 * For each entry in each of the lists, write the offset and the current
+		 * register value into the GPU buffer
+		 */
+		for (i = 0; i < reglist->count; i++) {
+			*dest++ = reglist->regs[i];
+			*dest++ = gpu_read(gpu, reglist->regs[i]);
+		}
+	}
+
+	reglist = adreno_gpu->info->a6xx->pwrup_reglist;
+	if (reglist) {
+		lock->preemption_list_len = reglist->count;
+
+		for (i = 0; i < reglist->count; i++) {
+			*dest++ = reglist->regs[i];
+			*dest++ = gpu_read(gpu, reglist->regs[i]);
+		}
+	}
+
+	/*
+	 * The overall register list is composed of
+	 * 1. Static IFPC-only registers
+	 * 2. Static IFPC + preemption registers
+	 * 3. Dynamic IFPC + preemption registers (ex: perfcounter selects)
+	 *
+	 * The first two lists are static. Size of these lists are stored as
+	 * number of pairs in ifpc_list_len and preemption_list_len
+	 * respectively. With concurrent binning, Some of the perfcounter
+	 * registers being virtualized, CP needs to know the pipe id to program
+	 * the aperture inorder to restore the same. Thus, third list is a
+	 * dynamic list with triplets as
+	 * (<aperture, shifted 12 bits> <address> <data>), and the length is
+	 * stored as number for triplets in dynamic_list_len.
+	 */
+	dyn_pwrup_reglist = adreno_gpu->info->a6xx->dyn_pwrup_reglist;
+	if (!dyn_pwrup_reglist)
+		goto done;
+
+	for (u32 pipe_id = PIPE_BR; pipe_id <= PIPE_DDE_BV; pipe_id++) {
+		for (i = 0; i < dyn_pwrup_reglist->count; i++) {
+			if (!(dyn_pwrup_reglist->regs[i].pipe & BIT(pipe_id)))
+				continue;
+			*dest++ = A8XX_CP_APERTURE_CNTL_HOST_PIPEID(pipe_id);
+			*dest++ = dyn_pwrup_reglist->regs[i].offset;
+			*dest++ = a8xx_read_pipe_slice(gpu,
+						       pipe_id,
+						       a8xx_get_first_slice(a6xx_gpu),
+						       dyn_pwrup_reglist->regs[i].offset);
+			dyn_pwrup_reglist_count++;
+		}
+	}
+
+	lock->dynamic_list_len = dyn_pwrup_reglist_count;
+
+done:
+	a8xx_aperture_clear(gpu);
+}
+
 static int a8xx_cp_init(struct msm_gpu *gpu)
 {
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 	struct msm_ringbuffer *ring = gpu->rb[0];
 	u32 mask;
 
@@ -395,7 +474,7 @@ static int a8xx_cp_init(struct msm_gpu *gpu)
 	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
 	OUT_RING(ring, BIT(27));
 
-	OUT_PKT7(ring, CP_ME_INIT, 4);
+	OUT_PKT7(ring, CP_ME_INIT, 7);
 
 	/* Use multiple HW contexts */
 	mask = BIT(0);
@@ -409,6 +488,9 @@ static int a8xx_cp_init(struct msm_gpu *gpu)
 	/* Disable save/restore of performance counters across preemption */
 	mask |= BIT(6);
 
+	/* Enable the register init list with the spinlock */
+	mask |= BIT(8);
+
 	OUT_RING(ring, mask);
 
 	/* Enable multiple hardware contexts */
@@ -420,6 +502,14 @@ static int a8xx_cp_init(struct msm_gpu *gpu)
 	/* Operation mode mask */
 	OUT_RING(ring, 0x00000002);
 
+	/* Lo address */
+	OUT_RING(ring, lower_32_bits(a6xx_gpu->pwrup_reglist_iova));
+	/* Hi address */
+	OUT_RING(ring, upper_32_bits(a6xx_gpu->pwrup_reglist_iova));
+
+	/* Enable dyn pwrup list with triplets (offset, value, pipe) */
+	OUT_RING(ring, BIT(31));
+
 	a6xx_flush(gpu, ring);
 	return a8xx_idle(gpu, ring) ? 0 : -EINVAL;
 }
@@ -702,6 +792,11 @@ static int hw_init(struct msm_gpu *gpu)
 	WARN_ON(!gmem_protect);
 	a8xx_aperture_clear(gpu);
 
+	if (!a6xx_gpu->pwrup_reglist_emitted) {
+		a8xx_patch_pwrup_reglist(gpu);
+		a6xx_gpu->pwrup_reglist_emitted = true;
+	}
+
 	/* Enable hardware clockgating */
 	a8xx_set_hwcg(gpu, true);
 out:

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 15/17] drm/msm/a8xx: Preemption support for A840
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (13 preceding siblings ...)
  2026-03-27  0:14 ` [PATCH v2 14/17] drm/msm/a8xx: Implement IFPC support for A840 Akhil P Oommen
@ 2026-03-27  0:14 ` Akhil P Oommen
  2026-04-11  2:22   ` Jie Gan
  2026-03-27  0:14 ` [PATCH v2 16/17] drm/msm/a6xx: Enable Preemption on X2-85 Akhil P Oommen
  2026-03-27  0:14 ` [PATCH v2 17/17] drm/msm/adreno: Expose a PARAM to check AQE support Akhil P Oommen
  16 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:14 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

The programing sequence related to preemption is unchanged from A7x. But
there is some code churn due to register shuffling in A8x. So, split out
the common code into a header file for code sharing and add/update
additional changes required to support preemption feature on A8x GPUs.

Finally, enable the preemption quirk in A840's catalog to enable this
feature.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/Makefile              |   1 +
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c |   1 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c     |   7 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h     |   5 +
 drivers/gpu/drm/msm/adreno/a6xx_preempt.c |  77 +--------
 drivers/gpu/drm/msm/adreno/a6xx_preempt.h |  82 ++++++++++
 drivers/gpu/drm/msm/adreno/a8xx_gpu.c     |  37 ++++-
 drivers/gpu/drm/msm/adreno/a8xx_preempt.c | 259 ++++++++++++++++++++++++++++++
 8 files changed, 389 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 8b94c5f1cb68..ba45e99be05b 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -25,6 +25,7 @@ adreno-y := \
 	adreno/a6xx_hfi.o \
 	adreno/a6xx_preempt.o \
 	adreno/a8xx_gpu.o \
+	adreno/a8xx_preempt.o \
 
 adreno-$(CONFIG_DEBUG_FS) += adreno/a5xx_debugfs.o \
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index 53548f6e891b..21f5a685196b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -2120,6 +2120,7 @@ static const struct adreno_info a8xx_gpus[] = {
 		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
 		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
 			  ADRENO_QUIRK_HAS_HW_APRIV |
+			  ADRENO_QUIRK_PREEMPTION |
 			  ADRENO_QUIRK_IFPC,
 		.funcs = &a8xx_gpu_funcs,
 		.a6xx = &(const struct a6xx_info) {
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index fb9662b946d0..44ce02b412ca 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -410,7 +410,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	a6xx_flush(gpu, ring);
 }
 
-static void a6xx_emit_set_pseudo_reg(struct msm_ringbuffer *ring,
+void a6xx_emit_set_pseudo_reg(struct msm_ringbuffer *ring,
 		struct a6xx_gpu *a6xx_gpu, struct msm_gpu_submitqueue *queue)
 {
 	u64 preempt_postamble;
@@ -620,7 +620,10 @@ static void a7xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	a6xx_flush(gpu, ring);
 
 	/* Check to see if we need to start preemption */
-	a6xx_preempt_trigger(gpu);
+	if (adreno_is_a8xx(adreno_gpu))
+		a8xx_preempt_trigger(gpu);
+	else
+		a6xx_preempt_trigger(gpu);
 }
 
 static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index a4434a6a56dd..eb431e5e00b1 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -278,6 +278,8 @@ void a6xx_preempt_hw_init(struct msm_gpu *gpu);
 void a6xx_preempt_trigger(struct msm_gpu *gpu);
 void a6xx_preempt_irq(struct msm_gpu *gpu);
 void a6xx_preempt_fini(struct msm_gpu *gpu);
+void a6xx_emit_set_pseudo_reg(struct msm_ringbuffer *ring,
+		struct a6xx_gpu *a6xx_gpu, struct msm_gpu_submitqueue *queue);
 int a6xx_preempt_submitqueue_setup(struct msm_gpu *gpu,
 		struct msm_gpu_submitqueue *queue);
 void a6xx_preempt_submitqueue_close(struct msm_gpu *gpu,
@@ -327,6 +329,9 @@ void a8xx_gpu_get_slice_info(struct msm_gpu *gpu);
 int a8xx_hw_init(struct msm_gpu *gpu);
 irqreturn_t a8xx_irq(struct msm_gpu *gpu);
 void a8xx_llc_activate(struct a6xx_gpu *a6xx_gpu);
+void a8xx_preempt_hw_init(struct msm_gpu *gpu);
+void a8xx_preempt_trigger(struct msm_gpu *gpu);
+void a8xx_preempt_irq(struct msm_gpu *gpu);
 bool a8xx_progress(struct msm_gpu *gpu, struct msm_ringbuffer *ring);
 void a8xx_recover(struct msm_gpu *gpu);
 #endif /* __A6XX_GPU_H__ */
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
index 747a22afad9f..df4cbf42e9a4 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
@@ -6,85 +6,10 @@
 #include "msm_gem.h"
 #include "a6xx_gpu.h"
 #include "a6xx_gmu.xml.h"
+#include "a6xx_preempt.h"
 #include "msm_mmu.h"
 #include "msm_gpu_trace.h"
 
-/*
- * Try to transition the preemption state from old to new. Return
- * true on success or false if the original state wasn't 'old'
- */
-static inline bool try_preempt_state(struct a6xx_gpu *a6xx_gpu,
-		enum a6xx_preempt_state old, enum a6xx_preempt_state new)
-{
-	enum a6xx_preempt_state cur = atomic_cmpxchg(&a6xx_gpu->preempt_state,
-		old, new);
-
-	return (cur == old);
-}
-
-/*
- * Force the preemption state to the specified state.  This is used in cases
- * where the current state is known and won't change
- */
-static inline void set_preempt_state(struct a6xx_gpu *gpu,
-		enum a6xx_preempt_state new)
-{
-	/*
-	 * preempt_state may be read by other cores trying to trigger a
-	 * preemption or in the interrupt handler so barriers are needed
-	 * before...
-	 */
-	smp_mb__before_atomic();
-	atomic_set(&gpu->preempt_state, new);
-	/* ... and after*/
-	smp_mb__after_atomic();
-}
-
-/* Write the most recent wptr for the given ring into the hardware */
-static inline void update_wptr(struct a6xx_gpu *a6xx_gpu, struct msm_ringbuffer *ring)
-{
-	unsigned long flags;
-	uint32_t wptr;
-
-	spin_lock_irqsave(&ring->preempt_lock, flags);
-
-	if (ring->restore_wptr) {
-		wptr = get_wptr(ring);
-
-		a6xx_fenced_write(a6xx_gpu, REG_A6XX_CP_RB_WPTR, wptr, BIT(0), false);
-
-		ring->restore_wptr = false;
-	}
-
-	spin_unlock_irqrestore(&ring->preempt_lock, flags);
-}
-
-/* Return the highest priority ringbuffer with something in it */
-static struct msm_ringbuffer *get_next_ring(struct msm_gpu *gpu)
-{
-	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
-	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
-
-	unsigned long flags;
-	int i;
-
-	for (i = 0; i < gpu->nr_rings; i++) {
-		bool empty;
-		struct msm_ringbuffer *ring = gpu->rb[i];
-
-		spin_lock_irqsave(&ring->preempt_lock, flags);
-		empty = (get_wptr(ring) == gpu->funcs->get_rptr(gpu, ring));
-		if (!empty && ring == a6xx_gpu->cur_ring)
-			empty = ring->memptrs->fence == a6xx_gpu->last_seqno[i];
-		spin_unlock_irqrestore(&ring->preempt_lock, flags);
-
-		if (!empty)
-			return ring;
-	}
-
-	return NULL;
-}
-
 static void a6xx_preempt_timer(struct timer_list *t)
 {
 	struct a6xx_gpu *a6xx_gpu = timer_container_of(a6xx_gpu, t,
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_preempt.h b/drivers/gpu/drm/msm/adreno/a6xx_preempt.h
new file mode 100644
index 000000000000..df36c945b836
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a6xx_preempt.h
@@ -0,0 +1,82 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved. */
+/* Copyright (c) 2023 Collabora, Ltd. */
+/* Copyright (c) 2024 Valve Corporation */
+/* Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. */
+
+/*
+ * Try to transition the preemption state from old to new. Return
+ * true on success or false if the original state wasn't 'old'
+ */
+static inline bool try_preempt_state(struct a6xx_gpu *a6xx_gpu,
+		enum a6xx_preempt_state old, enum a6xx_preempt_state new)
+{
+	enum a6xx_preempt_state cur = atomic_cmpxchg(&a6xx_gpu->preempt_state,
+		old, new);
+
+	return (cur == old);
+}
+
+/*
+ * Force the preemption state to the specified state.  This is used in cases
+ * where the current state is known and won't change
+ */
+static inline void set_preempt_state(struct a6xx_gpu *gpu,
+		enum a6xx_preempt_state new)
+{
+	/*
+	 * preempt_state may be read by other cores trying to trigger a
+	 * preemption or in the interrupt handler so barriers are needed
+	 * before...
+	 */
+	smp_mb__before_atomic();
+	atomic_set(&gpu->preempt_state, new);
+	/* ... and after */
+	smp_mb__after_atomic();
+}
+
+/* Write the most recent wptr for the given ring into the hardware */
+static inline void update_wptr(struct a6xx_gpu *a6xx_gpu, struct msm_ringbuffer *ring)
+{
+	unsigned long flags;
+	uint32_t wptr;
+
+	spin_lock_irqsave(&ring->preempt_lock, flags);
+
+	if (ring->restore_wptr) {
+		wptr = get_wptr(ring);
+
+		a6xx_fenced_write(a6xx_gpu, REG_A6XX_CP_RB_WPTR, wptr, BIT(0), false);
+
+		ring->restore_wptr = false;
+	}
+
+	spin_unlock_irqrestore(&ring->preempt_lock, flags);
+}
+
+/* Return the highest priority ringbuffer with something in it */
+static inline struct msm_ringbuffer *get_next_ring(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+
+	unsigned long flags;
+	int i;
+
+	for (i = 0; i < gpu->nr_rings; i++) {
+		bool empty;
+		struct msm_ringbuffer *ring = gpu->rb[i];
+
+		spin_lock_irqsave(&ring->preempt_lock, flags);
+		empty = (get_wptr(ring) == gpu->funcs->get_rptr(gpu, ring));
+		if (!empty && ring == a6xx_gpu->cur_ring)
+			empty = ring->memptrs->fence == a6xx_gpu->last_seqno[i];
+		spin_unlock_irqrestore(&ring->preempt_lock, flags);
+
+		if (!empty)
+			return ring;
+	}
+
+	return NULL;
+}
+
diff --git a/drivers/gpu/drm/msm/adreno/a8xx_gpu.c b/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
index e406681b8c80..9e6f2ed69247 100644
--- a/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
@@ -463,6 +463,34 @@ static void a8xx_patch_pwrup_reglist(struct msm_gpu *gpu)
 	a8xx_aperture_clear(gpu);
 }
 
+static int a8xx_preempt_start(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	struct msm_ringbuffer *ring = gpu->rb[0];
+
+	if (gpu->nr_rings <= 1)
+		return 0;
+
+	/* Turn CP protection off */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 0);
+
+	a6xx_emit_set_pseudo_reg(ring, a6xx_gpu, NULL);
+
+	/* Yield the floor on command completion */
+	OUT_PKT7(ring, CP_CONTEXT_SWITCH_YIELD, 4);
+	OUT_RING(ring, 0x00);
+	OUT_RING(ring, 0x00);
+	OUT_RING(ring, 0x00);
+	/* Generate interrupt on preemption completion */
+	OUT_RING(ring, 0x00);
+
+	a6xx_flush(gpu, ring);
+
+	return a8xx_idle(gpu, ring) ? 0 : -EINVAL;
+}
+
 static int a8xx_cp_init(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -738,6 +766,8 @@ static int hw_init(struct msm_gpu *gpu)
 	gpu_write64(gpu, REG_A6XX_CP_RB_RPTR_ADDR, shadowptr(a6xx_gpu, gpu->rb[0]));
 	gpu_write64(gpu, REG_A8XX_CP_RB_RPTR_ADDR_BV, rbmemptr(gpu->rb[0], bv_rptr));
 
+	a8xx_preempt_hw_init(gpu);
+
 	for (i = 0; i < gpu->nr_rings; i++)
 		a6xx_gpu->shadow[i] = 0;
 
@@ -800,6 +830,9 @@ static int hw_init(struct msm_gpu *gpu)
 	/* Enable hardware clockgating */
 	a8xx_set_hwcg(gpu, true);
 out:
+	/* Last step - yield the ringbuffer */
+	a8xx_preempt_start(gpu);
+
 	/*
 	 * Tell the GMU that we are done touching the GPU and it can start power
 	 * management
@@ -1209,11 +1242,11 @@ irqreturn_t a8xx_irq(struct msm_gpu *gpu)
 
 	if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS) {
 		msm_gpu_retire(gpu);
-		a6xx_preempt_trigger(gpu);
+		a8xx_preempt_trigger(gpu);
 	}
 
 	if (status & A6XX_RBBM_INT_0_MASK_CP_SW)
-		a6xx_preempt_irq(gpu);
+		a8xx_preempt_irq(gpu);
 
 	return IRQ_HANDLED;
 }
diff --git a/drivers/gpu/drm/msm/adreno/a8xx_preempt.c b/drivers/gpu/drm/msm/adreno/a8xx_preempt.c
new file mode 100644
index 000000000000..3d8c33ba722e
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a8xx_preempt.c
@@ -0,0 +1,259 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. */
+
+#include "msm_gem.h"
+#include "a6xx_gpu.h"
+#include "a6xx_gmu.xml.h"
+#include "a6xx_preempt.h"
+#include "msm_mmu.h"
+#include "msm_gpu_trace.h"
+
+static void preempt_prepare_postamble(struct a6xx_gpu *a6xx_gpu)
+{
+	u32 *postamble = a6xx_gpu->preempt_postamble_ptr;
+	u32 count = 0;
+
+	postamble[count++] = PKT7(CP_REG_RMW, 3);
+	postamble[count++] = REG_A8XX_RBBM_PERFCTR_SRAM_INIT_CMD;
+	postamble[count++] = 0;
+	postamble[count++] = 1;
+
+	postamble[count++] = PKT7(CP_WAIT_REG_MEM, 6);
+	postamble[count++] = CP_WAIT_REG_MEM_0_FUNCTION(WRITE_EQ);
+	postamble[count++] = CP_WAIT_REG_MEM_POLL_ADDR_LO(
+				REG_A8XX_RBBM_PERFCTR_SRAM_INIT_STATUS);
+	postamble[count++] = CP_WAIT_REG_MEM_POLL_ADDR_HI(0);
+	postamble[count++] = CP_WAIT_REG_MEM_3_REF(0x1);
+	postamble[count++] = CP_WAIT_REG_MEM_4_MASK(0x1);
+	postamble[count++] = CP_WAIT_REG_MEM_5_DELAY_LOOP_CYCLES(0);
+
+	a6xx_gpu->preempt_postamble_len = count;
+
+	a6xx_gpu->postamble_enabled = true;
+}
+
+static void preempt_disable_postamble(struct a6xx_gpu *a6xx_gpu)
+{
+	u32 *postamble = a6xx_gpu->preempt_postamble_ptr;
+
+	/*
+	 * Disable the postamble by replacing the first packet header with a NOP
+	 * that covers the whole buffer.
+	 */
+	*postamble = PKT7(CP_NOP, (a6xx_gpu->preempt_postamble_len - 1));
+
+	a6xx_gpu->postamble_enabled = false;
+}
+
+/*
+ * Set preemption keepalive vote. Please note that this vote is different from the one used in
+ * a8xx_irq()
+ */
+static void a8xx_preempt_keepalive_vote(struct msm_gpu *gpu, bool on)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+
+	gmu_write(&a6xx_gpu->gmu, REG_A8XX_GMU_PWR_COL_PREEMPT_KEEPALIVE, on);
+}
+
+void a8xx_preempt_irq(struct msm_gpu *gpu)
+{
+	uint32_t status;
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	struct drm_device *dev = gpu->dev;
+
+	if (!try_preempt_state(a6xx_gpu, PREEMPT_TRIGGERED, PREEMPT_PENDING))
+		return;
+
+	/* Delete the preemption watchdog timer */
+	timer_delete(&a6xx_gpu->preempt_timer);
+
+	/*
+	 * The hardware should be setting the stop bit of CP_CONTEXT_SWITCH_CNTL
+	 * to zero before firing the interrupt, but there is a non zero chance
+	 * of a hardware condition or a software race that could set it again
+	 * before we have a chance to finish. If that happens, log and go for
+	 * recovery
+	 */
+	status = gpu_read(gpu, REG_A8XX_CP_CONTEXT_SWITCH_CNTL);
+	if (unlikely(status & A8XX_CP_CONTEXT_SWITCH_CNTL_STOP)) {
+		DRM_DEV_ERROR(&gpu->pdev->dev,
+					  "!!!!!!!!!!!!!!!! preemption faulted !!!!!!!!!!!!!! irq\n");
+		set_preempt_state(a6xx_gpu, PREEMPT_FAULTED);
+		dev_err(dev->dev, "%s: Preemption failed to complete\n",
+			gpu->name);
+		kthread_queue_work(gpu->worker, &gpu->recover_work);
+		return;
+	}
+
+	a6xx_gpu->cur_ring = a6xx_gpu->next_ring;
+	a6xx_gpu->next_ring = NULL;
+
+	set_preempt_state(a6xx_gpu, PREEMPT_FINISH);
+
+	update_wptr(a6xx_gpu, a6xx_gpu->cur_ring);
+
+	set_preempt_state(a6xx_gpu, PREEMPT_NONE);
+
+	a8xx_preempt_keepalive_vote(gpu, false);
+
+	trace_msm_gpu_preemption_irq(a6xx_gpu->cur_ring->id);
+
+	/*
+	 * Retrigger preemption to avoid a deadlock that might occur when preemption
+	 * is skipped due to it being already in flight when requested.
+	 */
+	a8xx_preempt_trigger(gpu);
+}
+
+void a8xx_preempt_hw_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	int i;
+
+	/* No preemption if we only have one ring */
+	if (gpu->nr_rings == 1)
+		return;
+
+	for (i = 0; i < gpu->nr_rings; i++) {
+		struct a6xx_preempt_record *record_ptr = a6xx_gpu->preempt[i];
+
+		record_ptr->wptr = 0;
+		record_ptr->rptr = 0;
+		record_ptr->rptr_addr = shadowptr(a6xx_gpu, gpu->rb[i]);
+		record_ptr->info = 0;
+		record_ptr->data = 0;
+		record_ptr->rbase = gpu->rb[i]->iova;
+	}
+
+	/* Write a 0 to signal that we aren't switching pagetables */
+	gpu_write64(gpu, REG_A8XX_CP_CONTEXT_SWITCH_SMMU_INFO, 0);
+
+	/* Enable the GMEM save/restore feature for preemption */
+	gpu_write(gpu, REG_A6XX_RB_CONTEXT_SWITCH_GMEM_SAVE_RESTORE_ENABLE, 0x1);
+
+	/* Reset the preemption state */
+	set_preempt_state(a6xx_gpu, PREEMPT_NONE);
+
+	spin_lock_init(&a6xx_gpu->eval_lock);
+
+	/* Always come up on rb 0 */
+	a6xx_gpu->cur_ring = gpu->rb[0];
+}
+
+void a8xx_preempt_trigger(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	unsigned long flags;
+	struct msm_ringbuffer *ring;
+	unsigned int cntl;
+	bool sysprof;
+
+	if (gpu->nr_rings == 1)
+		return;
+
+	/*
+	 * Lock to make sure another thread attempting preemption doesn't skip it
+	 * while we are still evaluating the next ring. This makes sure the other
+	 * thread does start preemption if we abort it and avoids a soft lock.
+	 */
+	spin_lock_irqsave(&a6xx_gpu->eval_lock, flags);
+
+	/*
+	 * Try to start preemption by moving from NONE to START. If
+	 * unsuccessful, a preemption is already in flight
+	 */
+	if (!try_preempt_state(a6xx_gpu, PREEMPT_NONE, PREEMPT_START)) {
+		spin_unlock_irqrestore(&a6xx_gpu->eval_lock, flags);
+		return;
+	}
+
+	cntl = A8XX_CP_CONTEXT_SWITCH_CNTL_LEVEL(a6xx_gpu->preempt_level);
+
+	if (a6xx_gpu->skip_save_restore)
+		cntl |= A8XX_CP_CONTEXT_SWITCH_CNTL_SKIP_SAVE_RESTORE;
+
+	if (a6xx_gpu->uses_gmem)
+		cntl |= A8XX_CP_CONTEXT_SWITCH_CNTL_USES_GMEM;
+
+	cntl |= A8XX_CP_CONTEXT_SWITCH_CNTL_STOP;
+
+	/* Get the next ring to preempt to */
+	ring = get_next_ring(gpu);
+
+	/*
+	 * If no ring is populated or the highest priority ring is the current
+	 * one do nothing except to update the wptr to the latest and greatest
+	 */
+	if (!ring || (a6xx_gpu->cur_ring == ring)) {
+		set_preempt_state(a6xx_gpu, PREEMPT_FINISH);
+		update_wptr(a6xx_gpu, a6xx_gpu->cur_ring);
+		set_preempt_state(a6xx_gpu, PREEMPT_NONE);
+		spin_unlock_irqrestore(&a6xx_gpu->eval_lock, flags);
+		return;
+	}
+
+	spin_unlock_irqrestore(&a6xx_gpu->eval_lock, flags);
+
+	spin_lock_irqsave(&ring->preempt_lock, flags);
+
+	struct a7xx_cp_smmu_info *smmu_info_ptr =
+		a6xx_gpu->preempt_smmu[ring->id];
+	struct a6xx_preempt_record *record_ptr = a6xx_gpu->preempt[ring->id];
+	u64 ttbr0 = ring->memptrs->ttbr0;
+	u32 context_idr = ring->memptrs->context_idr;
+
+	smmu_info_ptr->ttbr0 = ttbr0;
+	smmu_info_ptr->context_idr = context_idr;
+	record_ptr->wptr = get_wptr(ring);
+
+	/*
+	 * The GPU will write the wptr we set above when we preempt. Reset
+	 * restore_wptr to make sure that we don't write WPTR to the same
+	 * thing twice. It's still possible subsequent submissions will update
+	 * wptr again, in which case they will set the flag to true. This has
+	 * to be protected by the lock for setting the flag and updating wptr
+	 * to be atomic.
+	 */
+	ring->restore_wptr = false;
+
+	trace_msm_gpu_preemption_trigger(a6xx_gpu->cur_ring->id, ring->id);
+
+	spin_unlock_irqrestore(&ring->preempt_lock, flags);
+
+	/* Set the keepalive bit to keep the GPU ON until preemption is complete */
+	a8xx_preempt_keepalive_vote(gpu, true);
+
+	a6xx_fenced_write(a6xx_gpu,
+		REG_A8XX_CP_CONTEXT_SWITCH_SMMU_INFO, a6xx_gpu->preempt_smmu_iova[ring->id],
+		BIT(1), true);
+
+	a6xx_fenced_write(a6xx_gpu,
+		REG_A8XX_CP_CONTEXT_SWITCH_PRIV_NON_SECURE_RESTORE_ADDR,
+		a6xx_gpu->preempt_iova[ring->id], BIT(1), true);
+
+	a6xx_gpu->next_ring = ring;
+
+	/* Start a timer to catch a stuck preemption */
+	mod_timer(&a6xx_gpu->preempt_timer, jiffies + msecs_to_jiffies(10000));
+
+	/* Enable or disable postamble as needed */
+	sysprof = refcount_read(&a6xx_gpu->base.base.sysprof_active) > 1;
+
+	if (!sysprof && !a6xx_gpu->postamble_enabled)
+		preempt_prepare_postamble(a6xx_gpu);
+
+	if (sysprof && a6xx_gpu->postamble_enabled)
+		preempt_disable_postamble(a6xx_gpu);
+
+	/* Set the preemption state to triggered */
+	set_preempt_state(a6xx_gpu, PREEMPT_TRIGGERED);
+
+	/* Trigger the preemption */
+	a6xx_fenced_write(a6xx_gpu, REG_A8XX_CP_CONTEXT_SWITCH_CNTL, cntl, BIT(1), false);
+}
+

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 16/17] drm/msm/a6xx: Enable Preemption on X2-85
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (14 preceding siblings ...)
  2026-03-27  0:14 ` [PATCH v2 15/17] drm/msm/a8xx: Preemption " Akhil P Oommen
@ 2026-03-27  0:14 ` Akhil P Oommen
  2026-03-27  0:14 ` [PATCH v2 17/17] drm/msm/adreno: Expose a PARAM to check AQE support Akhil P Oommen
  16 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:14 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

Add the save-restore register lists and set the necessary quirk flags
in the catalog to enable the Preemption feature on Adreno X2-85 GPU.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 42 +++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index 21f5a685196b..550ff3a9b82e 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -1761,6 +1761,44 @@ static const u32 x285_protect_regs[] = {
 
 DECLARE_ADRENO_PROTECT(x285_protect, 15);
 
+static const struct adreno_reglist_pipe x285_dyn_pwrup_reglist_regs[] = {
+	{ REG_A8XX_GRAS_TSEFE_DBG_ECO_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_GRAS_NC_MODE_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_GRAS_DBG_ECO_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A6XX_PC_AUTO_VERTEX_STRIDE, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_CHICKEN_BITS_1, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_CHICKEN_BITS_2, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_CHICKEN_BITS_3, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_CHICKEN_BITS_4, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_CONTEXT_SWITCH_STABILIZE_CNTL_1, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_PC_VIS_STREAM_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A7XX_RB_CCU_CNTL, 0, BIT(PIPE_BR) },
+	{ REG_A7XX_RB_CCU_DBG_ECO_CNTL, 0, BIT(PIPE_BR)},
+	{ REG_A8XX_RB_CCU_NC_MODE_CNTL, 0, BIT(PIPE_BR) },
+	{ REG_A8XX_RB_CMP_NC_MODE_CNTL, 0, BIT(PIPE_BR) },
+	{ REG_A6XX_RB_RBP_CNTL, 0, BIT(PIPE_BR) },
+	{ REG_A8XX_RB_RESOLVE_PREFETCH_CNTL, 0, BIT(PIPE_BR) },
+	{ REG_A8XX_RB_CMP_DBG_ECO_CNTL, 0, BIT(PIPE_BR) },
+	{ REG_A7XX_VFD_DBG_ECO_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_VFD_CB_BV_THRESHOLD, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_VFD_CB_BR_THRESHOLD, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_VFD_CB_BUSY_REQ_CNT, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_VFD_CB_LP_REQ_CNT, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_VPC_FLATSHADE_MODE_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) },
+	{ REG_A8XX_CP_HW_FAULT_STATUS_MASK_PIPE, 0, BIT(PIPE_BR) |
+		BIT(PIPE_BV) | BIT(PIPE_LPAC) | BIT(PIPE_AQE0) |
+		BIT(PIPE_AQE1) | BIT(PIPE_DDE_BR) | BIT(PIPE_DDE_BV) },
+	{ REG_A8XX_CP_INTERRUPT_STATUS_MASK_PIPE, 0, BIT(PIPE_BR) |
+		BIT(PIPE_BV) | BIT(PIPE_LPAC) | BIT(PIPE_AQE0) |
+		BIT(PIPE_AQE1) | BIT(PIPE_DDE_BR) | BIT(PIPE_DDE_BV) },
+	{ REG_A8XX_CP_PROTECT_CNTL_PIPE, 0, BIT(PIPE_BR) | BIT(PIPE_BV) | BIT(PIPE_LPAC)},
+	{ REG_A8XX_CP_PROTECT_PIPE(15), 0, BIT(PIPE_BR) | BIT(PIPE_BV) | BIT(PIPE_LPAC) },
+	{ REG_A8XX_RB_GC_GMEM_PROTECT, 0, BIT(PIPE_BR) },
+	{ REG_A8XX_RB_LPAC_GMEM_PROTECT, 0, BIT(PIPE_BR) },
+	{ REG_A6XX_RB_CONTEXT_SWITCH_GMEM_SAVE_RESTORE_ENABLE, 0, BIT(PIPE_BR) },
+};
+DECLARE_ADRENO_REGLIST_PIPE_LIST(x285_dyn_pwrup_reglist);
+
 static const struct adreno_reglist_pipe a840_nonctxt_regs[] = {
 	{ REG_A8XX_CP_SMMU_STREAM_ID_LPAC, 0x00000101, BIT(PIPE_NONE) },
 	{ REG_A8XX_GRAS_DBG_ECO_CNTL, 0x00000800, BIT(PIPE_BV) | BIT(PIPE_BR) },
@@ -2082,11 +2120,15 @@ static const struct adreno_info a8xx_gpus[] = {
 		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
 		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
 			  ADRENO_QUIRK_HAS_HW_APRIV |
+			  ADRENO_QUIRK_PREEMPTION |
 			  ADRENO_QUIRK_SOFTFUSE,
 		.funcs = &a8xx_gpu_funcs,
 		.a6xx = &(const struct a6xx_info) {
 			.protect = &x285_protect,
 			.nonctxt_reglist = x285_nonctxt_regs,
+			.pwrup_reglist = &a840_pwrup_reglist,
+			.dyn_pwrup_reglist = &x285_dyn_pwrup_reglist,
+			.ifpc_reglist = &a840_ifpc_reglist,
 			.gbif_cx = a840_gbif,
 			.max_slices = 4,
 			.gmu_chipid = 0x8010100,

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 17/17] drm/msm/adreno: Expose a PARAM to check AQE support
  2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
                   ` (15 preceding siblings ...)
  2026-03-27  0:14 ` [PATCH v2 16/17] drm/msm/a6xx: Enable Preemption on X2-85 Akhil P Oommen
@ 2026-03-27  0:14 ` Akhil P Oommen
  2026-03-27 11:19   ` Konrad Dybcio
  16 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-27  0:14 UTC (permalink / raw)
  To: Rob Clark, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Akhil P Oommen

AQE (Applicaton Qrisc Engine) is required to support VK ray-pipeline. Two
conditions should be met to use this HW:
  1. AQE firmware should be loaded and programmed
  2. Preemption support

Expose a new MSM_PARAM to allow userspace to query its support.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 13 +++++++++++++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++++
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  1 +
 include/uapi/drm/msm_drm.h              |  1 +
 4 files changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 44ce02b412ca..4d2f6c4d86e3 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -2604,6 +2604,17 @@ static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu,
 	return 0;
 }
 
+static bool a6xx_aqe_is_enabled(struct adreno_gpu *adreno_gpu)
+{
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+
+	/*
+	 * AQE uses preemption context record as scratch pad, so check if
+	 * preemption is enabled
+	 */
+	return (adreno_gpu->base.nr_rings > 1) && !!a6xx_gpu->aqe_bo;
+}
+
 static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 {
 	struct msm_drm_private *priv = dev->dev_private;
@@ -2802,6 +2813,7 @@ const struct adreno_gpu_funcs a7xx_gpu_funcs = {
 	.bus_halt = a6xx_bus_clear_pending_transactions,
 	.mmu_fault_handler = a6xx_fault_handler,
 	.gx_is_on = a7xx_gmu_gx_is_on,
+	.aqe_is_enabled = a6xx_aqe_is_enabled,
 };
 
 const struct adreno_gpu_funcs a8xx_gpu_funcs = {
@@ -2830,4 +2842,5 @@ const struct adreno_gpu_funcs a8xx_gpu_funcs = {
 	.bus_halt = a8xx_bus_clear_pending_transactions,
 	.mmu_fault_handler = a8xx_fault_handler,
 	.gx_is_on = a8xx_gmu_gx_is_on,
+	.aqe_is_enabled = a6xx_aqe_is_enabled,
 };
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index f6c39aed50f2..78c348852b4b 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -444,6 +444,10 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
 	case MSM_PARAM_HAS_PRR:
 		*value = adreno_smmu_has_prr(gpu);
 		return 0;
+	case MSM_PARAM_AQE:
+		*value = !!(adreno_gpu->funcs->aqe_is_enabled &&
+			    adreno_gpu->funcs->aqe_is_enabled(adreno_gpu));
+		return 0;
 	default:
 		return UERR(EINVAL, drm, "%s: invalid param: %u", gpu->name, param);
 	}
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 044ed4d49aa7..c0ee544ce257 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -84,6 +84,7 @@ struct adreno_gpu_funcs {
 	void (*bus_halt)(struct adreno_gpu *adreno_gpu, bool gx_off);
 	int (*mmu_fault_handler)(void *arg, unsigned long iova, int flags, void *data);
 	bool (*gx_is_on)(struct adreno_gpu *adreno_gpu);
+	bool (*aqe_is_enabled)(struct adreno_gpu *adreno_gpu);
 };
 
 struct adreno_reglist {
diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h
index 5c67294edc95..b99098792371 100644
--- a/include/uapi/drm/msm_drm.h
+++ b/include/uapi/drm/msm_drm.h
@@ -117,6 +117,7 @@ struct drm_msm_timespec {
  * ioctl will throw -EPIPE.
  */
 #define MSM_PARAM_EN_VM_BIND 0x16  /* WO, once */
+#define MSM_PARAM_AQE	     0x17  /* RO */
 
 /* For backwards compat.  The original support for preemption was based on
  * a single ring per priority level so # of priority levels equals the #

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 07/17] drm/msm/a6xx: Add support for Debug HFI Q
  2026-03-27  0:13 ` [PATCH v2 07/17] drm/msm/a6xx: Add support for Debug HFI Q Akhil P Oommen
@ 2026-03-27 10:34   ` Konrad Dybcio
  0 siblings, 0 replies; 45+ messages in thread
From: Konrad Dybcio @ 2026-03-27 10:34 UTC (permalink / raw)
  To: Akhil P Oommen, Rob Clark, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel

On 3/27/26 1:13 AM, Akhil P Oommen wrote:
> Add the Debug HFI Queue which contains the F2H messages posted from the
> GMU firmware. Having this data in coredump is useful to debug firmware
> issues.
> 
> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> ---

Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>

Konrad

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 14/17] drm/msm/a8xx: Implement IFPC support for A840
  2026-03-27  0:14 ` [PATCH v2 14/17] drm/msm/a8xx: Implement IFPC support for A840 Akhil P Oommen
@ 2026-03-27 11:07   ` Konrad Dybcio
  2026-03-30 20:34     ` Akhil P Oommen
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Dybcio @ 2026-03-27 11:07 UTC (permalink / raw)
  To: Akhil P Oommen, Rob Clark, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel

On 3/27/26 1:14 AM, Akhil P Oommen wrote:
> Implement pwrup reglist support and add the necessary register
> configurations to enable IFPC support on A840
> 
> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> ---

[...]

> +	REG_A8XX_CP_PROTECT_GLOBAL(51),
> +	REG_A8XX_CP_PROTECT_GLOBAL(52),
> +	REG_A8XX_CP_PROTECT_GLOBAL(53),
> +	REG_A8XX_CP_PROTECT_GLOBAL(54),
> +	REG_A8XX_CP_PROTECT_GLOBAL(55),
> +	REG_A8XX_CP_PROTECT_GLOBAL(56),
> +	REG_A8XX_CP_PROTECT_GLOBAL(57),
> +	REG_A8XX_CP_PROTECT_GLOBAL(58),
> +	REG_A8XX_CP_PROTECT_GLOBAL(59),
> +	REG_A8XX_CP_PROTECT_GLOBAL(60),
> +	REG_A8XX_CP_PROTECT_GLOBAL(61),
> +	REG_A8XX_CP_PROTECT_GLOBAL(62),

53..62 aren't listed by kgsl, but I suppose this is more robust

Similarly for:

SP_CHICKEN_BITS_4
RBBM_PERFCTR_CNTL

The other reglists look OK

[...]

> +	for (u32 pipe_id = PIPE_BR; pipe_id <= PIPE_DDE_BV; pipe_id++) {
> +		for (i = 0; i < dyn_pwrup_reglist->count; i++) {
> +			if (!(dyn_pwrup_reglist->regs[i].pipe & BIT(pipe_id)))
> +				continue;
> +			*dest++ = A8XX_CP_APERTURE_CNTL_HOST_PIPEID(pipe_id);
> +			*dest++ = dyn_pwrup_reglist->regs[i].offset;
> +			*dest++ = a8xx_read_pipe_slice(gpu,
> +						       pipe_id,
> +						       a8xx_get_first_slice(a6xx_gpu),

IDK if the compiler is smart enough to pick it up itself, but
you could do const u32 first_slice = a8xx.. beforehand to save a
couple cycles

[...]

> +	if (!a6xx_gpu->pwrup_reglist_emitted) {
> +		a8xx_patch_pwrup_reglist(gpu);
> +		a6xx_gpu->pwrup_reglist_emitted = true;

Would this flag be better set in the function above?

Konrad

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 11/17] drm/msm/a8xx: Add SKU table for A840
  2026-03-27  0:14 ` [PATCH v2 11/17] drm/msm/a8xx: Add SKU table for A840 Akhil P Oommen
@ 2026-03-27 11:14   ` Konrad Dybcio
  2026-03-30 20:35     ` Akhil P Oommen
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Dybcio @ 2026-03-27 11:14 UTC (permalink / raw)
  To: Akhil P Oommen, Rob Clark, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel

On 3/27/26 1:14 AM, Akhil P Oommen wrote:
> Add the SKU table in the catalog for A840 GPU. This data helps to pick
> the correct bin from the OPP table based on the speed_bin fuse value.
> 
> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> index 38561f26837e..f6b9792531a6 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> @@ -1954,6 +1954,12 @@ static const struct adreno_info a8xx_gpus[] = {
>  			},
>  		},
>  		.preempt_record_size = 19708 * SZ_1K,
> +		.speedbins = ADRENO_SPEEDBINS(
> +			{ 0,   0 },
> +			{ 273, 1 },
> +			{ 252, 2 },
> +			{ 221, 3 },

FWIW downstream maps it like this

221 -> 0
252 -> 1
273 -> 2
default -> 3

although it's of course just between the kernel and the DT

Konrad

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 12/17] drm/msm/a6xx: Add soft fuse detection support
  2026-03-27  0:14 ` [PATCH v2 12/17] drm/msm/a6xx: Add soft fuse detection support Akhil P Oommen
@ 2026-03-27 11:16   ` Konrad Dybcio
  2026-03-30 20:36     ` Akhil P Oommen
  2026-04-11  2:19   ` Jie Gan
  1 sibling, 1 reply; 45+ messages in thread
From: Konrad Dybcio @ 2026-03-27 11:16 UTC (permalink / raw)
  To: Akhil P Oommen, Rob Clark, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel

On 3/27/26 1:14 AM, Akhil P Oommen wrote:
> Recent chipsets like Glymur supports a new mechanism for SKU detection.
> A new CX_MISC register exposes the combined (or final) speedbin value
> from both HW fuse register and the Soft Fuse register. Implement this new
> SKU detection along with a new quirk to identify the GPUs that has soft
> fuse support.
> 
> There is a side effect of this patch on A4x and older series. The
> speedbin field in the MSM_PARAM_CHIPID will be 0 instead of 0xffff. This
> should be okay as Mesa correctly handles it. Speedbin was not even a
> thing when those GPUs' support were added.
> 
> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> ---

My comments here remain

https://lore.kernel.org/linux-arm-msm/20260324-a8xx-gpu-batch2-v1-0-fc95b8d9c017@oss.qualcomm.com/T/#mc97f65496ba60ecfc977bce1b5bb6d3922711ae3

Konrad

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 17/17] drm/msm/adreno: Expose a PARAM to check AQE support
  2026-03-27  0:14 ` [PATCH v2 17/17] drm/msm/adreno: Expose a PARAM to check AQE support Akhil P Oommen
@ 2026-03-27 11:19   ` Konrad Dybcio
  0 siblings, 0 replies; 45+ messages in thread
From: Konrad Dybcio @ 2026-03-27 11:19 UTC (permalink / raw)
  To: Akhil P Oommen, Rob Clark, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel

On 3/27/26 1:14 AM, Akhil P Oommen wrote:
> AQE (Applicaton Qrisc Engine) is required to support VK ray-pipeline. Two
> conditions should be met to use this HW:
>   1. AQE firmware should be loaded and programmed
>   2. Preemption support
> 
> Expose a new MSM_PARAM to allow userspace to query its support.
> 
> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> ---

[...]

> +	case MSM_PARAM_AQE:
> +		*value = !!(adreno_gpu->funcs->aqe_is_enabled &&
> +			    adreno_gpu->funcs->aqe_is_enabled(adreno_gpu));

again, this is a bootlean even without the !! because of the inner
&&

Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>

Konrad

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/17] drm/msm/a6xx: Switch to preemption safe AO counter
  2026-03-27  0:13 ` [PATCH v2 03/17] drm/msm/a6xx: Switch to preemption safe AO counter Akhil P Oommen
@ 2026-03-27 11:32   ` Konrad Dybcio
  2026-03-30 20:51     ` Akhil P Oommen
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Dybcio @ 2026-03-27 11:32 UTC (permalink / raw)
  To: Akhil P Oommen, Rob Clark, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel

On 3/27/26 1:13 AM, Akhil P Oommen wrote:
> CP_ALWAYS_ON_COUNTER is not save-restored during preemption, so it won't
> provide accurate data about the 'submit' when preemption is enabled.
> Switch to CP_ALWAYS_ON_CONTEXT which is preemption safe.
> 
> Fixes: e7ae83da4a28 ("drm/msm/a6xx: Implement preemption for a7xx targets")
> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> ---

(this is not a blocking comment)

would the _COUNTER counter be useful to keep for any other purpose?

Konrad

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/17] drm/msm/adreno: Coredump on GPU/GMU init failures
  2026-03-27  0:13 ` [PATCH v2 08/17] drm/msm/adreno: Coredump on GPU/GMU init failures Akhil P Oommen
@ 2026-03-27 11:35   ` Konrad Dybcio
  2026-03-30 20:53     ` Akhil P Oommen
  2026-03-30 19:51   ` Rob Clark
  1 sibling, 1 reply; 45+ messages in thread
From: Konrad Dybcio @ 2026-03-27 11:35 UTC (permalink / raw)
  To: Akhil P Oommen, Rob Clark, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel

On 3/27/26 1:13 AM, Akhil P Oommen wrote:
> Capture coredump on GPU or GMU errors during initialization to help in
> debugging the issues. To be consistent with the locks while calling
> msm_gpu_crashstate_capture(), call pm_runtime_get(gpu) always with
> msm_gpu->lock.
> 
> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> ---

[...]


> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -391,10 +391,13 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
>  		return 0;
>  	case MSM_PARAM_TIMESTAMP:
>  		if (adreno_gpu->funcs->get_timestamp) {
> +			mutex_lock(&gpu->lock);
>  			pm_runtime_get_sync(&gpu->pdev->dev);
> +
>  			*value = adreno_gpu->funcs->get_timestamp(gpu);
> -			pm_runtime_put_autosuspend(&gpu->pdev->dev);
>  
> +			pm_runtime_put_autosuspend(&gpu->pdev->dev);
> +			mutex_unlock(&gpu->lock);

This is something to take care of in a separate patch, but get_sync may
fail and then the read could crash the device (GMU may be off too)

put_autosuspend could theroetically fail too, but perhaps -edontcarethatmuch

Konrad

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/17] drm/msm/a8xx: Fix the ticks used in submit traces
  2026-03-27  0:13 ` [PATCH v2 02/17] drm/msm/a8xx: Fix the ticks used in submit traces Akhil P Oommen
@ 2026-03-27 11:37   ` Konrad Dybcio
  2026-03-30 20:58     ` Akhil P Oommen
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Dybcio @ 2026-03-27 11:37 UTC (permalink / raw)
  To: Akhil P Oommen, Rob Clark, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel

On 3/27/26 1:13 AM, Akhil P Oommen wrote:
> GMU_ALWAYS_ON_COUNTER_* registers got moved in A8x, but currently, A6x
> register offsets are used in the submit traces instead of A8x offsets.
> To fix this, refactor a bit and use adreno_gpu->funcs->get_timestamp()
> everywhere.
> 
> While we are at it, update a8xx_gmu_get_timestamp() to use the GMU AO
> counter.

Which I assume does not need the GMU to be in any particular state, just
have its clocks enabled?

Konrad

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/17] drm/msm/a6xx: Update HFI definitions
  2026-03-27  0:13 ` [PATCH v2 10/17] drm/msm/a6xx: Update HFI definitions Akhil P Oommen
@ 2026-03-30 11:15   ` Konrad Dybcio
  2026-03-30 20:27     ` Akhil P Oommen
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Dybcio @ 2026-03-30 11:15 UTC (permalink / raw)
  To: Akhil P Oommen, Rob Clark, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel

On 3/27/26 1:13 AM, Akhil P Oommen wrote:
> Update the HFI definitions to support additional GMU based power
> features.
> 
> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> ---

Whether you want to proceed with bitfields or not:

Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>

Konrad

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/17] drm/msm/adreno: Coredump on GPU/GMU init failures
  2026-03-27  0:13 ` [PATCH v2 08/17] drm/msm/adreno: Coredump on GPU/GMU init failures Akhil P Oommen
  2026-03-27 11:35   ` Konrad Dybcio
@ 2026-03-30 19:51   ` Rob Clark
  2026-03-31 18:05     ` Akhil P Oommen
  1 sibling, 1 reply; 45+ messages in thread
From: Rob Clark @ 2026-03-30 19:51 UTC (permalink / raw)
  To: Akhil P Oommen
  Cc: Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, linux-arm-msm, dri-devel,
	freedreno, linux-kernel

On Thu, Mar 26, 2026 at 5:15 PM Akhil P Oommen <akhilpo@oss.qualcomm.com> wrote:
>
> Capture coredump on GPU or GMU errors during initialization to help in
> debugging the issues. To be consistent with the locks while calling
> msm_gpu_crashstate_capture(), call pm_runtime_get(gpu) always with
> msm_gpu->lock.
>
> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c      | 1 +
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 5 +++--
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c    | 5 ++++-
>  drivers/gpu/drm/msm/msm_gpu.c              | 5 +++--
>  drivers/gpu/drm/msm/msm_gpu.h              | 2 ++
>  5 files changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 1b44b9e21ad8..916c5d99c4d1 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -1335,6 +1335,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
>
>  disable_irq:
>         disable_irq(gmu->gmu_irq);
> +       msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
>         a6xx_rpmh_stop(gmu);
>  disable_clk:
>         clk_bulk_disable_unprepare(gmu->nr_clocks, gmu->clocks);
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 4edfe80c5be7..ca5f96e16870 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -105,6 +105,8 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
>          */
>         pm_runtime_enable(&pdev->dev);
>
> +       guard(mutex)(&gpu->lock);

so, I am a fan of guard(), but I realized this means we are holding
gpu->lock across the debugfs_init() (only a5xx).. which I suspect will
anger lockdep, although I don't have a good way to test that atm.

BR,
-R

> +
>         ret = pm_runtime_get_sync(&pdev->dev);
>         if (ret < 0) {
>                 pm_runtime_put_noidle(&pdev->dev);
> @@ -112,10 +114,9 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
>                 goto err_disable_rpm;
>         }
>
> -       mutex_lock(&gpu->lock);
>         ret = msm_gpu_hw_init(gpu);
> -       mutex_unlock(&gpu->lock);
>         if (ret) {
> +               msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
>                 DRM_DEV_ERROR(dev->dev, "gpu hw init failed: %d\n", ret);
>                 goto err_put_rpm;
>         }
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index 785e99fb5bd5..8475802fdde2 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -391,10 +391,13 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
>                 return 0;
>         case MSM_PARAM_TIMESTAMP:
>                 if (adreno_gpu->funcs->get_timestamp) {
> +                       mutex_lock(&gpu->lock);
>                         pm_runtime_get_sync(&gpu->pdev->dev);
> +
>                         *value = adreno_gpu->funcs->get_timestamp(gpu);
> -                       pm_runtime_put_autosuspend(&gpu->pdev->dev);
>
> +                       pm_runtime_put_autosuspend(&gpu->pdev->dev);
> +                       mutex_unlock(&gpu->lock);
>                         return 0;
>                 }
>                 return -EINVAL;
> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> index 995549d0bbbc..472db2c916f9 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.c
> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> @@ -361,7 +361,7 @@ static void crashstate_get_vm_logs(struct msm_gpu_state *state, struct msm_gem_v
>         mutex_unlock(&vm->mmu_lock);
>  }
>
> -static void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
> +void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
>                 struct msm_gem_submit *submit, struct msm_gpu_fault_info *fault_info,
>                 char *comm, char *cmd)
>  {
> @@ -886,7 +886,8 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>
>         pm_runtime_get_sync(&gpu->pdev->dev);
>
> -       msm_gpu_hw_init(gpu);
> +       if (msm_gpu_hw_init(gpu))
> +               msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
>
>         submit->seqno = submit->hw_fence->seqno;
>
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index 666cf499b7ec..eb5b3a7b81f9 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -747,6 +747,8 @@ static inline void msm_gpu_crashstate_put(struct msm_gpu *gpu)
>  }
>
>  void msm_gpu_fault_crashstate_capture(struct msm_gpu *gpu, struct msm_gpu_fault_info *fault_info);
> +void msm_gpu_crashstate_capture(struct msm_gpu *gpu, struct msm_gem_submit *submit,
> +               struct msm_gpu_fault_info *fault_info, char *comm, char *cmd);
>
>  /*
>   * Simple macro to semi-cleanly add the MAP_PRIV flag for targets that can
>
> --
> 2.51.0
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/17] drm/msm/a6xx: Update HFI definitions
  2026-03-30 11:15   ` Konrad Dybcio
@ 2026-03-30 20:27     ` Akhil P Oommen
  2026-03-31  8:22       ` Konrad Dybcio
  0 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-30 20:27 UTC (permalink / raw)
  To: Konrad Dybcio
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Rob Clark,
	Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann

On 3/30/2026 4:45 PM, Konrad Dybcio wrote:
> On 3/27/26 1:13 AM, Akhil P Oommen wrote:
>> Update the HFI definitions to support additional GMU based power
>> features.
>>
>> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
>> ---
> 
> Whether you want to proceed with bitfields or not:
> 
> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>

Thanks. I still feel it is bitfield layout is a 'bit' complicated.

I did an experiment:

#include <stdio.h>
#include <stdint.h>
#include <stddef.h>

struct packed_u8 {
        uint16_t mitigation_time;
        uint8_t current_rating    : 6;
        uint8_t num_phases        : 6;
        uint8_t path_notification : 1;
        uint8_t extra_features    : 3;
} __packed;


void main() {

        struct packed_u8 data = {
                 .mitigation_time  = 0xffff,
                 .current_rating   = 0x3f,  /* all 6 bits set */
                 .num_phases       = 0x3f,
                 .path_notification = 1,
                 .extra_features   = 0x7,
         };

        printf("Akhil 0x%x\n", *((uint32_t *) &data));
}

The output I got in Kaanapali is: Akhil 0x7f3fffff

This means that the compiler inserted a padding between current_rating
and num_phases.

-Akhil.

> 
> Konrad


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 14/17] drm/msm/a8xx: Implement IFPC support for A840
  2026-03-27 11:07   ` Konrad Dybcio
@ 2026-03-30 20:34     ` Akhil P Oommen
  0 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-30 20:34 UTC (permalink / raw)
  To: Konrad Dybcio
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Rob Clark,
	Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann

On 3/27/2026 4:37 PM, Konrad Dybcio wrote:
> On 3/27/26 1:14 AM, Akhil P Oommen wrote:
>> Implement pwrup reglist support and add the necessary register
>> configurations to enable IFPC support on A840
>>
>> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
>> ---
> 
> [...]
> 
>> +	REG_A8XX_CP_PROTECT_GLOBAL(51),
>> +	REG_A8XX_CP_PROTECT_GLOBAL(52),
>> +	REG_A8XX_CP_PROTECT_GLOBAL(53),
>> +	REG_A8XX_CP_PROTECT_GLOBAL(54),
>> +	REG_A8XX_CP_PROTECT_GLOBAL(55),
>> +	REG_A8XX_CP_PROTECT_GLOBAL(56),
>> +	REG_A8XX_CP_PROTECT_GLOBAL(57),
>> +	REG_A8XX_CP_PROTECT_GLOBAL(58),
>> +	REG_A8XX_CP_PROTECT_GLOBAL(59),
>> +	REG_A8XX_CP_PROTECT_GLOBAL(60),
>> +	REG_A8XX_CP_PROTECT_GLOBAL(61),
>> +	REG_A8XX_CP_PROTECT_GLOBAL(62),
> 
> 53..62 aren't listed by kgsl, but I suppose this is more robust
> 
> Similarly for:
> 
> SP_CHICKEN_BITS_4
> RBBM_PERFCTR_CNTL

I can see SP_CHICKEN_BITS_4 in the recent kgsl code. RBBM_PERFCTR_CNTL
is required as we don't handle perfcounter management in this driver yet.

> 
> The other reglists look OK
> 
> [...]
> 
>> +	for (u32 pipe_id = PIPE_BR; pipe_id <= PIPE_DDE_BV; pipe_id++) {
>> +		for (i = 0; i < dyn_pwrup_reglist->count; i++) {
>> +			if (!(dyn_pwrup_reglist->regs[i].pipe & BIT(pipe_id)))
>> +				continue;
>> +			*dest++ = A8XX_CP_APERTURE_CNTL_HOST_PIPEID(pipe_id);
>> +			*dest++ = dyn_pwrup_reglist->regs[i].offset;
>> +			*dest++ = a8xx_read_pipe_slice(gpu,
>> +						       pipe_id,
>> +						       a8xx_get_first_slice(a6xx_gpu),
> 
> IDK if the compiler is smart enough to pick it up itself, but
> you could do const u32 first_slice = a8xx.. beforehand to save a
> couple cycles

This code is not called frequent enough to worry about micro optimizations.

> 
> [...]
> 
>> +	if (!a6xx_gpu->pwrup_reglist_emitted) {
>> +		a8xx_patch_pwrup_reglist(gpu);
>> +		a6xx_gpu->pwrup_reglist_emitted = true;
> 
> Would this flag be better set in the function above?

Agree, but we should update a6xx_gpu.c too. Lets do that separately.

-Akhil.

> 
> Konrad


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 11/17] drm/msm/a8xx: Add SKU table for A840
  2026-03-27 11:14   ` Konrad Dybcio
@ 2026-03-30 20:35     ` Akhil P Oommen
  0 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-30 20:35 UTC (permalink / raw)
  To: Konrad Dybcio
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Rob Clark,
	Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann

On 3/27/2026 4:44 PM, Konrad Dybcio wrote:
> On 3/27/26 1:14 AM, Akhil P Oommen wrote:
>> Add the SKU table in the catalog for A840 GPU. This data helps to pick
>> the correct bin from the OPP table based on the speed_bin fuse value.
>>
>> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
>> ---
>>  drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>> index 38561f26837e..f6b9792531a6 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>> @@ -1954,6 +1954,12 @@ static const struct adreno_info a8xx_gpus[] = {
>>  			},
>>  		},
>>  		.preempt_record_size = 19708 * SZ_1K,
>> +		.speedbins = ADRENO_SPEEDBINS(
>> +			{ 0,   0 },
>> +			{ 273, 1 },
>> +			{ 252, 2 },
>> +			{ 221, 3 },
> 
> FWIW downstream maps it like this
> 
> 221 -> 0
> 252 -> 1
> 273 -> 2
> default -> 3
> 
> although it's of course just between the kernel and the DT

Right. It doesn't matter as long as the DT is aligned with this mapping.

-Akhil.


> 
> Konrad


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 12/17] drm/msm/a6xx: Add soft fuse detection support
  2026-03-27 11:16   ` Konrad Dybcio
@ 2026-03-30 20:36     ` Akhil P Oommen
  0 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-30 20:36 UTC (permalink / raw)
  To: Konrad Dybcio
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Rob Clark,
	Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann

On 3/27/2026 4:46 PM, Konrad Dybcio wrote:
> On 3/27/26 1:14 AM, Akhil P Oommen wrote:
>> Recent chipsets like Glymur supports a new mechanism for SKU detection.
>> A new CX_MISC register exposes the combined (or final) speedbin value
>> from both HW fuse register and the Soft Fuse register. Implement this new
>> SKU detection along with a new quirk to identify the GPUs that has soft
>> fuse support.
>>
>> There is a side effect of this patch on A4x and older series. The
>> speedbin field in the MSM_PARAM_CHIPID will be 0 instead of 0xffff. This
>> should be okay as Mesa correctly handles it. Speedbin was not even a
>> thing when those GPUs' support were added.
>>
>> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
>> ---
> 
> My comments here remain
> 
> https://lore.kernel.org/linux-arm-msm/20260324-a8xx-gpu-batch2-v1-0-fc95b8d9c017@oss.qualcomm.com/T/#mc97f65496ba60ecfc977bce1b5bb6d3922711ae3
> 

We can ignore the SOFTSKUDISABLED bit in the driver.

-Akhil.

> Konrad


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/17] drm/msm/a6xx: Switch to preemption safe AO counter
  2026-03-27 11:32   ` Konrad Dybcio
@ 2026-03-30 20:51     ` Akhil P Oommen
  2026-03-30 21:34       ` Rob Clark
  0 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-30 20:51 UTC (permalink / raw)
  To: Konrad Dybcio, Rob Clark
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Sean Paul,
	Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang,
	Marijn Suijten, David Airlie, Simona Vetter, Antonino Maniscalco,
	Connor Abbott, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann

On 3/27/2026 5:02 PM, Konrad Dybcio wrote:
> On 3/27/26 1:13 AM, Akhil P Oommen wrote:
>> CP_ALWAYS_ON_COUNTER is not save-restored during preemption, so it won't
>> provide accurate data about the 'submit' when preemption is enabled.
>> Switch to CP_ALWAYS_ON_CONTEXT which is preemption safe.
>>
>> Fixes: e7ae83da4a28 ("drm/msm/a6xx: Implement preemption for a7xx targets")
>> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
>> ---
> 
> (this is not a blocking comment)
> 
> would the _COUNTER counter be useful to keep for any other purpose?

That is actually a good question. I was fixated on the fdinfo aspect
that I missed that this is used in the submit_retire trace too. In the
trace, I think it is preferable to use the _COUNTER (or keep both) so
that we can visualize the order in which GPU processed the submits.

Rob, what do you think? If required we can drop this patch from this
series. Or I can send a fix up as this affect only the ftrace.

-Akhil.

> 
> Konrad


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/17] drm/msm/adreno: Coredump on GPU/GMU init failures
  2026-03-27 11:35   ` Konrad Dybcio
@ 2026-03-30 20:53     ` Akhil P Oommen
  0 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-30 20:53 UTC (permalink / raw)
  To: Konrad Dybcio
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Rob Clark,
	Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann

On 3/27/2026 5:05 PM, Konrad Dybcio wrote:
> On 3/27/26 1:13 AM, Akhil P Oommen wrote:
>> Capture coredump on GPU or GMU errors during initialization to help in
>> debugging the issues. To be consistent with the locks while calling
>> msm_gpu_crashstate_capture(), call pm_runtime_get(gpu) always with
>> msm_gpu->lock.
>>
>> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
>> ---
> 
> [...]
> 
> 
>> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
>> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
>> @@ -391,10 +391,13 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
>>  		return 0;
>>  	case MSM_PARAM_TIMESTAMP:
>>  		if (adreno_gpu->funcs->get_timestamp) {
>> +			mutex_lock(&gpu->lock);
>>  			pm_runtime_get_sync(&gpu->pdev->dev);
>> +
>>  			*value = adreno_gpu->funcs->get_timestamp(gpu);
>> -			pm_runtime_put_autosuspend(&gpu->pdev->dev);
>>  
>> +			pm_runtime_put_autosuspend(&gpu->pdev->dev);
>> +			mutex_unlock(&gpu->lock);
> 
> This is something to take care of in a separate patch, but get_sync may
> fail and then the read could crash the device (GMU may be off too)
> 
> put_autosuspend could theroetically fail too, but perhaps -edontcarethatmuch

Right. That is something we should fix everywhere separately.

-Akhil.

> 
> Konrad


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/17] drm/msm/a8xx: Fix the ticks used in submit traces
  2026-03-27 11:37   ` Konrad Dybcio
@ 2026-03-30 20:58     ` Akhil P Oommen
  2026-03-31  8:23       ` Konrad Dybcio
  0 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-30 20:58 UTC (permalink / raw)
  To: Konrad Dybcio
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Rob Clark,
	Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann

On 3/27/2026 5:07 PM, Konrad Dybcio wrote:
> On 3/27/26 1:13 AM, Akhil P Oommen wrote:
>> GMU_ALWAYS_ON_COUNTER_* registers got moved in A8x, but currently, A6x
>> register offsets are used in the submit traces instead of A8x offsets.
>> To fix this, refactor a bit and use adreno_gpu->funcs->get_timestamp()
>> everywhere.
>>
>> While we are at it, update a8xx_gmu_get_timestamp() to use the GMU AO
>> counter.
> 
> Which I assume does not need the GMU to be in any particular state, just
> have its clocks enabled?

On A750+, there is a cx timer calibration during gmu resume to match the
GPU AO counters with the CPU counter. That should be complete before we
read this.

-Akhil.

> 
> Konrad


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/17] drm/msm/a6xx: Switch to preemption safe AO counter
  2026-03-30 20:51     ` Akhil P Oommen
@ 2026-03-30 21:34       ` Rob Clark
  0 siblings, 0 replies; 45+ messages in thread
From: Rob Clark @ 2026-03-30 21:34 UTC (permalink / raw)
  To: Akhil P Oommen
  Cc: Konrad Dybcio, linux-arm-msm, dri-devel, freedreno, linux-kernel,
	Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann

On Mon, Mar 30, 2026 at 1:51 PM Akhil P Oommen <akhilpo@oss.qualcomm.com> wrote:
>
> On 3/27/2026 5:02 PM, Konrad Dybcio wrote:
> > On 3/27/26 1:13 AM, Akhil P Oommen wrote:
> >> CP_ALWAYS_ON_COUNTER is not save-restored during preemption, so it won't
> >> provide accurate data about the 'submit' when preemption is enabled.
> >> Switch to CP_ALWAYS_ON_CONTEXT which is preemption safe.
> >>
> >> Fixes: e7ae83da4a28 ("drm/msm/a6xx: Implement preemption for a7xx targets")
> >> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> >> ---
> >
> > (this is not a blocking comment)
> >
> > would the _COUNTER counter be useful to keep for any other purpose?
>
> That is actually a good question. I was fixated on the fdinfo aspect
> that I missed that this is used in the submit_retire trace too. In the
> trace, I think it is preferable to use the _COUNTER (or keep both) so
> that we can visualize the order in which GPU processed the submits.
>
> Rob, what do you think? If required we can drop this patch from this
> series. Or I can send a fix up as this affect only the ftrace.

So.. there was some really old ftrace based gpu timeline visualization
in chrome ("catapult"), but that has been superceded by perfetto
userspace traces.  The latter uses RB_DONE_TS event to record
timestamps.

I guess being able to record a sensable duration is the main thing.

BR,
-R

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/17] drm/msm/a6xx: Update HFI definitions
  2026-03-30 20:27     ` Akhil P Oommen
@ 2026-03-31  8:22       ` Konrad Dybcio
  2026-03-31 17:44         ` Akhil P Oommen
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Dybcio @ 2026-03-31  8:22 UTC (permalink / raw)
  To: Akhil P Oommen
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Rob Clark,
	Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann

On 3/30/26 10:27 PM, Akhil P Oommen wrote:
> On 3/30/2026 4:45 PM, Konrad Dybcio wrote:
>> On 3/27/26 1:13 AM, Akhil P Oommen wrote:
>>> Update the HFI definitions to support additional GMU based power
>>> features.
>>>
>>> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
>>> ---
>>
>> Whether you want to proceed with bitfields or not:
>>
>> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
> 
> Thanks. I still feel it is bitfield layout is a 'bit' complicated.
> 
> I did an experiment:
> 
> #include <stdio.h>
> #include <stdint.h>
> #include <stddef.h>
> 
> struct packed_u8 {
>         uint16_t mitigation_time;
>         uint8_t current_rating    : 6;
>         uint8_t num_phases        : 6;
>         uint8_t path_notification : 1;
>         uint8_t extra_features    : 3;
> } __packed;
> 
> 
> void main() {
> 
>         struct packed_u8 data = {
>                  .mitigation_time  = 0xffff,
>                  .current_rating   = 0x3f,  /* all 6 bits set */
>                  .num_phases       = 0x3f,
>                  .path_notification = 1,
>                  .extra_features   = 0x7,
>          };
> 
>         printf("Akhil 0x%x\n", *((uint32_t *) &data));
> }
> 
> The output I got in Kaanapali is: Akhil 0x7f3fffff
> 
> This means that the compiler inserted a padding between current_rating
> and num_phases.

That's because __packed doesn't work outside the kernel - you're just
creating a variable named __packed, so this is effectively the same as:

struct foo {
        int a;
        int b;
        unsigned char c : 2;
} __hotdog;

int main () {
        printf("Akhil 0x%x\n", *((uint32_t *) &__hotdog));
}

Outside the kernel tree, you need to use the full annoying __attribute__((foo))
syntax:

include/linux/compiler_attributes.h:#define __packed                        __attribute__((__packed__))

with that changed, we get:

Akhil 0xffffffff

which is the expected behavior

Konrad

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/17] drm/msm/a8xx: Fix the ticks used in submit traces
  2026-03-30 20:58     ` Akhil P Oommen
@ 2026-03-31  8:23       ` Konrad Dybcio
  0 siblings, 0 replies; 45+ messages in thread
From: Konrad Dybcio @ 2026-03-31  8:23 UTC (permalink / raw)
  To: Akhil P Oommen
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Rob Clark,
	Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann

On 3/30/26 10:58 PM, Akhil P Oommen wrote:
> On 3/27/2026 5:07 PM, Konrad Dybcio wrote:
>> On 3/27/26 1:13 AM, Akhil P Oommen wrote:
>>> GMU_ALWAYS_ON_COUNTER_* registers got moved in A8x, but currently, A6x
>>> register offsets are used in the submit traces instead of A8x offsets.
>>> To fix this, refactor a bit and use adreno_gpu->funcs->get_timestamp()
>>> everywhere.
>>>
>>> While we are at it, update a8xx_gmu_get_timestamp() to use the GMU AO
>>> counter.
>>
>> Which I assume does not need the GMU to be in any particular state, just
>> have its clocks enabled?
> 
> On A750+, there is a cx timer calibration during gmu resume to match the
> GPU AO counters with the CPU counter. That should be complete before we
> read this.

Worth a comment!

Konrad

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/17] drm/msm/a6xx: Update HFI definitions
  2026-03-31  8:22       ` Konrad Dybcio
@ 2026-03-31 17:44         ` Akhil P Oommen
  0 siblings, 0 replies; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-31 17:44 UTC (permalink / raw)
  To: Konrad Dybcio
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel, Rob Clark,
	Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann

On 3/31/2026 1:52 PM, Konrad Dybcio wrote:
> On 3/30/26 10:27 PM, Akhil P Oommen wrote:
>> On 3/30/2026 4:45 PM, Konrad Dybcio wrote:
>>> On 3/27/26 1:13 AM, Akhil P Oommen wrote:
>>>> Update the HFI definitions to support additional GMU based power
>>>> features.
>>>>
>>>> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
>>>> ---
>>>
>>> Whether you want to proceed with bitfields or not:
>>>
>>> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
>>
>> Thanks. I still feel it is bitfield layout is a 'bit' complicated.
>>
>> I did an experiment:
>>
>> #include <stdio.h>
>> #include <stdint.h>
>> #include <stddef.h>
>>
>> struct packed_u8 {
>>         uint16_t mitigation_time;
>>         uint8_t current_rating    : 6;
>>         uint8_t num_phases        : 6;
>>         uint8_t path_notification : 1;
>>         uint8_t extra_features    : 3;
>> } __packed;
>>
>>
>> void main() {
>>
>>         struct packed_u8 data = {
>>                  .mitigation_time  = 0xffff,
>>                  .current_rating   = 0x3f,  /* all 6 bits set */
>>                  .num_phases       = 0x3f,
>>                  .path_notification = 1,
>>                  .extra_features   = 0x7,
>>          };
>>
>>         printf("Akhil 0x%x\n", *((uint32_t *) &data));
>> }
>>
>> The output I got in Kaanapali is: Akhil 0x7f3fffff
>>
>> This means that the compiler inserted a padding between current_rating
>> and num_phases.
> 
> That's because __packed doesn't work outside the kernel - you're just
> creating a variable named __packed, so this is effectively the same as:
> 
> struct foo {
>         int a;
>         int b;
>         unsigned char c : 2;
> } __hotdog;
> 
> int main () {
>         printf("Akhil 0x%x\n", *((uint32_t *) &__hotdog));
> }
> 
> Outside the kernel tree, you need to use the full annoying __attribute__((foo))
> syntax:
> 
> include/linux/compiler_attributes.h:#define __packed                        __attribute__((__packed__))
> 
> with that changed, we get:
> 
> Akhil 0xffffffff
> 
> which is the expected behavior

Yeah, I can confirm this. The AAPCS doc which describes this is a bit
difficult to decipher for the case where a bitfield straddles between 2
storage units. Anyway the experiment confirms that your assumption is
correct.

Will address this separately when I post the CLX patches.

-Akhil.

> 
> Konrad


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/17] drm/msm/adreno: Coredump on GPU/GMU init failures
  2026-03-30 19:51   ` Rob Clark
@ 2026-03-31 18:05     ` Akhil P Oommen
  2026-03-31 18:28       ` Rob Clark
  0 siblings, 1 reply; 45+ messages in thread
From: Akhil P Oommen @ 2026-03-31 18:05 UTC (permalink / raw)
  To: rob.clark
  Cc: Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, linux-arm-msm, dri-devel,
	freedreno, linux-kernel

On 3/31/2026 1:21 AM, Rob Clark wrote:
> On Thu, Mar 26, 2026 at 5:15 PM Akhil P Oommen <akhilpo@oss.qualcomm.com> wrote:
>>
>> Capture coredump on GPU or GMU errors during initialization to help in
>> debugging the issues. To be consistent with the locks while calling
>> msm_gpu_crashstate_capture(), call pm_runtime_get(gpu) always with
>> msm_gpu->lock.
>>
>> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
>> ---
>>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c      | 1 +
>>  drivers/gpu/drm/msm/adreno/adreno_device.c | 5 +++--
>>  drivers/gpu/drm/msm/adreno/adreno_gpu.c    | 5 ++++-
>>  drivers/gpu/drm/msm/msm_gpu.c              | 5 +++--
>>  drivers/gpu/drm/msm/msm_gpu.h              | 2 ++
>>  5 files changed, 13 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> index 1b44b9e21ad8..916c5d99c4d1 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> @@ -1335,6 +1335,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
>>
>>  disable_irq:
>>         disable_irq(gmu->gmu_irq);
>> +       msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
>>         a6xx_rpmh_stop(gmu);
>>  disable_clk:
>>         clk_bulk_disable_unprepare(gmu->nr_clocks, gmu->clocks);
>> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
>> index 4edfe80c5be7..ca5f96e16870 100644
>> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
>> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
>> @@ -105,6 +105,8 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
>>          */
>>         pm_runtime_enable(&pdev->dev);
>>
>> +       guard(mutex)(&gpu->lock);
> 
> so, I am a fan of guard(), but I realized this means we are holding
> gpu->lock across the debugfs_init() (only a5xx).. which I suspect will
> anger lockdep, although I don't have a good way to test that atm.

Shall we drop this patch for now? I will send a respin later.

-Akhil.

> 
> BR,
> -R
> 
>> +
>>         ret = pm_runtime_get_sync(&pdev->dev);
>>         if (ret < 0) {
>>                 pm_runtime_put_noidle(&pdev->dev);
>> @@ -112,10 +114,9 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
>>                 goto err_disable_rpm;
>>         }
>>
>> -       mutex_lock(&gpu->lock);
>>         ret = msm_gpu_hw_init(gpu);
>> -       mutex_unlock(&gpu->lock);
>>         if (ret) {
>> +               msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
>>                 DRM_DEV_ERROR(dev->dev, "gpu hw init failed: %d\n", ret);
>>                 goto err_put_rpm;
>>         }
>> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
>> index 785e99fb5bd5..8475802fdde2 100644
>> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
>> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
>> @@ -391,10 +391,13 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
>>                 return 0;
>>         case MSM_PARAM_TIMESTAMP:
>>                 if (adreno_gpu->funcs->get_timestamp) {
>> +                       mutex_lock(&gpu->lock);
>>                         pm_runtime_get_sync(&gpu->pdev->dev);
>> +
>>                         *value = adreno_gpu->funcs->get_timestamp(gpu);
>> -                       pm_runtime_put_autosuspend(&gpu->pdev->dev);
>>
>> +                       pm_runtime_put_autosuspend(&gpu->pdev->dev);
>> +                       mutex_unlock(&gpu->lock);
>>                         return 0;
>>                 }
>>                 return -EINVAL;
>> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
>> index 995549d0bbbc..472db2c916f9 100644
>> --- a/drivers/gpu/drm/msm/msm_gpu.c
>> +++ b/drivers/gpu/drm/msm/msm_gpu.c
>> @@ -361,7 +361,7 @@ static void crashstate_get_vm_logs(struct msm_gpu_state *state, struct msm_gem_v
>>         mutex_unlock(&vm->mmu_lock);
>>  }
>>
>> -static void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
>> +void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
>>                 struct msm_gem_submit *submit, struct msm_gpu_fault_info *fault_info,
>>                 char *comm, char *cmd)
>>  {
>> @@ -886,7 +886,8 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>>
>>         pm_runtime_get_sync(&gpu->pdev->dev);
>>
>> -       msm_gpu_hw_init(gpu);
>> +       if (msm_gpu_hw_init(gpu))
>> +               msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
>>
>>         submit->seqno = submit->hw_fence->seqno;
>>
>> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
>> index 666cf499b7ec..eb5b3a7b81f9 100644
>> --- a/drivers/gpu/drm/msm/msm_gpu.h
>> +++ b/drivers/gpu/drm/msm/msm_gpu.h
>> @@ -747,6 +747,8 @@ static inline void msm_gpu_crashstate_put(struct msm_gpu *gpu)
>>  }
>>
>>  void msm_gpu_fault_crashstate_capture(struct msm_gpu *gpu, struct msm_gpu_fault_info *fault_info);
>> +void msm_gpu_crashstate_capture(struct msm_gpu *gpu, struct msm_gem_submit *submit,
>> +               struct msm_gpu_fault_info *fault_info, char *comm, char *cmd);
>>
>>  /*
>>   * Simple macro to semi-cleanly add the MAP_PRIV flag for targets that can
>>
>> --
>> 2.51.0
>>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/17] drm/msm/adreno: Coredump on GPU/GMU init failures
  2026-03-31 18:05     ` Akhil P Oommen
@ 2026-03-31 18:28       ` Rob Clark
  0 siblings, 0 replies; 45+ messages in thread
From: Rob Clark @ 2026-03-31 18:28 UTC (permalink / raw)
  To: Akhil P Oommen
  Cc: Sean Paul, Konrad Dybcio, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Marijn Suijten, David Airlie, Simona Vetter,
	Antonino Maniscalco, Connor Abbott, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, linux-arm-msm, dri-devel,
	freedreno, linux-kernel

On Tue, Mar 31, 2026 at 11:05 AM Akhil P Oommen
<akhilpo@oss.qualcomm.com> wrote:
>
> On 3/31/2026 1:21 AM, Rob Clark wrote:
> > On Thu, Mar 26, 2026 at 5:15 PM Akhil P Oommen <akhilpo@oss.qualcomm.com> wrote:
> >>
> >> Capture coredump on GPU or GMU errors during initialization to help in
> >> debugging the issues. To be consistent with the locks while calling
> >> msm_gpu_crashstate_capture(), call pm_runtime_get(gpu) always with
> >> msm_gpu->lock.
> >>
> >> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> >> ---
> >>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c      | 1 +
> >>  drivers/gpu/drm/msm/adreno/adreno_device.c | 5 +++--
> >>  drivers/gpu/drm/msm/adreno/adreno_gpu.c    | 5 ++++-
> >>  drivers/gpu/drm/msm/msm_gpu.c              | 5 +++--
> >>  drivers/gpu/drm/msm/msm_gpu.h              | 2 ++
> >>  5 files changed, 13 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> index 1b44b9e21ad8..916c5d99c4d1 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> @@ -1335,6 +1335,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
> >>
> >>  disable_irq:
> >>         disable_irq(gmu->gmu_irq);
> >> +       msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
> >>         a6xx_rpmh_stop(gmu);
> >>  disable_clk:
> >>         clk_bulk_disable_unprepare(gmu->nr_clocks, gmu->clocks);
> >> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
> >> index 4edfe80c5be7..ca5f96e16870 100644
> >> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> >> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> >> @@ -105,6 +105,8 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
> >>          */
> >>         pm_runtime_enable(&pdev->dev);
> >>
> >> +       guard(mutex)(&gpu->lock);
> >
> > so, I am a fan of guard(), but I realized this means we are holding
> > gpu->lock across the debugfs_init() (only a5xx).. which I suspect will
> > anger lockdep, although I don't have a good way to test that atm.
>
> Shall we drop this patch for now? I will send a respin later.

I think I might, esp since the db820/a5xx ci runners seem to be down atm

BR,
-R

> -Akhil.
>
> >
> > BR,
> > -R
> >
> >> +
> >>         ret = pm_runtime_get_sync(&pdev->dev);
> >>         if (ret < 0) {
> >>                 pm_runtime_put_noidle(&pdev->dev);
> >> @@ -112,10 +114,9 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
> >>                 goto err_disable_rpm;
> >>         }
> >>
> >> -       mutex_lock(&gpu->lock);
> >>         ret = msm_gpu_hw_init(gpu);
> >> -       mutex_unlock(&gpu->lock);
> >>         if (ret) {
> >> +               msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
> >>                 DRM_DEV_ERROR(dev->dev, "gpu hw init failed: %d\n", ret);
> >>                 goto err_put_rpm;
> >>         }
> >> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> >> index 785e99fb5bd5..8475802fdde2 100644
> >> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> >> @@ -391,10 +391,13 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx,
> >>                 return 0;
> >>         case MSM_PARAM_TIMESTAMP:
> >>                 if (adreno_gpu->funcs->get_timestamp) {
> >> +                       mutex_lock(&gpu->lock);
> >>                         pm_runtime_get_sync(&gpu->pdev->dev);
> >> +
> >>                         *value = adreno_gpu->funcs->get_timestamp(gpu);
> >> -                       pm_runtime_put_autosuspend(&gpu->pdev->dev);
> >>
> >> +                       pm_runtime_put_autosuspend(&gpu->pdev->dev);
> >> +                       mutex_unlock(&gpu->lock);
> >>                         return 0;
> >>                 }
> >>                 return -EINVAL;
> >> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> >> index 995549d0bbbc..472db2c916f9 100644
> >> --- a/drivers/gpu/drm/msm/msm_gpu.c
> >> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> >> @@ -361,7 +361,7 @@ static void crashstate_get_vm_logs(struct msm_gpu_state *state, struct msm_gem_v
> >>         mutex_unlock(&vm->mmu_lock);
> >>  }
> >>
> >> -static void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
> >> +void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
> >>                 struct msm_gem_submit *submit, struct msm_gpu_fault_info *fault_info,
> >>                 char *comm, char *cmd)
> >>  {
> >> @@ -886,7 +886,8 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
> >>
> >>         pm_runtime_get_sync(&gpu->pdev->dev);
> >>
> >> -       msm_gpu_hw_init(gpu);
> >> +       if (msm_gpu_hw_init(gpu))
> >> +               msm_gpu_crashstate_capture(gpu, NULL, NULL, NULL, NULL);
> >>
> >>         submit->seqno = submit->hw_fence->seqno;
> >>
> >> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> >> index 666cf499b7ec..eb5b3a7b81f9 100644
> >> --- a/drivers/gpu/drm/msm/msm_gpu.h
> >> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> >> @@ -747,6 +747,8 @@ static inline void msm_gpu_crashstate_put(struct msm_gpu *gpu)
> >>  }
> >>
> >>  void msm_gpu_fault_crashstate_capture(struct msm_gpu *gpu, struct msm_gpu_fault_info *fault_info);
> >> +void msm_gpu_crashstate_capture(struct msm_gpu *gpu, struct msm_gem_submit *submit,
> >> +               struct msm_gpu_fault_info *fault_info, char *comm, char *cmd);
> >>
> >>  /*
> >>   * Simple macro to semi-cleanly add the MAP_PRIV flag for targets that can
> >>
> >> --
> >> 2.51.0
> >>
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 12/17] drm/msm/a6xx: Add soft fuse detection support
  2026-03-27  0:14 ` [PATCH v2 12/17] drm/msm/a6xx: Add soft fuse detection support Akhil P Oommen
  2026-03-27 11:16   ` Konrad Dybcio
@ 2026-04-11  2:19   ` Jie Gan
  1 sibling, 0 replies; 45+ messages in thread
From: Jie Gan @ 2026-04-11  2:19 UTC (permalink / raw)
  To: Akhil P Oommen, Rob Clark, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel



On 3/27/2026 8:14 AM, Akhil P Oommen wrote:
> Recent chipsets like Glymur supports a new mechanism for SKU detection.
> A new CX_MISC register exposes the combined (or final) speedbin value
> from both HW fuse register and the Soft Fuse register. Implement this new
> SKU detection along with a new quirk to identify the GPUs that has soft
> fuse support.
> 
> There is a side effect of this patch on A4x and older series. The
> speedbin field in the MSM_PARAM_CHIPID will be 0 instead of 0xffff. This
> should be okay as Mesa correctly handles it. Speedbin was not even a
> thing when those GPUs' support were added.
> 
> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> ---
>   drivers/gpu/drm/msm/adreno/a5xx_gpu.c         |  6 ++++
>   drivers/gpu/drm/msm/adreno/a6xx_gpu.c         | 41 ++++++++++++++++++++++-----
>   drivers/gpu/drm/msm/adreno/adreno_gpu.c       |  5 ----
>   drivers/gpu/drm/msm/adreno/adreno_gpu.h       |  1 +
>   drivers/gpu/drm/msm/registers/adreno/a6xx.xml |  4 +++
>   5 files changed, 45 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index 79a441e91fa1..d7ed3225f635 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -1731,6 +1731,7 @@ static struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
>   	struct adreno_gpu *adreno_gpu;
>   	struct msm_gpu *gpu;
>   	unsigned int nr_rings;
> +	u32 speedbin;
>   	int ret;
>   
>   	a5xx_gpu = kzalloc(sizeof(*a5xx_gpu), GFP_KERNEL);
> @@ -1757,6 +1758,11 @@ static struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
>   		return ERR_PTR(ret);
>   	}
>   
> +	/* Set the speedbin value that is passed to userspace */
> +	if (adreno_read_speedbin(&pdev->dev, &speedbin) || !speedbin)
> +		speedbin = 0xffff;
> +	adreno_gpu->speedbin = (uint16_t) (0xffff & speedbin);
> +
>   	msm_mmu_set_fault_handler(to_msm_vm(gpu->vm)->mmu, gpu,
>   				  a5xx_fault_handler);
>   
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 5cddfc03828f..fb9662b946d0 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2546,13 +2546,33 @@ static u32 fuse_to_supp_hw(const struct adreno_info *info, u32 fuse)
>   	return UINT_MAX;
>   }
>   
> -static int a6xx_set_supported_hw(struct device *dev, const struct adreno_info *info)
> +static int a6xx_read_speedbin(struct device *dev, struct a6xx_gpu *a6xx_gpu,
> +		const struct adreno_info *info, u32 *speedbin)
> +{
> +	int ret;
> +
> +	/* Use speedbin fuse if present. Otherwise, fallback to softfuse */
> +	ret = adreno_read_speedbin(dev, speedbin);
> +	if (ret != -ENOENT)
> +		return ret;
> +
> +	if (info->quirks & ADRENO_QUIRK_SOFTFUSE) {
> +		*speedbin = a6xx_llc_read(a6xx_gpu, REG_A8XX_CX_MISC_SW_FUSE_FREQ_LIMIT_STATUS);

a6xx_llc_read accesses the a6xx_gpu->llcc_mmio without check the 
validation of the a6xx_gpu->llcc_mmio. If the llcc_mmio is invalid, the 
subsequent readl() in a6xx_llc_read() will crash with a kernel oops.

Thanks,
Jie

> +		*speedbin = A8XX_CX_MISC_SW_FUSE_FREQ_LIMIT_STATUS_FINALFREQLIMIT(*speedbin);
> +		return 0;
> +	}
> +
> +	return -ENOENT;
> +}
> +
> +static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu,
> +		const struct adreno_info *info)
>   {
>   	u32 supp_hw;
>   	u32 speedbin;
>   	int ret;
>   
> -	ret = adreno_read_speedbin(dev, &speedbin);
> +	ret = a6xx_read_speedbin(dev, a6xx_gpu, info, &speedbin);
>   	/*
>   	 * -ENOENT means that the platform doesn't support speedbin which is
>   	 * fine
> @@ -2586,11 +2606,13 @@ static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>   	struct msm_drm_private *priv = dev->dev_private;
>   	struct platform_device *pdev = priv->gpu_pdev;
>   	struct adreno_platform_config *config = pdev->dev.platform_data;
> +	const struct adreno_info *info = config->info;
>   	struct device_node *node;
>   	struct a6xx_gpu *a6xx_gpu;
>   	struct adreno_gpu *adreno_gpu;
>   	struct msm_gpu *gpu;
>   	extern int enable_preemption;
> +	u32 speedbin;
>   	bool is_a7xx;
>   	int ret, nr_rings = 1;
>   
> @@ -2613,14 +2635,14 @@ static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>   	adreno_gpu->gmu_is_wrapper = of_device_is_compatible(node, "qcom,adreno-gmu-wrapper");
>   
>   	adreno_gpu->base.hw_apriv =
> -		!!(config->info->quirks & ADRENO_QUIRK_HAS_HW_APRIV);
> +		!!(info->quirks & ADRENO_QUIRK_HAS_HW_APRIV);
>   
>   	/* gpu->info only gets assigned in adreno_gpu_init(). A8x is included intentionally */
> -	is_a7xx = config->info->family >= ADRENO_7XX_GEN1;
> +	is_a7xx = info->family >= ADRENO_7XX_GEN1;
>   
>   	a6xx_llc_slices_init(pdev, a6xx_gpu, is_a7xx);
>   
> -	ret = a6xx_set_supported_hw(&pdev->dev, config->info);
> +	ret = a6xx_set_supported_hw(&pdev->dev, a6xx_gpu, info);
>   	if (ret) {
>   		a6xx_llc_slices_destroy(a6xx_gpu);
>   		kfree(a6xx_gpu);
> @@ -2628,15 +2650,20 @@ static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>   	}
>   
>   	if ((enable_preemption == 1) || (enable_preemption == -1 &&
> -	    (config->info->quirks & ADRENO_QUIRK_PREEMPTION)))
> +	    (info->quirks & ADRENO_QUIRK_PREEMPTION)))
>   		nr_rings = 4;
>   
> -	ret = adreno_gpu_init(dev, pdev, adreno_gpu, config->info->funcs, nr_rings);
> +	ret = adreno_gpu_init(dev, pdev, adreno_gpu, info->funcs, nr_rings);
>   	if (ret) {
>   		a6xx_destroy(&(a6xx_gpu->base.base));
>   		return ERR_PTR(ret);
>   	}
>   
> +	/* Set the speedbin value that is passed to userspace */
> +	if (a6xx_read_speedbin(&pdev->dev, a6xx_gpu, info, &speedbin) || !speedbin)
> +		speedbin = 0xffff;
> +	adreno_gpu->speedbin = (uint16_t) (0xffff & speedbin);
> +
>   	/*
>   	 * For now only clamp to idle freq for devices where this is known not
>   	 * to cause power supply issues:
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index 8475802fdde2..f6c39aed50f2 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -1185,7 +1185,6 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
>   	struct msm_gpu_config adreno_gpu_config  = { 0 };
>   	struct msm_gpu *gpu = &adreno_gpu->base;
>   	const char *gpu_name;
> -	u32 speedbin;
>   	int ret;
>   
>   	adreno_gpu->funcs = funcs;
> @@ -1214,10 +1213,6 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
>   			devm_pm_opp_set_clkname(dev, "core");
>   	}
>   
> -	if (adreno_read_speedbin(dev, &speedbin) || !speedbin)
> -		speedbin = 0xffff;
> -	adreno_gpu->speedbin = (uint16_t) (0xffff & speedbin);
> -
>   	gpu_name = devm_kasprintf(dev, GFP_KERNEL, "%"ADRENO_CHIPID_FMT,
>   			ADRENO_CHIPID_ARGS(config->chip_id));
>   	if (!gpu_name)
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index 29097e6b4253..044ed4d49aa7 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> @@ -63,6 +63,7 @@ enum adreno_family {
>   #define ADRENO_QUIRK_PREEMPTION			BIT(5)
>   #define ADRENO_QUIRK_4GB_VA			BIT(6)
>   #define ADRENO_QUIRK_IFPC			BIT(7)
> +#define ADRENO_QUIRK_SOFTFUSE			BIT(8)
>   
>   /* Helper for formating the chip_id in the way that userspace tools like
>    * crashdec expect.
> diff --git a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
> index 3941e7510754..2309870f5031 100644
> --- a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
> +++ b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
> @@ -5016,6 +5016,10 @@ by a particular renderpass/blit.
>   		<bitfield pos="1" name="LPAC" type="boolean"/>
>   		<bitfield pos="2" name="RAYTRACING" type="boolean"/>
>   	</reg32>
> +	<reg32 offset="0x0405" name="CX_MISC_SW_FUSE_FREQ_LIMIT_STATUS" variants="A8XX-">
> +		<bitfield high="8" low="0" name="FINALFREQLIMIT"/>
> +		<bitfield pos="24" name="SOFTSKUDISABLED" type="boolean"/>
> +	</reg32>
>   </domain>
>   
>   </database>
> 


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 15/17] drm/msm/a8xx: Preemption support for A840
  2026-03-27  0:14 ` [PATCH v2 15/17] drm/msm/a8xx: Preemption " Akhil P Oommen
@ 2026-04-11  2:22   ` Jie Gan
  2026-04-11 13:56     ` Rob Clark
  0 siblings, 1 reply; 45+ messages in thread
From: Jie Gan @ 2026-04-11  2:22 UTC (permalink / raw)
  To: Akhil P Oommen, Rob Clark, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann
  Cc: linux-arm-msm, dri-devel, freedreno, linux-kernel



On 3/27/2026 8:14 AM, Akhil P Oommen wrote:
> The programing sequence related to preemption is unchanged from A7x. But
> there is some code churn due to register shuffling in A8x. So, split out
> the common code into a header file for code sharing and add/update
> additional changes required to support preemption feature on A8x GPUs.
> 
> Finally, enable the preemption quirk in A840's catalog to enable this
> feature.
> 
> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> ---
>   drivers/gpu/drm/msm/Makefile              |   1 +
>   drivers/gpu/drm/msm/adreno/a6xx_catalog.c |   1 +
>   drivers/gpu/drm/msm/adreno/a6xx_gpu.c     |   7 +-
>   drivers/gpu/drm/msm/adreno/a6xx_gpu.h     |   5 +
>   drivers/gpu/drm/msm/adreno/a6xx_preempt.c |  77 +--------
>   drivers/gpu/drm/msm/adreno/a6xx_preempt.h |  82 ++++++++++
>   drivers/gpu/drm/msm/adreno/a8xx_gpu.c     |  37 ++++-
>   drivers/gpu/drm/msm/adreno/a8xx_preempt.c | 259 ++++++++++++++++++++++++++++++
>   8 files changed, 389 insertions(+), 80 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
> index 8b94c5f1cb68..ba45e99be05b 100644
> --- a/drivers/gpu/drm/msm/Makefile
> +++ b/drivers/gpu/drm/msm/Makefile
> @@ -25,6 +25,7 @@ adreno-y := \
>   	adreno/a6xx_hfi.o \
>   	adreno/a6xx_preempt.o \
>   	adreno/a8xx_gpu.o \
> +	adreno/a8xx_preempt.o \
>   
>   adreno-$(CONFIG_DEBUG_FS) += adreno/a5xx_debugfs.o \
>   
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> index 53548f6e891b..21f5a685196b 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> @@ -2120,6 +2120,7 @@ static const struct adreno_info a8xx_gpus[] = {
>   		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
>   		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
>   			  ADRENO_QUIRK_HAS_HW_APRIV |
> +			  ADRENO_QUIRK_PREEMPTION |
>   			  ADRENO_QUIRK_IFPC,
>   		.funcs = &a8xx_gpu_funcs,
>   		.a6xx = &(const struct a6xx_info) {
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index fb9662b946d0..44ce02b412ca 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -410,7 +410,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>   	a6xx_flush(gpu, ring);
>   }
>   
> -static void a6xx_emit_set_pseudo_reg(struct msm_ringbuffer *ring,
> +void a6xx_emit_set_pseudo_reg(struct msm_ringbuffer *ring,
>   		struct a6xx_gpu *a6xx_gpu, struct msm_gpu_submitqueue *queue)
>   {
>   	u64 preempt_postamble;
> @@ -620,7 +620,10 @@ static void a7xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>   	a6xx_flush(gpu, ring);
>   
>   	/* Check to see if we need to start preemption */
> -	a6xx_preempt_trigger(gpu);
> +	if (adreno_is_a8xx(adreno_gpu))
> +		a8xx_preempt_trigger(gpu);
> +	else
> +		a6xx_preempt_trigger(gpu);
>   }
>   
>   static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> index a4434a6a56dd..eb431e5e00b1 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> @@ -278,6 +278,8 @@ void a6xx_preempt_hw_init(struct msm_gpu *gpu);
>   void a6xx_preempt_trigger(struct msm_gpu *gpu);
>   void a6xx_preempt_irq(struct msm_gpu *gpu);
>   void a6xx_preempt_fini(struct msm_gpu *gpu);
> +void a6xx_emit_set_pseudo_reg(struct msm_ringbuffer *ring,
> +		struct a6xx_gpu *a6xx_gpu, struct msm_gpu_submitqueue *queue);
>   int a6xx_preempt_submitqueue_setup(struct msm_gpu *gpu,
>   		struct msm_gpu_submitqueue *queue);
>   void a6xx_preempt_submitqueue_close(struct msm_gpu *gpu,
> @@ -327,6 +329,9 @@ void a8xx_gpu_get_slice_info(struct msm_gpu *gpu);
>   int a8xx_hw_init(struct msm_gpu *gpu);
>   irqreturn_t a8xx_irq(struct msm_gpu *gpu);
>   void a8xx_llc_activate(struct a6xx_gpu *a6xx_gpu);
> +void a8xx_preempt_hw_init(struct msm_gpu *gpu);
> +void a8xx_preempt_trigger(struct msm_gpu *gpu);
> +void a8xx_preempt_irq(struct msm_gpu *gpu);
>   bool a8xx_progress(struct msm_gpu *gpu, struct msm_ringbuffer *ring);
>   void a8xx_recover(struct msm_gpu *gpu);
>   #endif /* __A6XX_GPU_H__ */
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
> index 747a22afad9f..df4cbf42e9a4 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
> @@ -6,85 +6,10 @@
>   #include "msm_gem.h"
>   #include "a6xx_gpu.h"
>   #include "a6xx_gmu.xml.h"
> +#include "a6xx_preempt.h"
>   #include "msm_mmu.h"
>   #include "msm_gpu_trace.h"
>   
> -/*
> - * Try to transition the preemption state from old to new. Return
> - * true on success or false if the original state wasn't 'old'
> - */
> -static inline bool try_preempt_state(struct a6xx_gpu *a6xx_gpu,
> -		enum a6xx_preempt_state old, enum a6xx_preempt_state new)
> -{
> -	enum a6xx_preempt_state cur = atomic_cmpxchg(&a6xx_gpu->preempt_state,
> -		old, new);
> -
> -	return (cur == old);
> -}
> -
> -/*
> - * Force the preemption state to the specified state.  This is used in cases
> - * where the current state is known and won't change
> - */
> -static inline void set_preempt_state(struct a6xx_gpu *gpu,
> -		enum a6xx_preempt_state new)
> -{
> -	/*
> -	 * preempt_state may be read by other cores trying to trigger a
> -	 * preemption or in the interrupt handler so barriers are needed
> -	 * before...
> -	 */
> -	smp_mb__before_atomic();
> -	atomic_set(&gpu->preempt_state, new);
> -	/* ... and after*/
> -	smp_mb__after_atomic();
> -}
> -
> -/* Write the most recent wptr for the given ring into the hardware */
> -static inline void update_wptr(struct a6xx_gpu *a6xx_gpu, struct msm_ringbuffer *ring)
> -{
> -	unsigned long flags;
> -	uint32_t wptr;
> -
> -	spin_lock_irqsave(&ring->preempt_lock, flags);
> -
> -	if (ring->restore_wptr) {
> -		wptr = get_wptr(ring);
> -
> -		a6xx_fenced_write(a6xx_gpu, REG_A6XX_CP_RB_WPTR, wptr, BIT(0), false);
> -
> -		ring->restore_wptr = false;
> -	}
> -
> -	spin_unlock_irqrestore(&ring->preempt_lock, flags);
> -}
> -
> -/* Return the highest priority ringbuffer with something in it */
> -static struct msm_ringbuffer *get_next_ring(struct msm_gpu *gpu)
> -{
> -	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> -	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> -
> -	unsigned long flags;
> -	int i;
> -
> -	for (i = 0; i < gpu->nr_rings; i++) {
> -		bool empty;
> -		struct msm_ringbuffer *ring = gpu->rb[i];
> -
> -		spin_lock_irqsave(&ring->preempt_lock, flags);
> -		empty = (get_wptr(ring) == gpu->funcs->get_rptr(gpu, ring));
> -		if (!empty && ring == a6xx_gpu->cur_ring)
> -			empty = ring->memptrs->fence == a6xx_gpu->last_seqno[i];
> -		spin_unlock_irqrestore(&ring->preempt_lock, flags);
> -
> -		if (!empty)
> -			return ring;
> -	}
> -
> -	return NULL;
> -}
> -
>   static void a6xx_preempt_timer(struct timer_list *t)
>   {
>   	struct a6xx_gpu *a6xx_gpu = timer_container_of(a6xx_gpu, t,
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_preempt.h b/drivers/gpu/drm/msm/adreno/a6xx_preempt.h
> new file mode 100644
> index 000000000000..df36c945b836
> --- /dev/null
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_preempt.h
> @@ -0,0 +1,82 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright (c) 2018, The Linux Foundation. All rights reserved. */
> +/* Copyright (c) 2023 Collabora, Ltd. */
> +/* Copyright (c) 2024 Valve Corporation */
> +/* Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. */
> +
> +/*
> + * Try to transition the preemption state from old to new. Return
> + * true on success or false if the original state wasn't 'old'
> + */
> +static inline bool try_preempt_state(struct a6xx_gpu *a6xx_gpu,
> +		enum a6xx_preempt_state old, enum a6xx_preempt_state new)
> +{
> +	enum a6xx_preempt_state cur = atomic_cmpxchg(&a6xx_gpu->preempt_state,
> +		old, new);
> +
> +	return (cur == old);
> +}
> +
> +/*
> + * Force the preemption state to the specified state.  This is used in cases
> + * where the current state is known and won't change
> + */
> +static inline void set_preempt_state(struct a6xx_gpu *gpu,
> +		enum a6xx_preempt_state new)
> +{
> +	/*
> +	 * preempt_state may be read by other cores trying to trigger a
> +	 * preemption or in the interrupt handler so barriers are needed
> +	 * before...
> +	 */
> +	smp_mb__before_atomic();
> +	atomic_set(&gpu->preempt_state, new);
> +	/* ... and after */
> +	smp_mb__after_atomic();
> +}
> +
> +/* Write the most recent wptr for the given ring into the hardware */
> +static inline void update_wptr(struct a6xx_gpu *a6xx_gpu, struct msm_ringbuffer *ring)
> +{
> +	unsigned long flags;
> +	uint32_t wptr;
> +
> +	spin_lock_irqsave(&ring->preempt_lock, flags);
> +
> +	if (ring->restore_wptr) {
> +		wptr = get_wptr(ring);
> +
> +		a6xx_fenced_write(a6xx_gpu, REG_A6XX_CP_RB_WPTR, wptr, BIT(0), false);
> +
> +		ring->restore_wptr = false;
> +	}
> +
> +	spin_unlock_irqrestore(&ring->preempt_lock, flags);
> +}
> +
> +/* Return the highest priority ringbuffer with something in it */
> +static inline struct msm_ringbuffer *get_next_ring(struct msm_gpu *gpu)
> +{
> +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> +
> +	unsigned long flags;
> +	int i;
> +
> +	for (i = 0; i < gpu->nr_rings; i++) {
> +		bool empty;
> +		struct msm_ringbuffer *ring = gpu->rb[i];
> +
> +		spin_lock_irqsave(&ring->preempt_lock, flags);
> +		empty = (get_wptr(ring) == gpu->funcs->get_rptr(gpu, ring));
> +		if (!empty && ring == a6xx_gpu->cur_ring)
> +			empty = ring->memptrs->fence == a6xx_gpu->last_seqno[i];
> +		spin_unlock_irqrestore(&ring->preempt_lock, flags);
> +
> +		if (!empty)
> +			return ring;
> +	}
> +
> +	return NULL;
> +}
> +
> diff --git a/drivers/gpu/drm/msm/adreno/a8xx_gpu.c b/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
> index e406681b8c80..9e6f2ed69247 100644
> --- a/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a8xx_gpu.c
> @@ -463,6 +463,34 @@ static void a8xx_patch_pwrup_reglist(struct msm_gpu *gpu)
>   	a8xx_aperture_clear(gpu);
>   }
>   
> +static int a8xx_preempt_start(struct msm_gpu *gpu)
> +{
> +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> +	struct msm_ringbuffer *ring = gpu->rb[0];
> +
> +	if (gpu->nr_rings <= 1)
> +		return 0;
> +
> +	/* Turn CP protection off */
> +	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
> +	OUT_RING(ring, 0);
> +
> +	a6xx_emit_set_pseudo_reg(ring, a6xx_gpu, NULL);
> +
> +	/* Yield the floor on command completion */
> +	OUT_PKT7(ring, CP_CONTEXT_SWITCH_YIELD, 4);
> +	OUT_RING(ring, 0x00);
> +	OUT_RING(ring, 0x00);
> +	OUT_RING(ring, 0x00);
> +	/* Generate interrupt on preemption completion */
> +	OUT_RING(ring, 0x00);
> +
> +	a6xx_flush(gpu, ring);
> +
> +	return a8xx_idle(gpu, ring) ? 0 : -EINVAL;
> +}
> +
>   static int a8xx_cp_init(struct msm_gpu *gpu)
>   {
>   	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> @@ -738,6 +766,8 @@ static int hw_init(struct msm_gpu *gpu)
>   	gpu_write64(gpu, REG_A6XX_CP_RB_RPTR_ADDR, shadowptr(a6xx_gpu, gpu->rb[0]));
>   	gpu_write64(gpu, REG_A8XX_CP_RB_RPTR_ADDR_BV, rbmemptr(gpu->rb[0], bv_rptr));
>   
> +	a8xx_preempt_hw_init(gpu);
> +
>   	for (i = 0; i < gpu->nr_rings; i++)
>   		a6xx_gpu->shadow[i] = 0;
>   
> @@ -800,6 +830,9 @@ static int hw_init(struct msm_gpu *gpu)
>   	/* Enable hardware clockgating */
>   	a8xx_set_hwcg(gpu, true);
>   out:
> +	/* Last step - yield the ringbuffer */
> +	a8xx_preempt_start(gpu);
> +
>   	/*
>   	 * Tell the GMU that we are done touching the GPU and it can start power
>   	 * management
> @@ -1209,11 +1242,11 @@ irqreturn_t a8xx_irq(struct msm_gpu *gpu)
>   
>   	if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS) {
>   		msm_gpu_retire(gpu);
> -		a6xx_preempt_trigger(gpu);
> +		a8xx_preempt_trigger(gpu);
>   	}
>   
>   	if (status & A6XX_RBBM_INT_0_MASK_CP_SW)
> -		a6xx_preempt_irq(gpu);
> +		a8xx_preempt_irq(gpu);
>   
>   	return IRQ_HANDLED;
>   }
> diff --git a/drivers/gpu/drm/msm/adreno/a8xx_preempt.c b/drivers/gpu/drm/msm/adreno/a8xx_preempt.c
> new file mode 100644
> index 000000000000..3d8c33ba722e
> --- /dev/null
> +++ b/drivers/gpu/drm/msm/adreno/a8xx_preempt.c
> @@ -0,0 +1,259 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. */
> +
> +#include "msm_gem.h"
> +#include "a6xx_gpu.h"
> +#include "a6xx_gmu.xml.h"
> +#include "a6xx_preempt.h"
> +#include "msm_mmu.h"
> +#include "msm_gpu_trace.h"
> +
> +static void preempt_prepare_postamble(struct a6xx_gpu *a6xx_gpu)
> +{
> +	u32 *postamble = a6xx_gpu->preempt_postamble_ptr;
> +	u32 count = 0;
> +
> +	postamble[count++] = PKT7(CP_REG_RMW, 3);
> +	postamble[count++] = REG_A8XX_RBBM_PERFCTR_SRAM_INIT_CMD;
> +	postamble[count++] = 0;
> +	postamble[count++] = 1;
> +
> +	postamble[count++] = PKT7(CP_WAIT_REG_MEM, 6);
> +	postamble[count++] = CP_WAIT_REG_MEM_0_FUNCTION(WRITE_EQ);
> +	postamble[count++] = CP_WAIT_REG_MEM_POLL_ADDR_LO(
> +				REG_A8XX_RBBM_PERFCTR_SRAM_INIT_STATUS);
> +	postamble[count++] = CP_WAIT_REG_MEM_POLL_ADDR_HI(0);
> +	postamble[count++] = CP_WAIT_REG_MEM_3_REF(0x1);
> +	postamble[count++] = CP_WAIT_REG_MEM_4_MASK(0x1);
> +	postamble[count++] = CP_WAIT_REG_MEM_5_DELAY_LOOP_CYCLES(0);
> +
> +	a6xx_gpu->preempt_postamble_len = count;
> +
> +	a6xx_gpu->postamble_enabled = true;
> +}
> +
> +static void preempt_disable_postamble(struct a6xx_gpu *a6xx_gpu)
> +{
> +	u32 *postamble = a6xx_gpu->preempt_postamble_ptr;
> +
> +	/*
> +	 * Disable the postamble by replacing the first packet header with a NOP
> +	 * that covers the whole buffer.
> +	 */
> +	*postamble = PKT7(CP_NOP, (a6xx_gpu->preempt_postamble_len - 1));
> +
> +	a6xx_gpu->postamble_enabled = false;
> +}
> +
> +/*
> + * Set preemption keepalive vote. Please note that this vote is different from the one used in
> + * a8xx_irq()
> + */
> +static void a8xx_preempt_keepalive_vote(struct msm_gpu *gpu, bool on)
> +{
> +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> +
> +	gmu_write(&a6xx_gpu->gmu, REG_A8XX_GMU_PWR_COL_PREEMPT_KEEPALIVE, on);
> +}
> +
> +void a8xx_preempt_irq(struct msm_gpu *gpu)
> +{
> +	uint32_t status;
> +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> +	struct drm_device *dev = gpu->dev;
> +
> +	if (!try_preempt_state(a6xx_gpu, PREEMPT_TRIGGERED, PREEMPT_PENDING))
> +		return;
> +
> +	/* Delete the preemption watchdog timer */
> +	timer_delete(&a6xx_gpu->preempt_timer);
> +
> +	/*
> +	 * The hardware should be setting the stop bit of CP_CONTEXT_SWITCH_CNTL
> +	 * to zero before firing the interrupt, but there is a non zero chance
> +	 * of a hardware condition or a software race that could set it again
> +	 * before we have a chance to finish. If that happens, log and go for
> +	 * recovery
> +	 */
> +	status = gpu_read(gpu, REG_A8XX_CP_CONTEXT_SWITCH_CNTL);
> +	if (unlikely(status & A8XX_CP_CONTEXT_SWITCH_CNTL_STOP)) {
> +		DRM_DEV_ERROR(&gpu->pdev->dev,
> +					  "!!!!!!!!!!!!!!!! preemption faulted !!!!!!!!!!!!!! irq\n");
> +		set_preempt_state(a6xx_gpu, PREEMPT_FAULTED);
> +		dev_err(dev->dev, "%s: Preemption failed to complete\n",
> +			gpu->name);
> +		kthread_queue_work(gpu->worker, &gpu->recover_work);
> +		return;
> +	}
> +
> +	a6xx_gpu->cur_ring = a6xx_gpu->next_ring;
> +	a6xx_gpu->next_ring = NULL;
> +
> +	set_preempt_state(a6xx_gpu, PREEMPT_FINISH);
> +
> +	update_wptr(a6xx_gpu, a6xx_gpu->cur_ring);
> +
> +	set_preempt_state(a6xx_gpu, PREEMPT_NONE);
> +
> +	a8xx_preempt_keepalive_vote(gpu, false);
> +
> +	trace_msm_gpu_preemption_irq(a6xx_gpu->cur_ring->id);
> +
> +	/*
> +	 * Retrigger preemption to avoid a deadlock that might occur when preemption
> +	 * is skipped due to it being already in flight when requested.
> +	 */
> +	a8xx_preempt_trigger(gpu);
> +}
> +
> +void a8xx_preempt_hw_init(struct msm_gpu *gpu)
> +{
> +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> +	int i;
> +
> +	/* No preemption if we only have one ring */
> +	if (gpu->nr_rings == 1)
> +		return;
> +
> +	for (i = 0; i < gpu->nr_rings; i++) {
> +		struct a6xx_preempt_record *record_ptr = a6xx_gpu->preempt[i];
> +
> +		record_ptr->wptr = 0;
> +		record_ptr->rptr = 0;
> +		record_ptr->rptr_addr = shadowptr(a6xx_gpu, gpu->rb[i]);
> +		record_ptr->info = 0;
> +		record_ptr->data = 0;
> +		record_ptr->rbase = gpu->rb[i]->iova;
> +	}
> +
> +	/* Write a 0 to signal that we aren't switching pagetables */
> +	gpu_write64(gpu, REG_A8XX_CP_CONTEXT_SWITCH_SMMU_INFO, 0);
> +
> +	/* Enable the GMEM save/restore feature for preemption */
> +	gpu_write(gpu, REG_A6XX_RB_CONTEXT_SWITCH_GMEM_SAVE_RESTORE_ENABLE, 0x1);
> +
> +	/* Reset the preemption state */
> +	set_preempt_state(a6xx_gpu, PREEMPT_NONE);
> +
> +	spin_lock_init(&a6xx_gpu->eval_lock);
> +
> +	/* Always come up on rb 0 */
> +	a6xx_gpu->cur_ring = gpu->rb[0];
> +}
> +
> +void a8xx_preempt_trigger(struct msm_gpu *gpu)
> +{
> +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> +	unsigned long flags;
> +	struct msm_ringbuffer *ring;
> +	unsigned int cntl;
> +	bool sysprof;
> +
> +	if (gpu->nr_rings == 1)
> +		return;
> +
> +	/*
> +	 * Lock to make sure another thread attempting preemption doesn't skip it
> +	 * while we are still evaluating the next ring. This makes sure the other
> +	 * thread does start preemption if we abort it and avoids a soft lock.
> +	 */
> +	spin_lock_irqsave(&a6xx_gpu->eval_lock, flags);
> +
> +	/*
> +	 * Try to start preemption by moving from NONE to START. If
> +	 * unsuccessful, a preemption is already in flight
> +	 */
> +	if (!try_preempt_state(a6xx_gpu, PREEMPT_NONE, PREEMPT_START)) {
> +		spin_unlock_irqrestore(&a6xx_gpu->eval_lock, flags);
> +		return;
> +	}
> +
> +	cntl = A8XX_CP_CONTEXT_SWITCH_CNTL_LEVEL(a6xx_gpu->preempt_level);
> +
> +	if (a6xx_gpu->skip_save_restore)
> +		cntl |= A8XX_CP_CONTEXT_SWITCH_CNTL_SKIP_SAVE_RESTORE;
> +
> +	if (a6xx_gpu->uses_gmem)
> +		cntl |= A8XX_CP_CONTEXT_SWITCH_CNTL_USES_GMEM;
> +
> +	cntl |= A8XX_CP_CONTEXT_SWITCH_CNTL_STOP;
> +
> +	/* Get the next ring to preempt to */
> +	ring = get_next_ring(gpu);
> +
> +	/*
> +	 * If no ring is populated or the highest priority ring is the current
> +	 * one do nothing except to update the wptr to the latest and greatest
> +	 */
> +	if (!ring || (a6xx_gpu->cur_ring == ring)) {
> +		set_preempt_state(a6xx_gpu, PREEMPT_FINISH);
> +		update_wptr(a6xx_gpu, a6xx_gpu->cur_ring);
> +		set_preempt_state(a6xx_gpu, PREEMPT_NONE);
> +		spin_unlock_irqrestore(&a6xx_gpu->eval_lock, flags);
> +		return;
> +	}
> +
> +	spin_unlock_irqrestore(&a6xx_gpu->eval_lock, flags);
> +
> +	spin_lock_irqsave(&ring->preempt_lock, flags);
> +
> +	struct a7xx_cp_smmu_info *smmu_info_ptr =
> +		a6xx_gpu->preempt_smmu[ring->id];
> +	struct a6xx_preempt_record *record_ptr = a6xx_gpu->preempt[ring->id];
> +	u64 ttbr0 = ring->memptrs->ttbr0;
> +	u32 context_idr = ring->memptrs->context_idr;

Shall we declare these variables at the top of the function body?

Thanks,
Jie

> +
> +	smmu_info_ptr->ttbr0 = ttbr0;
> +	smmu_info_ptr->context_idr = context_idr;
> +	record_ptr->wptr = get_wptr(ring);
> +
> +	/*
> +	 * The GPU will write the wptr we set above when we preempt. Reset
> +	 * restore_wptr to make sure that we don't write WPTR to the same
> +	 * thing twice. It's still possible subsequent submissions will update
> +	 * wptr again, in which case they will set the flag to true. This has
> +	 * to be protected by the lock for setting the flag and updating wptr
> +	 * to be atomic.
> +	 */
> +	ring->restore_wptr = false;
> +
> +	trace_msm_gpu_preemption_trigger(a6xx_gpu->cur_ring->id, ring->id);
> +
> +	spin_unlock_irqrestore(&ring->preempt_lock, flags);
> +
> +	/* Set the keepalive bit to keep the GPU ON until preemption is complete */
> +	a8xx_preempt_keepalive_vote(gpu, true);
> +
> +	a6xx_fenced_write(a6xx_gpu,
> +		REG_A8XX_CP_CONTEXT_SWITCH_SMMU_INFO, a6xx_gpu->preempt_smmu_iova[ring->id],
> +		BIT(1), true);
> +
> +	a6xx_fenced_write(a6xx_gpu,
> +		REG_A8XX_CP_CONTEXT_SWITCH_PRIV_NON_SECURE_RESTORE_ADDR,
> +		a6xx_gpu->preempt_iova[ring->id], BIT(1), true);
> +
> +	a6xx_gpu->next_ring = ring;
> +
> +	/* Start a timer to catch a stuck preemption */
> +	mod_timer(&a6xx_gpu->preempt_timer, jiffies + msecs_to_jiffies(10000));
> +
> +	/* Enable or disable postamble as needed */
> +	sysprof = refcount_read(&a6xx_gpu->base.base.sysprof_active) > 1;
> +
> +	if (!sysprof && !a6xx_gpu->postamble_enabled)
> +		preempt_prepare_postamble(a6xx_gpu);
> +
> +	if (sysprof && a6xx_gpu->postamble_enabled)
> +		preempt_disable_postamble(a6xx_gpu);
> +
> +	/* Set the preemption state to triggered */
> +	set_preempt_state(a6xx_gpu, PREEMPT_TRIGGERED);
> +
> +	/* Trigger the preemption */
> +	a6xx_fenced_write(a6xx_gpu, REG_A8XX_CP_CONTEXT_SWITCH_CNTL, cntl, BIT(1), false);
> +}
> +
> 


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 15/17] drm/msm/a8xx: Preemption support for A840
  2026-04-11  2:22   ` Jie Gan
@ 2026-04-11 13:56     ` Rob Clark
  2026-04-11 14:45       ` David Laight
  0 siblings, 1 reply; 45+ messages in thread
From: Rob Clark @ 2026-04-11 13:56 UTC (permalink / raw)
  To: Jie Gan
  Cc: Akhil P Oommen, Sean Paul, Konrad Dybcio, Dmitry Baryshkov,
	Abhinav Kumar, Jessica Zhang, Marijn Suijten, David Airlie,
	Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	linux-arm-msm, dri-devel, freedreno, linux-kernel

On Fri, Apr 10, 2026 at 7:22 PM Jie Gan <jie.gan@oss.qualcomm.com> wrote:
>
>
>
> On 3/27/2026 8:14 AM, Akhil P Oommen wrote:
> > The programing sequence related to preemption is unchanged from A7x. But
> > there is some code churn due to register shuffling in A8x. So, split out
> > the common code into a header file for code sharing and add/update
> > additional changes required to support preemption feature on A8x GPUs.
> >
> > Finally, enable the preemption quirk in A840's catalog to enable this
> > feature.
> >
> > Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
> > ---
> >   drivers/gpu/drm/msm/Makefile              |   1 +
> >   drivers/gpu/drm/msm/adreno/a6xx_catalog.c |   1 +
> >   drivers/gpu/drm/msm/adreno/a6xx_gpu.c     |   7 +-
> >   drivers/gpu/drm/msm/adreno/a6xx_gpu.h     |   5 +
> >   drivers/gpu/drm/msm/adreno/a6xx_preempt.c |  77 +--------
> >   drivers/gpu/drm/msm/adreno/a6xx_preempt.h |  82 ++++++++++
> >   drivers/gpu/drm/msm/adreno/a8xx_gpu.c     |  37 ++++-
> >   drivers/gpu/drm/msm/adreno/a8xx_preempt.c | 259 ++++++++++++++++++++++++++++++
> >   8 files changed, 389 insertions(+), 80 deletions(-)
> >
[...]
> > diff --git a/drivers/gpu/drm/msm/adreno/a8xx_preempt.c b/drivers/gpu/drm/msm/adreno/a8xx_preempt.c
> > new file mode 100644
> > index 000000000000..3d8c33ba722e
> > --- /dev/null
> > +++ b/drivers/gpu/drm/msm/adreno/a8xx_preempt.c
> > @@ -0,0 +1,259 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. */
> > +
> > +#include "msm_gem.h"
> > +#include "a6xx_gpu.h"
> > +#include "a6xx_gmu.xml.h"
> > +#include "a6xx_preempt.h"
> > +#include "msm_mmu.h"
> > +#include "msm_gpu_trace.h"
> > +
> > +static void preempt_prepare_postamble(struct a6xx_gpu *a6xx_gpu)
> > +{
> > +     u32 *postamble = a6xx_gpu->preempt_postamble_ptr;
> > +     u32 count = 0;
> > +
> > +     postamble[count++] = PKT7(CP_REG_RMW, 3);
> > +     postamble[count++] = REG_A8XX_RBBM_PERFCTR_SRAM_INIT_CMD;
> > +     postamble[count++] = 0;
> > +     postamble[count++] = 1;
> > +
> > +     postamble[count++] = PKT7(CP_WAIT_REG_MEM, 6);
> > +     postamble[count++] = CP_WAIT_REG_MEM_0_FUNCTION(WRITE_EQ);
> > +     postamble[count++] = CP_WAIT_REG_MEM_POLL_ADDR_LO(
> > +                             REG_A8XX_RBBM_PERFCTR_SRAM_INIT_STATUS);
> > +     postamble[count++] = CP_WAIT_REG_MEM_POLL_ADDR_HI(0);
> > +     postamble[count++] = CP_WAIT_REG_MEM_3_REF(0x1);
> > +     postamble[count++] = CP_WAIT_REG_MEM_4_MASK(0x1);
> > +     postamble[count++] = CP_WAIT_REG_MEM_5_DELAY_LOOP_CYCLES(0);
> > +
> > +     a6xx_gpu->preempt_postamble_len = count;
> > +
> > +     a6xx_gpu->postamble_enabled = true;
> > +}
> > +
> > +static void preempt_disable_postamble(struct a6xx_gpu *a6xx_gpu)
> > +{
> > +     u32 *postamble = a6xx_gpu->preempt_postamble_ptr;
> > +
> > +     /*
> > +      * Disable the postamble by replacing the first packet header with a NOP
> > +      * that covers the whole buffer.
> > +      */
> > +     *postamble = PKT7(CP_NOP, (a6xx_gpu->preempt_postamble_len - 1));
> > +
> > +     a6xx_gpu->postamble_enabled = false;
> > +}
> > +
> > +/*
> > + * Set preemption keepalive vote. Please note that this vote is different from the one used in
> > + * a8xx_irq()
> > + */
> > +static void a8xx_preempt_keepalive_vote(struct msm_gpu *gpu, bool on)
> > +{
> > +     struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > +     struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> > +
> > +     gmu_write(&a6xx_gpu->gmu, REG_A8XX_GMU_PWR_COL_PREEMPT_KEEPALIVE, on);
> > +}
> > +
> > +void a8xx_preempt_irq(struct msm_gpu *gpu)
> > +{
> > +     uint32_t status;
> > +     struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > +     struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> > +     struct drm_device *dev = gpu->dev;
> > +
> > +     if (!try_preempt_state(a6xx_gpu, PREEMPT_TRIGGERED, PREEMPT_PENDING))
> > +             return;
> > +
> > +     /* Delete the preemption watchdog timer */
> > +     timer_delete(&a6xx_gpu->preempt_timer);
> > +
> > +     /*
> > +      * The hardware should be setting the stop bit of CP_CONTEXT_SWITCH_CNTL
> > +      * to zero before firing the interrupt, but there is a non zero chance
> > +      * of a hardware condition or a software race that could set it again
> > +      * before we have a chance to finish. If that happens, log and go for
> > +      * recovery
> > +      */
> > +     status = gpu_read(gpu, REG_A8XX_CP_CONTEXT_SWITCH_CNTL);
> > +     if (unlikely(status & A8XX_CP_CONTEXT_SWITCH_CNTL_STOP)) {
> > +             DRM_DEV_ERROR(&gpu->pdev->dev,
> > +                                       "!!!!!!!!!!!!!!!! preemption faulted !!!!!!!!!!!!!! irq\n");
> > +             set_preempt_state(a6xx_gpu, PREEMPT_FAULTED);
> > +             dev_err(dev->dev, "%s: Preemption failed to complete\n",
> > +                     gpu->name);
> > +             kthread_queue_work(gpu->worker, &gpu->recover_work);
> > +             return;
> > +     }
> > +
> > +     a6xx_gpu->cur_ring = a6xx_gpu->next_ring;
> > +     a6xx_gpu->next_ring = NULL;
> > +
> > +     set_preempt_state(a6xx_gpu, PREEMPT_FINISH);
> > +
> > +     update_wptr(a6xx_gpu, a6xx_gpu->cur_ring);
> > +
> > +     set_preempt_state(a6xx_gpu, PREEMPT_NONE);
> > +
> > +     a8xx_preempt_keepalive_vote(gpu, false);
> > +
> > +     trace_msm_gpu_preemption_irq(a6xx_gpu->cur_ring->id);
> > +
> > +     /*
> > +      * Retrigger preemption to avoid a deadlock that might occur when preemption
> > +      * is skipped due to it being already in flight when requested.
> > +      */
> > +     a8xx_preempt_trigger(gpu);
> > +}
> > +
> > +void a8xx_preempt_hw_init(struct msm_gpu *gpu)
> > +{
> > +     struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > +     struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> > +     int i;
> > +
> > +     /* No preemption if we only have one ring */
> > +     if (gpu->nr_rings == 1)
> > +             return;
> > +
> > +     for (i = 0; i < gpu->nr_rings; i++) {
> > +             struct a6xx_preempt_record *record_ptr = a6xx_gpu->preempt[i];
> > +
> > +             record_ptr->wptr = 0;
> > +             record_ptr->rptr = 0;
> > +             record_ptr->rptr_addr = shadowptr(a6xx_gpu, gpu->rb[i]);
> > +             record_ptr->info = 0;
> > +             record_ptr->data = 0;
> > +             record_ptr->rbase = gpu->rb[i]->iova;
> > +     }
> > +
> > +     /* Write a 0 to signal that we aren't switching pagetables */
> > +     gpu_write64(gpu, REG_A8XX_CP_CONTEXT_SWITCH_SMMU_INFO, 0);
> > +
> > +     /* Enable the GMEM save/restore feature for preemption */
> > +     gpu_write(gpu, REG_A6XX_RB_CONTEXT_SWITCH_GMEM_SAVE_RESTORE_ENABLE, 0x1);
> > +
> > +     /* Reset the preemption state */
> > +     set_preempt_state(a6xx_gpu, PREEMPT_NONE);
> > +
> > +     spin_lock_init(&a6xx_gpu->eval_lock);
> > +
> > +     /* Always come up on rb 0 */
> > +     a6xx_gpu->cur_ring = gpu->rb[0];
> > +}
> > +
> > +void a8xx_preempt_trigger(struct msm_gpu *gpu)
> > +{
> > +     struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > +     struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> > +     unsigned long flags;
> > +     struct msm_ringbuffer *ring;
> > +     unsigned int cntl;
> > +     bool sysprof;
> > +
> > +     if (gpu->nr_rings == 1)
> > +             return;
> > +
> > +     /*
> > +      * Lock to make sure another thread attempting preemption doesn't skip it
> > +      * while we are still evaluating the next ring. This makes sure the other
> > +      * thread does start preemption if we abort it and avoids a soft lock.
> > +      */
> > +     spin_lock_irqsave(&a6xx_gpu->eval_lock, flags);
> > +
> > +     /*
> > +      * Try to start preemption by moving from NONE to START. If
> > +      * unsuccessful, a preemption is already in flight
> > +      */
> > +     if (!try_preempt_state(a6xx_gpu, PREEMPT_NONE, PREEMPT_START)) {
> > +             spin_unlock_irqrestore(&a6xx_gpu->eval_lock, flags);
> > +             return;
> > +     }
> > +
> > +     cntl = A8XX_CP_CONTEXT_SWITCH_CNTL_LEVEL(a6xx_gpu->preempt_level);
> > +
> > +     if (a6xx_gpu->skip_save_restore)
> > +             cntl |= A8XX_CP_CONTEXT_SWITCH_CNTL_SKIP_SAVE_RESTORE;
> > +
> > +     if (a6xx_gpu->uses_gmem)
> > +             cntl |= A8XX_CP_CONTEXT_SWITCH_CNTL_USES_GMEM;
> > +
> > +     cntl |= A8XX_CP_CONTEXT_SWITCH_CNTL_STOP;
> > +
> > +     /* Get the next ring to preempt to */
> > +     ring = get_next_ring(gpu);
> > +
> > +     /*
> > +      * If no ring is populated or the highest priority ring is the current
> > +      * one do nothing except to update the wptr to the latest and greatest
> > +      */
> > +     if (!ring || (a6xx_gpu->cur_ring == ring)) {
> > +             set_preempt_state(a6xx_gpu, PREEMPT_FINISH);
> > +             update_wptr(a6xx_gpu, a6xx_gpu->cur_ring);
> > +             set_preempt_state(a6xx_gpu, PREEMPT_NONE);
> > +             spin_unlock_irqrestore(&a6xx_gpu->eval_lock, flags);
> > +             return;
> > +     }
> > +
> > +     spin_unlock_irqrestore(&a6xx_gpu->eval_lock, flags);
> > +
> > +     spin_lock_irqsave(&ring->preempt_lock, flags);
> > +
> > +     struct a7xx_cp_smmu_info *smmu_info_ptr =
> > +             a6xx_gpu->preempt_smmu[ring->id];
> > +     struct a6xx_preempt_record *record_ptr = a6xx_gpu->preempt[ring->id];
> > +     u64 ttbr0 = ring->memptrs->ttbr0;
> > +     u32 context_idr = ring->memptrs->context_idr;
>
> Shall we declare these variables at the top of the function body?
>

We can use c11 / mixed code/declarations now.  It's actually kinda
required for cleanup.h stuff (which I recommend for new code, where
possible.. I've been using it for PERFCNTR_CONFIG ioctl that I'm
working on, and it makes error path cleanup much more straightforward)

BR,
-R

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 15/17] drm/msm/a8xx: Preemption support for A840
  2026-04-11 13:56     ` Rob Clark
@ 2026-04-11 14:45       ` David Laight
  0 siblings, 0 replies; 45+ messages in thread
From: David Laight @ 2026-04-11 14:45 UTC (permalink / raw)
  To: Rob Clark
  Cc: Jie Gan, Akhil P Oommen, Sean Paul, Konrad Dybcio,
	Dmitry Baryshkov, Abhinav Kumar, Jessica Zhang, Marijn Suijten,
	David Airlie, Simona Vetter, Antonino Maniscalco, Connor Abbott,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	linux-arm-msm, dri-devel, freedreno, linux-kernel

On Sat, 11 Apr 2026 06:56:15 -0700
Rob Clark <rob.clark@oss.qualcomm.com> wrote:

> On Fri, Apr 10, 2026 at 7:22 PM Jie Gan <jie.gan@oss.qualcomm.com> wrote:
> >
> >
> >
> > On 3/27/2026 8:14 AM, Akhil P Oommen wrote:  
...
> > > +     struct a7xx_cp_smmu_info *smmu_info_ptr =
> > > +             a6xx_gpu->preempt_smmu[ring->id];
> > > +     struct a6xx_preempt_record *record_ptr = a6xx_gpu->preempt[ring->id];
> > > +     u64 ttbr0 = ring->memptrs->ttbr0;
> > > +     u32 context_idr = ring->memptrs->context_idr;  
> >
> > Shall we declare these variables at the top of the function body?
> >  
> 
> We can use c11 / mixed code/declarations now.  It's actually kinda
> required for cleanup.h stuff (which I recommend for new code, where
> possible.. I've been using it for PERFCNTR_CONFIG ioctl that I'm
> working on, and it makes error path cleanup much more straightforward)

It is also still frowned upon for normal variables.
It does make reading code - especially large functions harder.

	David


> 
> BR,
> -R
> 


^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2026-04-11 14:45 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-27  0:13 [PATCH v2 00/17] drm/msm: A8xx Support - Batch 2 Akhil P Oommen
2026-03-27  0:13 ` [PATCH v2 01/17] drm/msm/a6xx: Use barriers while updating HFI Q headers Akhil P Oommen
2026-03-27  0:13 ` [PATCH v2 02/17] drm/msm/a8xx: Fix the ticks used in submit traces Akhil P Oommen
2026-03-27 11:37   ` Konrad Dybcio
2026-03-30 20:58     ` Akhil P Oommen
2026-03-31  8:23       ` Konrad Dybcio
2026-03-27  0:13 ` [PATCH v2 03/17] drm/msm/a6xx: Switch to preemption safe AO counter Akhil P Oommen
2026-03-27 11:32   ` Konrad Dybcio
2026-03-30 20:51     ` Akhil P Oommen
2026-03-30 21:34       ` Rob Clark
2026-03-27  0:13 ` [PATCH v2 04/17] drm/msm/a6xx: Correct OOB usage Akhil P Oommen
2026-03-27  0:13 ` [PATCH v2 05/17] drm/msm/adreno: Implement gx_is_on() for A8x Akhil P Oommen
2026-03-27  0:13 ` [PATCH v2 06/17] drm/msm/a6xx: Fix gpu init from secure world Akhil P Oommen
2026-03-27  0:13 ` [PATCH v2 07/17] drm/msm/a6xx: Add support for Debug HFI Q Akhil P Oommen
2026-03-27 10:34   ` Konrad Dybcio
2026-03-27  0:13 ` [PATCH v2 08/17] drm/msm/adreno: Coredump on GPU/GMU init failures Akhil P Oommen
2026-03-27 11:35   ` Konrad Dybcio
2026-03-30 20:53     ` Akhil P Oommen
2026-03-30 19:51   ` Rob Clark
2026-03-31 18:05     ` Akhil P Oommen
2026-03-31 18:28       ` Rob Clark
2026-03-27  0:13 ` [PATCH v2 09/17] drm/msm/a6xx: Use packed structs for HFI Akhil P Oommen
2026-03-27  0:13 ` [PATCH v2 10/17] drm/msm/a6xx: Update HFI definitions Akhil P Oommen
2026-03-30 11:15   ` Konrad Dybcio
2026-03-30 20:27     ` Akhil P Oommen
2026-03-31  8:22       ` Konrad Dybcio
2026-03-31 17:44         ` Akhil P Oommen
2026-03-27  0:14 ` [PATCH v2 11/17] drm/msm/a8xx: Add SKU table for A840 Akhil P Oommen
2026-03-27 11:14   ` Konrad Dybcio
2026-03-30 20:35     ` Akhil P Oommen
2026-03-27  0:14 ` [PATCH v2 12/17] drm/msm/a6xx: Add soft fuse detection support Akhil P Oommen
2026-03-27 11:16   ` Konrad Dybcio
2026-03-30 20:36     ` Akhil P Oommen
2026-04-11  2:19   ` Jie Gan
2026-03-27  0:14 ` [PATCH v2 13/17] drm/msm/a6xx: Add SKU detection support for X2-85 Akhil P Oommen
2026-03-27  0:14 ` [PATCH v2 14/17] drm/msm/a8xx: Implement IFPC support for A840 Akhil P Oommen
2026-03-27 11:07   ` Konrad Dybcio
2026-03-30 20:34     ` Akhil P Oommen
2026-03-27  0:14 ` [PATCH v2 15/17] drm/msm/a8xx: Preemption " Akhil P Oommen
2026-04-11  2:22   ` Jie Gan
2026-04-11 13:56     ` Rob Clark
2026-04-11 14:45       ` David Laight
2026-03-27  0:14 ` [PATCH v2 16/17] drm/msm/a6xx: Enable Preemption on X2-85 Akhil P Oommen
2026-03-27  0:14 ` [PATCH v2 17/17] drm/msm/adreno: Expose a PARAM to check AQE support Akhil P Oommen
2026-03-27 11:19   ` Konrad Dybcio

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox