dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/9] drm/panthor: Add GPU specific initialization framework to support new Mali GPUs
@ 2025-03-20 11:17 Karunika Choo
  2025-03-20 11:17 ` [PATCH v2 1/9] drm/panthor: Add 64-bit and poll register accessors Karunika Choo
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Karunika Choo @ 2025-03-20 11:17 UTC (permalink / raw)
  To: dri-devel
  Cc: nd, Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

This patch series introduces improvements and new features to the drm/panthor
driver, primarily focusing on extending support for additional Mali GPU
families.

Key changes:
- Implementation of 64-bit and polling register accessors
- Addition of GPU-specific initialization framework to standardize and
  streamline support new GPUs.
- Support for cache maintenance via the GPU_CONTROL.GPU_COMMAND register.
- Support for Mali-G710, Mali-G715, Mali-G720, and Mali-G725 series of GPUs.

Patch Breakdown:
[PATCH 1-2]: Introduces 64-bit and poll register accessors, and updates
             existing register accesses to use the new accessors.
[PATCH 3]:   Implements the GPU-specific initialization framework to handle
             differences between GPU architectures by enabling definition of
             architecture-specific initialization routines.
[PATCH 4-5]: Refactors gpu_info initialization to use the initialization
             framework to support differences in GPU registers for subsequent
             GPUs, and at the same time simplifies and makes extensible the
             process of determining the GPU model name.
[PATCH 6]:   Adds support for the Mali-G715 Family of GPUs
[PATCH 7-8]: Adds support for the Mali-G720 and Mali-G725 series. It also
             supports cache maintenance via the FLUSH_CACHES GPU command due
             to deprecation of the FLUSH_MEM and FLUSH_PT MMU_AS commands.
[PATCH 9]:   Adds support for Mali-G710, Mali-G510 and Mali-G310.

v2:
- Removed handling for register base addresses as they are not yet needed.
- Merged gpu_info handling into panthor_hw.c as they depend on the same arch_id
  matching mechanism.
- Made gpu_info initialization a GPU-specific function.
- Removed unnecessary changes for cache maintenance via GPU_CONTROL.
- Removed unnecessary pre-parsing of register fields from v1. Retaining current
  implementation as much as possible.
- Added support for G710, G715, G720, and G725 series of Mali GPUs.
- Link to v1: https://lore.kernel.org/all/20241219170521.64879-1-karunika.choo@arm.com/

Thanks,
Karunika Choo

Karunika Choo (9):
  drm/panthor: Add 64-bit and poll register accessors
  drm/panthor: Use 64-bit and poll register accessors
  drm/panthor: Add GPU specific initialization framework
  drm/panthor: Move GPU info initialization into panthor_hw.c
  drm/panthor: Make getting GPU model name simple and extensible
  drm/panthor: Add support for Mali-G715 family of GPUs
  drm/panthor: Support GPU_CONTROL cache flush based on feature bit
  drm/panthor: Add support for Mali-G720 and Mali-G725 GPUs
  drm/panthor: Add support for Mali-G710, Mali-G510, and Mali-G310

 drivers/gpu/drm/panthor/Makefile         |   1 +
 drivers/gpu/drm/panthor/panthor_device.c |   5 +
 drivers/gpu/drm/panthor/panthor_device.h |   3 +
 drivers/gpu/drm/panthor/panthor_fw.c     |  14 +-
 drivers/gpu/drm/panthor/panthor_gpu.c    | 231 ++++-------------------
 drivers/gpu/drm/panthor/panthor_hw.c     | 216 +++++++++++++++++++++
 drivers/gpu/drm/panthor/panthor_hw.h     |  70 +++++++
 drivers/gpu/drm/panthor/panthor_mmu.c    |  69 ++++---
 drivers/gpu/drm/panthor/panthor_regs.h   |  83 ++++++++
 include/uapi/drm/panthor_drm.h           |   3 +
 10 files changed, 471 insertions(+), 224 deletions(-)
 create mode 100644 drivers/gpu/drm/panthor/panthor_hw.c
 create mode 100644 drivers/gpu/drm/panthor/panthor_hw.h

-- 
2.47.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 1/9] drm/panthor: Add 64-bit and poll register accessors
  2025-03-20 11:17 [PATCH v2 0/9] drm/panthor: Add GPU specific initialization framework to support new Mali GPUs Karunika Choo
@ 2025-03-20 11:17 ` Karunika Choo
  2025-03-21  7:48   ` Boris Brezillon
  2025-03-20 11:17 ` [PATCH v2 2/9] drm/panthor: Use " Karunika Choo
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Karunika Choo @ 2025-03-20 11:17 UTC (permalink / raw)
  To: dri-devel
  Cc: nd, Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

This patch adds 64-bit register accessors to simplify register access in
Panthor. It also adds 32-bit and 64-bit variants for read_poll_timeout.

Signed-off-by: Karunika Choo <karunika.choo@arm.com>
---
 drivers/gpu/drm/panthor/panthor_regs.h | 55 ++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

diff --git a/drivers/gpu/drm/panthor/panthor_regs.h b/drivers/gpu/drm/panthor/panthor_regs.h
index 42dc3fedb0d4..7ec4a1d04e20 100644
--- a/drivers/gpu/drm/panthor/panthor_regs.h
+++ b/drivers/gpu/drm/panthor/panthor_regs.h
@@ -238,4 +238,59 @@
 #define gpu_read(dev, reg) \
 	readl((dev)->iomem + (reg))
 
+#define gpu_read_relaxed(dev, reg) readl_relaxed((dev)->iomem + (reg))
+
+#define gpu_write64(dev, reg, data)                            \
+	do {                                                   \
+		u64 __val = (u64)(data);                       \
+		gpu_write(dev, reg, lower_32_bits(__val));     \
+		gpu_write(dev, reg + 4, upper_32_bits(__val)); \
+	} while (0)
+
+#define gpu_read64(dev, reg) \
+	(gpu_read(dev, reg) | ((u64)gpu_read(dev, reg + 4) << 32))
+
+#define gpu_read64_relaxed(dev, reg)  \
+	(gpu_read_relaxed(dev, reg) | \
+	 ((u64)gpu_read_relaxed(dev, reg + 4) << 32))
+
+#define gpu_read64_sync(dev, reg)                     \
+	({                                            \
+		u32 lo, hi1, hi2;                     \
+		do {                                  \
+			hi1 = gpu_read(dev, reg + 4); \
+			lo = gpu_read(dev, reg);      \
+			hi2 = gpu_read(dev, reg + 4); \
+		} while (hi1 != hi2);                 \
+		lo | ((u64)hi2 << 32);                \
+	})
+
+#define gpu_read_poll_timeout(dev, reg, val, cond, delay_us, timeout_us)    \
+	read_poll_timeout(gpu_read, val, cond, delay_us, timeout_us, false, \
+			  dev, reg)
+
+#define gpu_read_poll_timeout_atomic(dev, reg, val, cond, delay_us,         \
+				     timeout_us)                            \
+	read_poll_timeout_atomic(gpu_read, val, cond, delay_us, timeout_us, \
+				 false, dev, reg)
+
+#define gpu_read64_poll_timeout(dev, reg, val, cond, delay_us, timeout_us)    \
+	read_poll_timeout(gpu_read64, val, cond, delay_us, timeout_us, false, \
+			  dev, reg)
+
+#define gpu_read64_poll_timeout_atomic(dev, reg, val, cond, delay_us,         \
+				       timeout_us)                            \
+	read_poll_timeout_atomic(gpu_read64, val, cond, delay_us, timeout_us, \
+				 false, dev, reg)
+
+#define gpu_read_relaxed_poll_timeout_atomic(dev, reg, val, cond, delay_us, \
+					     timeout_us)                    \
+	read_poll_timeout_atomic(gpu_read_relaxed, val, cond, delay_us,     \
+				 timeout_us, false, dev, reg)
+
+#define gpu_read64_relaxed_poll_timeout(dev, reg, val, cond, delay_us,         \
+					timeout_us)                            \
+	read_poll_timeout(gpu_read64_relaxed, val, cond, delay_us, timeout_us, \
+			  false, dev, reg)
+
 #endif
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 2/9] drm/panthor: Use 64-bit and poll register accessors
  2025-03-20 11:17 [PATCH v2 0/9] drm/panthor: Add GPU specific initialization framework to support new Mali GPUs Karunika Choo
  2025-03-20 11:17 ` [PATCH v2 1/9] drm/panthor: Add 64-bit and poll register accessors Karunika Choo
@ 2025-03-20 11:17 ` Karunika Choo
  2025-03-21  7:53   ` Boris Brezillon
  2025-03-20 11:17 ` [PATCH v2 3/9] drm/panthor: Add GPU specific initialization framework Karunika Choo
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Karunika Choo @ 2025-03-20 11:17 UTC (permalink / raw)
  To: dri-devel
  Cc: nd, Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

This patch updates Panthor to use the new 64-bit accessors and poll
functions.

Signed-off-by: Karunika Choo <karunika.choo@arm.com>
---
 drivers/gpu/drm/panthor/panthor_fw.c  |   9 +-
 drivers/gpu/drm/panthor/panthor_gpu.c | 142 +++++++-------------------
 drivers/gpu/drm/panthor/panthor_mmu.c |  34 ++----
 3 files changed, 53 insertions(+), 132 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 0f52766a3120..ecfbe0456f89 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1059,8 +1059,8 @@ static void panthor_fw_stop(struct panthor_device *ptdev)
 	u32 status;
 
 	gpu_write(ptdev, MCU_CONTROL, MCU_CONTROL_DISABLE);
-	if (readl_poll_timeout(ptdev->iomem + MCU_STATUS, status,
-			       status == MCU_STATUS_DISABLED, 10, 100000))
+	if (gpu_read_poll_timeout(ptdev, MCU_STATUS, status,
+				  status == MCU_STATUS_DISABLED, 10, 100000))
 		drm_err(&ptdev->base, "Failed to stop MCU");
 }
 
@@ -1085,8 +1085,9 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
 
 		panthor_fw_update_reqs(glb_iface, req, GLB_HALT, GLB_HALT);
 		gpu_write(ptdev, CSF_DOORBELL(CSF_GLB_DOORBELL_ID), 1);
-		if (!readl_poll_timeout(ptdev->iomem + MCU_STATUS, status,
-					status == MCU_STATUS_HALT, 10, 100000)) {
+		if (!gpu_read_poll_timeout(ptdev, MCU_STATUS, status,
+					   status == MCU_STATUS_HALT, 10,
+					   100000)) {
 			ptdev->reset.fast = true;
 		} else {
 			drm_warn(&ptdev->base, "Failed to cleanly suspend MCU");
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index 671049020afa..0dee011fe2e9 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -108,14 +108,9 @@ static void panthor_gpu_init_info(struct panthor_device *ptdev)
 
 	ptdev->gpu_info.as_present = gpu_read(ptdev, GPU_AS_PRESENT);
 
-	ptdev->gpu_info.shader_present = gpu_read(ptdev, GPU_SHADER_PRESENT_LO);
-	ptdev->gpu_info.shader_present |= (u64)gpu_read(ptdev, GPU_SHADER_PRESENT_HI) << 32;
-
-	ptdev->gpu_info.tiler_present = gpu_read(ptdev, GPU_TILER_PRESENT_LO);
-	ptdev->gpu_info.tiler_present |= (u64)gpu_read(ptdev, GPU_TILER_PRESENT_HI) << 32;
-
-	ptdev->gpu_info.l2_present = gpu_read(ptdev, GPU_L2_PRESENT_LO);
-	ptdev->gpu_info.l2_present |= (u64)gpu_read(ptdev, GPU_L2_PRESENT_HI) << 32;
+	ptdev->gpu_info.shader_present = gpu_read64(ptdev, GPU_SHADER_PRESENT_LO);
+	ptdev->gpu_info.tiler_present = gpu_read64(ptdev, GPU_TILER_PRESENT_LO);
+	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
 
 	arch_major = GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id);
 	product_major = GPU_PROD_MAJOR(ptdev->gpu_info.gpu_id);
@@ -152,8 +147,7 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
 {
 	if (status & GPU_IRQ_FAULT) {
 		u32 fault_status = gpu_read(ptdev, GPU_FAULT_STATUS);
-		u64 address = ((u64)gpu_read(ptdev, GPU_FAULT_ADDR_HI) << 32) |
-			      gpu_read(ptdev, GPU_FAULT_ADDR_LO);
+		u64 address = gpu_read64(ptdev, GPU_FAULT_ADDR_LO);
 
 		drm_warn(&ptdev->base, "GPU Fault 0x%08x (%s) at 0x%016llx\n",
 			 fault_status, panthor_exception_name(ptdev, fault_status & 0xFF),
@@ -244,45 +238,27 @@ int panthor_gpu_block_power_off(struct panthor_device *ptdev,
 				u32 pwroff_reg, u32 pwrtrans_reg,
 				u64 mask, u32 timeout_us)
 {
-	u32 val, i;
+	u32 val;
 	int ret;
 
-	for (i = 0; i < 2; i++) {
-		u32 mask32 = mask >> (i * 32);
-
-		if (!mask32)
-			continue;
-
-		ret = readl_relaxed_poll_timeout(ptdev->iomem + pwrtrans_reg + (i * 4),
-						 val, !(mask32 & val),
-						 100, timeout_us);
-		if (ret) {
-			drm_err(&ptdev->base, "timeout waiting on %s:%llx power transition",
-				blk_name, mask);
-			return ret;
-		}
+	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
+					      100, timeout_us);
+	if (ret) {
+		drm_err(&ptdev->base,
+			"timeout waiting on %s:%llx power transition", blk_name,
+			mask);
+		return ret;
 	}
 
-	if (mask & GENMASK(31, 0))
-		gpu_write(ptdev, pwroff_reg, mask);
-
-	if (mask >> 32)
-		gpu_write(ptdev, pwroff_reg + 4, mask >> 32);
-
-	for (i = 0; i < 2; i++) {
-		u32 mask32 = mask >> (i * 32);
+	gpu_write64(ptdev, pwroff_reg, mask);
 
-		if (!mask32)
-			continue;
-
-		ret = readl_relaxed_poll_timeout(ptdev->iomem + pwrtrans_reg + (i * 4),
-						 val, !(mask32 & val),
-						 100, timeout_us);
-		if (ret) {
-			drm_err(&ptdev->base, "timeout waiting on %s:%llx power transition",
-				blk_name, mask);
-			return ret;
-		}
+	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
+					      100, timeout_us);
+	if (ret) {
+		drm_err(&ptdev->base,
+			"timeout waiting on %s:%llx power transition", blk_name,
+			mask);
+		return ret;
 	}
 
 	return 0;
@@ -305,45 +281,26 @@ int panthor_gpu_block_power_on(struct panthor_device *ptdev,
 			       u32 pwron_reg, u32 pwrtrans_reg,
 			       u32 rdy_reg, u64 mask, u32 timeout_us)
 {
-	u32 val, i;
+	u32 val;
 	int ret;
 
-	for (i = 0; i < 2; i++) {
-		u32 mask32 = mask >> (i * 32);
-
-		if (!mask32)
-			continue;
-
-		ret = readl_relaxed_poll_timeout(ptdev->iomem + pwrtrans_reg + (i * 4),
-						 val, !(mask32 & val),
-						 100, timeout_us);
-		if (ret) {
-			drm_err(&ptdev->base, "timeout waiting on %s:%llx power transition",
-				blk_name, mask);
-			return ret;
-		}
+	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
+					      100, timeout_us);
+	if (ret) {
+		drm_err(&ptdev->base,
+			"timeout waiting on %s:%llx power transition", blk_name,
+			mask);
+		return ret;
 	}
 
-	if (mask & GENMASK(31, 0))
-		gpu_write(ptdev, pwron_reg, mask);
-
-	if (mask >> 32)
-		gpu_write(ptdev, pwron_reg + 4, mask >> 32);
-
-	for (i = 0; i < 2; i++) {
-		u32 mask32 = mask >> (i * 32);
+	gpu_write64(ptdev, pwron_reg, mask);
 
-		if (!mask32)
-			continue;
-
-		ret = readl_relaxed_poll_timeout(ptdev->iomem + rdy_reg + (i * 4),
-						 val, (mask32 & val) == mask32,
-						 100, timeout_us);
-		if (ret) {
-			drm_err(&ptdev->base, "timeout waiting on %s:%llx readiness",
-				blk_name, mask);
-			return ret;
-		}
+	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
+					      100, timeout_us);
+	if (ret) {
+		drm_err(&ptdev->base, "timeout waiting on %s:%llx readiness",
+			blk_name, mask);
+		return ret;
 	}
 
 	return 0;
@@ -492,26 +449,6 @@ void panthor_gpu_resume(struct panthor_device *ptdev)
 	panthor_gpu_l2_power_on(ptdev);
 }
 
-/**
- * panthor_gpu_read_64bit_counter() - Read a 64-bit counter at a given offset.
- * @ptdev: Device.
- * @reg: The offset of the register to read.
- *
- * Return: The counter value.
- */
-static u64
-panthor_gpu_read_64bit_counter(struct panthor_device *ptdev, u32 reg)
-{
-	u32 hi, lo;
-
-	do {
-		hi = gpu_read(ptdev, reg + 0x4);
-		lo = gpu_read(ptdev, reg);
-	} while (hi != gpu_read(ptdev, reg + 0x4));
-
-	return ((u64)hi << 32) | lo;
-}
-
 /**
  * panthor_gpu_read_timestamp() - Read the timestamp register.
  * @ptdev: Device.
@@ -520,7 +457,7 @@ panthor_gpu_read_64bit_counter(struct panthor_device *ptdev, u32 reg)
  */
 u64 panthor_gpu_read_timestamp(struct panthor_device *ptdev)
 {
-	return panthor_gpu_read_64bit_counter(ptdev, GPU_TIMESTAMP_LO);
+	return gpu_read64_sync(ptdev, GPU_TIMESTAMP_LO);
 }
 
 /**
@@ -531,10 +468,5 @@ u64 panthor_gpu_read_timestamp(struct panthor_device *ptdev)
  */
 u64 panthor_gpu_read_timestamp_offset(struct panthor_device *ptdev)
 {
-	u32 hi, lo;
-
-	hi = gpu_read(ptdev, GPU_TIMESTAMP_OFFSET_HI);
-	lo = gpu_read(ptdev, GPU_TIMESTAMP_OFFSET_LO);
-
-	return ((u64)hi << 32) | lo;
+	return gpu_read64(ptdev, GPU_TIMESTAMP_OFFSET_LO);
 }
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index 12a02e28f50f..a0a79f19bdea 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -510,9 +510,9 @@ static int wait_ready(struct panthor_device *ptdev, u32 as_nr)
 	/* Wait for the MMU status to indicate there is no active command, in
 	 * case one is pending.
 	 */
-	ret = readl_relaxed_poll_timeout_atomic(ptdev->iomem + AS_STATUS(as_nr),
-						val, !(val & AS_STATUS_AS_ACTIVE),
-						10, 100000);
+	ret = gpu_read_relaxed_poll_timeout_atomic(ptdev, AS_STATUS(as_nr), val,
+						   !(val & AS_STATUS_AS_ACTIVE),
+						   10, 100000);
 
 	if (ret) {
 		panthor_device_schedule_reset(ptdev);
@@ -564,8 +564,7 @@ static void lock_region(struct panthor_device *ptdev, u32 as_nr,
 	region = region_width | region_start;
 
 	/* Lock the region that needs to be updated */
-	gpu_write(ptdev, AS_LOCKADDR_LO(as_nr), lower_32_bits(region));
-	gpu_write(ptdev, AS_LOCKADDR_HI(as_nr), upper_32_bits(region));
+	gpu_write64(ptdev, AS_LOCKADDR_LO(as_nr), region);
 	write_cmd(ptdev, as_nr, AS_COMMAND_LOCK);
 }
 
@@ -615,14 +614,9 @@ static int panthor_mmu_as_enable(struct panthor_device *ptdev, u32 as_nr,
 	if (ret)
 		return ret;
 
-	gpu_write(ptdev, AS_TRANSTAB_LO(as_nr), lower_32_bits(transtab));
-	gpu_write(ptdev, AS_TRANSTAB_HI(as_nr), upper_32_bits(transtab));
-
-	gpu_write(ptdev, AS_MEMATTR_LO(as_nr), lower_32_bits(memattr));
-	gpu_write(ptdev, AS_MEMATTR_HI(as_nr), upper_32_bits(memattr));
-
-	gpu_write(ptdev, AS_TRANSCFG_LO(as_nr), lower_32_bits(transcfg));
-	gpu_write(ptdev, AS_TRANSCFG_HI(as_nr), upper_32_bits(transcfg));
+	gpu_write64(ptdev, AS_TRANSTAB_LO(as_nr), transtab);
+	gpu_write64(ptdev, AS_MEMATTR_LO(as_nr), memattr);
+	gpu_write64(ptdev, AS_TRANSCFG_LO(as_nr), transcfg);
 
 	return write_cmd(ptdev, as_nr, AS_COMMAND_UPDATE);
 }
@@ -635,14 +629,9 @@ static int panthor_mmu_as_disable(struct panthor_device *ptdev, u32 as_nr)
 	if (ret)
 		return ret;
 
-	gpu_write(ptdev, AS_TRANSTAB_LO(as_nr), 0);
-	gpu_write(ptdev, AS_TRANSTAB_HI(as_nr), 0);
-
-	gpu_write(ptdev, AS_MEMATTR_LO(as_nr), 0);
-	gpu_write(ptdev, AS_MEMATTR_HI(as_nr), 0);
-
-	gpu_write(ptdev, AS_TRANSCFG_LO(as_nr), AS_TRANSCFG_ADRMODE_UNMAPPED);
-	gpu_write(ptdev, AS_TRANSCFG_HI(as_nr), 0);
+	gpu_write64(ptdev, AS_TRANSTAB_LO(as_nr), 0);
+	gpu_write64(ptdev, AS_MEMATTR_LO(as_nr), 0);
+	gpu_write64(ptdev, AS_TRANSCFG_LO(as_nr), AS_TRANSCFG_ADRMODE_UNMAPPED);
 
 	return write_cmd(ptdev, as_nr, AS_COMMAND_UPDATE);
 }
@@ -1680,8 +1669,7 @@ static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
 		u32 source_id;
 
 		fault_status = gpu_read(ptdev, AS_FAULTSTATUS(as));
-		addr = gpu_read(ptdev, AS_FAULTADDRESS_LO(as));
-		addr |= (u64)gpu_read(ptdev, AS_FAULTADDRESS_HI(as)) << 32;
+		addr = gpu_read64(ptdev, AS_FAULTADDRESS_LO(as));
 
 		/* decode the fault status */
 		exception_type = fault_status & 0xFF;
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 3/9] drm/panthor: Add GPU specific initialization framework
  2025-03-20 11:17 [PATCH v2 0/9] drm/panthor: Add GPU specific initialization framework to support new Mali GPUs Karunika Choo
  2025-03-20 11:17 ` [PATCH v2 1/9] drm/panthor: Add 64-bit and poll register accessors Karunika Choo
  2025-03-20 11:17 ` [PATCH v2 2/9] drm/panthor: Use " Karunika Choo
@ 2025-03-20 11:17 ` Karunika Choo
  2025-03-21  8:28   ` Boris Brezillon
  2025-03-20 11:17 ` [PATCH v2 4/9] drm/panthor: Move GPU info initialization into panthor_hw.c Karunika Choo
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Karunika Choo @ 2025-03-20 11:17 UTC (permalink / raw)
  To: dri-devel
  Cc: nd, Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

This patch aims to lay the foundation to provide support for multiple
Mali GPUs through a framework by which differences in registers,
functionality, and features can be managed.

It introduces the concept of the arch_id which is a 32-bit ID in the
format of ((arch_major << 16) | (arch_minor << 8) | arch_rev). The 8-bit
fields of the arch_id provides future proofing past the 4-bit fields of
the GPU_ID's arch_major, arch_minor, and arch_rev.

The arch_id is used to select the correct abstraction for the GPU, such
as function pointers for operations specific to the GPU, base addresses
describing changes in register offsets, and supported features.

Signed-off-by: Karunika Choo <karunika.choo@arm.com>
---
 drivers/gpu/drm/panthor/Makefile         |  1 +
 drivers/gpu/drm/panthor/panthor_device.c |  5 ++
 drivers/gpu/drm/panthor/panthor_device.h |  3 +
 drivers/gpu/drm/panthor/panthor_hw.c     | 70 ++++++++++++++++++++++++
 drivers/gpu/drm/panthor/panthor_hw.h     | 63 +++++++++++++++++++++
 drivers/gpu/drm/panthor/panthor_regs.h   |  2 +
 6 files changed, 144 insertions(+)
 create mode 100644 drivers/gpu/drm/panthor/panthor_hw.c
 create mode 100644 drivers/gpu/drm/panthor/panthor_hw.h

diff --git a/drivers/gpu/drm/panthor/Makefile b/drivers/gpu/drm/panthor/Makefile
index 15294719b09c..02db21748c12 100644
--- a/drivers/gpu/drm/panthor/Makefile
+++ b/drivers/gpu/drm/panthor/Makefile
@@ -8,6 +8,7 @@ panthor-y := \
 	panthor_gem.o \
 	panthor_gpu.o \
 	panthor_heap.o \
+	panthor_hw.o \
 	panthor_mmu.o \
 	panthor_sched.o
 
diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
index a9da1d1eeb70..a6fca6b3fabd 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -18,6 +18,7 @@
 #include "panthor_device.h"
 #include "panthor_fw.h"
 #include "panthor_gpu.h"
+#include "panthor_hw.h"
 #include "panthor_mmu.h"
 #include "panthor_regs.h"
 #include "panthor_sched.h"
@@ -243,6 +244,10 @@ int panthor_device_init(struct panthor_device *ptdev)
 			return ret;
 	}
 
+	ret = panthor_hw_init(ptdev);
+	if (ret)
+		goto err_rpm_put;
+
 	ret = panthor_gpu_init(ptdev);
 	if (ret)
 		goto err_rpm_put;
diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index da6574021664..82741bf1a49b 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -120,6 +120,9 @@ struct panthor_device {
 	/** @csif_info: Command stream interface information. */
 	struct drm_panthor_csif_info csif_info;
 
+	/** @hw: GPU specific data. */
+	struct panthor_hw *hw;
+
 	/** @gpu: GPU management data. */
 	struct panthor_gpu *gpu;
 
diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
new file mode 100644
index 000000000000..234bfd50cf0d
--- /dev/null
+++ b/drivers/gpu/drm/panthor/panthor_hw.c
@@ -0,0 +1,70 @@
+// SPDX-License-Identifier: GPL-2.0 or MIT
+/* Copyright 2025 ARM Limited. All rights reserved. */
+
+#include "panthor_device.h"
+#include "panthor_hw.h"
+#include "panthor_regs.h"
+
+static struct panthor_hw panthor_hw_devices[] = {
+	{
+		.arch_id = GPU_ARCH_ID_MAKE(10, 0, 0),
+		.arch_mask = GPU_ARCH_ID_MAKE(0xFF, 0, 0),
+	},
+};
+
+static int init_gpu_id(struct panthor_device *ptdev)
+{
+	ptdev->gpu_info.gpu_id = gpu_read(ptdev, GPU_ID);
+
+	if (!ptdev->gpu_info.gpu_id) {
+		drm_err(&ptdev->base, "Invalid GPU ID (0x0)");
+		return -ENXIO;
+	}
+
+	return 0;
+}
+
+int panthor_hw_init(struct panthor_device *ptdev)
+{
+	struct panthor_hw *hdev = NULL;
+	u32 arch_id = 0;
+	int i, ret;
+
+	ret = init_gpu_id(ptdev);
+	if (ret)
+		return ret;
+
+	arch_id = GPU_ARCH_ID_MAKE(GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id),
+				   GPU_ARCH_MINOR(ptdev->gpu_info.gpu_id),
+				   GPU_ARCH_REV(ptdev->gpu_info.gpu_id));
+	if (!arch_id) {
+		drm_err(&ptdev->base, "Invalid arch_id (0x0)");
+		return -ENXIO;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(panthor_hw_devices); i++) {
+		u32 mask = panthor_hw_devices[i].arch_mask;
+		u32 hw_arch_id = panthor_hw_devices[i].arch_id;
+
+		if ((arch_id & mask) == (hw_arch_id & mask)) {
+			hdev = &panthor_hw_devices[i];
+			break;
+		}
+	}
+
+	if (!hdev) {
+		drm_err(&ptdev->base, "Unsupported GPU (arch 0x%x)", arch_id);
+		return -ENODEV;
+	}
+
+	ptdev->hw = hdev;
+
+	return 0;
+}
+
+bool panthor_hw_supports(struct panthor_device *ptdev,
+			 enum panthor_hw_feature feature)
+{
+	return test_bit(feature, ptdev->hw->features);
+}
+
diff --git a/drivers/gpu/drm/panthor/panthor_hw.h b/drivers/gpu/drm/panthor/panthor_hw.h
new file mode 100644
index 000000000000..5eb0549ad333
--- /dev/null
+++ b/drivers/gpu/drm/panthor/panthor_hw.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0 or MIT */
+/* Copyright 2025 ARM Limited. All rights reserved. */
+
+#ifndef __PANTHOR_HW_H__
+#define __PANTHOR_HW_H__
+
+#include <linux/types.h>
+#include <linux/bitmap.h>
+
+struct panthor_device;
+
+/**
+ * enum panthor_hw_feature - Bit position of each HW feature
+ *
+ * Used to define GPU specific features based on the GPU architecture ID.
+ * New feature flags will be added with support for newer GPU architectures.
+ */
+enum panthor_hw_feature {
+	/** @PANTHOR_HW_FEATURES_END: Must be last. */
+	PANTHOR_HW_FEATURES_END
+};
+
+/**
+ * struct panthor_hw_regmap - Register offsets for specific register blocks
+ */
+struct panthor_hw_regmap {
+
+};
+
+/**
+ * struct panthor_hw_ops - HW operations that are specific to a GPU
+ */
+struct panthor_hw_ops {
+
+};
+
+/**
+ * struct panthor_hw - GPU specific register mapping and functions
+ */
+struct panthor_hw {
+	/** @arch_id: Architecture id to match against */
+	u32 arch_id;
+
+	/** @arch_mask: Mask for architecture id comparison */
+	u32 arch_mask;
+
+	/** @features: Bitmap containing panthor_hw_feature */
+	DECLARE_BITMAP(features, PANTHOR_HW_FEATURES_END);
+
+	/** @map: Panthor regmap */
+	struct panthor_hw_regmap map;
+
+	/** @ops: Panthor HW specific operations */
+	struct panthor_hw_ops ops;
+};
+
+int panthor_hw_init(struct panthor_device *ptdev);
+
+bool panthor_hw_supports(struct panthor_device *ptdev,
+			 enum panthor_hw_feature feature);
+
+#endif /* __PANTHOR_HW_H__ */
+
diff --git a/drivers/gpu/drm/panthor/panthor_regs.h b/drivers/gpu/drm/panthor/panthor_regs.h
index 7ec4a1d04e20..ba452c1dd644 100644
--- a/drivers/gpu/drm/panthor/panthor_regs.h
+++ b/drivers/gpu/drm/panthor/panthor_regs.h
@@ -19,6 +19,8 @@
 #define   GPU_VER_MINOR(x)				(((x) & GENMASK(11, 4)) >> 4)
 #define   GPU_VER_STATUS(x)				((x) & GENMASK(3, 0))
 
+#define GPU_ARCH_ID_MAKE(major, minor, rev)		(((major) << 16) | ((minor) << 8) | (rev))
+
 #define GPU_L2_FEATURES					0x4
 #define  GPU_L2_FEATURES_LINE_SIZE(x)			(1 << ((x) & GENMASK(7, 0)))
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 4/9] drm/panthor: Move GPU info initialization into panthor_hw.c
  2025-03-20 11:17 [PATCH v2 0/9] drm/panthor: Add GPU specific initialization framework to support new Mali GPUs Karunika Choo
                   ` (2 preceding siblings ...)
  2025-03-20 11:17 ` [PATCH v2 3/9] drm/panthor: Add GPU specific initialization framework Karunika Choo
@ 2025-03-20 11:17 ` Karunika Choo
  2025-03-21  8:16   ` Boris Brezillon
  2025-03-20 11:17 ` [PATCH v2 5/9] drm/panthor: Make getting GPU model name simple and extensible Karunika Choo
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Karunika Choo @ 2025-03-20 11:17 UTC (permalink / raw)
  To: dri-devel
  Cc: nd, Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

This patch moves GPU info initialization into panthor_hw.c in
preparation of handling GPU register changes. The GPU register reading
operations to populate gpu_info are separated into an architecture
specific arch_*_gpu_info_init() function and is called via the new
function pointer abstraction under hw.ops.gpu_info_init().

Future GPU support will be performed by implementing a *_gpu_info_init()
function specific to that architecture version. It can call any existing
*_gpu_info_init() and extend it with additional register reads or
provide an entirely different implementation.

This patch will enable Panthor to support GPUs with changes to register
offsets, size and fields.

Signed-off-by: Karunika Choo <karunika.choo@arm.com>
---
 drivers/gpu/drm/panthor/panthor_gpu.c |  95 -----------------------
 drivers/gpu/drm/panthor/panthor_hw.c  | 105 ++++++++++++++++++++++++++
 drivers/gpu/drm/panthor/panthor_hw.h  |   3 +-
 3 files changed, 107 insertions(+), 96 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
index 0dee011fe2e9..fcdee8901482 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -37,40 +37,6 @@ struct panthor_gpu {
 	wait_queue_head_t reqs_acked;
 };
 
-/**
- * struct panthor_model - GPU model description
- */
-struct panthor_model {
-	/** @name: Model name. */
-	const char *name;
-
-	/** @arch_major: Major version number of architecture. */
-	u8 arch_major;
-
-	/** @product_major: Major version number of product. */
-	u8 product_major;
-};
-
-/**
- * GPU_MODEL() - Define a GPU model. A GPU product can be uniquely identified
- * by a combination of the major architecture version and the major product
- * version.
- * @_name: Name for the GPU model.
- * @_arch_major: Architecture major.
- * @_product_major: Product major.
- */
-#define GPU_MODEL(_name, _arch_major, _product_major) \
-{\
-	.name = __stringify(_name),				\
-	.arch_major = _arch_major,				\
-	.product_major = _product_major,			\
-}
-
-static const struct panthor_model gpu_models[] = {
-	GPU_MODEL(g610, 10, 7),
-	{},
-};
-
 #define GPU_INTERRUPTS_MASK	\
 	(GPU_IRQ_FAULT | \
 	 GPU_IRQ_PROTM_FAULT | \
@@ -83,66 +49,6 @@ static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
 		ptdev->coherent ? GPU_COHERENCY_PROT_BIT(ACE_LITE) : GPU_COHERENCY_NONE);
 }
 
-static void panthor_gpu_init_info(struct panthor_device *ptdev)
-{
-	const struct panthor_model *model;
-	u32 arch_major, product_major;
-	u32 major, minor, status;
-	unsigned int i;
-
-	ptdev->gpu_info.gpu_id = gpu_read(ptdev, GPU_ID);
-	ptdev->gpu_info.csf_id = gpu_read(ptdev, GPU_CSF_ID);
-	ptdev->gpu_info.gpu_rev = gpu_read(ptdev, GPU_REVID);
-	ptdev->gpu_info.core_features = gpu_read(ptdev, GPU_CORE_FEATURES);
-	ptdev->gpu_info.l2_features = gpu_read(ptdev, GPU_L2_FEATURES);
-	ptdev->gpu_info.tiler_features = gpu_read(ptdev, GPU_TILER_FEATURES);
-	ptdev->gpu_info.mem_features = gpu_read(ptdev, GPU_MEM_FEATURES);
-	ptdev->gpu_info.mmu_features = gpu_read(ptdev, GPU_MMU_FEATURES);
-	ptdev->gpu_info.thread_features = gpu_read(ptdev, GPU_THREAD_FEATURES);
-	ptdev->gpu_info.max_threads = gpu_read(ptdev, GPU_THREAD_MAX_THREADS);
-	ptdev->gpu_info.thread_max_workgroup_size = gpu_read(ptdev, GPU_THREAD_MAX_WORKGROUP_SIZE);
-	ptdev->gpu_info.thread_max_barrier_size = gpu_read(ptdev, GPU_THREAD_MAX_BARRIER_SIZE);
-	ptdev->gpu_info.coherency_features = gpu_read(ptdev, GPU_COHERENCY_FEATURES);
-	for (i = 0; i < 4; i++)
-		ptdev->gpu_info.texture_features[i] = gpu_read(ptdev, GPU_TEXTURE_FEATURES(i));
-
-	ptdev->gpu_info.as_present = gpu_read(ptdev, GPU_AS_PRESENT);
-
-	ptdev->gpu_info.shader_present = gpu_read64(ptdev, GPU_SHADER_PRESENT_LO);
-	ptdev->gpu_info.tiler_present = gpu_read64(ptdev, GPU_TILER_PRESENT_LO);
-	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
-
-	arch_major = GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id);
-	product_major = GPU_PROD_MAJOR(ptdev->gpu_info.gpu_id);
-	major = GPU_VER_MAJOR(ptdev->gpu_info.gpu_id);
-	minor = GPU_VER_MINOR(ptdev->gpu_info.gpu_id);
-	status = GPU_VER_STATUS(ptdev->gpu_info.gpu_id);
-
-	for (model = gpu_models; model->name; model++) {
-		if (model->arch_major == arch_major &&
-		    model->product_major == product_major)
-			break;
-	}
-
-	drm_info(&ptdev->base,
-		 "mali-%s id 0x%x major 0x%x minor 0x%x status 0x%x",
-		 model->name ?: "unknown", ptdev->gpu_info.gpu_id >> 16,
-		 major, minor, status);
-
-	drm_info(&ptdev->base,
-		 "Features: L2:%#x Tiler:%#x Mem:%#x MMU:%#x AS:%#x",
-		 ptdev->gpu_info.l2_features,
-		 ptdev->gpu_info.tiler_features,
-		 ptdev->gpu_info.mem_features,
-		 ptdev->gpu_info.mmu_features,
-		 ptdev->gpu_info.as_present);
-
-	drm_info(&ptdev->base,
-		 "shader_present=0x%0llx l2_present=0x%0llx tiler_present=0x%0llx",
-		 ptdev->gpu_info.shader_present, ptdev->gpu_info.l2_present,
-		 ptdev->gpu_info.tiler_present);
-}
-
 static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
 {
 	if (status & GPU_IRQ_FAULT) {
@@ -203,7 +109,6 @@ int panthor_gpu_init(struct panthor_device *ptdev)
 	spin_lock_init(&gpu->reqs_lock);
 	init_waitqueue_head(&gpu->reqs_acked);
 	ptdev->gpu = gpu;
-	panthor_gpu_init_info(ptdev);
 
 	dma_set_max_seg_size(ptdev->base.dev, UINT_MAX);
 	pa_bits = GPU_MMU_FEATURES_PA_BITS(ptdev->gpu_info.mmu_features);
diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
index 234bfd50cf0d..4cc4b0d5382c 100644
--- a/drivers/gpu/drm/panthor/panthor_hw.c
+++ b/drivers/gpu/drm/panthor/panthor_hw.c
@@ -5,10 +5,113 @@
 #include "panthor_hw.h"
 #include "panthor_regs.h"
 
+/**
+ * struct panthor_model - GPU model description
+ */
+struct panthor_model {
+	/** @name: Model name. */
+	const char *name;
+
+	/** @arch_major: Major version number of architecture. */
+	u8 arch_major;
+
+	/** @product_major: Major version number of product. */
+	u8 product_major;
+};
+
+/**
+ * GPU_MODEL() - Define a GPU model. A GPU product can be uniquely identified
+ * by a combination of the major architecture version and the major product
+ * version.
+ * @_name: Name for the GPU model.
+ * @_arch_major: Architecture major.
+ * @_product_major: Product major.
+ */
+#define GPU_MODEL(_name, _arch_major, _product_major) \
+{\
+	.name = __stringify(_name),				\
+	.arch_major = _arch_major,				\
+	.product_major = _product_major,			\
+}
+
+static const struct panthor_model gpu_models[] = {
+	GPU_MODEL(g610, 10, 7),
+	{},
+};
+
+static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
+{
+	unsigned int i;
+
+	ptdev->gpu_info.gpu_id = gpu_read(ptdev, GPU_ID);
+	ptdev->gpu_info.csf_id = gpu_read(ptdev, GPU_CSF_ID);
+	ptdev->gpu_info.gpu_rev = gpu_read(ptdev, GPU_REVID);
+	ptdev->gpu_info.core_features = gpu_read(ptdev, GPU_CORE_FEATURES);
+	ptdev->gpu_info.l2_features = gpu_read(ptdev, GPU_L2_FEATURES);
+	ptdev->gpu_info.tiler_features = gpu_read(ptdev, GPU_TILER_FEATURES);
+	ptdev->gpu_info.mem_features = gpu_read(ptdev, GPU_MEM_FEATURES);
+	ptdev->gpu_info.mmu_features = gpu_read(ptdev, GPU_MMU_FEATURES);
+	ptdev->gpu_info.thread_features = gpu_read(ptdev, GPU_THREAD_FEATURES);
+	ptdev->gpu_info.max_threads = gpu_read(ptdev, GPU_THREAD_MAX_THREADS);
+	ptdev->gpu_info.thread_max_workgroup_size = gpu_read(ptdev, GPU_THREAD_MAX_WORKGROUP_SIZE);
+	ptdev->gpu_info.thread_max_barrier_size = gpu_read(ptdev, GPU_THREAD_MAX_BARRIER_SIZE);
+	ptdev->gpu_info.coherency_features = gpu_read(ptdev, GPU_COHERENCY_FEATURES);
+	for (i = 0; i < 4; i++)
+		ptdev->gpu_info.texture_features[i] = gpu_read(ptdev, GPU_TEXTURE_FEATURES(i));
+
+	ptdev->gpu_info.as_present = gpu_read(ptdev, GPU_AS_PRESENT);
+
+	ptdev->gpu_info.shader_present = gpu_read64(ptdev, GPU_SHADER_PRESENT_LO);
+	ptdev->gpu_info.tiler_present = gpu_read64(ptdev, GPU_TILER_PRESENT_LO);
+	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
+}
+
+static void panthor_gpu_init_info(struct panthor_device *ptdev)
+{
+	const struct panthor_model *model;
+	u32 arch_major, product_major;
+	u32 major, minor, status;
+
+	ptdev->hw->ops.gpu_info_init(ptdev);
+
+	arch_major = GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id);
+	product_major = GPU_PROD_MAJOR(ptdev->gpu_info.gpu_id);
+	major = GPU_VER_MAJOR(ptdev->gpu_info.gpu_id);
+	minor = GPU_VER_MINOR(ptdev->gpu_info.gpu_id);
+	status = GPU_VER_STATUS(ptdev->gpu_info.gpu_id);
+
+	for (model = gpu_models; model->name; model++) {
+		if (model->arch_major == arch_major &&
+		    model->product_major == product_major)
+			break;
+	}
+
+	drm_info(&ptdev->base,
+		 "mali-%s id 0x%x major 0x%x minor 0x%x status 0x%x",
+		 model->name ?: "unknown", ptdev->gpu_info.gpu_id >> 16,
+		 major, minor, status);
+
+	drm_info(&ptdev->base,
+		 "Features: L2:%#x Tiler:%#x Mem:%#x MMU:%#x AS:%#x",
+		 ptdev->gpu_info.l2_features,
+		 ptdev->gpu_info.tiler_features,
+		 ptdev->gpu_info.mem_features,
+		 ptdev->gpu_info.mmu_features,
+		 ptdev->gpu_info.as_present);
+
+	drm_info(&ptdev->base,
+		 "shader_present=0x%0llx l2_present=0x%0llx tiler_present=0x%0llx",
+		 ptdev->gpu_info.shader_present, ptdev->gpu_info.l2_present,
+		 ptdev->gpu_info.tiler_present);
+}
+
 static struct panthor_hw panthor_hw_devices[] = {
 	{
 		.arch_id = GPU_ARCH_ID_MAKE(10, 0, 0),
 		.arch_mask = GPU_ARCH_ID_MAKE(0xFF, 0, 0),
+		.ops = {
+			.gpu_info_init = arch_10_8_gpu_info_init,
+		},
 	},
 };
 
@@ -59,6 +162,8 @@ int panthor_hw_init(struct panthor_device *ptdev)
 
 	ptdev->hw = hdev;
 
+	panthor_gpu_init_info(ptdev);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/panthor/panthor_hw.h b/drivers/gpu/drm/panthor/panthor_hw.h
index 5eb0549ad333..dfe0f86c5d76 100644
--- a/drivers/gpu/drm/panthor/panthor_hw.h
+++ b/drivers/gpu/drm/panthor/panthor_hw.h
@@ -31,7 +31,8 @@ struct panthor_hw_regmap {
  * struct panthor_hw_ops - HW operations that are specific to a GPU
  */
 struct panthor_hw_ops {
-
+	/** @gpu_info_init: Function pointer to initialize GPU info. */
+	void (*gpu_info_init)(struct panthor_device *ptdev);
 };
 
 /**
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 5/9] drm/panthor: Make getting GPU model name simple and extensible
  2025-03-20 11:17 [PATCH v2 0/9] drm/panthor: Add GPU specific initialization framework to support new Mali GPUs Karunika Choo
                   ` (3 preceding siblings ...)
  2025-03-20 11:17 ` [PATCH v2 4/9] drm/panthor: Move GPU info initialization into panthor_hw.c Karunika Choo
@ 2025-03-20 11:17 ` Karunika Choo
  2025-03-21  8:02   ` Boris Brezillon
  2025-03-20 11:17 ` [PATCH v2 6/9] drm/panthor: Add support for Mali-G715 family of GPUs Karunika Choo
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Karunika Choo @ 2025-03-20 11:17 UTC (permalink / raw)
  To: dri-devel
  Cc: nd, Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

This patch replaces the previous panthor_model structure with a simple
switch case based on the product_id, which is in the format of:
        ((arch_major << 24) | product_major)

This not only simplifies the comparison, but also allows extending the
function to accommodate naming differences based on GPU features.

Signed-off-by: Karunika Choo <karunika.choo@arm.com>
---
 drivers/gpu/drm/panthor/panthor_hw.c   | 63 +++++++-------------------
 drivers/gpu/drm/panthor/panthor_regs.h |  1 +
 2 files changed, 18 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
index 4cc4b0d5382c..12183c04cd21 100644
--- a/drivers/gpu/drm/panthor/panthor_hw.c
+++ b/drivers/gpu/drm/panthor/panthor_hw.c
@@ -5,40 +5,6 @@
 #include "panthor_hw.h"
 #include "panthor_regs.h"
 
-/**
- * struct panthor_model - GPU model description
- */
-struct panthor_model {
-	/** @name: Model name. */
-	const char *name;
-
-	/** @arch_major: Major version number of architecture. */
-	u8 arch_major;
-
-	/** @product_major: Major version number of product. */
-	u8 product_major;
-};
-
-/**
- * GPU_MODEL() - Define a GPU model. A GPU product can be uniquely identified
- * by a combination of the major architecture version and the major product
- * version.
- * @_name: Name for the GPU model.
- * @_arch_major: Architecture major.
- * @_product_major: Product major.
- */
-#define GPU_MODEL(_name, _arch_major, _product_major) \
-{\
-	.name = __stringify(_name),				\
-	.arch_major = _arch_major,				\
-	.product_major = _product_major,			\
-}
-
-static const struct panthor_model gpu_models[] = {
-	GPU_MODEL(g610, 10, 7),
-	{},
-};
-
 static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
 {
 	unsigned int i;
@@ -66,29 +32,34 @@ static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
 	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
 }
 
+static char *get_gpu_model_name(struct panthor_device *ptdev)
+{
+	const u32 gpu_id = ptdev->gpu_info.gpu_id;
+	const u32 product_id = GPU_PROD_ID_MAKE(GPU_ARCH_MAJOR(gpu_id),
+						GPU_PROD_MAJOR(gpu_id));
+
+	switch (product_id) {
+	case GPU_PROD_ID_MAKE(10, 7):
+		return "Mali-G610";
+	}
+
+	return "(Unknown Mali GPU)";
+}
+
 static void panthor_gpu_init_info(struct panthor_device *ptdev)
 {
-	const struct panthor_model *model;
-	u32 arch_major, product_major;
+	const char *gpu_model_name = get_gpu_model_name(ptdev);
 	u32 major, minor, status;
 
 	ptdev->hw->ops.gpu_info_init(ptdev);
 
-	arch_major = GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id);
-	product_major = GPU_PROD_MAJOR(ptdev->gpu_info.gpu_id);
 	major = GPU_VER_MAJOR(ptdev->gpu_info.gpu_id);
 	minor = GPU_VER_MINOR(ptdev->gpu_info.gpu_id);
 	status = GPU_VER_STATUS(ptdev->gpu_info.gpu_id);
 
-	for (model = gpu_models; model->name; model++) {
-		if (model->arch_major == arch_major &&
-		    model->product_major == product_major)
-			break;
-	}
-
 	drm_info(&ptdev->base,
-		 "mali-%s id 0x%x major 0x%x minor 0x%x status 0x%x",
-		 model->name ?: "unknown", ptdev->gpu_info.gpu_id >> 16,
+		 "%s id 0x%x major 0x%x minor 0x%x status 0x%x",
+		 gpu_model_name, ptdev->gpu_info.gpu_id >> 16,
 		 major, minor, status);
 
 	drm_info(&ptdev->base,
diff --git a/drivers/gpu/drm/panthor/panthor_regs.h b/drivers/gpu/drm/panthor/panthor_regs.h
index ba452c1dd644..d9e0769d6f1a 100644
--- a/drivers/gpu/drm/panthor/panthor_regs.h
+++ b/drivers/gpu/drm/panthor/panthor_regs.h
@@ -20,6 +20,7 @@
 #define   GPU_VER_STATUS(x)				((x) & GENMASK(3, 0))
 
 #define GPU_ARCH_ID_MAKE(major, minor, rev)		(((major) << 16) | ((minor) << 8) | (rev))
+#define GPU_PROD_ID_MAKE(arch_major, prod_major)	(((arch_major) << 24) | (prod_major))
 
 #define GPU_L2_FEATURES					0x4
 #define  GPU_L2_FEATURES_LINE_SIZE(x)			(1 << ((x) & GENMASK(7, 0)))
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 6/9] drm/panthor: Add support for Mali-G715 family of GPUs
  2025-03-20 11:17 [PATCH v2 0/9] drm/panthor: Add GPU specific initialization framework to support new Mali GPUs Karunika Choo
                   ` (4 preceding siblings ...)
  2025-03-20 11:17 ` [PATCH v2 5/9] drm/panthor: Make getting GPU model name simple and extensible Karunika Choo
@ 2025-03-20 11:17 ` Karunika Choo
  2025-03-21  8:34   ` Boris Brezillon
  2025-03-20 11:17 ` [PATCH v2 7/9] drm/panthor: Support GPU_CONTROL cache flush based on feature bit Karunika Choo
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Karunika Choo @ 2025-03-20 11:17 UTC (permalink / raw)
  To: dri-devel
  Cc: nd, Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

Mali-G715 introduces a new GPU_FEATURES register that provides
information about GPU-wide supported features. The register value will
be passed on to userspace via gpu_info. It also adds the following
registers that are specific to the kernel driver only:
- ASN_HASH_0~2
- DOORBELL_FEATURES
- PRFCNT_FEATURES
- SYSC_ALLOC0~7
- SYSC_PBHA_OVERRIDE0~3

Additionally, Mali-G715 presents an 'Immortalis' naming variant
depending on the shader core count and presence of Ray Intersection
feature support.

This patch adds:
- support for correctly identifying the model names for the Mali-G715
  family of GPUs.
- arch 11.8 FW binary support
- reading and handling of GPU_FEATURES register

Signed-off-by: Karunika Choo <karunika.choo@arm.com>
---
 drivers/gpu/drm/panthor/panthor_fw.c   |  1 +
 drivers/gpu/drm/panthor/panthor_hw.c   | 26 ++++++++++++++++++++++++++
 drivers/gpu/drm/panthor/panthor_regs.h | 12 ++++++++++++
 include/uapi/drm/panthor_drm.h         |  3 +++
 4 files changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index ecfbe0456f89..0b3fab95f26b 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1398,3 +1398,4 @@ int panthor_fw_init(struct panthor_device *ptdev)
 }
 
 MODULE_FIRMWARE("arm/mali/arch10.8/mali_csffw.bin");
+MODULE_FIRMWARE("arm/mali/arch11.8/mali_csffw.bin");
diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
index 12183c04cd21..d04c8723ac98 100644
--- a/drivers/gpu/drm/panthor/panthor_hw.c
+++ b/drivers/gpu/drm/panthor/panthor_hw.c
@@ -32,15 +32,34 @@ static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
 	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
 }
 
+static void arch_11_8_gpu_info_init(struct panthor_device *ptdev)
+{
+	arch_10_8_gpu_info_init(ptdev);
+
+	ptdev->gpu_info.gpu_features = gpu_read64(ptdev, GPU_FEATURES_LO);
+}
+
 static char *get_gpu_model_name(struct panthor_device *ptdev)
 {
 	const u32 gpu_id = ptdev->gpu_info.gpu_id;
 	const u32 product_id = GPU_PROD_ID_MAKE(GPU_ARCH_MAJOR(gpu_id),
 						GPU_PROD_MAJOR(gpu_id));
+	const bool ray_intersection = !!(ptdev->gpu_info.gpu_features &
+					 GPU_FEATURES_RAY_INTERSECTION);
+	const u8 shader_core_count = hweight64(ptdev->gpu_info.shader_present);
 
 	switch (product_id) {
 	case GPU_PROD_ID_MAKE(10, 7):
 		return "Mali-G610";
+	case GPU_PROD_ID_MAKE(11, 2):
+		if (shader_core_count > 10 && ray_intersection)
+			return "Mali-G715-Immortalis";
+		else if (shader_core_count >= 7)
+			return "Mali-G715";
+
+		fallthrough;
+	case GPU_PROD_ID_MAKE(11, 3):
+		return "Mali-G615";
 	}
 
 	return "(Unknown Mali GPU)";
@@ -84,6 +103,13 @@ static struct panthor_hw panthor_hw_devices[] = {
 			.gpu_info_init = arch_10_8_gpu_info_init,
 		},
 	},
+	{
+		.arch_id = GPU_ARCH_ID_MAKE(11, 8, 0),
+		.arch_mask = GPU_ARCH_ID_MAKE(0xFF, 0xFF, 0),
+		.ops = {
+			.gpu_info_init = arch_11_8_gpu_info_init,
+		}
+	},
 };
 
 static int init_gpu_id(struct panthor_device *ptdev)
diff --git a/drivers/gpu/drm/panthor/panthor_regs.h b/drivers/gpu/drm/panthor/panthor_regs.h
index d9e0769d6f1a..7bc2d838e704 100644
--- a/drivers/gpu/drm/panthor/panthor_regs.h
+++ b/drivers/gpu/drm/panthor/panthor_regs.h
@@ -74,6 +74,11 @@
 #define GPU_PWR_OVERRIDE0				0x54
 #define GPU_PWR_OVERRIDE1				0x58
 
+#define GPU_FEATURES_LO					0x60
+#define GPU_FEATURES_HI					0x64
+#define   GPU_FEATURES_RAY_INTERSECTION			BIT(2)
+#define GPU_PRFCNT_FEATURES				0x68
+
 #define GPU_TIMESTAMP_OFFSET_LO				0x88
 #define GPU_TIMESTAMP_OFFSET_HI				0x8C
 #define GPU_CYCLE_COUNT_LO				0x90
@@ -88,6 +93,8 @@
 
 #define GPU_TEXTURE_FEATURES(n)				(0xB0 + ((n) * 4))
 
+#define GPU_DOORBELL_FEATURES				0xC0
+
 #define GPU_SHADER_PRESENT_LO				0x100
 #define GPU_SHADER_PRESENT_HI				0x104
 #define GPU_TILER_PRESENT_LO				0x110
@@ -132,6 +139,8 @@
 
 #define GPU_REVID					0x280
 
+#define GPU_ASN_HASH(n)					(0x2C0 + ((n) * 4))
+
 #define GPU_COHERENCY_FEATURES				0x300
 #define GPU_COHERENCY_PROT_BIT(name)			BIT(GPU_COHERENCY_  ## name)
 
@@ -140,6 +149,9 @@
 #define   GPU_COHERENCY_ACE_LITE			1
 #define   GPU_COHERENCY_NONE				31
 
+#define GPU_SYSC_PBHA_OVERRIDE(n)			(0x320 + ((n) * 4))
+#define GPU_SYSC_ALLOC(n)				(0x340 + ((n) * 4))
+
 #define MCU_CONTROL					0x700
 #define MCU_CONTROL_ENABLE				1
 #define MCU_CONTROL_AUTO				2
diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
index 97e2c4510e69..4aba8146af3b 100644
--- a/include/uapi/drm/panthor_drm.h
+++ b/include/uapi/drm/panthor_drm.h
@@ -307,6 +307,9 @@ struct drm_panthor_gpu_info {
 
 	/** @pad: MBZ. */
 	__u32 pad;
+
+	/** @gpu_features: Bitmask describing supported GPU-wide features */
+	__u64 gpu_features;
 };
 
 /**
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 7/9] drm/panthor: Support GPU_CONTROL cache flush based on feature bit
  2025-03-20 11:17 [PATCH v2 0/9] drm/panthor: Add GPU specific initialization framework to support new Mali GPUs Karunika Choo
                   ` (5 preceding siblings ...)
  2025-03-20 11:17 ` [PATCH v2 6/9] drm/panthor: Add support for Mali-G715 family of GPUs Karunika Choo
@ 2025-03-20 11:17 ` Karunika Choo
  2025-03-21  8:41   ` Boris Brezillon
  2025-03-20 11:17 ` [PATCH v2 8/9] drm/panthor: Add support for Mali-G720 and Mali-G725 GPUs Karunika Choo
  2025-03-20 11:17 ` [PATCH v2 9/9] drm/panthor: Add support for Mali-G710, Mali-G510, and Mali-G310 Karunika Choo
  8 siblings, 1 reply; 26+ messages in thread
From: Karunika Choo @ 2025-03-20 11:17 UTC (permalink / raw)
  To: dri-devel
  Cc: nd, Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

As the FLUSH_MEM and FLUSH_PT commands are deprecated in GPUs from
Mali-G720 onwards, this patch adds support for performing cache
maintenance via the FLUSH_CACHES command in GPU_CONTROL, in place of
FLUSH_MEM and FLUSH_PT based on PANTHOR_HW_FEATURE_GPU_CTRL_CACHE_FLUSH
feature bit.

Signed-off-by: Karunika Choo <karunika.choo@arm.com>
---
 drivers/gpu/drm/panthor/panthor_hw.h  |  6 +++++
 drivers/gpu/drm/panthor/panthor_mmu.c | 35 +++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/panthor/panthor_hw.h b/drivers/gpu/drm/panthor/panthor_hw.h
index dfe0f86c5d76..4d67fdfe86f9 100644
--- a/drivers/gpu/drm/panthor/panthor_hw.h
+++ b/drivers/gpu/drm/panthor/panthor_hw.h
@@ -16,6 +16,12 @@ struct panthor_device;
  * New feature flags will be added with support for newer GPU architectures.
  */
 enum panthor_hw_feature {
+	/**
+	 * @PANTHOR_HW_FEATURE_GPU_CTRL_CACHE_FLUSH: Perform cache maintenance
+	 * via GPU_CONTROL.
+	 */
+	PANTHOR_HW_FEATURE_GPU_CTRL_CACHE_FLUSH,
+
 	/** @PANTHOR_HW_FEATURES_END: Must be last. */
 	PANTHOR_HW_FEATURES_END
 };
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index a0a79f19bdea..4ac8bf36177e 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -29,7 +29,9 @@
 
 #include "panthor_device.h"
 #include "panthor_gem.h"
+#include "panthor_gpu.h"
 #include "panthor_heap.h"
+#include "panthor_hw.h"
 #include "panthor_mmu.h"
 #include "panthor_regs.h"
 #include "panthor_sched.h"
@@ -568,6 +570,35 @@ static void lock_region(struct panthor_device *ptdev, u32 as_nr,
 	write_cmd(ptdev, as_nr, AS_COMMAND_LOCK);
 }
 
+static int mmu_hw_do_flush_on_gpu_ctrl(struct panthor_device *ptdev, int as_nr,
+				       u32 op)
+{
+	const u32 l2_flush_op = CACHE_CLEAN | CACHE_INV;
+	u32 lsc_flush_op = 0;
+	int ret;
+
+	if (op == AS_COMMAND_FLUSH_MEM)
+		lsc_flush_op = CACHE_CLEAN | CACHE_INV;
+
+	ret = wait_ready(ptdev, as_nr);
+	if (ret)
+		return ret;
+
+	ret = panthor_gpu_flush_caches(ptdev, l2_flush_op, lsc_flush_op, 0);
+	if (ret)
+		return ret;
+
+	/*
+	 * Explicitly unlock the region as the AS is not unlocked automatically
+	 * at the end of the GPU_CONTROL cache flush command, unlike
+	 * AS_COMMAND_FLUSH_MEM or AS_COMMAND_FLUSH_PT.
+	 */
+	write_cmd(ptdev, as_nr, AS_COMMAND_UNLOCK);
+
+	/* Wait for the unlock command to complete */
+	return wait_ready(ptdev, as_nr);
+}
+
 static int mmu_hw_do_operation_locked(struct panthor_device *ptdev, int as_nr,
 				      u64 iova, u64 size, u32 op)
 {
@@ -585,6 +616,10 @@ static int mmu_hw_do_operation_locked(struct panthor_device *ptdev, int as_nr,
 	if (op != AS_COMMAND_UNLOCK)
 		lock_region(ptdev, as_nr, iova, size);
 
+	if (panthor_hw_supports(ptdev,PANTHOR_HW_FEATURE_GPU_CTRL_CACHE_FLUSH))
+		if (op == AS_COMMAND_FLUSH_MEM || op == AS_COMMAND_FLUSH_PT)
+			return mmu_hw_do_flush_on_gpu_ctrl(ptdev, as_nr, op);
+
 	/* Run the MMU operation */
 	write_cmd(ptdev, as_nr, op);
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 8/9] drm/panthor: Add support for Mali-G720 and Mali-G725 GPUs
  2025-03-20 11:17 [PATCH v2 0/9] drm/panthor: Add GPU specific initialization framework to support new Mali GPUs Karunika Choo
                   ` (6 preceding siblings ...)
  2025-03-20 11:17 ` [PATCH v2 7/9] drm/panthor: Support GPU_CONTROL cache flush based on feature bit Karunika Choo
@ 2025-03-20 11:17 ` Karunika Choo
  2025-03-20 11:17 ` [PATCH v2 9/9] drm/panthor: Add support for Mali-G710, Mali-G510, and Mali-G310 Karunika Choo
  8 siblings, 0 replies; 26+ messages in thread
From: Karunika Choo @ 2025-03-20 11:17 UTC (permalink / raw)
  To: dri-devel
  Cc: nd, Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

Mali-G720 and Mali-G725 deprecates the use of FLUSH_MEM and FLUSH_PT
MMU_AS commands in favour of cache maintenance via
GPU_COMMAND's FLUSH_CACHES and FLUSH_PA_RANGE.

They also introduce the following registers:
- GPU_COMMAND_ARG0~1
- SHADER_PWRFEATURES
- AMBA_FEATURES
- AMBA_ENABLE

This patch enables FLUSH_CACHES for both families of GPUs via the
PANTHOR_HW_FEATURE_GPU_CTRL_CACHE_FLUSH bit until FLUSH_PA_RANGE support
is added. It also adds the aforementioned register definitions and
firmware binary support for arch 12.8 and 13.8.

Signed-off-by: Karunika Choo <karunika.choo@arm.com>
---
 drivers/gpu/drm/panthor/panthor_fw.c   |  2 ++
 drivers/gpu/drm/panthor/panthor_hw.c   | 38 ++++++++++++++++++++++++++
 drivers/gpu/drm/panthor/panthor_regs.h | 13 +++++++++
 3 files changed, 53 insertions(+)

diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 0b3fab95f26b..8a967af0e2b4 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1399,3 +1399,5 @@ int panthor_fw_init(struct panthor_device *ptdev)
 
 MODULE_FIRMWARE("arm/mali/arch10.8/mali_csffw.bin");
 MODULE_FIRMWARE("arm/mali/arch11.8/mali_csffw.bin");
+MODULE_FIRMWARE("arm/mali/arch12.8/mali_csffw.bin");
+MODULE_FIRMWARE("arm/mali/arch13.8/mali_csffw.bin");
diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
index d04c8723ac98..e6354304bbef 100644
--- a/drivers/gpu/drm/panthor/panthor_hw.c
+++ b/drivers/gpu/drm/panthor/panthor_hw.c
@@ -60,6 +60,24 @@ static char *get_gpu_model_name(struct panthor_device *ptdev)
 		fallthrough;
 	case GPU_PROD_ID_MAKE(11, 3):
 		return "Mali-G615";
+	case GPU_PROD_ID_MAKE(12, 0):
+		if (shader_core_count >= 10 && ray_intersection)
+			return "Mali-G720-Immortalis";
+		else if (shader_core_count >= 6)
+			return "Mali-G720";
+
+		fallthrough;
+	case GPU_PROD_ID_MAKE(12, 1):
+		return "Mali-G620";
+	case GPU_PROD_ID_MAKE(13, 0):
+		if (shader_core_count >= 10 && ray_intersection)
+			return "Mali-G925-Immortalis";
+		else if (shader_core_count >= 6)
+			return "Mali-G725";
+
+		fallthrough;
+	case GPU_PROD_ID_MAKE(13, 1):
+		return "Mali-G625";
 	}
 
 	return "(Unknown Mali GPU)";
@@ -110,6 +128,26 @@ static struct panthor_hw panthor_hw_devices[] = {
 			.gpu_info_init = arch_11_8_gpu_info_init,
 		}
 	},
+	{
+		.arch_id = GPU_ARCH_ID_MAKE(12, 8, 0),
+		.arch_mask = GPU_ARCH_ID_MAKE(0xFF, 0xFF, 0),
+		.features = {
+			BIT(PANTHOR_HW_FEATURE_GPU_CTRL_CACHE_FLUSH)
+		},
+		.ops = {
+			.gpu_info_init = arch_11_8_gpu_info_init,
+		}
+	},
+	{
+		.arch_id = GPU_ARCH_ID_MAKE(13, 8, 0),
+		.arch_mask = GPU_ARCH_ID_MAKE(0xFF, 0xFF, 0),
+		.features = {
+			BIT(PANTHOR_HW_FEATURE_GPU_CTRL_CACHE_FLUSH)
+		},
+		.ops = {
+			.gpu_info_init = arch_11_8_gpu_info_init,
+		}
+	},
 };
 
 static int init_gpu_id(struct panthor_device *ptdev)
diff --git a/drivers/gpu/drm/panthor/panthor_regs.h b/drivers/gpu/drm/panthor/panthor_regs.h
index 7bc2d838e704..283e758ac86f 100644
--- a/drivers/gpu/drm/panthor/panthor_regs.h
+++ b/drivers/gpu/drm/panthor/panthor_regs.h
@@ -95,6 +95,9 @@
 
 #define GPU_DOORBELL_FEATURES				0xC0
 
+#define GPU_COMMAND_ARG_LO(n)				(0xD0 + ((n) * 8))
+#define GPU_COMMAND_ARG_HI(n)				(0xD4 + ((n) * 8))
+
 #define GPU_SHADER_PRESENT_LO				0x100
 #define GPU_SHADER_PRESENT_HI				0x104
 #define GPU_TILER_PRESENT_LO				0x110
@@ -111,6 +114,8 @@
 
 #define SHADER_PWRON_LO					0x180
 #define SHADER_PWRON_HI					0x184
+#define SHADER_PWRFEATURES				0x188
+#define   SHADER_PWRFEATURES_RAY_TRACING_UNIT		BIT(0)
 #define TILER_PWRON_LO					0x190
 #define TILER_PWRON_HI					0x194
 #define L2_PWRON_LO					0x1A0
@@ -149,6 +154,13 @@
 #define   GPU_COHERENCY_ACE_LITE			1
 #define   GPU_COHERENCY_NONE				31
 
+#define AMBA_FEATURES					0x300
+#define   AMBA_FEATURES_ACE_LITE			BIT(0)
+#define   AMBA_FEATURES_ACE				BIT(1)
+#define   AMBA_FEATURES_SHAREABLE_CACHE_SUPPORT		BIT(5)
+
+#define AMBA_ENABLE					0x304
+
 #define GPU_SYSC_PBHA_OVERRIDE(n)			(0x320 + ((n) * 4))
 #define GPU_SYSC_ALLOC(n)				(0x340 + ((n) * 4))
 
@@ -162,6 +174,7 @@
 #define MCU_STATUS_ENABLED				1
 #define MCU_STATUS_HALT					2
 #define MCU_STATUS_FATAL				3
+#define MCU_FEATURES					0x708
 
 /* Job Control regs */
 #define JOB_INT_RAWSTAT					0x1000
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 9/9] drm/panthor: Add support for Mali-G710, Mali-G510, and Mali-G310
  2025-03-20 11:17 [PATCH v2 0/9] drm/panthor: Add GPU specific initialization framework to support new Mali GPUs Karunika Choo
                   ` (7 preceding siblings ...)
  2025-03-20 11:17 ` [PATCH v2 8/9] drm/panthor: Add support for Mali-G720 and Mali-G725 GPUs Karunika Choo
@ 2025-03-20 11:17 ` Karunika Choo
  2025-03-20 19:03   ` Liviu Dudau
  8 siblings, 1 reply; 26+ messages in thread
From: Karunika Choo @ 2025-03-20 11:17 UTC (permalink / raw)
  To: dri-devel
  Cc: nd, Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

This patch adds GPU model name and FW binary support for Mali-G710,
Mali-G510, and Mali-G310.

Signed-off-by: Karunika Choo <karunika.choo@arm.com>
---
 drivers/gpu/drm/panthor/panthor_fw.c | 2 ++
 drivers/gpu/drm/panthor/panthor_hw.c | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 8a967af0e2b4..7050adfaa8b6 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1398,6 +1398,8 @@ int panthor_fw_init(struct panthor_device *ptdev)
 }
 
 MODULE_FIRMWARE("arm/mali/arch10.8/mali_csffw.bin");
+MODULE_FIRMWARE("arm/mali/arch10.10/mali_csffw.bin");
+MODULE_FIRMWARE("arm/mali/arch10.12/mali_csffw.bin");
 MODULE_FIRMWARE("arm/mali/arch11.8/mali_csffw.bin");
 MODULE_FIRMWARE("arm/mali/arch12.8/mali_csffw.bin");
 MODULE_FIRMWARE("arm/mali/arch13.8/mali_csffw.bin");
diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
index e6354304bbef..6f18b7420f90 100644
--- a/drivers/gpu/drm/panthor/panthor_hw.c
+++ b/drivers/gpu/drm/panthor/panthor_hw.c
@@ -49,8 +49,14 @@ static char *get_gpu_model_name(struct panthor_device *ptdev)
 	const u8 shader_core_count = hweight64(ptdev->gpu_info.shader_present);
 
 	switch (product_id) {
+	case GPU_PROD_ID_MAKE(10, 2):
+		return "Mali-G710";
 	case GPU_PROD_ID_MAKE(10, 7):
 		return "Mali-G610";
+	case GPU_PROD_ID_MAKE(10, 3):
+		return "Mali-G510";
+	case GPU_PROD_ID_MAKE(10, 4):
+		return "Mali-G310";
 	case GPU_PROD_ID_MAKE(11, 2):
 		if (shader_core_count > 10 && ray_intersection)
 			return "Mali-G715-Immortalis";
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 9/9] drm/panthor: Add support for Mali-G710, Mali-G510, and Mali-G310
  2025-03-20 11:17 ` [PATCH v2 9/9] drm/panthor: Add support for Mali-G710, Mali-G510, and Mali-G310 Karunika Choo
@ 2025-03-20 19:03   ` Liviu Dudau
  0 siblings, 0 replies; 26+ messages in thread
From: Liviu Dudau @ 2025-03-20 19:03 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Boris Brezillon, Steven Price, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Thu, Mar 20, 2025 at 11:17:41AM +0000, Karunika Choo wrote:
> This patch adds GPU model name and FW binary support for Mali-G710,
> Mali-G510, and Mali-G310.
> 
> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
> ---
>  drivers/gpu/drm/panthor/panthor_fw.c | 2 ++
>  drivers/gpu/drm/panthor/panthor_hw.c | 6 ++++++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index 8a967af0e2b4..7050adfaa8b6 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -1398,6 +1398,8 @@ int panthor_fw_init(struct panthor_device *ptdev)
>  }
>  
>  MODULE_FIRMWARE("arm/mali/arch10.8/mali_csffw.bin");
> +MODULE_FIRMWARE("arm/mali/arch10.10/mali_csffw.bin");
> +MODULE_FIRMWARE("arm/mali/arch10.12/mali_csffw.bin");
>  MODULE_FIRMWARE("arm/mali/arch11.8/mali_csffw.bin");
>  MODULE_FIRMWARE("arm/mali/arch12.8/mali_csffw.bin");
>  MODULE_FIRMWARE("arm/mali/arch13.8/mali_csffw.bin");

Firmware available for testing here:

https://gitlab.com/dliviu/linux-firmware

Best regards,
Liviu

> diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
> index e6354304bbef..6f18b7420f90 100644
> --- a/drivers/gpu/drm/panthor/panthor_hw.c
> +++ b/drivers/gpu/drm/panthor/panthor_hw.c
> @@ -49,8 +49,14 @@ static char *get_gpu_model_name(struct panthor_device *ptdev)
>  	const u8 shader_core_count = hweight64(ptdev->gpu_info.shader_present);
>  
>  	switch (product_id) {
> +	case GPU_PROD_ID_MAKE(10, 2):
> +		return "Mali-G710";
>  	case GPU_PROD_ID_MAKE(10, 7):
>  		return "Mali-G610";
> +	case GPU_PROD_ID_MAKE(10, 3):
> +		return "Mali-G510";
> +	case GPU_PROD_ID_MAKE(10, 4):
> +		return "Mali-G310";
>  	case GPU_PROD_ID_MAKE(11, 2):
>  		if (shader_core_count > 10 && ray_intersection)
>  			return "Mali-G715-Immortalis";
> -- 
> 2.47.1
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/9] drm/panthor: Add 64-bit and poll register accessors
  2025-03-20 11:17 ` [PATCH v2 1/9] drm/panthor: Add 64-bit and poll register accessors Karunika Choo
@ 2025-03-21  7:48   ` Boris Brezillon
  2025-04-09 13:00     ` Karunika Choo
  0 siblings, 1 reply; 26+ messages in thread
From: Boris Brezillon @ 2025-03-21  7:48 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Thu, 20 Mar 2025 11:17:33 +0000
Karunika Choo <karunika.choo@arm.com> wrote:

> This patch adds 64-bit register accessors to simplify register access in
> Panthor. It also adds 32-bit and 64-bit variants for read_poll_timeout.
> 
> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
> ---
>  drivers/gpu/drm/panthor/panthor_regs.h | 55 ++++++++++++++++++++++++++
>  1 file changed, 55 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_regs.h b/drivers/gpu/drm/panthor/panthor_regs.h
> index 42dc3fedb0d4..7ec4a1d04e20 100644
> --- a/drivers/gpu/drm/panthor/panthor_regs.h
> +++ b/drivers/gpu/drm/panthor/panthor_regs.h
> @@ -238,4 +238,59 @@
>  #define gpu_read(dev, reg) \
>  	readl((dev)->iomem + (reg))
>  
> +#define gpu_read_relaxed(dev, reg) readl_relaxed((dev)->iomem + (reg))
> +
> +#define gpu_write64(dev, reg, data)                            \
> +	do {                                                   \
> +		u64 __val = (u64)(data);                       \
> +		gpu_write(dev, reg, lower_32_bits(__val));     \
> +		gpu_write(dev, reg + 4, upper_32_bits(__val)); \
> +	} while (0)

We're not doing funky name concatenation in these macros, so I'd rather
have them defined as static inline funcs in panthor_device.h. We
probably want to move the gpu_read/write definitions there as well if
we do that.

> +
> +#define gpu_read64(dev, reg) \
> +	(gpu_read(dev, reg) | ((u64)gpu_read(dev, reg + 4) << 32))
> +
> +#define gpu_read64_relaxed(dev, reg)  \
> +	(gpu_read_relaxed(dev, reg) | \
> +	 ((u64)gpu_read_relaxed(dev, reg + 4) << 32))
> +
> +#define gpu_read64_sync(dev, reg)                     \
> +	({                                            \
> +		u32 lo, hi1, hi2;                     \
> +		do {                                  \
> +			hi1 = gpu_read(dev, reg + 4); \
> +			lo = gpu_read(dev, reg);      \
> +			hi2 = gpu_read(dev, reg + 4); \
> +		} while (hi1 != hi2);                 \
> +		lo | ((u64)hi2 << 32);                \
> +	})

I would name that one gpu_read64_counter and make it a static inline
function. Note that we already have panthor_gpu_read_64bit_counter()
which does the same thing, so maybe move it there and rename it along
the way.

> +
> +#define gpu_read_poll_timeout(dev, reg, val, cond, delay_us, timeout_us)    \
> +	read_poll_timeout(gpu_read, val, cond, delay_us, timeout_us, false, \
> +			  dev, reg)
> +
> +#define gpu_read_poll_timeout_atomic(dev, reg, val, cond, delay_us,         \
> +				     timeout_us)                            \
> +	read_poll_timeout_atomic(gpu_read, val, cond, delay_us, timeout_us, \
> +				 false, dev, reg)
> +
> +#define gpu_read64_poll_timeout(dev, reg, val, cond, delay_us, timeout_us)    \
> +	read_poll_timeout(gpu_read64, val, cond, delay_us, timeout_us, false, \
> +			  dev, reg)
> +
> +#define gpu_read64_poll_timeout_atomic(dev, reg, val, cond, delay_us,         \
> +				       timeout_us)                            \
> +	read_poll_timeout_atomic(gpu_read64, val, cond, delay_us, timeout_us, \
> +				 false, dev, reg)
> +
> +#define gpu_read_relaxed_poll_timeout_atomic(dev, reg, val, cond, delay_us, \
> +					     timeout_us)                    \
> +	read_poll_timeout_atomic(gpu_read_relaxed, val, cond, delay_us,     \
> +				 timeout_us, false, dev, reg)
> +
> +#define gpu_read64_relaxed_poll_timeout(dev, reg, val, cond, delay_us,         \
> +					timeout_us)                            \
> +	read_poll_timeout(gpu_read64_relaxed, val, cond, delay_us, timeout_us, \
> +			  false, dev, reg)
> +
>  #endif


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/9] drm/panthor: Use 64-bit and poll register accessors
  2025-03-20 11:17 ` [PATCH v2 2/9] drm/panthor: Use " Karunika Choo
@ 2025-03-21  7:53   ` Boris Brezillon
  2025-04-09 13:07     ` Karunika Choo
  0 siblings, 1 reply; 26+ messages in thread
From: Boris Brezillon @ 2025-03-21  7:53 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Thu, 20 Mar 2025 11:17:34 +0000
Karunika Choo <karunika.choo@arm.com> wrote:

> This patch updates Panthor to use the new 64-bit accessors and poll
> functions.

nit: I don't think it makes sense to dissociate the introduction of the
new helpers and their use. Could we squash this patch into the previous
one?

> 
> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
> ---
>  drivers/gpu/drm/panthor/panthor_fw.c  |   9 +-
>  drivers/gpu/drm/panthor/panthor_gpu.c | 142 +++++++-------------------
>  drivers/gpu/drm/panthor/panthor_mmu.c |  34 ++----
>  3 files changed, 53 insertions(+), 132 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index 0f52766a3120..ecfbe0456f89 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -1059,8 +1059,8 @@ static void panthor_fw_stop(struct panthor_device *ptdev)
>  	u32 status;
>  
>  	gpu_write(ptdev, MCU_CONTROL, MCU_CONTROL_DISABLE);
> -	if (readl_poll_timeout(ptdev->iomem + MCU_STATUS, status,
> -			       status == MCU_STATUS_DISABLED, 10, 100000))
> +	if (gpu_read_poll_timeout(ptdev, MCU_STATUS, status,
> +				  status == MCU_STATUS_DISABLED, 10, 100000))
>  		drm_err(&ptdev->base, "Failed to stop MCU");
>  }
>  
> @@ -1085,8 +1085,9 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
>  
>  		panthor_fw_update_reqs(glb_iface, req, GLB_HALT, GLB_HALT);
>  		gpu_write(ptdev, CSF_DOORBELL(CSF_GLB_DOORBELL_ID), 1);
> -		if (!readl_poll_timeout(ptdev->iomem + MCU_STATUS, status,
> -					status == MCU_STATUS_HALT, 10, 100000)) {
> +		if (!gpu_read_poll_timeout(ptdev, MCU_STATUS, status,
> +					   status == MCU_STATUS_HALT, 10,
> +					   100000)) {
>  			ptdev->reset.fast = true;
>  		} else {
>  			drm_warn(&ptdev->base, "Failed to cleanly suspend MCU");
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index 671049020afa..0dee011fe2e9 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -108,14 +108,9 @@ static void panthor_gpu_init_info(struct panthor_device *ptdev)
>  
>  	ptdev->gpu_info.as_present = gpu_read(ptdev, GPU_AS_PRESENT);
>  
> -	ptdev->gpu_info.shader_present = gpu_read(ptdev, GPU_SHADER_PRESENT_LO);
> -	ptdev->gpu_info.shader_present |= (u64)gpu_read(ptdev, GPU_SHADER_PRESENT_HI) << 32;
> -
> -	ptdev->gpu_info.tiler_present = gpu_read(ptdev, GPU_TILER_PRESENT_LO);
> -	ptdev->gpu_info.tiler_present |= (u64)gpu_read(ptdev, GPU_TILER_PRESENT_HI) << 32;
> -
> -	ptdev->gpu_info.l2_present = gpu_read(ptdev, GPU_L2_PRESENT_LO);
> -	ptdev->gpu_info.l2_present |= (u64)gpu_read(ptdev, GPU_L2_PRESENT_HI) << 32;
> +	ptdev->gpu_info.shader_present = gpu_read64(ptdev, GPU_SHADER_PRESENT_LO);
> +	ptdev->gpu_info.tiler_present = gpu_read64(ptdev, GPU_TILER_PRESENT_LO);
> +	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
>  
>  	arch_major = GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id);
>  	product_major = GPU_PROD_MAJOR(ptdev->gpu_info.gpu_id);
> @@ -152,8 +147,7 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
>  {
>  	if (status & GPU_IRQ_FAULT) {
>  		u32 fault_status = gpu_read(ptdev, GPU_FAULT_STATUS);
> -		u64 address = ((u64)gpu_read(ptdev, GPU_FAULT_ADDR_HI) << 32) |
> -			      gpu_read(ptdev, GPU_FAULT_ADDR_LO);
> +		u64 address = gpu_read64(ptdev, GPU_FAULT_ADDR_LO);
>  
>  		drm_warn(&ptdev->base, "GPU Fault 0x%08x (%s) at 0x%016llx\n",
>  			 fault_status, panthor_exception_name(ptdev, fault_status & 0xFF),
> @@ -244,45 +238,27 @@ int panthor_gpu_block_power_off(struct panthor_device *ptdev,
>  				u32 pwroff_reg, u32 pwrtrans_reg,
>  				u64 mask, u32 timeout_us)
>  {
> -	u32 val, i;
> +	u32 val;
>  	int ret;
>  
> -	for (i = 0; i < 2; i++) {
> -		u32 mask32 = mask >> (i * 32);
> -
> -		if (!mask32)
> -			continue;
> -
> -		ret = readl_relaxed_poll_timeout(ptdev->iomem + pwrtrans_reg + (i * 4),
> -						 val, !(mask32 & val),
> -						 100, timeout_us);
> -		if (ret) {
> -			drm_err(&ptdev->base, "timeout waiting on %s:%llx power transition",
> -				blk_name, mask);
> -			return ret;
> -		}
> +	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
> +					      100, timeout_us);
> +	if (ret) {
> +		drm_err(&ptdev->base,
> +			"timeout waiting on %s:%llx power transition", blk_name,
> +			mask);
> +		return ret;
>  	}
>  
> -	if (mask & GENMASK(31, 0))
> -		gpu_write(ptdev, pwroff_reg, mask);
> -
> -	if (mask >> 32)
> -		gpu_write(ptdev, pwroff_reg + 4, mask >> 32);
> -
> -	for (i = 0; i < 2; i++) {
> -		u32 mask32 = mask >> (i * 32);
> +	gpu_write64(ptdev, pwroff_reg, mask);
>  
> -		if (!mask32)
> -			continue;
> -
> -		ret = readl_relaxed_poll_timeout(ptdev->iomem + pwrtrans_reg + (i * 4),
> -						 val, !(mask32 & val),
> -						 100, timeout_us);
> -		if (ret) {
> -			drm_err(&ptdev->base, "timeout waiting on %s:%llx power transition",
> -				blk_name, mask);
> -			return ret;
> -		}
> +	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
> +					      100, timeout_us);
> +	if (ret) {
> +		drm_err(&ptdev->base,
> +			"timeout waiting on %s:%llx power transition", blk_name,
> +			mask);
> +		return ret;
>  	}
>  
>  	return 0;
> @@ -305,45 +281,26 @@ int panthor_gpu_block_power_on(struct panthor_device *ptdev,
>  			       u32 pwron_reg, u32 pwrtrans_reg,
>  			       u32 rdy_reg, u64 mask, u32 timeout_us)
>  {
> -	u32 val, i;
> +	u32 val;
>  	int ret;
>  
> -	for (i = 0; i < 2; i++) {
> -		u32 mask32 = mask >> (i * 32);
> -
> -		if (!mask32)
> -			continue;
> -
> -		ret = readl_relaxed_poll_timeout(ptdev->iomem + pwrtrans_reg + (i * 4),
> -						 val, !(mask32 & val),
> -						 100, timeout_us);
> -		if (ret) {
> -			drm_err(&ptdev->base, "timeout waiting on %s:%llx power transition",
> -				blk_name, mask);
> -			return ret;
> -		}
> +	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
> +					      100, timeout_us);
> +	if (ret) {
> +		drm_err(&ptdev->base,
> +			"timeout waiting on %s:%llx power transition", blk_name,
> +			mask);
> +		return ret;
>  	}
>  
> -	if (mask & GENMASK(31, 0))
> -		gpu_write(ptdev, pwron_reg, mask);
> -
> -	if (mask >> 32)
> -		gpu_write(ptdev, pwron_reg + 4, mask >> 32);
> -
> -	for (i = 0; i < 2; i++) {
> -		u32 mask32 = mask >> (i * 32);
> +	gpu_write64(ptdev, pwron_reg, mask);
>  
> -		if (!mask32)
> -			continue;
> -
> -		ret = readl_relaxed_poll_timeout(ptdev->iomem + rdy_reg + (i * 4),
> -						 val, (mask32 & val) == mask32,
> -						 100, timeout_us);
> -		if (ret) {
> -			drm_err(&ptdev->base, "timeout waiting on %s:%llx readiness",
> -				blk_name, mask);
> -			return ret;
> -		}
> +	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
> +					      100, timeout_us);
> +	if (ret) {
> +		drm_err(&ptdev->base, "timeout waiting on %s:%llx readiness",
> +			blk_name, mask);
> +		return ret;
>  	}
>  
>  	return 0;
> @@ -492,26 +449,6 @@ void panthor_gpu_resume(struct panthor_device *ptdev)
>  	panthor_gpu_l2_power_on(ptdev);
>  }
>  
> -/**
> - * panthor_gpu_read_64bit_counter() - Read a 64-bit counter at a given offset.
> - * @ptdev: Device.
> - * @reg: The offset of the register to read.
> - *
> - * Return: The counter value.
> - */
> -static u64
> -panthor_gpu_read_64bit_counter(struct panthor_device *ptdev, u32 reg)
> -{
> -	u32 hi, lo;
> -
> -	do {
> -		hi = gpu_read(ptdev, reg + 0x4);
> -		lo = gpu_read(ptdev, reg);
> -	} while (hi != gpu_read(ptdev, reg + 0x4));
> -
> -	return ((u64)hi << 32) | lo;
> -}
> -
>  /**
>   * panthor_gpu_read_timestamp() - Read the timestamp register.
>   * @ptdev: Device.
> @@ -520,7 +457,7 @@ panthor_gpu_read_64bit_counter(struct panthor_device *ptdev, u32 reg)
>   */
>  u64 panthor_gpu_read_timestamp(struct panthor_device *ptdev)
>  {
> -	return panthor_gpu_read_64bit_counter(ptdev, GPU_TIMESTAMP_LO);
> +	return gpu_read64_sync(ptdev, GPU_TIMESTAMP_LO);
>  }
>  
>  /**
> @@ -531,10 +468,5 @@ u64 panthor_gpu_read_timestamp(struct panthor_device *ptdev)
>   */
>  u64 panthor_gpu_read_timestamp_offset(struct panthor_device *ptdev)
>  {
> -	u32 hi, lo;
> -
> -	hi = gpu_read(ptdev, GPU_TIMESTAMP_OFFSET_HI);
> -	lo = gpu_read(ptdev, GPU_TIMESTAMP_OFFSET_LO);
> -
> -	return ((u64)hi << 32) | lo;
> +	return gpu_read64(ptdev, GPU_TIMESTAMP_OFFSET_LO);
>  }
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index 12a02e28f50f..a0a79f19bdea 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -510,9 +510,9 @@ static int wait_ready(struct panthor_device *ptdev, u32 as_nr)
>  	/* Wait for the MMU status to indicate there is no active command, in
>  	 * case one is pending.
>  	 */
> -	ret = readl_relaxed_poll_timeout_atomic(ptdev->iomem + AS_STATUS(as_nr),
> -						val, !(val & AS_STATUS_AS_ACTIVE),
> -						10, 100000);
> +	ret = gpu_read_relaxed_poll_timeout_atomic(ptdev, AS_STATUS(as_nr), val,
> +						   !(val & AS_STATUS_AS_ACTIVE),
> +						   10, 100000);
>  
>  	if (ret) {
>  		panthor_device_schedule_reset(ptdev);
> @@ -564,8 +564,7 @@ static void lock_region(struct panthor_device *ptdev, u32 as_nr,
>  	region = region_width | region_start;
>  
>  	/* Lock the region that needs to be updated */
> -	gpu_write(ptdev, AS_LOCKADDR_LO(as_nr), lower_32_bits(region));
> -	gpu_write(ptdev, AS_LOCKADDR_HI(as_nr), upper_32_bits(region));
> +	gpu_write64(ptdev, AS_LOCKADDR_LO(as_nr), region);
>  	write_cmd(ptdev, as_nr, AS_COMMAND_LOCK);
>  }
>  
> @@ -615,14 +614,9 @@ static int panthor_mmu_as_enable(struct panthor_device *ptdev, u32 as_nr,
>  	if (ret)
>  		return ret;
>  
> -	gpu_write(ptdev, AS_TRANSTAB_LO(as_nr), lower_32_bits(transtab));
> -	gpu_write(ptdev, AS_TRANSTAB_HI(as_nr), upper_32_bits(transtab));
> -
> -	gpu_write(ptdev, AS_MEMATTR_LO(as_nr), lower_32_bits(memattr));
> -	gpu_write(ptdev, AS_MEMATTR_HI(as_nr), upper_32_bits(memattr));
> -
> -	gpu_write(ptdev, AS_TRANSCFG_LO(as_nr), lower_32_bits(transcfg));
> -	gpu_write(ptdev, AS_TRANSCFG_HI(as_nr), upper_32_bits(transcfg));
> +	gpu_write64(ptdev, AS_TRANSTAB_LO(as_nr), transtab);
> +	gpu_write64(ptdev, AS_MEMATTR_LO(as_nr), memattr);
> +	gpu_write64(ptdev, AS_TRANSCFG_LO(as_nr), transcfg);
>  
>  	return write_cmd(ptdev, as_nr, AS_COMMAND_UPDATE);
>  }
> @@ -635,14 +629,9 @@ static int panthor_mmu_as_disable(struct panthor_device *ptdev, u32 as_nr)
>  	if (ret)
>  		return ret;
>  
> -	gpu_write(ptdev, AS_TRANSTAB_LO(as_nr), 0);
> -	gpu_write(ptdev, AS_TRANSTAB_HI(as_nr), 0);
> -
> -	gpu_write(ptdev, AS_MEMATTR_LO(as_nr), 0);
> -	gpu_write(ptdev, AS_MEMATTR_HI(as_nr), 0);
> -
> -	gpu_write(ptdev, AS_TRANSCFG_LO(as_nr), AS_TRANSCFG_ADRMODE_UNMAPPED);
> -	gpu_write(ptdev, AS_TRANSCFG_HI(as_nr), 0);
> +	gpu_write64(ptdev, AS_TRANSTAB_LO(as_nr), 0);
> +	gpu_write64(ptdev, AS_MEMATTR_LO(as_nr), 0);
> +	gpu_write64(ptdev, AS_TRANSCFG_LO(as_nr), AS_TRANSCFG_ADRMODE_UNMAPPED);
>  
>  	return write_cmd(ptdev, as_nr, AS_COMMAND_UPDATE);
>  }
> @@ -1680,8 +1669,7 @@ static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
>  		u32 source_id;
>  
>  		fault_status = gpu_read(ptdev, AS_FAULTSTATUS(as));
> -		addr = gpu_read(ptdev, AS_FAULTADDRESS_LO(as));
> -		addr |= (u64)gpu_read(ptdev, AS_FAULTADDRESS_HI(as)) << 32;
> +		addr = gpu_read64(ptdev, AS_FAULTADDRESS_LO(as));
>  
>  		/* decode the fault status */
>  		exception_type = fault_status & 0xFF;


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 5/9] drm/panthor: Make getting GPU model name simple and extensible
  2025-03-20 11:17 ` [PATCH v2 5/9] drm/panthor: Make getting GPU model name simple and extensible Karunika Choo
@ 2025-03-21  8:02   ` Boris Brezillon
  2025-04-10 13:20     ` Karunika Choo
  0 siblings, 1 reply; 26+ messages in thread
From: Boris Brezillon @ 2025-03-21  8:02 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Thu, 20 Mar 2025 11:17:37 +0000
Karunika Choo <karunika.choo@arm.com> wrote:

> This patch replaces the previous panthor_model structure with a simple
> switch case based on the product_id, which is in the format of:
>         ((arch_major << 24) | product_major)
> 
> This not only simplifies the comparison, but also allows extending the
> function to accommodate naming differences based on GPU features.
> 
> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
> ---
>  drivers/gpu/drm/panthor/panthor_hw.c   | 63 +++++++-------------------
>  drivers/gpu/drm/panthor/panthor_regs.h |  1 +
>  2 files changed, 18 insertions(+), 46 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
> index 4cc4b0d5382c..12183c04cd21 100644
> --- a/drivers/gpu/drm/panthor/panthor_hw.c
> +++ b/drivers/gpu/drm/panthor/panthor_hw.c
> @@ -5,40 +5,6 @@
>  #include "panthor_hw.h"
>  #include "panthor_regs.h"
>  
> -/**
> - * struct panthor_model - GPU model description
> - */
> -struct panthor_model {
> -	/** @name: Model name. */
> -	const char *name;
> -
> -	/** @arch_major: Major version number of architecture. */
> -	u8 arch_major;
> -
> -	/** @product_major: Major version number of product. */
> -	u8 product_major;
> -};
> -
> -/**
> - * GPU_MODEL() - Define a GPU model. A GPU product can be uniquely identified
> - * by a combination of the major architecture version and the major product
> - * version.
> - * @_name: Name for the GPU model.
> - * @_arch_major: Architecture major.
> - * @_product_major: Product major.
> - */
> -#define GPU_MODEL(_name, _arch_major, _product_major) \
> -{\
> -	.name = __stringify(_name),				\
> -	.arch_major = _arch_major,				\
> -	.product_major = _product_major,			\
> -}
> -
> -static const struct panthor_model gpu_models[] = {
> -	GPU_MODEL(g610, 10, 7),
> -	{},
> -};
> -
>  static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
>  {
>  	unsigned int i;
> @@ -66,29 +32,34 @@ static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
>  	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
>  }
>  
> +static char *get_gpu_model_name(struct panthor_device *ptdev)
> +{
> +	const u32 gpu_id = ptdev->gpu_info.gpu_id;
> +	const u32 product_id = GPU_PROD_ID_MAKE(GPU_ARCH_MAJOR(gpu_id),
> +						GPU_PROD_MAJOR(gpu_id));
> +
> +	switch (product_id) {
> +	case GPU_PROD_ID_MAKE(10, 7):
> +		return "Mali-G610";
> +	}

I a big fan of these ever growing switch statements with nested
conditionals. Could we instead add an optional ::get_variant() callback
in panthor_model and have the following formatting:

	"Mali-%s%s%s", model->name,
		       model->get_variant ? "-" : "",
		       model->get_variant ? model->get_variant() : ""

> +
> +	return "(Unknown Mali GPU)";
> +}
> +
>  static void panthor_gpu_init_info(struct panthor_device *ptdev)
>  {
> -	const struct panthor_model *model;
> -	u32 arch_major, product_major;
> +	const char *gpu_model_name = get_gpu_model_name(ptdev);
>  	u32 major, minor, status;
>  
>  	ptdev->hw->ops.gpu_info_init(ptdev);
>  
> -	arch_major = GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id);
> -	product_major = GPU_PROD_MAJOR(ptdev->gpu_info.gpu_id);
>  	major = GPU_VER_MAJOR(ptdev->gpu_info.gpu_id);
>  	minor = GPU_VER_MINOR(ptdev->gpu_info.gpu_id);
>  	status = GPU_VER_STATUS(ptdev->gpu_info.gpu_id);
>  
> -	for (model = gpu_models; model->name; model++) {
> -		if (model->arch_major == arch_major &&
> -		    model->product_major == product_major)
> -			break;
> -	}
> -
>  	drm_info(&ptdev->base,
> -		 "mali-%s id 0x%x major 0x%x minor 0x%x status 0x%x",
> -		 model->name ?: "unknown", ptdev->gpu_info.gpu_id >> 16,
> +		 "%s id 0x%x major 0x%x minor 0x%x status 0x%x",
> +		 gpu_model_name, ptdev->gpu_info.gpu_id >> 16,
>  		 major, minor, status);
>  
>  	drm_info(&ptdev->base,
> diff --git a/drivers/gpu/drm/panthor/panthor_regs.h b/drivers/gpu/drm/panthor/panthor_regs.h
> index ba452c1dd644..d9e0769d6f1a 100644
> --- a/drivers/gpu/drm/panthor/panthor_regs.h
> +++ b/drivers/gpu/drm/panthor/panthor_regs.h
> @@ -20,6 +20,7 @@
>  #define   GPU_VER_STATUS(x)				((x) & GENMASK(3, 0))
>  
>  #define GPU_ARCH_ID_MAKE(major, minor, rev)		(((major) << 16) | ((minor) << 8) | (rev))
> +#define GPU_PROD_ID_MAKE(arch_major, prod_major)	(((arch_major) << 24) | (prod_major))
>  
>  #define GPU_L2_FEATURES					0x4
>  #define  GPU_L2_FEATURES_LINE_SIZE(x)			(1 << ((x) & GENMASK(7, 0)))


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/9] drm/panthor: Move GPU info initialization into panthor_hw.c
  2025-03-20 11:17 ` [PATCH v2 4/9] drm/panthor: Move GPU info initialization into panthor_hw.c Karunika Choo
@ 2025-03-21  8:16   ` Boris Brezillon
  2025-03-21  8:43     ` Boris Brezillon
  0 siblings, 1 reply; 26+ messages in thread
From: Boris Brezillon @ 2025-03-21  8:16 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Thu, 20 Mar 2025 11:17:36 +0000
Karunika Choo <karunika.choo@arm.com> wrote:

> This patch moves GPU info initialization into panthor_hw.c in
> preparation of handling GPU register changes. The GPU register reading
> operations to populate gpu_info are separated into an architecture
> specific arch_*_gpu_info_init() function and is called via the new
> function pointer abstraction under hw.ops.gpu_info_init().
> 
> Future GPU support will be performed by implementing a *_gpu_info_init()
> function specific to that architecture version. It can call any existing
> *_gpu_info_init() and extend it with additional register reads or
> provide an entirely different implementation.

Could you give us an insight into what the reg layout changes are? So
far, they were mostly unchanged between GPU gens, and I'd really
prefer we could keep the majority of them unchanged part of the commo 
discovery, and only add the missing reads in the ->gpu_info_init()
callback.

Note that I'm also working on abstracting mali device operations to add
JM support to panthor, and the only things I had to specialize are:

- CSF ID for CSF
- JS features/present masks for JM

The rest is just common. So what I have is a common gpu_init_info()
helper that reads all the regs excepts those two, and after that, I
have a device ops selection based on the arch major of the GPU ID [1].
The device-specific GPU info are then read as part of the
panthor_device_ops::init().

> 
> This patch will enable Panthor to support GPUs with changes to register
> offsets, size and fields.
> 
> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
> ---
>  drivers/gpu/drm/panthor/panthor_gpu.c |  95 -----------------------
>  drivers/gpu/drm/panthor/panthor_hw.c  | 105 ++++++++++++++++++++++++++
>  drivers/gpu/drm/panthor/panthor_hw.h  |   3 +-
>  3 files changed, 107 insertions(+), 96 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> index 0dee011fe2e9..fcdee8901482 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -37,40 +37,6 @@ struct panthor_gpu {
>  	wait_queue_head_t reqs_acked;
>  };
>  
> -/**
> - * struct panthor_model - GPU model description
> - */
> -struct panthor_model {
> -	/** @name: Model name. */
> -	const char *name;
> -
> -	/** @arch_major: Major version number of architecture. */
> -	u8 arch_major;
> -
> -	/** @product_major: Major version number of product. */
> -	u8 product_major;
> -};
> -
> -/**
> - * GPU_MODEL() - Define a GPU model. A GPU product can be uniquely identified
> - * by a combination of the major architecture version and the major product
> - * version.
> - * @_name: Name for the GPU model.
> - * @_arch_major: Architecture major.
> - * @_product_major: Product major.
> - */
> -#define GPU_MODEL(_name, _arch_major, _product_major) \
> -{\
> -	.name = __stringify(_name),				\
> -	.arch_major = _arch_major,				\
> -	.product_major = _product_major,			\
> -}
> -
> -static const struct panthor_model gpu_models[] = {
> -	GPU_MODEL(g610, 10, 7),
> -	{},
> -};
> -
>  #define GPU_INTERRUPTS_MASK	\
>  	(GPU_IRQ_FAULT | \
>  	 GPU_IRQ_PROTM_FAULT | \
> @@ -83,66 +49,6 @@ static void panthor_gpu_coherency_set(struct panthor_device *ptdev)
>  		ptdev->coherent ? GPU_COHERENCY_PROT_BIT(ACE_LITE) : GPU_COHERENCY_NONE);
>  }
>  
> -static void panthor_gpu_init_info(struct panthor_device *ptdev)
> -{
> -	const struct panthor_model *model;
> -	u32 arch_major, product_major;
> -	u32 major, minor, status;
> -	unsigned int i;
> -
> -	ptdev->gpu_info.gpu_id = gpu_read(ptdev, GPU_ID);
> -	ptdev->gpu_info.csf_id = gpu_read(ptdev, GPU_CSF_ID);
> -	ptdev->gpu_info.gpu_rev = gpu_read(ptdev, GPU_REVID);
> -	ptdev->gpu_info.core_features = gpu_read(ptdev, GPU_CORE_FEATURES);
> -	ptdev->gpu_info.l2_features = gpu_read(ptdev, GPU_L2_FEATURES);
> -	ptdev->gpu_info.tiler_features = gpu_read(ptdev, GPU_TILER_FEATURES);
> -	ptdev->gpu_info.mem_features = gpu_read(ptdev, GPU_MEM_FEATURES);
> -	ptdev->gpu_info.mmu_features = gpu_read(ptdev, GPU_MMU_FEATURES);
> -	ptdev->gpu_info.thread_features = gpu_read(ptdev, GPU_THREAD_FEATURES);
> -	ptdev->gpu_info.max_threads = gpu_read(ptdev, GPU_THREAD_MAX_THREADS);
> -	ptdev->gpu_info.thread_max_workgroup_size = gpu_read(ptdev, GPU_THREAD_MAX_WORKGROUP_SIZE);
> -	ptdev->gpu_info.thread_max_barrier_size = gpu_read(ptdev, GPU_THREAD_MAX_BARRIER_SIZE);
> -	ptdev->gpu_info.coherency_features = gpu_read(ptdev, GPU_COHERENCY_FEATURES);
> -	for (i = 0; i < 4; i++)
> -		ptdev->gpu_info.texture_features[i] = gpu_read(ptdev, GPU_TEXTURE_FEATURES(i));
> -
> -	ptdev->gpu_info.as_present = gpu_read(ptdev, GPU_AS_PRESENT);
> -
> -	ptdev->gpu_info.shader_present = gpu_read64(ptdev, GPU_SHADER_PRESENT_LO);
> -	ptdev->gpu_info.tiler_present = gpu_read64(ptdev, GPU_TILER_PRESENT_LO);
> -	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
> -
> -	arch_major = GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id);
> -	product_major = GPU_PROD_MAJOR(ptdev->gpu_info.gpu_id);
> -	major = GPU_VER_MAJOR(ptdev->gpu_info.gpu_id);
> -	minor = GPU_VER_MINOR(ptdev->gpu_info.gpu_id);
> -	status = GPU_VER_STATUS(ptdev->gpu_info.gpu_id);
> -
> -	for (model = gpu_models; model->name; model++) {
> -		if (model->arch_major == arch_major &&
> -		    model->product_major == product_major)
> -			break;
> -	}
> -
> -	drm_info(&ptdev->base,
> -		 "mali-%s id 0x%x major 0x%x minor 0x%x status 0x%x",
> -		 model->name ?: "unknown", ptdev->gpu_info.gpu_id >> 16,
> -		 major, minor, status);
> -
> -	drm_info(&ptdev->base,
> -		 "Features: L2:%#x Tiler:%#x Mem:%#x MMU:%#x AS:%#x",
> -		 ptdev->gpu_info.l2_features,
> -		 ptdev->gpu_info.tiler_features,
> -		 ptdev->gpu_info.mem_features,
> -		 ptdev->gpu_info.mmu_features,
> -		 ptdev->gpu_info.as_present);
> -
> -	drm_info(&ptdev->base,
> -		 "shader_present=0x%0llx l2_present=0x%0llx tiler_present=0x%0llx",
> -		 ptdev->gpu_info.shader_present, ptdev->gpu_info.l2_present,
> -		 ptdev->gpu_info.tiler_present);
> -}
> -
>  static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
>  {
>  	if (status & GPU_IRQ_FAULT) {
> @@ -203,7 +109,6 @@ int panthor_gpu_init(struct panthor_device *ptdev)
>  	spin_lock_init(&gpu->reqs_lock);
>  	init_waitqueue_head(&gpu->reqs_acked);
>  	ptdev->gpu = gpu;
> -	panthor_gpu_init_info(ptdev);
>  
>  	dma_set_max_seg_size(ptdev->base.dev, UINT_MAX);
>  	pa_bits = GPU_MMU_FEATURES_PA_BITS(ptdev->gpu_info.mmu_features);
> diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
> index 234bfd50cf0d..4cc4b0d5382c 100644
> --- a/drivers/gpu/drm/panthor/panthor_hw.c
> +++ b/drivers/gpu/drm/panthor/panthor_hw.c
> @@ -5,10 +5,113 @@
>  #include "panthor_hw.h"
>  #include "panthor_regs.h"
>  
> +/**
> + * struct panthor_model - GPU model description
> + */
> +struct panthor_model {
> +	/** @name: Model name. */
> +	const char *name;
> +
> +	/** @arch_major: Major version number of architecture. */
> +	u8 arch_major;
> +
> +	/** @product_major: Major version number of product. */
> +	u8 product_major;
> +};
> +
> +/**
> + * GPU_MODEL() - Define a GPU model. A GPU product can be uniquely identified
> + * by a combination of the major architecture version and the major product
> + * version.
> + * @_name: Name for the GPU model.
> + * @_arch_major: Architecture major.
> + * @_product_major: Product major.
> + */
> +#define GPU_MODEL(_name, _arch_major, _product_major) \
> +{\
> +	.name = __stringify(_name),				\
> +	.arch_major = _arch_major,				\
> +	.product_major = _product_major,			\
> +}
> +
> +static const struct panthor_model gpu_models[] = {
> +	GPU_MODEL(g610, 10, 7),
> +	{},
> +};
> +
> +static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
> +{
> +	unsigned int i;
> +
> +	ptdev->gpu_info.gpu_id = gpu_read(ptdev, GPU_ID);
> +	ptdev->gpu_info.csf_id = gpu_read(ptdev, GPU_CSF_ID);
> +	ptdev->gpu_info.gpu_rev = gpu_read(ptdev, GPU_REVID);
> +	ptdev->gpu_info.core_features = gpu_read(ptdev, GPU_CORE_FEATURES);
> +	ptdev->gpu_info.l2_features = gpu_read(ptdev, GPU_L2_FEATURES);
> +	ptdev->gpu_info.tiler_features = gpu_read(ptdev, GPU_TILER_FEATURES);
> +	ptdev->gpu_info.mem_features = gpu_read(ptdev, GPU_MEM_FEATURES);
> +	ptdev->gpu_info.mmu_features = gpu_read(ptdev, GPU_MMU_FEATURES);
> +	ptdev->gpu_info.thread_features = gpu_read(ptdev, GPU_THREAD_FEATURES);
> +	ptdev->gpu_info.max_threads = gpu_read(ptdev, GPU_THREAD_MAX_THREADS);
> +	ptdev->gpu_info.thread_max_workgroup_size = gpu_read(ptdev, GPU_THREAD_MAX_WORKGROUP_SIZE);
> +	ptdev->gpu_info.thread_max_barrier_size = gpu_read(ptdev, GPU_THREAD_MAX_BARRIER_SIZE);
> +	ptdev->gpu_info.coherency_features = gpu_read(ptdev, GPU_COHERENCY_FEATURES);
> +	for (i = 0; i < 4; i++)
> +		ptdev->gpu_info.texture_features[i] = gpu_read(ptdev, GPU_TEXTURE_FEATURES(i));
> +
> +	ptdev->gpu_info.as_present = gpu_read(ptdev, GPU_AS_PRESENT);
> +
> +	ptdev->gpu_info.shader_present = gpu_read64(ptdev, GPU_SHADER_PRESENT_LO);
> +	ptdev->gpu_info.tiler_present = gpu_read64(ptdev, GPU_TILER_PRESENT_LO);
> +	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
> +}
> +
> +static void panthor_gpu_init_info(struct panthor_device *ptdev)
> +{
> +	const struct panthor_model *model;
> +	u32 arch_major, product_major;
> +	u32 major, minor, status;
> +
> +	ptdev->hw->ops.gpu_info_init(ptdev);
> +
> +	arch_major = GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id);
> +	product_major = GPU_PROD_MAJOR(ptdev->gpu_info.gpu_id);
> +	major = GPU_VER_MAJOR(ptdev->gpu_info.gpu_id);
> +	minor = GPU_VER_MINOR(ptdev->gpu_info.gpu_id);
> +	status = GPU_VER_STATUS(ptdev->gpu_info.gpu_id);
> +
> +	for (model = gpu_models; model->name; model++) {
> +		if (model->arch_major == arch_major &&
> +		    model->product_major == product_major)
> +			break;
> +	}
> +
> +	drm_info(&ptdev->base,
> +		 "mali-%s id 0x%x major 0x%x minor 0x%x status 0x%x",
> +		 model->name ?: "unknown", ptdev->gpu_info.gpu_id >> 16,
> +		 major, minor, status);
> +
> +	drm_info(&ptdev->base,
> +		 "Features: L2:%#x Tiler:%#x Mem:%#x MMU:%#x AS:%#x",
> +		 ptdev->gpu_info.l2_features,
> +		 ptdev->gpu_info.tiler_features,
> +		 ptdev->gpu_info.mem_features,
> +		 ptdev->gpu_info.mmu_features,
> +		 ptdev->gpu_info.as_present);
> +
> +	drm_info(&ptdev->base,
> +		 "shader_present=0x%0llx l2_present=0x%0llx tiler_present=0x%0llx",
> +		 ptdev->gpu_info.shader_present, ptdev->gpu_info.l2_present,
> +		 ptdev->gpu_info.tiler_present);
> +}
> +
>  static struct panthor_hw panthor_hw_devices[] = {
>  	{
>  		.arch_id = GPU_ARCH_ID_MAKE(10, 0, 0),
>  		.arch_mask = GPU_ARCH_ID_MAKE(0xFF, 0, 0),
> +		.ops = {
> +			.gpu_info_init = arch_10_8_gpu_info_init,
> +		},
>  	},
>  };
>  
> @@ -59,6 +162,8 @@ int panthor_hw_init(struct panthor_device *ptdev)
>  
>  	ptdev->hw = hdev;
>  
> +	panthor_gpu_init_info(ptdev);
> +
>  	return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_hw.h b/drivers/gpu/drm/panthor/panthor_hw.h
> index 5eb0549ad333..dfe0f86c5d76 100644
> --- a/drivers/gpu/drm/panthor/panthor_hw.h
> +++ b/drivers/gpu/drm/panthor/panthor_hw.h
> @@ -31,7 +31,8 @@ struct panthor_hw_regmap {
>   * struct panthor_hw_ops - HW operations that are specific to a GPU
>   */
>  struct panthor_hw_ops {
> -
> +	/** @gpu_info_init: Function pointer to initialize GPU info. */
> +	void (*gpu_info_init)(struct panthor_device *ptdev);
>  };
>  
>  /**


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 3/9] drm/panthor: Add GPU specific initialization framework
  2025-03-20 11:17 ` [PATCH v2 3/9] drm/panthor: Add GPU specific initialization framework Karunika Choo
@ 2025-03-21  8:28   ` Boris Brezillon
  0 siblings, 0 replies; 26+ messages in thread
From: Boris Brezillon @ 2025-03-21  8:28 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Thu, 20 Mar 2025 11:17:35 +0000
Karunika Choo <karunika.choo@arm.com> wrote:

> This patch aims to lay the foundation to provide support for multiple
> Mali GPUs through a framework by which differences in registers,
> functionality, and features can be managed.
> 
> It introduces the concept of the arch_id which is a 32-bit ID in the
> format of ((arch_major << 16) | (arch_minor << 8) | arch_rev). The 8-bit
> fields of the arch_id provides future proofing past the 4-bit fields of
> the GPU_ID's arch_major, arch_minor, and arch_rev.
> 
> The arch_id is used to select the correct abstraction for the GPU, such
> as function pointers for operations specific to the GPU, base addresses
> describing changes in register offsets, and supported features.
> 
> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
> ---
>  drivers/gpu/drm/panthor/Makefile         |  1 +
>  drivers/gpu/drm/panthor/panthor_device.c |  5 ++
>  drivers/gpu/drm/panthor/panthor_device.h |  3 +
>  drivers/gpu/drm/panthor/panthor_hw.c     | 70 ++++++++++++++++++++++++
>  drivers/gpu/drm/panthor/panthor_hw.h     | 63 +++++++++++++++++++++
>  drivers/gpu/drm/panthor/panthor_regs.h   |  2 +
>  6 files changed, 144 insertions(+)
>  create mode 100644 drivers/gpu/drm/panthor/panthor_hw.c
>  create mode 100644 drivers/gpu/drm/panthor/panthor_hw.h
> 
> diff --git a/drivers/gpu/drm/panthor/Makefile b/drivers/gpu/drm/panthor/Makefile
> index 15294719b09c..02db21748c12 100644
> --- a/drivers/gpu/drm/panthor/Makefile
> +++ b/drivers/gpu/drm/panthor/Makefile
> @@ -8,6 +8,7 @@ panthor-y := \
>  	panthor_gem.o \
>  	panthor_gpu.o \
>  	panthor_heap.o \
> +	panthor_hw.o \
>  	panthor_mmu.o \
>  	panthor_sched.o
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index a9da1d1eeb70..a6fca6b3fabd 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -18,6 +18,7 @@
>  #include "panthor_device.h"
>  #include "panthor_fw.h"
>  #include "panthor_gpu.h"
> +#include "panthor_hw.h"
>  #include "panthor_mmu.h"
>  #include "panthor_regs.h"
>  #include "panthor_sched.h"
> @@ -243,6 +244,10 @@ int panthor_device_init(struct panthor_device *ptdev)
>  			return ret;
>  	}
>  
> +	ret = panthor_hw_init(ptdev);
> +	if (ret)
> +		goto err_rpm_put;
> +
>  	ret = panthor_gpu_init(ptdev);
>  	if (ret)
>  		goto err_rpm_put;
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index da6574021664..82741bf1a49b 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -120,6 +120,9 @@ struct panthor_device {
>  	/** @csif_info: Command stream interface information. */
>  	struct drm_panthor_csif_info csif_info;
>  
> +	/** @hw: GPU specific data. */
> +	struct panthor_hw *hw;
> +
>  	/** @gpu: GPU management data. */
>  	struct panthor_gpu *gpu;
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
> new file mode 100644
> index 000000000000..234bfd50cf0d
> --- /dev/null
> +++ b/drivers/gpu/drm/panthor/panthor_hw.c
> @@ -0,0 +1,70 @@
> +// SPDX-License-Identifier: GPL-2.0 or MIT
> +/* Copyright 2025 ARM Limited. All rights reserved. */
> +
> +#include "panthor_device.h"
> +#include "panthor_hw.h"
> +#include "panthor_regs.h"
> +
> +static struct panthor_hw panthor_hw_devices[] = {
> +	{
> +		.arch_id = GPU_ARCH_ID_MAKE(10, 0, 0),
> +		.arch_mask = GPU_ARCH_ID_MAKE(0xFF, 0, 0),
> +	},
> +};
> +
> +static int init_gpu_id(struct panthor_device *ptdev)
> +{
> +	ptdev->gpu_info.gpu_id = gpu_read(ptdev, GPU_ID);
> +
> +	if (!ptdev->gpu_info.gpu_id) {
> +		drm_err(&ptdev->base, "Invalid GPU ID (0x0)");
> +		return -ENXIO;
> +	}
> +
> +	return 0;
> +}
> +
> +int panthor_hw_init(struct panthor_device *ptdev)
> +{
> +	struct panthor_hw *hdev = NULL;
> +	u32 arch_id = 0;
> +	int i, ret;
> +
> +	ret = init_gpu_id(ptdev);
> +	if (ret)
> +		return ret;
> +
> +	arch_id = GPU_ARCH_ID_MAKE(GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id),
> +				   GPU_ARCH_MINOR(ptdev->gpu_info.gpu_id),
> +				   GPU_ARCH_REV(ptdev->gpu_info.gpu_id));
> +	if (!arch_id) {
> +		drm_err(&ptdev->base, "Invalid arch_id (0x0)");
> +		return -ENXIO;
> +	}
> +
> +	for (i = 0; i < ARRAY_SIZE(panthor_hw_devices); i++) {
> +		u32 mask = panthor_hw_devices[i].arch_mask;
> +		u32 hw_arch_id = panthor_hw_devices[i].arch_id;
> +
> +		if ((arch_id & mask) == (hw_arch_id & mask)) {
> +			hdev = &panthor_hw_devices[i];
> +			break;
> +		}
> +	}
> +
> +	if (!hdev) {
> +		drm_err(&ptdev->base, "Unsupported GPU (arch 0x%x)", arch_id);
> +		return -ENODEV;
> +	}
> +
> +	ptdev->hw = hdev;
> +
> +	return 0;
> +}
> +
> +bool panthor_hw_supports(struct panthor_device *ptdev,
> +			 enum panthor_hw_feature feature)
> +{
> +	return test_bit(feature, ptdev->hw->features);
> +}
> +
> diff --git a/drivers/gpu/drm/panthor/panthor_hw.h b/drivers/gpu/drm/panthor/panthor_hw.h
> new file mode 100644
> index 000000000000..5eb0549ad333
> --- /dev/null
> +++ b/drivers/gpu/drm/panthor/panthor_hw.h
> @@ -0,0 +1,63 @@
> +/* SPDX-License-Identifier: GPL-2.0 or MIT */
> +/* Copyright 2025 ARM Limited. All rights reserved. */
> +
> +#ifndef __PANTHOR_HW_H__
> +#define __PANTHOR_HW_H__
> +
> +#include <linux/types.h>
> +#include <linux/bitmap.h>
> +
> +struct panthor_device;
> +
> +/**
> + * enum panthor_hw_feature - Bit position of each HW feature
> + *
> + * Used to define GPU specific features based on the GPU architecture ID.
> + * New feature flags will be added with support for newer GPU architectures.
> + */
> +enum panthor_hw_feature {
> +	/** @PANTHOR_HW_FEATURES_END: Must be last. */
> +	PANTHOR_HW_FEATURES_END
> +};
> +
> +/**
> + * struct panthor_hw_regmap - Register offsets for specific register blocks
> + */
> +struct panthor_hw_regmap {
> +
> +};
> +
> +/**
> + * struct panthor_hw_ops - HW operations that are specific to a GPU
> + */
> +struct panthor_hw_ops {
> +
> +};
> +
> +/**
> + * struct panthor_hw - GPU specific register mapping and functions
> + */
> +struct panthor_hw {
> +	/** @arch_id: Architecture id to match against */
> +	u32 arch_id;
> +
> +	/** @arch_mask: Mask for architecture id comparison */
> +	u32 arch_mask;
> +
> +	/** @features: Bitmap containing panthor_hw_feature */
> +	DECLARE_BITMAP(features, PANTHOR_HW_FEATURES_END);
> +
> +	/** @map: Panthor regmap */
> +	struct panthor_hw_regmap map;
> +
> +	/** @ops: Panthor HW specific operations */
> +	struct panthor_hw_ops ops;

Do we really need per minor arch specialization if we already have per
GPU information through panthor_model?

The way I see it, we can have a device operation specialization per
major arch, then some tweaking done in the arch major init callback
based on the minor version. And the final tweaks applied per GPU model.

> +};
> +
> +int panthor_hw_init(struct panthor_device *ptdev);
> +
> +bool panthor_hw_supports(struct panthor_device *ptdev,
> +			 enum panthor_hw_feature feature);
> +
> +#endif /* __PANTHOR_HW_H__ */
> +
> diff --git a/drivers/gpu/drm/panthor/panthor_regs.h b/drivers/gpu/drm/panthor/panthor_regs.h
> index 7ec4a1d04e20..ba452c1dd644 100644
> --- a/drivers/gpu/drm/panthor/panthor_regs.h
> +++ b/drivers/gpu/drm/panthor/panthor_regs.h
> @@ -19,6 +19,8 @@
>  #define   GPU_VER_MINOR(x)				(((x) & GENMASK(11, 4)) >> 4)
>  #define   GPU_VER_STATUS(x)				((x) & GENMASK(3, 0))
>  
> +#define GPU_ARCH_ID_MAKE(major, minor, rev)		(((major) << 16) | ((minor) << 8) | (rev))
> +
>  #define GPU_L2_FEATURES					0x4
>  #define  GPU_L2_FEATURES_LINE_SIZE(x)			(1 << ((x) & GENMASK(7, 0)))
>  


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 6/9] drm/panthor: Add support for Mali-G715 family of GPUs
  2025-03-20 11:17 ` [PATCH v2 6/9] drm/panthor: Add support for Mali-G715 family of GPUs Karunika Choo
@ 2025-03-21  8:34   ` Boris Brezillon
  0 siblings, 0 replies; 26+ messages in thread
From: Boris Brezillon @ 2025-03-21  8:34 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Thu, 20 Mar 2025 11:17:38 +0000
Karunika Choo <karunika.choo@arm.com> wrote:

> Mali-G715 introduces a new GPU_FEATURES register that provides
> information about GPU-wide supported features. The register value will
> be passed on to userspace via gpu_info. It also adds the following
> registers that are specific to the kernel driver only:
> - ASN_HASH_0~2
> - DOORBELL_FEATURES
> - PRFCNT_FEATURES
> - SYSC_ALLOC0~7
> - SYSC_PBHA_OVERRIDE0~3
> 
> Additionally, Mali-G715 presents an 'Immortalis' naming variant
> depending on the shader core count and presence of Ray Intersection
> feature support.
> 
> This patch adds:
> - support for correctly identifying the model names for the Mali-G715
>   family of GPUs.
> - arch 11.8 FW binary support
> - reading and handling of GPU_FEATURES register
> 
> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
> ---
>  drivers/gpu/drm/panthor/panthor_fw.c   |  1 +
>  drivers/gpu/drm/panthor/panthor_hw.c   | 26 ++++++++++++++++++++++++++
>  drivers/gpu/drm/panthor/panthor_regs.h | 12 ++++++++++++
>  include/uapi/drm/panthor_drm.h         |  3 +++
>  4 files changed, 42 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index ecfbe0456f89..0b3fab95f26b 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -1398,3 +1398,4 @@ int panthor_fw_init(struct panthor_device *ptdev)
>  }
>  
>  MODULE_FIRMWARE("arm/mali/arch10.8/mali_csffw.bin");
> +MODULE_FIRMWARE("arm/mali/arch11.8/mali_csffw.bin");
> diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
> index 12183c04cd21..d04c8723ac98 100644
> --- a/drivers/gpu/drm/panthor/panthor_hw.c
> +++ b/drivers/gpu/drm/panthor/panthor_hw.c
> @@ -32,15 +32,34 @@ static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
>  	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
>  }
>  
> +static void arch_11_8_gpu_info_init(struct panthor_device *ptdev)
> +{
> +	arch_10_8_gpu_info_init(ptdev);
> +
> +	ptdev->gpu_info.gpu_features = gpu_read64(ptdev, GPU_FEATURES_LO);

That's typically the sort of specialization I would have done directly
in some csf_gpu_info_init() with a:

	if (ARCH_MAJOR(gpu_id) > 11) {
		ptdev->gpu_info.gpu_features =
			gpu_read64(ptdev, GPU_FEATURES_LO);
	}

I do see a benefit in abstracting things away when the layout is
completely different or when registers are conflicting, but that's not
the case AFAICT. I think for this kind of tweaks, we'd rather stick to a
single function with a few conditionals.

Actually, if the registers were readable and returning 0 on v10, you
don't even need the if (ARCH_MAJOR(gpu_id) > 11).

> +}
> +
>  static char *get_gpu_model_name(struct panthor_device *ptdev)
>  {
>  	const u32 gpu_id = ptdev->gpu_info.gpu_id;
>  	const u32 product_id = GPU_PROD_ID_MAKE(GPU_ARCH_MAJOR(gpu_id),
>  						GPU_PROD_MAJOR(gpu_id));
> +	const bool ray_intersection = !!(ptdev->gpu_info.gpu_features &
> +					 GPU_FEATURES_RAY_INTERSECTION);
> +	const u8 shader_core_count = hweight64(ptdev->gpu_info.shader_present);
>  
>  	switch (product_id) {
>  	case GPU_PROD_ID_MAKE(10, 7):
>  		return "Mali-G610";
> +	case GPU_PROD_ID_MAKE(11, 2):
> +		if (shader_core_count > 10 && ray_intersection)
> +			return "Mali-G715-Immortalis";
> +		else if (shader_core_count >= 7)
> +			return "Mali-G715";
> +
> +		fallthrough;
> +	case GPU_PROD_ID_MAKE(11, 3):
> +		return "Mali-G615";
>  	}
>  
>  	return "(Unknown Mali GPU)";
> @@ -84,6 +103,13 @@ static struct panthor_hw panthor_hw_devices[] = {
>  			.gpu_info_init = arch_10_8_gpu_info_init,
>  		},
>  	},
> +	{
> +		.arch_id = GPU_ARCH_ID_MAKE(11, 8, 0),
> +		.arch_mask = GPU_ARCH_ID_MAKE(0xFF, 0xFF, 0),
> +		.ops = {
> +			.gpu_info_init = arch_11_8_gpu_info_init,
> +		}
> +	},
>  };
>  
>  static int init_gpu_id(struct panthor_device *ptdev)
> diff --git a/drivers/gpu/drm/panthor/panthor_regs.h b/drivers/gpu/drm/panthor/panthor_regs.h
> index d9e0769d6f1a..7bc2d838e704 100644
> --- a/drivers/gpu/drm/panthor/panthor_regs.h
> +++ b/drivers/gpu/drm/panthor/panthor_regs.h
> @@ -74,6 +74,11 @@
>  #define GPU_PWR_OVERRIDE0				0x54
>  #define GPU_PWR_OVERRIDE1				0x58
>  
> +#define GPU_FEATURES_LO					0x60
> +#define GPU_FEATURES_HI					0x64
> +#define   GPU_FEATURES_RAY_INTERSECTION			BIT(2)
> +#define GPU_PRFCNT_FEATURES				0x68
> +
>  #define GPU_TIMESTAMP_OFFSET_LO				0x88
>  #define GPU_TIMESTAMP_OFFSET_HI				0x8C
>  #define GPU_CYCLE_COUNT_LO				0x90
> @@ -88,6 +93,8 @@
>  
>  #define GPU_TEXTURE_FEATURES(n)				(0xB0 + ((n) * 4))
>  
> +#define GPU_DOORBELL_FEATURES				0xC0
> +
>  #define GPU_SHADER_PRESENT_LO				0x100
>  #define GPU_SHADER_PRESENT_HI				0x104
>  #define GPU_TILER_PRESENT_LO				0x110
> @@ -132,6 +139,8 @@
>  
>  #define GPU_REVID					0x280
>  
> +#define GPU_ASN_HASH(n)					(0x2C0 + ((n) * 4))
> +
>  #define GPU_COHERENCY_FEATURES				0x300
>  #define GPU_COHERENCY_PROT_BIT(name)			BIT(GPU_COHERENCY_  ## name)
>  
> @@ -140,6 +149,9 @@
>  #define   GPU_COHERENCY_ACE_LITE			1
>  #define   GPU_COHERENCY_NONE				31
>  
> +#define GPU_SYSC_PBHA_OVERRIDE(n)			(0x320 + ((n) * 4))
> +#define GPU_SYSC_ALLOC(n)				(0x340 + ((n) * 4))
> +
>  #define MCU_CONTROL					0x700
>  #define MCU_CONTROL_ENABLE				1
>  #define MCU_CONTROL_AUTO				2
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index 97e2c4510e69..4aba8146af3b 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -307,6 +307,9 @@ struct drm_panthor_gpu_info {
>  
>  	/** @pad: MBZ. */
>  	__u32 pad;
> +
> +	/** @gpu_features: Bitmask describing supported GPU-wide features */
> +	__u64 gpu_features;
>  };
>  
>  /**


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 7/9] drm/panthor: Support GPU_CONTROL cache flush based on feature bit
  2025-03-20 11:17 ` [PATCH v2 7/9] drm/panthor: Support GPU_CONTROL cache flush based on feature bit Karunika Choo
@ 2025-03-21  8:41   ` Boris Brezillon
  0 siblings, 0 replies; 26+ messages in thread
From: Boris Brezillon @ 2025-03-21  8:41 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Thu, 20 Mar 2025 11:17:39 +0000
Karunika Choo <karunika.choo@arm.com> wrote:

> As the FLUSH_MEM and FLUSH_PT commands are deprecated in GPUs from
> Mali-G720 onwards, this patch adds support for performing cache
> maintenance via the FLUSH_CACHES command in GPU_CONTROL, in place of
> FLUSH_MEM and FLUSH_PT based on PANTHOR_HW_FEATURE_GPU_CTRL_CACHE_FLUSH
> feature bit.
> 
> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
> ---
>  drivers/gpu/drm/panthor/panthor_hw.h  |  6 +++++
>  drivers/gpu/drm/panthor/panthor_mmu.c | 35 +++++++++++++++++++++++++++
>  2 files changed, 41 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_hw.h b/drivers/gpu/drm/panthor/panthor_hw.h
> index dfe0f86c5d76..4d67fdfe86f9 100644
> --- a/drivers/gpu/drm/panthor/panthor_hw.h
> +++ b/drivers/gpu/drm/panthor/panthor_hw.h
> @@ -16,6 +16,12 @@ struct panthor_device;
>   * New feature flags will be added with support for newer GPU architectures.
>   */
>  enum panthor_hw_feature {
> +	/**
> +	 * @PANTHOR_HW_FEATURE_GPU_CTRL_CACHE_FLUSH: Perform cache maintenance
> +	 * via GPU_CONTROL.
> +	 */
> +	PANTHOR_HW_FEATURE_GPU_CTRL_CACHE_FLUSH,
> +
>  	/** @PANTHOR_HW_FEATURES_END: Must be last. */
>  	PANTHOR_HW_FEATURES_END
>  };
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index a0a79f19bdea..4ac8bf36177e 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -29,7 +29,9 @@
>  
>  #include "panthor_device.h"
>  #include "panthor_gem.h"
> +#include "panthor_gpu.h"
>  #include "panthor_heap.h"
> +#include "panthor_hw.h"
>  #include "panthor_mmu.h"
>  #include "panthor_regs.h"
>  #include "panthor_sched.h"
> @@ -568,6 +570,35 @@ static void lock_region(struct panthor_device *ptdev, u32 as_nr,
>  	write_cmd(ptdev, as_nr, AS_COMMAND_LOCK);
>  }
>  
> +static int mmu_hw_do_flush_on_gpu_ctrl(struct panthor_device *ptdev, int as_nr,
> +				       u32 op)
> +{
> +	const u32 l2_flush_op = CACHE_CLEAN | CACHE_INV;
> +	u32 lsc_flush_op = 0;
> +	int ret;
> +
> +	if (op == AS_COMMAND_FLUSH_MEM)
> +		lsc_flush_op = CACHE_CLEAN | CACHE_INV;
> +
> +	ret = wait_ready(ptdev, as_nr);
> +	if (ret)
> +		return ret;
> +
> +	ret = panthor_gpu_flush_caches(ptdev, l2_flush_op, lsc_flush_op, 0);
> +	if (ret)
> +		return ret;
> +
> +	/*
> +	 * Explicitly unlock the region as the AS is not unlocked automatically
> +	 * at the end of the GPU_CONTROL cache flush command, unlike
> +	 * AS_COMMAND_FLUSH_MEM or AS_COMMAND_FLUSH_PT.
> +	 */
> +	write_cmd(ptdev, as_nr, AS_COMMAND_UNLOCK);
> +
> +	/* Wait for the unlock command to complete */
> +	return wait_ready(ptdev, as_nr);
> +}
> +
>  static int mmu_hw_do_operation_locked(struct panthor_device *ptdev, int as_nr,
>  				      u64 iova, u64 size, u32 op)
>  {
> @@ -585,6 +616,10 @@ static int mmu_hw_do_operation_locked(struct panthor_device *ptdev, int as_nr,
>  	if (op != AS_COMMAND_UNLOCK)
>  		lock_region(ptdev, as_nr, iova, size);
>  
> +	if (panthor_hw_supports(ptdev,PANTHOR_HW_FEATURE_GPU_CTRL_CACHE_FLUSH))
> +		if (op == AS_COMMAND_FLUSH_MEM || op == AS_COMMAND_FLUSH_PT)
> +			return mmu_hw_do_flush_on_gpu_ctrl(ptdev, as_nr, op);

Can't we use this sequence on v10 as well? The GPU flush_cache command
exists there, so I'd rather switch all CSF HW to this sequence than
diverge on v11+.

> +
>  	/* Run the MMU operation */
>  	write_cmd(ptdev, as_nr, op);
>  


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/9] drm/panthor: Move GPU info initialization into panthor_hw.c
  2025-03-21  8:16   ` Boris Brezillon
@ 2025-03-21  8:43     ` Boris Brezillon
  0 siblings, 0 replies; 26+ messages in thread
From: Boris Brezillon @ 2025-03-21  8:43 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Fri, 21 Mar 2025 09:16:45 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> On Thu, 20 Mar 2025 11:17:36 +0000
> Karunika Choo <karunika.choo@arm.com> wrote:
> 
> > This patch moves GPU info initialization into panthor_hw.c in
> > preparation of handling GPU register changes. The GPU register reading
> > operations to populate gpu_info are separated into an architecture
> > specific arch_*_gpu_info_init() function and is called via the new
> > function pointer abstraction under hw.ops.gpu_info_init().
> > 
> > Future GPU support will be performed by implementing a *_gpu_info_init()
> > function specific to that architecture version. It can call any existing
> > *_gpu_info_init() and extend it with additional register reads or
> > provide an entirely different implementation.  
> 
> Could you give us an insight into what the reg layout changes are? So
> far, they were mostly unchanged between GPU gens, and I'd really
> prefer we could keep the majority of them unchanged part of the commo 
> discovery, and only add the missing reads in the ->gpu_info_init()
> callback.
> 
> Note that I'm also working on abstracting mali device operations to add
> JM support to panthor, and the only things I had to specialize are:
> 
> - CSF ID for CSF
> - JS features/present masks for JM
> 
> The rest is just common. So what I have is a common gpu_init_info()
> helper that reads all the regs excepts those two, and after that, I
> have a device ops selection based on the arch major of the GPU ID [1].
> The device-specific GPU info are then read as part of the
> panthor_device_ops::init().

With the link this time :-).

[1]https://gitlab.freedesktop.org/bbrezillon/linux/-/blob/panthor-jm/drivers/gpu/drm/panthor/panthor_device.c?ref_type=heads#L359

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/9] drm/panthor: Add 64-bit and poll register accessors
  2025-03-21  7:48   ` Boris Brezillon
@ 2025-04-09 13:00     ` Karunika Choo
  2025-04-10 13:28       ` Boris Brezillon
  0 siblings, 1 reply; 26+ messages in thread
From: Karunika Choo @ 2025-04-09 13:00 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On 21/03/2025 07:48, Boris Brezillon wrote:
> On Thu, 20 Mar 2025 11:17:33 +0000
> Karunika Choo <karunika.choo@arm.com> wrote:
>
>> This patch adds 64-bit register accessors to simplify register access in
>> Panthor. It also adds 32-bit and 64-bit variants for read_poll_timeout.
>>
>> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
>> ---
>>  drivers/gpu/drm/panthor/panthor_regs.h | 55 ++++++++++++++++++++++++++
>>  1 file changed, 55 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/panthor/panthor_regs.h
b/drivers/gpu/drm/panthor/panthor_regs.h
>> index 42dc3fedb0d4..7ec4a1d04e20 100644
>> --- a/drivers/gpu/drm/panthor/panthor_regs.h
>> +++ b/drivers/gpu/drm/panthor/panthor_regs.h
>> @@ -238,4 +238,59 @@
>>  #define gpu_read(dev, reg) \
>>  	readl((dev)->iomem + (reg))
>>
>> +#define gpu_read_relaxed(dev, reg) readl_relaxed((dev)->iomem + (reg))
>> +
>> +#define gpu_write64(dev, reg, data)                            \
>> +	do {                                                   \
>> +		u64 __val = (u64)(data);                       \
>> +		gpu_write(dev, reg, lower_32_bits(__val));     \
>> +		gpu_write(dev, reg + 4, upper_32_bits(__val)); \
>> +	} while (0)
>
> We're not doing funky name concatenation in these macros, so I'd rather
> have them defined as static inline funcs in panthor_device.h. We
> probably want to move the gpu_read/write definitions there as well if
> we do that.

I see where you're coming from, and it makes sense. I was thinking it
might be better to keep it in panthor_regs.h since that's the file we
include when accessing GPU registers. That said, we could certainly
convert them to static inline functions if you prefer.

>> +
>> +#define gpu_read64(dev, reg) \
>> +	(gpu_read(dev, reg) | ((u64)gpu_read(dev, reg + 4) << 32))
>> +
>> +#define gpu_read64_relaxed(dev, reg)  \
>> +	(gpu_read_relaxed(dev, reg) | \
>> +	 ((u64)gpu_read_relaxed(dev, reg + 4) << 32))
>> +
>> +#define gpu_read64_sync(dev, reg)                     \
>> +	({                                            \
>> +		u32 lo, hi1, hi2;                     \
>> +		do {                                  \
>> +			hi1 = gpu_read(dev, reg + 4); \
>> +			lo = gpu_read(dev, reg);      \
>> +			hi2 = gpu_read(dev, reg + 4); \
>> +		} while (hi1 != hi2);                 \
>> +		lo | ((u64)hi2 << 32);                \
>> +	})
>
> I would name that one gpu_read64_counter and make it a static inline
> function. Note that we already have panthor_gpu_read_64bit_counter()
> which does the same thing, so maybe move it there and rename it along
> the way.

Happy to rename this to gpu_read64_counter in v3, if you're okay with
us keeping the macros/functions in this file.

Kind regards,
Karunika

>> +
>> +#define gpu_read_poll_timeout(dev, reg, val, cond, delay_us,
timeout_us)    \
>> +	read_poll_timeout(gpu_read, val, cond, delay_us, timeout_us, false, \
>> +			  dev, reg)
>> +
>> +#define gpu_read_poll_timeout_atomic(dev, reg, val, cond, delay_us,
       \
>> +				     timeout_us)                            \
>> +	read_poll_timeout_atomic(gpu_read, val, cond, delay_us, timeout_us, \
>> +				 false, dev, reg)
>> +
>> +#define gpu_read64_poll_timeout(dev, reg, val, cond, delay_us,
timeout_us)    \
>> +	read_poll_timeout(gpu_read64, val, cond, delay_us, timeout_us, false, \
>> +			  dev, reg)
>> +
>> +#define gpu_read64_poll_timeout_atomic(dev, reg, val, cond,
delay_us,         \
>> +				       timeout_us)                            \
>> +	read_poll_timeout_atomic(gpu_read64, val, cond, delay_us, timeout_us, \
>> +				 false, dev, reg)
>> +
>> +#define gpu_read_relaxed_poll_timeout_atomic(dev, reg, val, cond,
delay_us, \
>> +					     timeout_us)                    \
>> +	read_poll_timeout_atomic(gpu_read_relaxed, val, cond, delay_us,     \
>> +				 timeout_us, false, dev, reg)
>> +
>> +#define gpu_read64_relaxed_poll_timeout(dev, reg, val, cond,
delay_us,         \
>> +					timeout_us)                            \
>> +	read_poll_timeout(gpu_read64_relaxed, val, cond, delay_us,
timeout_us, \
>> +			  false, dev, reg)
>> +
>>  #endif
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/9] drm/panthor: Use 64-bit and poll register accessors
  2025-03-21  7:53   ` Boris Brezillon
@ 2025-04-09 13:07     ` Karunika Choo
  2025-04-10 13:29       ` Boris Brezillon
  0 siblings, 1 reply; 26+ messages in thread
From: Karunika Choo @ 2025-04-09 13:07 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On 21/03/2025 07:53, Boris Brezillon wrote:
> On Thu, 20 Mar 2025 11:17:34 +0000
> Karunika Choo <karunika.choo@arm.com> wrote:
> 
>> This patch updates Panthor to use the new 64-bit accessors and poll
>> functions.
> 
> nit: I don't think it makes sense to dissociate the introduction of the
> new helpers and their use. Could we squash this patch into the previous
> one?

It was previously requested that I split the patches into two to ease
review. I can merge it back into the previous one in v3.

Kind regards,
Karunika Choo

>
>>
>> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
>> ---
>>  drivers/gpu/drm/panthor/panthor_fw.c  |   9 +-
>>  drivers/gpu/drm/panthor/panthor_gpu.c | 142 +++++++-------------------
>>  drivers/gpu/drm/panthor/panthor_mmu.c |  34 ++----
>>  3 files changed, 53 insertions(+), 132 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
>> index 0f52766a3120..ecfbe0456f89 100644
>> --- a/drivers/gpu/drm/panthor/panthor_fw.c
>> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
>> @@ -1059,8 +1059,8 @@ static void panthor_fw_stop(struct panthor_device *ptdev)
>>  	u32 status;
>>  
>>  	gpu_write(ptdev, MCU_CONTROL, MCU_CONTROL_DISABLE);
>> -	if (readl_poll_timeout(ptdev->iomem + MCU_STATUS, status,
>> -			       status == MCU_STATUS_DISABLED, 10, 100000))
>> +	if (gpu_read_poll_timeout(ptdev, MCU_STATUS, status,
>> +				  status == MCU_STATUS_DISABLED, 10, 100000))
>>  		drm_err(&ptdev->base, "Failed to stop MCU");
>>  }
>>  
>> @@ -1085,8 +1085,9 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
>>  
>>  		panthor_fw_update_reqs(glb_iface, req, GLB_HALT, GLB_HALT);
>>  		gpu_write(ptdev, CSF_DOORBELL(CSF_GLB_DOORBELL_ID), 1);
>> -		if (!readl_poll_timeout(ptdev->iomem + MCU_STATUS, status,
>> -					status == MCU_STATUS_HALT, 10, 100000)) {
>> +		if (!gpu_read_poll_timeout(ptdev, MCU_STATUS, status,
>> +					   status == MCU_STATUS_HALT, 10,
>> +					   100000)) {
>>  			ptdev->reset.fast = true;
>>  		} else {
>>  			drm_warn(&ptdev->base, "Failed to cleanly suspend MCU");
>> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
>> index 671049020afa..0dee011fe2e9 100644
>> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
>> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
>> @@ -108,14 +108,9 @@ static void panthor_gpu_init_info(struct panthor_device *ptdev)
>>  
>>  	ptdev->gpu_info.as_present = gpu_read(ptdev, GPU_AS_PRESENT);
>>  
>> -	ptdev->gpu_info.shader_present = gpu_read(ptdev, GPU_SHADER_PRESENT_LO);
>> -	ptdev->gpu_info.shader_present |= (u64)gpu_read(ptdev, GPU_SHADER_PRESENT_HI) << 32;
>> -
>> -	ptdev->gpu_info.tiler_present = gpu_read(ptdev, GPU_TILER_PRESENT_LO);
>> -	ptdev->gpu_info.tiler_present |= (u64)gpu_read(ptdev, GPU_TILER_PRESENT_HI) << 32;
>> -
>> -	ptdev->gpu_info.l2_present = gpu_read(ptdev, GPU_L2_PRESENT_LO);
>> -	ptdev->gpu_info.l2_present |= (u64)gpu_read(ptdev, GPU_L2_PRESENT_HI) << 32;
>> +	ptdev->gpu_info.shader_present = gpu_read64(ptdev, GPU_SHADER_PRESENT_LO);
>> +	ptdev->gpu_info.tiler_present = gpu_read64(ptdev, GPU_TILER_PRESENT_LO);
>> +	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
>>  
>>  	arch_major = GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id);
>>  	product_major = GPU_PROD_MAJOR(ptdev->gpu_info.gpu_id);
>> @@ -152,8 +147,7 @@ static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 status)
>>  {
>>  	if (status & GPU_IRQ_FAULT) {
>>  		u32 fault_status = gpu_read(ptdev, GPU_FAULT_STATUS);
>> -		u64 address = ((u64)gpu_read(ptdev, GPU_FAULT_ADDR_HI) << 32) |
>> -			      gpu_read(ptdev, GPU_FAULT_ADDR_LO);
>> +		u64 address = gpu_read64(ptdev, GPU_FAULT_ADDR_LO);
>>  
>>  		drm_warn(&ptdev->base, "GPU Fault 0x%08x (%s) at 0x%016llx\n",
>>  			 fault_status, panthor_exception_name(ptdev, fault_status & 0xFF),
>> @@ -244,45 +238,27 @@ int panthor_gpu_block_power_off(struct panthor_device *ptdev,
>>  				u32 pwroff_reg, u32 pwrtrans_reg,
>>  				u64 mask, u32 timeout_us)
>>  {
>> -	u32 val, i;
>> +	u32 val;
>>  	int ret;
>>  
>> -	for (i = 0; i < 2; i++) {
>> -		u32 mask32 = mask >> (i * 32);
>> -
>> -		if (!mask32)
>> -			continue;
>> -
>> -		ret = readl_relaxed_poll_timeout(ptdev->iomem + pwrtrans_reg + (i * 4),
>> -						 val, !(mask32 & val),
>> -						 100, timeout_us);
>> -		if (ret) {
>> -			drm_err(&ptdev->base, "timeout waiting on %s:%llx power transition",
>> -				blk_name, mask);
>> -			return ret;
>> -		}
>> +	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
>> +					      100, timeout_us);
>> +	if (ret) {
>> +		drm_err(&ptdev->base,
>> +			"timeout waiting on %s:%llx power transition", blk_name,
>> +			mask);
>> +		return ret;
>>  	}
>>  
>> -	if (mask & GENMASK(31, 0))
>> -		gpu_write(ptdev, pwroff_reg, mask);
>> -
>> -	if (mask >> 32)
>> -		gpu_write(ptdev, pwroff_reg + 4, mask >> 32);
>> -
>> -	for (i = 0; i < 2; i++) {
>> -		u32 mask32 = mask >> (i * 32);
>> +	gpu_write64(ptdev, pwroff_reg, mask);
>>  
>> -		if (!mask32)
>> -			continue;
>> -
>> -		ret = readl_relaxed_poll_timeout(ptdev->iomem + pwrtrans_reg + (i * 4),
>> -						 val, !(mask32 & val),
>> -						 100, timeout_us);
>> -		if (ret) {
>> -			drm_err(&ptdev->base, "timeout waiting on %s:%llx power transition",
>> -				blk_name, mask);
>> -			return ret;
>> -		}
>> +	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
>> +					      100, timeout_us);
>> +	if (ret) {
>> +		drm_err(&ptdev->base,
>> +			"timeout waiting on %s:%llx power transition", blk_name,
>> +			mask);
>> +		return ret;
>>  	}
>>  
>>  	return 0;
>> @@ -305,45 +281,26 @@ int panthor_gpu_block_power_on(struct panthor_device *ptdev,
>>  			       u32 pwron_reg, u32 pwrtrans_reg,
>>  			       u32 rdy_reg, u64 mask, u32 timeout_us)
>>  {
>> -	u32 val, i;
>> +	u32 val;
>>  	int ret;
>>  
>> -	for (i = 0; i < 2; i++) {
>> -		u32 mask32 = mask >> (i * 32);
>> -
>> -		if (!mask32)
>> -			continue;
>> -
>> -		ret = readl_relaxed_poll_timeout(ptdev->iomem + pwrtrans_reg + (i * 4),
>> -						 val, !(mask32 & val),
>> -						 100, timeout_us);
>> -		if (ret) {
>> -			drm_err(&ptdev->base, "timeout waiting on %s:%llx power transition",
>> -				blk_name, mask);
>> -			return ret;
>> -		}
>> +	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
>> +					      100, timeout_us);
>> +	if (ret) {
>> +		drm_err(&ptdev->base,
>> +			"timeout waiting on %s:%llx power transition", blk_name,
>> +			mask);
>> +		return ret;
>>  	}
>>  
>> -	if (mask & GENMASK(31, 0))
>> -		gpu_write(ptdev, pwron_reg, mask);
>> -
>> -	if (mask >> 32)
>> -		gpu_write(ptdev, pwron_reg + 4, mask >> 32);
>> -
>> -	for (i = 0; i < 2; i++) {
>> -		u32 mask32 = mask >> (i * 32);
>> +	gpu_write64(ptdev, pwron_reg, mask);
>>  
>> -		if (!mask32)
>> -			continue;
>> -
>> -		ret = readl_relaxed_poll_timeout(ptdev->iomem + rdy_reg + (i * 4),
>> -						 val, (mask32 & val) == mask32,
>> -						 100, timeout_us);
>> -		if (ret) {
>> -			drm_err(&ptdev->base, "timeout waiting on %s:%llx readiness",
>> -				blk_name, mask);
>> -			return ret;
>> -		}
>> +	ret = gpu_read64_relaxed_poll_timeout(ptdev, pwrtrans_reg, val, !val,
>> +					      100, timeout_us);
>> +	if (ret) {
>> +		drm_err(&ptdev->base, "timeout waiting on %s:%llx readiness",
>> +			blk_name, mask);
>> +		return ret;
>>  	}
>>  
>>  	return 0;
>> @@ -492,26 +449,6 @@ void panthor_gpu_resume(struct panthor_device *ptdev)
>>  	panthor_gpu_l2_power_on(ptdev);
>>  }
>>  
>> -/**
>> - * panthor_gpu_read_64bit_counter() - Read a 64-bit counter at a given offset.
>> - * @ptdev: Device.
>> - * @reg: The offset of the register to read.
>> - *
>> - * Return: The counter value.
>> - */
>> -static u64
>> -panthor_gpu_read_64bit_counter(struct panthor_device *ptdev, u32 reg)
>> -{
>> -	u32 hi, lo;
>> -
>> -	do {
>> -		hi = gpu_read(ptdev, reg + 0x4);
>> -		lo = gpu_read(ptdev, reg);
>> -	} while (hi != gpu_read(ptdev, reg + 0x4));
>> -
>> -	return ((u64)hi << 32) | lo;
>> -}
>> -
>>  /**
>>   * panthor_gpu_read_timestamp() - Read the timestamp register.
>>   * @ptdev: Device.
>> @@ -520,7 +457,7 @@ panthor_gpu_read_64bit_counter(struct panthor_device *ptdev, u32 reg)
>>   */
>>  u64 panthor_gpu_read_timestamp(struct panthor_device *ptdev)
>>  {
>> -	return panthor_gpu_read_64bit_counter(ptdev, GPU_TIMESTAMP_LO);
>> +	return gpu_read64_sync(ptdev, GPU_TIMESTAMP_LO);
>>  }
>>  
>>  /**
>> @@ -531,10 +468,5 @@ u64 panthor_gpu_read_timestamp(struct panthor_device *ptdev)
>>   */
>>  u64 panthor_gpu_read_timestamp_offset(struct panthor_device *ptdev)
>>  {
>> -	u32 hi, lo;
>> -
>> -	hi = gpu_read(ptdev, GPU_TIMESTAMP_OFFSET_HI);
>> -	lo = gpu_read(ptdev, GPU_TIMESTAMP_OFFSET_LO);
>> -
>> -	return ((u64)hi << 32) | lo;
>> +	return gpu_read64(ptdev, GPU_TIMESTAMP_OFFSET_LO);
>>  }
>> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
>> index 12a02e28f50f..a0a79f19bdea 100644
>> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
>> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
>> @@ -510,9 +510,9 @@ static int wait_ready(struct panthor_device *ptdev, u32 as_nr)
>>  	/* Wait for the MMU status to indicate there is no active command, in
>>  	 * case one is pending.
>>  	 */
>> -	ret = readl_relaxed_poll_timeout_atomic(ptdev->iomem + AS_STATUS(as_nr),
>> -						val, !(val & AS_STATUS_AS_ACTIVE),
>> -						10, 100000);
>> +	ret = gpu_read_relaxed_poll_timeout_atomic(ptdev, AS_STATUS(as_nr), val,
>> +						   !(val & AS_STATUS_AS_ACTIVE),
>> +						   10, 100000);
>>  
>>  	if (ret) {
>>  		panthor_device_schedule_reset(ptdev);
>> @@ -564,8 +564,7 @@ static void lock_region(struct panthor_device *ptdev, u32 as_nr,
>>  	region = region_width | region_start;
>>  
>>  	/* Lock the region that needs to be updated */
>> -	gpu_write(ptdev, AS_LOCKADDR_LO(as_nr), lower_32_bits(region));
>> -	gpu_write(ptdev, AS_LOCKADDR_HI(as_nr), upper_32_bits(region));
>> +	gpu_write64(ptdev, AS_LOCKADDR_LO(as_nr), region);
>>  	write_cmd(ptdev, as_nr, AS_COMMAND_LOCK);
>>  }
>>  
>> @@ -615,14 +614,9 @@ static int panthor_mmu_as_enable(struct panthor_device *ptdev, u32 as_nr,
>>  	if (ret)
>>  		return ret;
>>  
>> -	gpu_write(ptdev, AS_TRANSTAB_LO(as_nr), lower_32_bits(transtab));
>> -	gpu_write(ptdev, AS_TRANSTAB_HI(as_nr), upper_32_bits(transtab));
>> -
>> -	gpu_write(ptdev, AS_MEMATTR_LO(as_nr), lower_32_bits(memattr));
>> -	gpu_write(ptdev, AS_MEMATTR_HI(as_nr), upper_32_bits(memattr));
>> -
>> -	gpu_write(ptdev, AS_TRANSCFG_LO(as_nr), lower_32_bits(transcfg));
>> -	gpu_write(ptdev, AS_TRANSCFG_HI(as_nr), upper_32_bits(transcfg));
>> +	gpu_write64(ptdev, AS_TRANSTAB_LO(as_nr), transtab);
>> +	gpu_write64(ptdev, AS_MEMATTR_LO(as_nr), memattr);
>> +	gpu_write64(ptdev, AS_TRANSCFG_LO(as_nr), transcfg);
>>  
>>  	return write_cmd(ptdev, as_nr, AS_COMMAND_UPDATE);
>>  }
>> @@ -635,14 +629,9 @@ static int panthor_mmu_as_disable(struct panthor_device *ptdev, u32 as_nr)
>>  	if (ret)
>>  		return ret;
>>  
>> -	gpu_write(ptdev, AS_TRANSTAB_LO(as_nr), 0);
>> -	gpu_write(ptdev, AS_TRANSTAB_HI(as_nr), 0);
>> -
>> -	gpu_write(ptdev, AS_MEMATTR_LO(as_nr), 0);
>> -	gpu_write(ptdev, AS_MEMATTR_HI(as_nr), 0);
>> -
>> -	gpu_write(ptdev, AS_TRANSCFG_LO(as_nr), AS_TRANSCFG_ADRMODE_UNMAPPED);
>> -	gpu_write(ptdev, AS_TRANSCFG_HI(as_nr), 0);
>> +	gpu_write64(ptdev, AS_TRANSTAB_LO(as_nr), 0);
>> +	gpu_write64(ptdev, AS_MEMATTR_LO(as_nr), 0);
>> +	gpu_write64(ptdev, AS_TRANSCFG_LO(as_nr), AS_TRANSCFG_ADRMODE_UNMAPPED);
>>  
>>  	return write_cmd(ptdev, as_nr, AS_COMMAND_UPDATE);
>>  }
>> @@ -1680,8 +1669,7 @@ static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
>>  		u32 source_id;
>>  
>>  		fault_status = gpu_read(ptdev, AS_FAULTSTATUS(as));
>> -		addr = gpu_read(ptdev, AS_FAULTADDRESS_LO(as));
>> -		addr |= (u64)gpu_read(ptdev, AS_FAULTADDRESS_HI(as)) << 32;
>> +		addr = gpu_read64(ptdev, AS_FAULTADDRESS_LO(as));
>>  
>>  		/* decode the fault status */
>>  		exception_type = fault_status & 0xFF;
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 5/9] drm/panthor: Make getting GPU model name simple and extensible
  2025-03-21  8:02   ` Boris Brezillon
@ 2025-04-10 13:20     ` Karunika Choo
  2025-04-10 13:37       ` Boris Brezillon
  0 siblings, 1 reply; 26+ messages in thread
From: Karunika Choo @ 2025-04-10 13:20 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On 21/03/2025 08:02, Boris Brezillon wrote:
> On Thu, 20 Mar 2025 11:17:37 +0000
> Karunika Choo <karunika.choo@arm.com> wrote:
> 
>> This patch replaces the previous panthor_model structure with a simple
>> switch case based on the product_id, which is in the format of:
>>         ((arch_major << 24) | product_major)
>>
>> This not only simplifies the comparison, but also allows extending the
>> function to accommodate naming differences based on GPU features.
>>
>> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
>> ---
>>  drivers/gpu/drm/panthor/panthor_hw.c   | 63 +++++++-------------------
>>  drivers/gpu/drm/panthor/panthor_regs.h |  1 +
>>  2 files changed, 18 insertions(+), 46 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
>> index 4cc4b0d5382c..12183c04cd21 100644
>> --- a/drivers/gpu/drm/panthor/panthor_hw.c
>> +++ b/drivers/gpu/drm/panthor/panthor_hw.c
>> @@ -5,40 +5,6 @@
>>  #include "panthor_hw.h"
>>  #include "panthor_regs.h"
>>  
>> -/**
>> - * struct panthor_model - GPU model description
>> - */
>> -struct panthor_model {
>> -	/** @name: Model name. */
>> -	const char *name;
>> -
>> -	/** @arch_major: Major version number of architecture. */
>> -	u8 arch_major;
>> -
>> -	/** @product_major: Major version number of product. */
>> -	u8 product_major;
>> -};
>> -
>> -/**
>> - * GPU_MODEL() - Define a GPU model. A GPU product can be uniquely identified
>> - * by a combination of the major architecture version and the major product
>> - * version.
>> - * @_name: Name for the GPU model.
>> - * @_arch_major: Architecture major.
>> - * @_product_major: Product major.
>> - */
>> -#define GPU_MODEL(_name, _arch_major, _product_major) \
>> -{\
>> -	.name = __stringify(_name),				\
>> -	.arch_major = _arch_major,				\
>> -	.product_major = _product_major,			\
>> -}
>> -
>> -static const struct panthor_model gpu_models[] = {
>> -	GPU_MODEL(g610, 10, 7),
>> -	{},
>> -};
>> -
>>  static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
>>  {
>>  	unsigned int i;
>> @@ -66,29 +32,34 @@ static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
>>  	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
>>  }
>>  
>> +static char *get_gpu_model_name(struct panthor_device *ptdev)
>> +{
>> +	const u32 gpu_id = ptdev->gpu_info.gpu_id;
>> +	const u32 product_id = GPU_PROD_ID_MAKE(GPU_ARCH_MAJOR(gpu_id),
>> +						GPU_PROD_MAJOR(gpu_id));
>> +
>> +	switch (product_id) {
>> +	case GPU_PROD_ID_MAKE(10, 7):
>> +		return "Mali-G610";
>> +	}
> 
> I a big fan of these ever growing switch statements with nested
> conditionals. Could we instead add an optional ::get_variant() callback
> in panthor_model and have the following formatting:
> 
> 	"Mali-%s%s%s", model->name,
> 		       model->get_variant ? "-" : "",
> 		       model->get_variant ? model->get_variant() : ""
>

While that’s certainly an option, I wonder if it’s better to avoid
additional string formatting when it’s not strictly necessary. The
switch cases provide a straightforward GPU name without needing to
handle conditional "-" separators or similar.

Also, with the current approach, if a GPU is misconfigured with an
incorrect product_major for its core count, the switch’s fallthrough
helps ensure the correct name is still returned. A model->get_variant()
callback wouldn’t give us that same flexibility to adjust the name based
on such mismatches.

Kind regards,
Karunika Choo

>> +
>> +	return "(Unknown Mali GPU)";
>> +}
>> +
>>  static void panthor_gpu_init_info(struct panthor_device *ptdev)
>>  {
>> -	const struct panthor_model *model;
>> -	u32 arch_major, product_major;
>> +	const char *gpu_model_name = get_gpu_model_name(ptdev);
>>  	u32 major, minor, status;
>>  
>>  	ptdev->hw->ops.gpu_info_init(ptdev);
>>  
>> -	arch_major = GPU_ARCH_MAJOR(ptdev->gpu_info.gpu_id);
>> -	product_major = GPU_PROD_MAJOR(ptdev->gpu_info.gpu_id);
>>  	major = GPU_VER_MAJOR(ptdev->gpu_info.gpu_id);
>>  	minor = GPU_VER_MINOR(ptdev->gpu_info.gpu_id);
>>  	status = GPU_VER_STATUS(ptdev->gpu_info.gpu_id);
>>  
>> -	for (model = gpu_models; model->name; model++) {
>> -		if (model->arch_major == arch_major &&
>> -		    model->product_major == product_major)
>> -			break;
>> -	}
>> -
>>  	drm_info(&ptdev->base,
>> -		 "mali-%s id 0x%x major 0x%x minor 0x%x status 0x%x",
>> -		 model->name ?: "unknown", ptdev->gpu_info.gpu_id >> 16,
>> +		 "%s id 0x%x major 0x%x minor 0x%x status 0x%x",
>> +		 gpu_model_name, ptdev->gpu_info.gpu_id >> 16,
>>  		 major, minor, status);
>>  
>>  	drm_info(&ptdev->base,
>> diff --git a/drivers/gpu/drm/panthor/panthor_regs.h b/drivers/gpu/drm/panthor/panthor_regs.h
>> index ba452c1dd644..d9e0769d6f1a 100644
>> --- a/drivers/gpu/drm/panthor/panthor_regs.h
>> +++ b/drivers/gpu/drm/panthor/panthor_regs.h
>> @@ -20,6 +20,7 @@
>>  #define   GPU_VER_STATUS(x)				((x) & GENMASK(3, 0))
>>  
>>  #define GPU_ARCH_ID_MAKE(major, minor, rev)		(((major) << 16) | ((minor) << 8) | (rev))
>> +#define GPU_PROD_ID_MAKE(arch_major, prod_major)	(((arch_major) << 24) | (prod_major))
>>  
>>  #define GPU_L2_FEATURES					0x4
>>  #define  GPU_L2_FEATURES_LINE_SIZE(x)			(1 << ((x) & GENMASK(7, 0)))
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/9] drm/panthor: Add 64-bit and poll register accessors
  2025-04-09 13:00     ` Karunika Choo
@ 2025-04-10 13:28       ` Boris Brezillon
  2025-04-10 16:49         ` Karunika Choo
  0 siblings, 1 reply; 26+ messages in thread
From: Boris Brezillon @ 2025-04-10 13:28 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Wed, 9 Apr 2025 14:00:54 +0100
Karunika Choo <karunika.choo@arm.com> wrote:

> On 21/03/2025 07:48, Boris Brezillon wrote:
> > On Thu, 20 Mar 2025 11:17:33 +0000
> > Karunika Choo <karunika.choo@arm.com> wrote:
> >  
> >> This patch adds 64-bit register accessors to simplify register access in
> >> Panthor. It also adds 32-bit and 64-bit variants for read_poll_timeout.
> >>
> >> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
> >> ---
> >>  drivers/gpu/drm/panthor/panthor_regs.h | 55 ++++++++++++++++++++++++++
> >>  1 file changed, 55 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/panthor/panthor_regs.h  
> b/drivers/gpu/drm/panthor/panthor_regs.h
> >> index 42dc3fedb0d4..7ec4a1d04e20 100644
> >> --- a/drivers/gpu/drm/panthor/panthor_regs.h
> >> +++ b/drivers/gpu/drm/panthor/panthor_regs.h
> >> @@ -238,4 +238,59 @@
> >>  #define gpu_read(dev, reg) \
> >>  	readl((dev)->iomem + (reg))
> >>
> >> +#define gpu_read_relaxed(dev, reg) readl_relaxed((dev)->iomem + (reg))
> >> +
> >> +#define gpu_write64(dev, reg, data)                            \
> >> +	do {                                                   \
> >> +		u64 __val = (u64)(data);                       \
> >> +		gpu_write(dev, reg, lower_32_bits(__val));     \
> >> +		gpu_write(dev, reg + 4, upper_32_bits(__val)); \
> >> +	} while (0)  
> >
> > We're not doing funky name concatenation in these macros, so I'd rather
> > have them defined as static inline funcs in panthor_device.h. We
> > probably want to move the gpu_read/write definitions there as well if
> > we do that.  
> 
> I see where you're coming from, and it makes sense. I was thinking it
> might be better to keep it in panthor_regs.h since that's the file we
> include when accessing GPU registers.

Well, yes, but also gpu_write/read() take a panthor_device, which is
defined in panthor_device.h. I guess we can keep those in
panthor_regs.h and include panthor_device.h from panthor_regs.h if
there's no circular inclusion. I'm fine either way.

> That said, we could certainly
> convert them to static inline functions if you prefer.

Yeah, I'd prefer that.

> 
> >> +
> >> +#define gpu_read64(dev, reg) \
> >> +	(gpu_read(dev, reg) | ((u64)gpu_read(dev, reg + 4) << 32))
> >> +
> >> +#define gpu_read64_relaxed(dev, reg)  \
> >> +	(gpu_read_relaxed(dev, reg) | \
> >> +	 ((u64)gpu_read_relaxed(dev, reg + 4) << 32))
> >> +
> >> +#define gpu_read64_sync(dev, reg)                     \
> >> +	({                                            \
> >> +		u32 lo, hi1, hi2;                     \
> >> +		do {                                  \
> >> +			hi1 = gpu_read(dev, reg + 4); \
> >> +			lo = gpu_read(dev, reg);      \
> >> +			hi2 = gpu_read(dev, reg + 4); \
> >> +		} while (hi1 != hi2);                 \
> >> +		lo | ((u64)hi2 << 32);                \
> >> +	})  
> >
> > I would name that one gpu_read64_counter and make it a static inline
> > function. Note that we already have panthor_gpu_read_64bit_counter()
> > which does the same thing, so maybe move it there and rename it along
> > the way.  
> 
> Happy to rename this to gpu_read64_counter in v3, if you're okay with
> us keeping the macros/functions in this file.

Renaming the function is orthogonal to moving its definition to a
different header, no? I'm not sure I see why one depends on the other.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/9] drm/panthor: Use 64-bit and poll register accessors
  2025-04-09 13:07     ` Karunika Choo
@ 2025-04-10 13:29       ` Boris Brezillon
  0 siblings, 0 replies; 26+ messages in thread
From: Boris Brezillon @ 2025-04-10 13:29 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Wed, 9 Apr 2025 14:07:20 +0100
Karunika Choo <karunika.choo@arm.com> wrote:

> On 21/03/2025 07:53, Boris Brezillon wrote:
> > On Thu, 20 Mar 2025 11:17:34 +0000
> > Karunika Choo <karunika.choo@arm.com> wrote:
> >   
> >> This patch updates Panthor to use the new 64-bit accessors and poll
> >> functions.  
> > 
> > nit: I don't think it makes sense to dissociate the introduction of the
> > new helpers and their use. Could we squash this patch into the previous
> > one?  
> 
> It was previously requested that I split the patches into two to ease
> review. I can merge it back into the previous one in v3.

Thanks. Could we also have that submitted in a separate patch, so we
can merge it while we're discussing the rest of the patch series?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 5/9] drm/panthor: Make getting GPU model name simple and extensible
  2025-04-10 13:20     ` Karunika Choo
@ 2025-04-10 13:37       ` Boris Brezillon
  0 siblings, 0 replies; 26+ messages in thread
From: Boris Brezillon @ 2025-04-10 13:37 UTC (permalink / raw)
  To: Karunika Choo
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On Thu, 10 Apr 2025 14:20:59 +0100
Karunika Choo <karunika.choo@arm.com> wrote:

> On 21/03/2025 08:02, Boris Brezillon wrote:
> > On Thu, 20 Mar 2025 11:17:37 +0000
> > Karunika Choo <karunika.choo@arm.com> wrote:
> >   
> >> This patch replaces the previous panthor_model structure with a simple
> >> switch case based on the product_id, which is in the format of:
> >>         ((arch_major << 24) | product_major)
> >>
> >> This not only simplifies the comparison, but also allows extending the
> >> function to accommodate naming differences based on GPU features.
> >>
> >> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
> >> ---
> >>  drivers/gpu/drm/panthor/panthor_hw.c   | 63 +++++++-------------------
> >>  drivers/gpu/drm/panthor/panthor_regs.h |  1 +
> >>  2 files changed, 18 insertions(+), 46 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/panthor/panthor_hw.c b/drivers/gpu/drm/panthor/panthor_hw.c
> >> index 4cc4b0d5382c..12183c04cd21 100644
> >> --- a/drivers/gpu/drm/panthor/panthor_hw.c
> >> +++ b/drivers/gpu/drm/panthor/panthor_hw.c
> >> @@ -5,40 +5,6 @@
> >>  #include "panthor_hw.h"
> >>  #include "panthor_regs.h"
> >>  
> >> -/**
> >> - * struct panthor_model - GPU model description
> >> - */
> >> -struct panthor_model {
> >> -	/** @name: Model name. */
> >> -	const char *name;
> >> -
> >> -	/** @arch_major: Major version number of architecture. */
> >> -	u8 arch_major;
> >> -
> >> -	/** @product_major: Major version number of product. */
> >> -	u8 product_major;
> >> -};
> >> -
> >> -/**
> >> - * GPU_MODEL() - Define a GPU model. A GPU product can be uniquely identified
> >> - * by a combination of the major architecture version and the major product
> >> - * version.
> >> - * @_name: Name for the GPU model.
> >> - * @_arch_major: Architecture major.
> >> - * @_product_major: Product major.
> >> - */
> >> -#define GPU_MODEL(_name, _arch_major, _product_major) \
> >> -{\
> >> -	.name = __stringify(_name),				\
> >> -	.arch_major = _arch_major,				\
> >> -	.product_major = _product_major,			\
> >> -}
> >> -
> >> -static const struct panthor_model gpu_models[] = {
> >> -	GPU_MODEL(g610, 10, 7),
> >> -	{},
> >> -};
> >> -
> >>  static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
> >>  {
> >>  	unsigned int i;
> >> @@ -66,29 +32,34 @@ static void arch_10_8_gpu_info_init(struct panthor_device *ptdev)
> >>  	ptdev->gpu_info.l2_present = gpu_read64(ptdev, GPU_L2_PRESENT_LO);
> >>  }
> >>  
> >> +static char *get_gpu_model_name(struct panthor_device *ptdev)
> >> +{
> >> +	const u32 gpu_id = ptdev->gpu_info.gpu_id;
> >> +	const u32 product_id = GPU_PROD_ID_MAKE(GPU_ARCH_MAJOR(gpu_id),
> >> +						GPU_PROD_MAJOR(gpu_id));
> >> +
> >> +	switch (product_id) {
> >> +	case GPU_PROD_ID_MAKE(10, 7):
> >> +		return "Mali-G610";
> >> +	}  
> > 
> > I a big fan of these ever growing switch statements with nested
> > conditionals. Could we instead add an optional ::get_variant() callback
> > in panthor_model and have the following formatting:
> > 
> > 	"Mali-%s%s%s", model->name,
> > 		       model->get_variant ? "-" : "",
> > 		       model->get_variant ? model->get_variant() : ""
> >  
> 
> While that’s certainly an option, I wonder if it’s better to avoid
> additional string formatting when it’s not strictly necessary. The
> switch cases provide a straightforward GPU name without needing to
> handle conditional "-" separators or similar.
> 
> Also, with the current approach, if a GPU is misconfigured with an
> incorrect product_major for its core count, the switch’s fallthrough
> helps ensure the correct name is still returned. A model->get_variant()
> callback wouldn’t give us that same flexibility to adjust the name based
> on such mismatches.

Fair enough. I guess we can live with this sort of switch statement for
the name selection. Hopefully the variants are rare enough that it
doesn't go too wild.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/9] drm/panthor: Add 64-bit and poll register accessors
  2025-04-10 13:28       ` Boris Brezillon
@ 2025-04-10 16:49         ` Karunika Choo
  0 siblings, 0 replies; 26+ messages in thread
From: Karunika Choo @ 2025-04-10 16:49 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: dri-devel, nd, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	linux-kernel

On 10/04/2025 14:28, Boris Brezillon wrote:
> On Wed, 9 Apr 2025 14:00:54 +0100
> Karunika Choo <karunika.choo@arm.com> wrote:
> 
>> On 21/03/2025 07:48, Boris Brezillon wrote:
>>> On Thu, 20 Mar 2025 11:17:33 +0000
>>> Karunika Choo <karunika.choo@arm.com> wrote:
>>>  
>>>> This patch adds 64-bit register accessors to simplify register access in
>>>> Panthor. It also adds 32-bit and 64-bit variants for read_poll_timeout.
>>>>
>>>> Signed-off-by: Karunika Choo <karunika.choo@arm.com>
>>>> ---
>>>>  drivers/gpu/drm/panthor/panthor_regs.h | 55 ++++++++++++++++++++++++++
>>>>  1 file changed, 55 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/panthor/panthor_regs.h  
>> b/drivers/gpu/drm/panthor/panthor_regs.h
>>>> index 42dc3fedb0d4..7ec4a1d04e20 100644
>>>> --- a/drivers/gpu/drm/panthor/panthor_regs.h
>>>> +++ b/drivers/gpu/drm/panthor/panthor_regs.h
>>>> @@ -238,4 +238,59 @@
>>>>  #define gpu_read(dev, reg) \
>>>>  	readl((dev)->iomem + (reg))
>>>>
>>>> +#define gpu_read_relaxed(dev, reg) readl_relaxed((dev)->iomem + (reg))
>>>> +
>>>> +#define gpu_write64(dev, reg, data)                            \
>>>> +	do {                                                   \
>>>> +		u64 __val = (u64)(data);                       \
>>>> +		gpu_write(dev, reg, lower_32_bits(__val));     \
>>>> +		gpu_write(dev, reg + 4, upper_32_bits(__val)); \
>>>> +	} while (0)  
>>>
>>> We're not doing funky name concatenation in these macros, so I'd rather
>>> have them defined as static inline funcs in panthor_device.h. We
>>> probably want to move the gpu_read/write definitions there as well if
>>> we do that.  
>>
>> I see where you're coming from, and it makes sense. I was thinking it
>> might be better to keep it in panthor_regs.h since that's the file we
>> include when accessing GPU registers.
> 
> Well, yes, but also gpu_write/read() take a panthor_device, which is
> defined in panthor_device.h. I guess we can keep those in
> panthor_regs.h and include panthor_device.h from panthor_regs.h if
> there's no circular inclusion. I'm fine either way.
> 
>> That said, we could certainly
>> convert them to static inline functions if you prefer.
> 
> Yeah, I'd prefer that.
> 

Apologies for the back-and-forth. You’re absolutely right—it’s a good
point that struct panthor_device is defined in panthor_device.h. I
have moved these functions there as static inline functions, in a
separate patch outside this series.

Link: https://lore.kernel.org/lkml/20250410163546.919749-1-karunika.choo@arm.com/

Kind regards,
Karunika Choo

>>
>>>> +
>>>> +#define gpu_read64(dev, reg) \
>>>> +	(gpu_read(dev, reg) | ((u64)gpu_read(dev, reg + 4) << 32))
>>>> +
>>>> +#define gpu_read64_relaxed(dev, reg)  \
>>>> +	(gpu_read_relaxed(dev, reg) | \
>>>> +	 ((u64)gpu_read_relaxed(dev, reg + 4) << 32))
>>>> +
>>>> +#define gpu_read64_sync(dev, reg)                     \
>>>> +	({                                            \
>>>> +		u32 lo, hi1, hi2;                     \
>>>> +		do {                                  \
>>>> +			hi1 = gpu_read(dev, reg + 4); \
>>>> +			lo = gpu_read(dev, reg);      \
>>>> +			hi2 = gpu_read(dev, reg + 4); \
>>>> +		} while (hi1 != hi2);                 \
>>>> +		lo | ((u64)hi2 << 32);                \
>>>> +	})  
>>>
>>> I would name that one gpu_read64_counter and make it a static inline
>>> function. Note that we already have panthor_gpu_read_64bit_counter()
>>> which does the same thing, so maybe move it there and rename it along
>>> the way.  
>>
>> Happy to rename this to gpu_read64_counter in v3, if you're okay with
>> us keeping the macros/functions in this file.
> 
> Renaming the function is orthogonal to moving its definition to a
> different header, no? I'm not sure I see why one depends on the other.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2025-04-10 16:49 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-20 11:17 [PATCH v2 0/9] drm/panthor: Add GPU specific initialization framework to support new Mali GPUs Karunika Choo
2025-03-20 11:17 ` [PATCH v2 1/9] drm/panthor: Add 64-bit and poll register accessors Karunika Choo
2025-03-21  7:48   ` Boris Brezillon
2025-04-09 13:00     ` Karunika Choo
2025-04-10 13:28       ` Boris Brezillon
2025-04-10 16:49         ` Karunika Choo
2025-03-20 11:17 ` [PATCH v2 2/9] drm/panthor: Use " Karunika Choo
2025-03-21  7:53   ` Boris Brezillon
2025-04-09 13:07     ` Karunika Choo
2025-04-10 13:29       ` Boris Brezillon
2025-03-20 11:17 ` [PATCH v2 3/9] drm/panthor: Add GPU specific initialization framework Karunika Choo
2025-03-21  8:28   ` Boris Brezillon
2025-03-20 11:17 ` [PATCH v2 4/9] drm/panthor: Move GPU info initialization into panthor_hw.c Karunika Choo
2025-03-21  8:16   ` Boris Brezillon
2025-03-21  8:43     ` Boris Brezillon
2025-03-20 11:17 ` [PATCH v2 5/9] drm/panthor: Make getting GPU model name simple and extensible Karunika Choo
2025-03-21  8:02   ` Boris Brezillon
2025-04-10 13:20     ` Karunika Choo
2025-04-10 13:37       ` Boris Brezillon
2025-03-20 11:17 ` [PATCH v2 6/9] drm/panthor: Add support for Mali-G715 family of GPUs Karunika Choo
2025-03-21  8:34   ` Boris Brezillon
2025-03-20 11:17 ` [PATCH v2 7/9] drm/panthor: Support GPU_CONTROL cache flush based on feature bit Karunika Choo
2025-03-21  8:41   ` Boris Brezillon
2025-03-20 11:17 ` [PATCH v2 8/9] drm/panthor: Add support for Mali-G720 and Mali-G725 GPUs Karunika Choo
2025-03-20 11:17 ` [PATCH v2 9/9] drm/panthor: Add support for Mali-G710, Mali-G510, and Mali-G310 Karunika Choo
2025-03-20 19:03   ` Liviu Dudau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).