public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH v4 0/8] Add GCE support for MT8196
@ 2025-02-18  5:41 Jason-JH Lin
  2025-02-18  5:41 ` [PATCH v4 1/8] dt-bindings: mailbox: mediatek: Add support for MT8196 GCE mailbox Jason-JH Lin
                   ` (7 more replies)
  0 siblings, 8 replies; 31+ messages in thread
From: Jason-JH Lin @ 2025-02-18  5:41 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jassi Brar,
	Chun-Kuang Hu, AngeloGioacchino Del Regno, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Jason-JH Lin, Nancy Lin, Singo Chang, Moudy Ho,
	Xavier Chang, Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

This patch series adds support for the MediaTek MT8196 SoC in the CMDQ
driver and related subsystems. The changes include adding compatible
names and iommus property, updating driver data to accommodate hardware
changes, and modifying the usage of CMDQ APIs to support non-subsys ID
hardware.

---
Change in v4:
1. Remove dt-binding header and add a gce header in dts folder.
2. Remove dot in sign-off name.
3. Change addr type from u32 to dma_addr_t for cmdq_reg_shift_addr() and
   cmdq_reg_revert_addr().

Change in v3:
1. Merge 2 dt-bindings pathes together and add more detail commit message.
2. Change type u32 to phys_addr_t for pa_base of struct cmdq_client_reg.
3. Remove cmdq_subsys_is_valid() and subsys_num in CMDQ driver.
4. Add CMDQ_SUBSYS_INVALID to check subsys instead of using
   cmdq_subsys_is_invalid().
5. Make use of CMDQ_THR_SPR0 define to the parameter of CMDQ APIs.
6. Rebase on the new MACRO in mtk-mdp3-comp.h.

Change in v2:
1. Remove the constant and fix warning in dt-bindings.
2. Remove the pa_base parameter of CMDQ APIs and related modification.
3. Move subsys checking to client drivers and use 2 alternative
   CMDQ APIs to achieve the same functionality.

---

Jason-JH Lin (8):
  dt-bindings: mailbox: mediatek: Add support for MT8196 GCE mailbox
  arm64: dts: mediatek: Add GCE header for MT8196
  mailbox: mtk-cmdq: Add driver data to support for MT8196
  soc: mediatek: mtk-cmdq: Add pa_base parsing for unsupported subsys ID
    hardware
  soc: mediatek: mtk-cmdq: Add mminfra_offset compatibility for DRAM
    address
  soc: mediatek: Add programming flow for unsupported subsys ID hardware
  drm/mediatek: Add programming flow for unsupported subsys ID hardware
  media: mediatek: mdp3: Add programming flow for unsupported subsys ID
    hardware

 .../mailbox/mediatek,gce-mailbox.yaml         |   4 +
 arch/arm64/boot/dts/mediatek/mt8196-gce.h     | 612 ++++++++++++++++++
 drivers/gpu/drm/mediatek/mtk_ddp_comp.c       |  33 +-
 drivers/mailbox/mtk-cmdq-mailbox.c            |  90 ++-
 .../platform/mediatek/mdp3/mtk-mdp3-cmdq.c    |  18 +-
 .../platform/mediatek/mdp3/mtk-mdp3-comp.h    |  79 ++-
 drivers/soc/mediatek/mtk-cmdq-helper.c        |  53 +-
 drivers/soc/mediatek/mtk-mmsys.c              |  14 +-
 drivers/soc/mediatek/mtk-mutex.c              |  11 +-
 include/linux/mailbox/mtk-cmdq-mailbox.h      |   2 +
 include/linux/soc/mediatek/mtk-cmdq.h         |   3 +
 11 files changed, 876 insertions(+), 43 deletions(-)
 create mode 100644 arch/arm64/boot/dts/mediatek/mt8196-gce.h

-- 
2.43.0



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 1/8] dt-bindings: mailbox: mediatek: Add support for MT8196 GCE mailbox
  2025-02-18  5:41 [PATCH v4 0/8] Add GCE support for MT8196 Jason-JH Lin
@ 2025-02-18  5:41 ` Jason-JH Lin
  2025-02-18  8:12   ` Krzysztof Kozlowski
  2025-02-18  5:41 ` [PATCH v4 2/8] arm64: dts: mediatek: Add GCE header for MT8196 Jason-JH Lin
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Jason-JH Lin @ 2025-02-18  5:41 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jassi Brar,
	Chun-Kuang Hu, AngeloGioacchino Del Regno, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Jason-JH Lin, Nancy Lin, Singo Chang, Moudy Ho,
	Xavier Chang, Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

Add the compatible name and iommus property for MT8196.

In MT8196, all command buffers allocated and used by the GCE device
work with IOMMU.

Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
---
 .../devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml     | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml b/Documentation/devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml
index cef9d7601398..73d6db34d64a 100644
--- a/Documentation/devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml
+++ b/Documentation/devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml
@@ -25,6 +25,7 @@ properties:
           - mediatek,mt8188-gce
           - mediatek,mt8192-gce
           - mediatek,mt8195-gce
+          - mediatek,mt8196-gce
       - items:
           - const: mediatek,mt6795-gce
           - const: mediatek,mt8173-gce
@@ -49,6 +50,9 @@ properties:
     items:
       - const: gce
 
+  iommus:
+    maxItems: 1
+
 required:
   - compatible
   - "#mbox-cells"
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 2/8] arm64: dts: mediatek: Add GCE header for MT8196
  2025-02-18  5:41 [PATCH v4 0/8] Add GCE support for MT8196 Jason-JH Lin
  2025-02-18  5:41 ` [PATCH v4 1/8] dt-bindings: mailbox: mediatek: Add support for MT8196 GCE mailbox Jason-JH Lin
@ 2025-02-18  5:41 ` Jason-JH Lin
  2025-02-18  5:41 ` [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support " Jason-JH Lin
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Jason-JH Lin @ 2025-02-18  5:41 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jassi Brar,
	Chun-Kuang Hu, AngeloGioacchino Del Regno, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Jason-JH Lin, Nancy Lin, Singo Chang, Moudy Ho,
	Xavier Chang, Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

Add GCE header define for GCE Thread priority and GCE event IDs
that used in the MT8196 dtsi.

Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
---
 arch/arm64/boot/dts/mediatek/mt8196-gce.h | 612 ++++++++++++++++++++++
 1 file changed, 612 insertions(+)
 create mode 100644 arch/arm64/boot/dts/mediatek/mt8196-gce.h

diff --git a/arch/arm64/boot/dts/mediatek/mt8196-gce.h b/arch/arm64/boot/dts/mediatek/mt8196-gce.h
new file mode 100644
index 000000000000..067a67f82965
--- /dev/null
+++ b/arch/arm64/boot/dts/mediatek/mt8196-gce.h
@@ -0,0 +1,612 @@
+/* SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause */
+/*
+ * Copyright (c) 2024 MediaTek Inc.
+ *
+ */
+
+#ifndef __DTS_GCE_MT8196_H
+#define __DTS_GCE_MT8196_H
+
+/* GCE Thread Priority
+ * The GCE core has multiple GCE threads, each of which can independently
+ * execute its own sequence of instructions.
+ * However, the GCE threads on the same core cannot run in parallel.
+ * Different GCE threads can determine thread priority based on the scenario,
+ * thereby serving different user needs.
+ *
+ * Low priority thread is executed when no high priority thread is active.
+ * Same priority thread is scheduled by round robin.
+ */
+#define CMDQ_THR_PRIO_LOWEST	0
+#define CMDQ_THR_PRIO_1		1
+#define CMDQ_THR_PRIO_2		2
+#define CMDQ_THR_PRIO_3		3
+#define CMDQ_THR_PRIO_4		4
+#define CMDQ_THR_PRIO_5		5
+#define CMDQ_THR_PRIO_6		6
+#define CMDQ_THR_PRIO_HIGHEST	7
+
+/*
+ * GCE0 Hardware Event IDs
+ * Different SoCs will have varying numbers of hardware event signals,
+ * which are sent from the corresponding hardware to the GCE.
+ * Each hardware event signal corresponds to an event ID in the GCE.
+ * The CMDQ driver can use the following event ID definitions to allow
+ * the client driver to use wait and clear APIs provided by CMDQ, enabling
+ * the GCE to execute operations in the instructions for that event ID.
+ *
+ * The event IDs of GCE0 are mainly used by display hardware.
+ */
+/* CMDQ_EVENT_DISP0_STREAM_SOF0 ~ 15: 0 ~ 15 */
+#define CMDQ_EVENT_DISP0_STREAM_SOF(n)						(0 + (n))
+/* CMDQ_EVENT_DISP0_FRAME_DONE_SEL0 ~ 15: 16 ~ 31 */
+#define CMDQ_EVENT_DISP0_FRAME_DONE_SEL(n)					(16 + (n))
+#define CMDQ_EVENT_DISP0_DISP_WDMA0_TARGET_LINE_END_ENG_EVENT			32
+#define CMDQ_EVENT_DISP0_DISP_WDMA0_SW_RST_DONE_ENG_EVENT			33
+#define CMDQ_EVENT_DISP0_DISP_POSTMASK1_RST_DONE_ENG_EVENT			34
+#define CMDQ_EVENT_DISP0_DISP_POSTMASK0_RST_DONE_ENG_EVENT			35
+#define CMDQ_EVENT_DISP0_DISP_MUTEX0_TIMEOUT_ENG_EVENT				36
+/* CMDQ_EVENT_DISP0_DISP_MUTEX0_REG_UPDATE_ENG_EVENT0 ~ 15: 37 ~ 52 */
+#define CMDQ_EVENT_DISP0_DISP_MUTEX0_REG_UPDATE_ENG_EVENT(n)			(37 + (n))
+#define CMDQ_EVENT_DISP0_DISP_MUTEX0_GET_RELEASE_ENG_EVENT			53
+#define CMDQ_EVENT_DISP0_DISP_MDP_RDMA0_SW_RST_DONE_ENG_EVENT			54
+/* CMDQ_EVENT_DISP1_STREAM_SOF0 ~ 15: 55 ~ 70 */
+#define CMDQ_EVENT_DISP1_STREAM_SOF(n)						(55 + (n))
+/* CMDQ_EVENT_DISP1_FRAME_DONE_SEL0 ~ 15: 71 ~ 86 */
+#define CMDQ_EVENT_DISP1_FRAME_DONE_SEL(n)					(71 + (n))
+/* CMDQ_EVENT_DISP1_STREAM_DONE_ENG_EVENT0 ~ 15: 87 ~ 102 */
+#define CMDQ_EVENT_DISP1_STREAM_DONE_ENG_EVENT(n)				(87 + (n))
+/* CMDQ_EVENT_DISP1_REG_UPDATE_DONE_ENG_EVENT0 ~ 15: 103 ~ 118 */
+#define CMDQ_EVENT_DISP1_REG_UPDATE_DONE_ENG_EVENT(n)				(103 + (n))
+#define CMDQ_EVENT_DISP1_OCIP_SUBSYS_SRAM_ISOINT_ENG_EVENT			119
+#define CMDQ_EVENT_DISP1_DISP_WDMA4_TARGET_LINE_END_ENG_EVENT			120
+#define CMDQ_EVENT_DISP1_DISP_WDMA4_SW_RST_DONE_ENG_EVENT			121
+#define CMDQ_EVENT_DISP1_DISP_WDMA3_TARGET_LINE_END_ENG_EVENT			122
+#define CMDQ_EVENT_DISP1_DISP_WDMA3_SW_RST_DONE_ENG_EVENT			123
+#define CMDQ_EVENT_DISP1_DISP_WDMA2_TARGET_LINE_END_ENG_EVENT			124
+#define CMDQ_EVENT_DISP1_DISP_WDMA2_SW_RST_DONE_ENG_EVENT			125
+#define CMDQ_EVENT_DISP1_DISP_WDMA1_TARGET_LINE_END_ENG_EVENT			126
+#define CMDQ_EVENT_DISP1_DISP_WDMA1_SW_RST_DONE_ENG_EVENT			127
+#define CMDQ_EVENT_DISP1_DISP_MUTEX0_TIMEOUT_ENG_EVENT				128
+#define CMDQ_EVENT_DISP1_DISP_MUTEX0_GET_RLZ_ENG_EVENT				129
+#define CMDQ_EVENT_DISP1_DISP_MDP_RDMA1_SW_RST_DONE_ENG_EVENT			130
+#define CMDQ_EVENT_DISP1_DISP_GDMA0_SW_RST_DONE_ENG_EVENT			131
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_INT_TG_VSYNC_START_ENG_EVENT		132
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_INT_TG_VSYNC_END_ENG_EVENT		133
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_INT_TG_VRR_VFP_LAST_SAFE_BLANK_ENG_EVENT	134
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_INT_TG_VFP_START_ENG_EVENT		135
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_INT_TG_VFP_LAST_LINE_ENG_EVENT		136
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_INT_TG_VDE_END_ENG_EVENT			137
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_INT_TG_TRIGGER_LOOP_CLR_ENG_EVENT	138
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_INT_TG_TARGET_LINE1_ENG_EVENT		139
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_INT_TG_TARGET_LINE0_ENG_EVENT		140
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_EXT_TG_VSYNC_START_ENG_EVENT		141
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_EXT_TG_VSYNC_END_ENG_EVENT		142
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_EXT_TG_VDE_START_ENG_EVENT		143
+#define CMDQ_EVENT_DISP1_DISP_DVO0_DVO_EXT_TG_VDE_END_ENG_EVENT			144
+/* CMDQ_EVENT_DISP1_DISP_DSI2_ENG_EVENT0 ~ 10: 145 ~ 155 */
+#define CMDQ_EVENT_DISP1_DISP_DSI2_ENG_EVENT(n)					(145 + (n))
+/* CMDQ_EVENT_DISP1_DISP_DSI1_ENG_EVENT0 ~ 21: 156 ~ 177 */
+#define CMDQ_EVENT_DISP1_DISP_DSI1_ENG_EVENT(n)					(156 + (n))
+/* CMDQ_EVENT_DISP1_DISP_DSI0_ENG_EVENT0 ~ 10: 178 ~ 188 */
+#define CMDQ_EVENT_DISP1_DISP_DSI0_ENG_EVENT(n)					(178 + (n))
+#define CMDQ_EVENT_DISP1_DISP_DP_INTF1_VSYNC_START_ENG_EVENT			189
+#define CMDQ_EVENT_DISP1_DISP_DP_INTF1_VSYNC_END_ENG_EVENT			190
+#define CMDQ_EVENT_DISP1_DISP_DP_INTF1_VDE_START_ENG_EVENT			191
+#define CMDQ_EVENT_DISP1_DISP_DP_INTF1_VDE_END_ENG_EVENT			192
+#define CMDQ_EVENT_DISP1_DISP_DP_INTF1_TARGET_LINE_ENG_EVENT			193
+#define CMDQ_EVENT_DISP1_DISP_DP_INTF0_VSYNC_START_ENG_EVENT			194
+#define CMDQ_EVENT_DISP1_DISP_DP_INTF0_VSYNC_END_ENG_EVENT			195
+#define CMDQ_EVENT_DISP1_DISP_DP_INTF0_VDE_START_ENG_EVENT			196
+#define CMDQ_EVENT_DISP1_DISP_DP_INTF0_VDE_END_ENG_EVENT			197
+#define CMDQ_EVENT_DISP1_DISP_DP_INTF0_TARGET_LINE_ENG_EVENT			198
+/* CMDQ_EVENT_DISP1_BUF_UNDERRUN_ENG_EVENT0 ~ 10: 199 ~ 209 */
+#define CMDQ_EVENT_DISP1_BUF_UNDERRUN_ENG_EVENT(n)				(199 + (n))
+/* CMDQ_EVENT_MML0_STREAM_SOF0 ~ 15: 210 ~ 225 */
+#define CMDQ_EVENT_MML0_STREAM_SOF(n)						(210 + (n))
+/* CMDQ_EVENT_MML0_FRAME_DONE_SEL0 ~ 15: 226 ~ 241 */
+#define CMDQ_EVENT_MML0_FRAME_DONE_SEL(n)					(226 + (n))
+/* CMDQ_EVENT_MML0_REG_UPDATE_DONE_ENG_EVENT0 ~ 15: 242 ~ 257 */
+#define CMDQ_EVENT_MML0_REG_UPDATE_DONE_ENG_EVENT(n)				(242 + (n))
+#define CMDQ_EVENT_MML0_MDP_WROT2_SW_RST_DONE_ENG_EVENT				258
+#define CMDQ_EVENT_MML0_MDP_WROT1_SW_RST_DONE_ENG_EVENT				259
+#define CMDQ_EVENT_MML0_MDP_WROT0_SW_RST_DONE_ENG_EVENT				260
+#define CMDQ_EVENT_MML0_MDP_RROT0_SW_RST_DONE_ENG_EVENT				261
+#define CMDQ_EVENT_MML0_MDP_RDMA2_SW_RST_DONE_ENG_EVENT				262
+#define CMDQ_EVENT_MML0_MDP_RDMA1_SW_RST_DONE_ENG_EVENT				263
+#define CMDQ_EVENT_MML0_MDP_RDMA0_SW_RST_DONE_ENG_EVENT				264
+#define CMDQ_EVENT_MML0_MDP_MERGE0_SW_RST_DONE_ENG_EVENT			265
+#define CMDQ_EVENT_MML0_DISP_MUTEX0_TIMEOUT_ENG_EVENT				266
+#define CMDQ_EVENT_MML0_DISP_MUTEX0_GET_RLZ_ENG_EVENT				267
+/* CMDQ_EVENT_MML1_STREAM_SOF0 ~ 15: 268 ~ 283 */
+#define CMDQ_EVENT_MML1_STREAM_SOF(n)						(268 + (n))
+/* CMDQ_EVENT_MML1_FRAME_DONE_SEL0 ~ 15: 284 ~ 299 */
+#define CMDQ_EVENT_MML1_FRAME_DONE_SEL(n)					(284 + (n))
+/* CMDQ_EVENT_MML1_REG_UPDATE_DONE_ENG_EVENT0 ~ 15: 300 ~ 315 */
+#define CMDQ_EVENT_MML1_REG_UPDATE_DONE_ENG_EVENT0				(300 + (n))
+#define CMDQ_EVENT_MML1_MDP_WROT2_SW_RST_DONE_ENG_EVENT				316
+#define CMDQ_EVENT_MML1_MDP_WROT1_SW_RST_DONE_ENG_EVENT				317
+#define CMDQ_EVENT_MML1_MDP_WROT0_SW_RST_DONE_ENG_EVENT				318
+#define CMDQ_EVENT_MML1_MDP_RROT0_SW_RST_DONE_ENG_EVENT				319
+#define CMDQ_EVENT_MML1_MDP_RDMA2_SW_RST_DONE_ENG_EVENT				320
+#define CMDQ_EVENT_MML1_MDP_RDMA1_SW_RST_DONE_ENG_EVENT				321
+#define CMDQ_EVENT_MML1_MDP_RDMA0_SW_RST_DONE_ENG_EVENT				322
+#define CMDQ_EVENT_MML1_MDP_MERGE0_SW_RST_DONE_ENG_EVENT			323
+#define CMDQ_EVENT_MML1_DISP_MUTEX0_TIMEOUT_ENG_EVENT				324
+#define CMDQ_EVENT_MML1_DISP_MUTEX0_GET_RLZ_ENG_EVENT				325
+/* CMDQ_EVENT_OVL0_STREAM_SOF0 ~ 15: 326 ~ 341 */
+#define CMDQ_EVENT_OVL0_STREAM_SOF(n)						(326 + (n))
+/* CMDQ_EVENT_OVL0_FRAME_DONE_SEL0 ~ 15: 342 ~ 357 */
+#define CMDQ_EVENT_OVL0_FRAME_DONE_SEL(n)					(342 + (n))
+#define CMDQ_EVENT_OVL0_OVL_UFBC_WDMA0_TARGET_LINE_END_ENG_EVENT		358
+#define CMDQ_EVENT_OVL0_OVL_MUTEX0_TIMEOUT_ENG_EVENT				359
+/* CMDQ_EVENT_OVL0_OVL_MUTEX0_REG_UPDATE_DONE_ENG_EVENT0 ~ 15: 360 ~ 375 */
+#define CMDQ_EVENT_OVL0_OVL_MUTEX0_REG_UPDATE_DONE_ENG_EVENT(n)			(360 + (n))
+#define CMDQ_EVENT_OVL0_OVL_MUTEX0_GET_RELEASE_ENG_EVENT			376
+#define CMDQ_EVENT_OVL0_OVL_MDP_RDMA1_SW_RST_DONE_ENG_EVENT			377
+#define CMDQ_EVENT_OVL0_OVL_MDP_RDMA0_SW_RST_DONE_ENG_EVENT			378
+#define CMDQ_EVENT_OVL0_OVL_EXDMA9_FRAME_RESET_DONE_ENG_EVENT			379
+#define CMDQ_EVENT_OVL0_OVL_EXDMA8_FRAME_RESET_DONE_ENG_EVENT			380
+#define CMDQ_EVENT_OVL0_OVL_EXDMA7_FRAME_RESET_DONE_ENG_EVENT			381
+#define CMDQ_EVENT_OVL0_OVL_EXDMA6_FRAME_RESET_DONE_ENG_EVENT			382
+#define CMDQ_EVENT_OVL0_OVL_EXDMA5_FRAME_RESET_DONE_ENG_EVENT			383
+#define CMDQ_EVENT_OVL0_OVL_EXDMA4_FRAME_RESET_DONE_ENG_EVENT			384
+#define CMDQ_EVENT_OVL0_OVL_EXDMA3_FRAME_RESET_DONE_ENG_EVENT			385
+#define CMDQ_EVENT_OVL0_OVL_EXDMA2_FRAME_RESET_DONE_ENG_EVENT			386
+#define CMDQ_EVENT_OVL0_OVL_EXDMA1_FRAME_RESET_DONE_ENG_EVENT			387
+#define CMDQ_EVENT_OVL0_OVL_EXDMA0_FRAME_RESET_DONE_ENG_EVENT			388
+#define CMDQ_EVENT_OVL0_OVL_DISP_WDMA1_TARGET_LINE_END_ENG_EVENT		389
+#define CMDQ_EVENT_OVL0_OVL_DISP_WDMA1_SW_RST_DONE_END_ENG_EVENT		390
+#define CMDQ_EVENT_OVL0_OVL_DISP_WDMA0_TARGET_LINE_END_ENG_EVENT		391
+#define CMDQ_EVENT_OVL0_OVL_DISP_WDMA0_SW_RST_DONE_END_ENG_EVENT		392
+#define CMDQ_EVENT_OVL0_OVL_BWM0_FRAME_RESET_DONE_ENG_EVENT			393
+/* CMDQ_EVENT_OVL1_STREAM_SOF0 ~ 15: 394 ~ 409 */
+#define CMDQ_EVENT_OVL1_STREAM_SOF(n)						(394 + (n))
+/* CMDQ_EVENT_OVL1_FRAME_DONE_SEL0 ~ 15: 410 ~ 425 */
+#define CMDQ_EVENT_OVL1_FRAME_DONE_SEL(n)					(410 + (n))
+#define CMDQ_EVENT_OVL1_OVL_UFBC_WDMA0_TARGET_LINE_END_ENG_EVENT		426
+#define CMDQ_EVENT_OVL1_OVL_MUTEX0_TIMEOUT_ENG_EVENT				427
+/* CMDQ_EVENT_OVL1_OVL_MUTEX0_REG_UPDATE_DONE_ENG_EVENT0 ~ 15: 428 ~ 443 */
+#define CMDQ_EVENT_OVL1_OVL_MUTEX0_REG_UPDATE_DONE_ENG_EVENT(n)			(428 + (n))
+#define CMDQ_EVENT_OVL1_OVL_MUTEX0_GET_RELEASE_ENG_EVENT			444
+#define CMDQ_EVENT_OVL1_OVL_MDP_RDMA1_SW_RST_DONE_ENG_EVENT			445
+#define CMDQ_EVENT_OVL1_OVL_MDP_RDMA0_SW_RST_DONE_ENG_EVENT			446
+#define CMDQ_EVENT_OVL1_OVL_EXDMA9_FRAME_RESET_DONE_ENG_EVENT			447
+#define CMDQ_EVENT_OVL1_OVL_EXDMA8_FRAME_RESET_DONE_ENG_EVENT			448
+#define CMDQ_EVENT_OVL1_OVL_EXDMA7_FRAME_RESET_DONE_ENG_EVENT			449
+#define CMDQ_EVENT_OVL1_OVL_EXDMA6_FRAME_RESET_DONE_ENG_EVENT			450
+#define CMDQ_EVENT_OVL1_OVL_EXDMA5_FRAME_RESET_DONE_ENG_EVENT			451
+#define CMDQ_EVENT_OVL1_OVL_EXDMA4_FRAME_RESET_DONE_ENG_EVENT			452
+#define CMDQ_EVENT_OVL1_OVL_EXDMA3_FRAME_RESET_DONE_ENG_EVENT			453
+#define CMDQ_EVENT_OVL1_OVL_EXDMA2_FRAME_RESET_DONE_ENG_EVENT			454
+#define CMDQ_EVENT_OVL1_OVL_EXDMA1_FRAME_RESET_DONE_ENG_EVENT			455
+#define CMDQ_EVENT_OVL1_OVL_EXDMA0_FRAME_RESET_DONE_ENG_EVENT			456
+#define CMDQ_EVENT_OVL1_OVL_DISP_WDMA1_TARGET_LINE_END_ENG_EVENT		457
+#define CMDQ_EVENT_OVL1_OVL_DISP_WDMA1_SW_RST_DONE_END_ENG_EVENT		458
+#define CMDQ_EVENT_OVL1_OVL_DISP_WDMA0_TARGET_LINE_END_ENG_EVENT		459
+#define CMDQ_EVENT_OVL1_OVL_DISP_WDMA0_SW_RST_DONE_END_ENG_EVENT		460
+#define CMDQ_EVENT_OVL1_OVL_BWM0_FRAME_RESET_DONE_ENG_EVENT			461
+#define CMDQ_EVENT_DPC_DT_DONE0							462
+#define CMDQ_EVENT_DPC_DT_DONE1							463
+#define CMDQ_EVENT_DPC_DT_DONE2_0_MERGE						464
+#define CMDQ_EVENT_DPC_DT_DONE2_1_MERGE						465
+#define CMDQ_EVENT_DPC_DT_DONE2_2_MERGE						466
+#define CMDQ_EVENT_DPC_DT_DONE2_3_MERGE						467
+#define CMDQ_EVENT_DPC_DT_DONE3							468
+#define CMDQ_EVENT_DPC_DT_DONE4_MERGE						469
+#define CMDQ_EVENT_DPC_DT_DONE5							470
+#define CMDQ_EVENT_DPC_DT_DONE6_0_MERGE						471
+#define CMDQ_EVENT_DPC_DT_DONE6_1_MERGE						472
+#define CMDQ_EVENT_DPC_DT_DONE6_2_MERGE						473
+#define CMDQ_EVENT_DPC_DT_DONE6_3_MERGE						474
+#define CMDQ_EVENT_DPC_DT_DONE7							475
+#define CMDQ_EVENT_DPC_DT_DONE32_MERGE						476
+#define CMDQ_EVENT_DPC_DT_DONE33						477
+#define CMDQ_EVENT_DPC_DT_DONE34_0						478
+#define CMDQ_EVENT_DPC_DT_DONE35						479
+#define CMDQ_EVENT_DPC_DISP_SSYS_DT_ERR_ON_BEFORE_OFF				480
+#define CMDQ_EVENT_DPC_DISP_SSYS_DT_ERR_PRETE_BEFORE_ON				481
+#define CMDQ_EVENT_DPC_DISP_DVFS_DT_ERR_ON_BEFORE_OFF				482
+#define CMDQ_EVENT_DPC_DISP_DVFS_DT_ERR_PRETE_BEFORE_ON				483
+#define CMDQ_EVENT_DPC_DISP_SB_DT_ERR_ON_BEFORE_OFF				484
+#define CMDQ_EVENT_DPC_DISP_SB_DT_ERR_PRETE_BEFORE_ON				485
+#define CMDQ_EVENT_DPC_DISP_SW_CONFIG_WHEN_MTCMOS_OFF				486
+#define CMDQ_EVENT_DPC_MML_SSYS_DT_ERR_ON_BEFORE_OFF				487
+#define CMDQ_EVENT_DPC_MML_SSYS_DT_ERR_PRETE_BEFORE_ON				488
+#define CMDQ_EVENT_DPC_MML_DVFS_DT_ERR_ON_BEFORE_OFF				489
+#define CMDQ_EVENT_DPC_MML_DVFS_DT_ERR_PRETE_BEFORE_ON				490
+#define CMDQ_EVENT_DPC_MML_SB_DT_ERR_ON_BEFORE_OFF				491
+#define CMDQ_EVENT_DPC_MML_SB_DT_ERR_PRETE_BEFORE_ON				492
+#define CMDQ_EVENT_DPC_MML_SW_CONFIG_WHEN_MTCMOS_OFF				493
+/* CMDQ_EVENT_DPTX_DPTX_EVENT0 ~ 3: 494 ~ 497 */
+#define CMDQ_EVENT_DPTX_DPTX_EVENT(n)						(494 + (n))
+/* CMDQ_EVENT_EDPTX_EDPTX_EVENT0 ~ 1: 498 ~ 499 */
+#define CMDQ_EVENT_EDPTX_EDPTX_EVENT(n)						(498 + (n))
+
+#define CMDQ_EVENT_DSI0_TE_I_DSI0_TE_I						898
+#define CMDQ_EVENT_DSI1_TE_I_DSI1_TE_I						899
+#define CMDQ_EVENT_DSI2_TE_I_DSI2_TE_I						900
+/* CMDQ_EVENT_POWEREVENT_GCE_EVENT_SUBSYS_PWR_ACK0 ~ 23: 901 ~ 924 */
+#define CMDQ_EVENT_POWEREVENT_GCE_EVENT_SUBSYS_PWR_ACK(n)			(901 + (n))
+/* CMDQ_EVENT_GCE_EVENT_DPTX_GCE_EVENT_DPTX0 ~ 1: 925 ~ 926 */
+#define CMDQ_EVENT_GCE_EVENT_DPTX_GCE_EVENT_DPTX(n)				(925 + (n))
+/* CMDQ_EVENT_GCE_EVENT_DPTX_P1_GCE_EVENT_DPTX_P10 ~ 1: 927 ~ 928 */
+#define CMDQ_EVENT_GCE_EVENT_DPTX_P1_GCE_EVENT_DPTX_P1(n)			(927 + (n))
+/* CMDQ_EVENT_GCE_EVENT_EDPTX_GCE_EVENT_EDPTX0 ~ 1: 929 ~ 930 */
+#define CMDQ_EVENT_GCE_EVENT_EDPTX_GCE_EVENT_EDPTX(n)				(929 + (n))
+#define CMDQ_EVENT_DSI3_TE_I_DSI3_TE_I						931
+#define CMDQ_EVENT_SPI0_FINISH_EVENT_DSI4_TE_I					932
+#define CMDQ_EVENT_SPI0_EVENT_EVENT_DSI5_TE_I					933
+
+/*
+ * GCE1 Hardware Event IDs
+ * Different SoCs will have varying numbers of hardware event signals,
+ * which are sent from the corresponding hardware to the GCE.
+ * Each hardware event signal corresponds to an event ID in the GCE.
+ * The CMDQ driver can use the following event ID definitions to allow
+ * the client driver to use wait and clear APIs provided by CMDQ, enabling
+ * the GCE to execute operations in the instructions for that event ID.
+ *
+ * The event IDs of GCE1 are mainly used by non-display hardware.
+ */
+#define CMDQ_EVENT_VENC3_VENC_RESERVED						0
+#define CMDQ_EVENT_VENC3_VENC_FRAME_DONE					1
+#define CMDQ_EVENT_VENC3_VENC_PAUSE_DONE					2
+#define CMDQ_EVENT_VENC3_JPGENC_DONE						3
+#define CMDQ_EVENT_VENC3_VENC_MB_DONE						4
+#define CMDQ_EVENT_VENC3_VENC_128BYTE_DONE					5
+#define CMDQ_EVENT_VENC3_JPGDEC_DONE						6
+#define CMDQ_EVENT_VENC3_JPGDEC_C1_DONE						7
+#define CMDQ_EVENT_VENC3_JPGDEC_INSUFF_DONE					8
+#define CMDQ_EVENT_VENC3_JPGDEC_C1_INSUFF_DONE					9
+#define CMDQ_EVENT_VENC3_WP_2ND_STAGE_DONE					10
+#define CMDQ_EVENT_VENC3_WP_3RD_STAGE_DONE					11
+#define CMDQ_EVENT_VENC3_PPS_HEADER_DONE					12
+#define CMDQ_EVENT_VENC3_SPS_HEADER_DONE					13
+#define CMDQ_EVENT_VENC3_VPS_HEADER_DONE					14
+#define CMDQ_EVENT_VENC3_VENC_SLICE_DONE					15
+#define CMDQ_EVENT_VENC3_VENC_SOC_SLICE_DONE					16
+#define CMDQ_EVENT_VENC3_VENC_SOC_FRAME_DONE					17
+
+#define CMDQ_EVENT_VENC2_VENC_FRAME_DONE					33
+#define CMDQ_EVENT_VENC2_VENC_PAUSE_DONE					34
+#define CMDQ_EVENT_VENC2_JPGENC_DONE						35
+#define CMDQ_EVENT_VENC2_VENC_MB_DONE						36
+#define CMDQ_EVENT_VENC2_VENC_128BYTE_DONE					37
+#define CMDQ_EVENT_VENC2_JPGDEC_DONE						38
+#define CMDQ_EVENT_VENC2_JPGDEC_C1_DONE						39
+#define CMDQ_EVENT_VENC2_JPGDEC_INSUFF_DONE					40
+#define CMDQ_EVENT_VENC2_JPGDEC_C1_INSUFF_DONE					41
+#define CMDQ_EVENT_VENC2_WP_2ND_STAGE_DONE					42
+#define CMDQ_EVENT_VENC2_WP_3RD_STAGE_DONE					43
+#define CMDQ_EVENT_VENC2_PPS_HEADER_DONE					44
+#define CMDQ_EVENT_VENC2_SPS_HEADER_DONE					45
+#define CMDQ_EVENT_VENC2_VPS_HEADER_DONE					46
+#define CMDQ_EVENT_VENC2_VENC_SLICE_DONE					47
+#define CMDQ_EVENT_VENC2_VENC_SOC_SLICE_DONE					48
+#define CMDQ_EVENT_VENC2_VENC_SOC_FRAME_DONE					49
+
+#define CMDQ_EVENT_VENC1_VENC_FRAME_DONE					65
+#define CMDQ_EVENT_VENC1_VENC_PAUSE_DONE					66
+#define CMDQ_EVENT_VENC1_JPGENC_DONE						67
+#define CMDQ_EVENT_VENC1_VENC_MB_DONE						68
+#define CMDQ_EVENT_VENC1_VENC_128BYTE_DONE					69
+#define CMDQ_EVENT_VENC1_JPGDEC_DONE						70
+#define CMDQ_EVENT_VENC1_JPGDEC_C1_DONE						71
+#define CMDQ_EVENT_VENC1_JPGDEC_INSUFF_DONE					72
+#define CMDQ_EVENT_VENC1_JPGDEC_C1_INSUFF_DONE					73
+#define CMDQ_EVENT_VENC1_WP_2ND_STAGE_DONE					74
+#define CMDQ_EVENT_VENC1_WP_3RD_STAGE_DONE					75
+#define CMDQ_EVENT_VENC1_PPS_HEADER_DONE					76
+#define CMDQ_EVENT_VENC1_SPS_HEADER_DONE					77
+#define CMDQ_EVENT_VENC1_VPS_HEADER_DONE					78
+#define CMDQ_EVENT_VENC1_VENC_SLICE_DONE					79
+#define CMDQ_EVENT_VENC1_VENC_SOC_SLICE_DONE					80
+#define CMDQ_EVENT_VENC1_VENC_SOC_FRAME_DONE					81
+
+#define CMDQ_EVENT_VDEC1_VDEC_LINE_CNT_INT					192
+#define CMDQ_EVENT_VDEC1_VDEC_INT						193
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_2						194
+#define CMDQ_EVENT_VDEC1_VDEC_DEC_ERR						195
+#define CMDQ_EVENT_VDEC1_VDEC_BUSY_OVERFLOW					196
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_5						197
+#define CMDQ_EVENT_VDEC1_VDEC_INI_FETCH_RDY					198
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_7						199
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_8						200
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_9						201
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_10						202
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_11						203
+
+#define CMDQ_EVENT_VDEC1_VDEC_GCE_CNT_OP_THR					207
+
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_32						224
+#define CMDQ_EVENT_VDEC1_VDEC_LAT_INT						225
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_34						226
+#define CMDQ_EVENT_VDEC1_VDEC_LAT_DEC_ERR					227
+#define CMDQ_EVENT_VDEC1_VDEC_LAT_BUSY_OVERFLOW					228
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_37						229
+#define CMDQ_EVENT_VDEC1_VDEC_LAT_INI_FETCH_RDY					230
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_39						231
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_40						232
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_41						233
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_42						234
+#define CMDQ_EVENT_VDEC1_VDEC1_EVENT_43						235
+
+#define CMDQ_EVENT_VDEC1_VDEC_LAT_GCE_CNT_OP_THR				239
+
+#define CMDQ_EVENT_IMG_IMG_EVENT_0						256
+/* CMDQ_EVENT_IMG_TRAW0_CQ_THR_DONE_TRAW0_0 ~ 5: 257 ~  262 */
+#define CMDQ_EVENT_IMG_TRAW0_CQ_THR_DONE_TRAW0(n)				(257 + (n))
+#define CMDQ_EVENT_IMG_TRAW0_DMA_ERR_EVENT					263
+#define CMDQ_EVENT_IMG_TRAW0_DUMMY_0						264
+/* CMDQ_EVENT_IMG_TRAW1_CQ_THR_DONE_TRAW0_0 ~ 5: 265 ~ 270 */
+#define CMDQ_EVENT_IMG_TRAW1_CQ_THR_DONE_TRAW0(n)				(265 + (n))
+#define CMDQ_EVENT_IMG_TRAW1_DMA_ERR_EVENT					271
+#define CMDQ_EVENT_IMG_ADL_TILE_DONE_EVENT					272
+#define CMDQ_EVENT_IMG_ADLWR0_TILE_DONE_EVENT					273
+#define CMDQ_EVENT_IMG_ADLWR1_TILE_DONE_EVENT					274
+#define CMDQ_EVENT_IMG_IMGSYS_IPE_ME_DONE					275
+#define CMDQ_EVENT_IMG_IMGSYS_IPE_MMG_DONE					276
+/* CMDQ_EVENT_IMG_QOF_ACK_EVENT0 ~ 19: 277 ~ 296 */
+#define CMDQ_EVENT_IMG_QOF_ACK_EVENT(n)						(277 + (n))
+/* CMDQ_EVENT_IMG_QOF_ON_EVENT0 ~ 4: 297 ~ 301 */
+#define CMDQ_EVENT_IMG_QOF_ON_EVENT(n)						(297 + (n))
+/* CMDQ_EVENT_IMG_QOF_OFF_EVENT0 ~ 4: 302 ~ 306 */
+#define CMDQ_EVENT_IMG_QOF_OFF_EVENT(n)						(302 + (n))
+/* CMDQ_EVENT_IMG_QOF_SAVE_EVENT0 ~ 4: 307 ~ 311 */
+#define CMDQ_EVENT_IMG_QOF_SAVE_EVENT(n)					(307 + (n))
+/* CMDQ_EVENT_IMG_QOF_RESTORE_EVENT0 ~ 4: 312 ~ 316 */
+#define CMDQ_EVENT_IMG_QOF_RESTORE_EVENT(n)					(312 + (n))
+/* CMDQ_EVENT_IMG_DIP_CQ_THR_DONE_P20~5: 317 ~ 322 */
+#define CMDQ_EVENT_IMG_DIP_CQ_THR_DONE_P2(n)					(317 + (n))
+#define CMDQ_EVENT_IMG_DIP_DMA_ERR_EVENT					323
+#define CMDQ_EVENT_IMG_DIP_NR_DMA_ERR_EVENT					324
+#define CMDQ_EVENT_IMG_DIP_DUMMY_0						325
+#define CMDQ_EVENT_IMG_WPE_EIS_GCE_FRAME_DONE					326
+#define CMDQ_EVENT_IMG_WPE_EIS_DONE_SYNC_OUT					327
+/* CMDQ_EVENT_IMG_WPE_EIS_CQ_THR_DONE_P20 ~ 5: 328 ~ 333 */
+#define CMDQ_EVENT_IMG_WPE_EIS_CQ_THR_DONE_P2(n)				(328 + (n))
+/* CMDQ_EVENT_IMG_PQDIP_A_CQ_THR_DONE_P20 ~ 5: 334 ~ 339 */
+#define CMDQ_EVENT_IMG_PQDIP_A_CQ_THR_DONE_P2(n)				(334 + (n))
+#define CMDQ_EVENT_IMG_PQA_DMA_ERR_EVENT					340
+/* CMDQ_EVENT_IMG_WPE0_DUMMY0~2: 341 ~ 343 */
+#define CMDQ_EVENT_IMG_WPE0_DUMMY(n)						(341 + (n))
+#define CMDQ_EVENT_IMG_OMC_TNR_GCE_FRAME_DONE					344
+#define CMDQ_EVENT_IMG_OMC_TNR_DONE_SYNC_OUT					345
+/* CMDQ_EVENT_IMG_OMC_TNR_CQ_THR_DONE_P20 ~ 5: 346 ~ 351 */
+#define CMDQ_EVENT_IMG_OMC_TNR_CQ_THR_DONE_P2(n)				(346 + (n))
+/* CMDQ_EVENT_IMG_PQDIP_B_CQ_THR_DONE_P20 ~ 5: 352 ~ 357 */
+#define CMDQ_EVENT_IMG_PQDIP_B_CQ_THR_DONE_P2(n)				(352 + (n))
+#define CMDQ_EVENT_IMG_PQB_DMA_ERR_EVENT					358
+/* CMDQ_EVENT_IMG_WPE1_DUMMY0 ~ 2: 359 ~ 361 */
+#define CMDQ_EVENT_IMG_WPE1_DUMMY(n)						(359 + (n))
+#define CMDQ_EVENT_IMG_WPE_LITE_GCE_FRAME_DONE					362
+#define CMDQ_EVENT_IMG_WPE_LITE_DONE_SYNC_OUT					363
+/* CMDQ_EVENT_IMG_WPE_LITE_CQ_THR_DONE_P20 ~ 5: 364 ~ 369 */
+#define CMDQ_EVENT_IMG_WPE_LITE_CQ_THR_DONE_P2(n)				(364 + (n))
+#define CMDQ_EVENT_IMG_OMC_LITE_GCE_FRAME_DONE					370
+#define CMDQ_EVENT_IMG_OMC_LITE_DONE_SYNC_OUT					371
+/* CMDQ_EVENT_IMG_OMC_LITE_CQ_THR_DONE_P20 ~ 5: 372 ~ 377 */
+#define CMDQ_EVENT_IMG_OMC_LITE_CQ_THR_DONE_P2(n)				(372 + (n))
+/* CMDQ_EVENT_IMG_WPE2_DUMMY0 ~ 2: 378 ~ 380 */
+#define CMDQ_EVENT_IMG_WPE2_DUMMY(n)						(378 + (n))
+#define CMDQ_EVENT_IMG_IMGSYS_IPE_FDVT0_DONE					381
+#define CMDQ_EVENT_IMG_IMG_EVENT_126						382
+#define CMDQ_EVENT_IMG_IMG_EVENT_127						383
+#define CMDQ_EVENT_CAM_CAM_EVENT_0						384
+#define CMDQ_EVENT_CAM_CAM_SUBA_SW_PASS1_DONE					385
+#define CMDQ_EVENT_CAM_CAM_SUBB_SW_PASS1_DONE					386
+#define CMDQ_EVENT_CAM_CAM_SUBC_SW_PASS1_DONE					387
+#define CMDQ_EVENT_CAM_CAM_SUBA_TFMR_PASS1_DONE					388
+#define CMDQ_EVENT_CAM_CAM_SUBB_TFMR_PASS1_DONE					389
+#define CMDQ_EVENT_CAM_CAM_SUBC_TFMR_PASS1_DONE					390
+/* CMDQ_EVENT_CAM_CAMSV_A_SW_PASS1_DONE0 ~ 3: 391 ~ 394 */
+#define CMDQ_EVENT_CAM_CAMSV_A_SW_PASS1_DONE(n)					(391 + (n))
+/* CMDQ_EVENT_CAM_CAMSV_B_SW_PASS1_DONE0 ~ 3: 395 ~ 398 */
+#define CMDQ_EVENT_CAM_CAMSV_B_SW_PASS1_DONE(n)					(395 + (n))
+/* CMDQ_EVENT_CAM_CAMSV_C_SW_PASS1_DONE0 ~ 3: 399 + 402 */
+#define CMDQ_EVENT_CAM_CAMSV_C_SW_PASS1_DONE(n)					(399 + (n))
+/* CMDQ_EVENT_CAM_CAMSV_D_SW_PASS1_DONE0 ~ 3: 403 ~ 406 */
+#define CMDQ_EVENT_CAM_CAMSV_D_SW_PASS1_DONE(n)					(403 + (n))
+/* CMDQ_EVENT_CAM_CAMSV_E_SW_PASS1_DONE0 ~ 3: 407 ~ 409 */
+#define CMDQ_EVENT_CAM_CAMSV_E_SW_PASS1_DONE(n)					(407 + (n))
+/* CMDQ_EVENT_CAM_CAMSV_F_SW_PASS1_DONE0 ~ 3: 411 ~ 413 */
+#define CMDQ_EVENT_CAM_CAMSV_F_SW_PASS1_DONE(n)					(411 + (n))
+#define CMDQ_EVENT_CAM_MRAW0_SW_PASS1_DONE					415
+#define CMDQ_EVENT_CAM_MRAW1_SW_PASS1_DONE					416
+#define CMDQ_EVENT_CAM_MRAW2_SW_PASS1_DONE					417
+#define CMDQ_EVENT_CAM_MRAW3_SW_PASS1_DONE					418
+#define CMDQ_EVENT_CAM_UISP_SW_PASS1_DONE					419
+#define CMDQ_EVENT_CAM_TG_MRAW0_OUT_SOF						420
+#define CMDQ_EVENT_CAM_TG_MRAW1_OUT_SOF						421
+#define CMDQ_EVENT_CAM_TG_MRAW2_OUT_SOF						422
+#define CMDQ_EVENT_CAM_TG_MRAW3_OUT_SOF						423
+#define CMDQ_EVENT_CAM_PDA0_IRQO_EVENT_DONE_D1					424
+#define CMDQ_EVENT_CAM_PDA1_IRQO_EVENT_DONE_D1					425
+#define CMDQ_EVENT_CAM_DPE_DVP_CMQ_EVENT					426
+#define CMDQ_EVENT_CAM_DPE_DVS_CMQ_EVENT					427
+#define CMDQ_EVENT_CAM_DPE_DVFG_CMQ_EVENT					428
+#define CMDQ_EVENT_CAM_CAM_EVENT_45						429
+#define CMDQ_EVENT_CAM_CAM_EVENT_46						430
+#define CMDQ_EVENT_CAM_CAM_EVENT_47						431
+#define CMDQ_EVENT_CAM_CAM_EVENT_48						432
+/* CMDQ_EVENT_CAM_CAM_SUBA_TG_INT1 ~ 4: 433 ~ 436 */
+#define CMDQ_EVENT_CAM_CAM_SUBA_TG_INT(n)					(433 + (n) - 1)
+/* CMDQ_EVENT_CAM_CAM_SUBB_TG_INT1 ~ 4: 437 ~ 440 */
+#define CMDQ_EVENT_CAM_CAM_SUBB_TG_INT(n)					(437 + (n) - 1)
+/* CMDQ_EVENT_CAM_CAM_SUBC_TG_INT1 ~ 4: 441 ~ 444 */
+#define CMDQ_EVENT_CAM_CAM_SUBC_TG_INT(n)					(441 + (n) - 1)
+#define CMDQ_EVENT_CAM_RAW_O_SOF_SUBA						445
+#define CMDQ_EVENT_CAM_RAW_O_SOF_SUBB						446
+#define CMDQ_EVENT_CAM_RAW_O_SOF_SUBC						447
+#define CMDQ_EVENT_CAM_TFMR_RAW_O_SOF_SUBA					448
+#define CMDQ_EVENT_CAM_TFMR_RAW_O_SOF_SUBB					449
+#define CMDQ_EVENT_CAM_TFMR_RAW_O_SOF_SUBC					450
+#define CMDQ_EVENT_CAM_RAW_SEL_SOF_UISP						451
+#define CMDQ_EVENT_CAM_CAM_SUBA_RING_BUFFER_OVERFLOW_INT_IN			452
+#define CMDQ_EVENT_CAM_CAM_SUBB_RING_BUFFER_OVERFLOW_INT_IN			453
+#define CMDQ_EVENT_CAM_CAM_SUBC_RING_BUFFER_OVERFLOW_INT_IN			454
+#define CMDQ_EVENT_CAM_CAM_EVENT_71						455
+#define CMDQ_EVENT_CAM_ADL_WR_FRAME_DONE					456
+#define CMDQ_EVENT_CAM_ADL_RD_FRAME_DONE					457
+#define CMDQ_EVENT_CAM_QOF_RAWA_POWER_ON_EVENT					458
+#define CMDQ_EVENT_CAM_QOF_RAWB_POWER_ON_EVENT					459
+#define CMDQ_EVENT_CAM_QOF_RAWC_POWER_ON_EVENT					460
+#define CMDQ_EVENT_CAM_QOF_RAWA_POWER_OFF_EVENT					461
+#define CMDQ_EVENT_CAM_QOF_RAWB_POWER_OFF_EVENT					462
+#define CMDQ_EVENT_CAM_QOF_RAWC_POWER_OFF_EVENT					463
+#define CMDQ_EVENT_CAM_QOF_RAWA_SAVE_EVENT					464
+#define CMDQ_EVENT_CAM_QOF_RAWB_SAVE_EVENT					465
+#define CMDQ_EVENT_CAM_QOF_RAWC_SAVE_EVENT					466
+#define CMDQ_EVENT_CAM_QOF_RAWA_RESTORE_EVENT					467
+#define CMDQ_EVENT_CAM_QOF_RAWB_RESTORE_EVENT					468
+#define CMDQ_EVENT_CAM_QOF_RAWC_RESTORE_EVENT					469
+/* CMDQ_EVENT_CAM_QOF_CAM_EVENT0 ~ 11: 470 ~ 481 */
+#define CMDQ_EVENT_CAM_QOF_CAM_EVENT(n)						(470 + (n))
+/* CMDQ_EVENT_CAM_SENINF_CFG_DONE_EVENT0 ~ 11: 482 ~ 495 */
+#define CMDQ_EVENT_CAM_SENINF_CFG_DONE_EVENT(n)					(482 + (n))
+#define CMDQ_EVENT_CAM_CCU0_TO_GCE_NON_SEC_IRQ					496
+#define CMDQ_EVENT_CAM_CCU0_TO_GCE_SEC_IRQ					497
+#define CMDQ_EVENT_CAM_CCU0_TO_GCE_VM_IRQ					498
+#define CMDQ_EVENT_CAM_CCU0_TO_GCE_EXCH_VM_IRQ					499
+#define CMDQ_EVENT_CAM_CCU1_TO_GCE_NON_SEC_IRQ					500
+#define CMDQ_EVENT_CAM_CCU1_TO_GCE_SEC_IRQ					501
+#define CMDQ_EVENT_CAM_CCU1_TO_GCE_VM_IRQ					502
+#define CMDQ_EVENT_CAM_CCU1_TO_GCE_EXCH_VM_IRQ					503
+/* CMDQ_EVENT_CAM_I2C_CH2_EVENT0 ~ 4: 504 ~ 509 */
+#define CMDQ_EVENT_CAM_I2C_CH2_EVENT(n)						(504 + (n))
+#define CMDQ_EVENT_CAM_CAM_EVENT_125						509
+#define CMDQ_EVENT_CAM_CAM_EVENT_126						510
+#define CMDQ_EVENT_CAM_CAM_EVENT_127						511
+
+#define CMDQ_EVENT_SMI_EVENT_MMINFRA_SMI_MMSRAM_COMM_SMIASSER			898
+#define CMDQ_EVENT_SMI_EVENT_MMINFRA_SMI_MDP_COMM_SMIASSER			899
+#define CMDQ_EVENT_SMI_EVENT_MMINFRA_SMI_DISP_COMM_SMIASSER			900
+
+/*
+ * GCE Software Tokens
+ * Apart from the event IDs that are already bound to hardware event signals,
+ * the remaining event IDs can be used as software tokens.
+ * This allows the client driver to name and operate them independently,
+ * and their usage is the same as that of hardware events.
+ */
+/* Begin of GCE0 software token */
+/* Config thread notify trigger thread */
+#define CMDQ_SYNC_TOKEN_CONFIG_DIRTY			640
+/* Trigger thread notify config thread */
+#define CMDQ_SYNC_TOKEN_STREAM_EOF			641
+/* Block Trigger thread until the ESD check finishes */
+#define CMDQ_SYNC_TOKEN_ESD_EOF				642
+#define CMDQ_SYNC_TOKEN_STREAM_BLOCK			643
+/* Check CABC setup finish */
+#define CMDQ_SYNC_TOKEN_CABC_EOF			644
+/* VFP period token for Msync */
+#define CMDQ_SYNC_TOKEN_VFP_PERIOD			645
+/* Software sync token for dual display */
+#define CMDQ_SYNC_TOKEN_CONFIG_DIRTY_1			694
+#define CMDQ_SYNC_TOKEN_STREAM_EOF_1			695
+#define CMDQ_SYNC_TOKEN_ESD_EOF_1			696
+#define CMDQ_SYNC_TOKEN_STREAM_BLOCK_1			697
+#define CMDQ_SYNC_TOKEN_CABC_EOF_1			698
+
+/*
+ * GPR access tokens (for HW register backup)
+ * There are 15 32-bit GPR, form 3 GPR as a set
+ * (64-bit for address, 32-bit for value)
+ *
+ * CMDQ_SYNC_TOKEN_GPR_SET0 ~ 4: 700 ~ 704
+ */
+#define CMDQ_SYNC_TOKEN_GPR_SET(n)			(700 + (n))
+#define CMDQ_SYNC_TOKEN_TE_0				705
+#define CMDQ_SYNC_TOKEN_PREFETCH_TE_0			706
+#define CMDQ_SYNC_TOKEN_VIDLE_POWER_ON			707
+#define CMDQ_SYNC_TOKEN_CHECK_TRIGGER_MERGE		708
+
+/* Resource lock event to control resource in GCE thread */
+#define CMDQ_SYNC_RESOURCE_WROT0			710
+#define CMDQ_SYNC_RESOURCE_WROT1			711
+/* Hardware TRACE software token */
+#define CMDQ_SYNC_TOKEN_HW_TRACE_WAIT			712
+#define CMDQ_SYNC_TOKEN_HW_TRACE_LOCK			713
+/* Software sync token for dual display */
+#define CMDQ_SYNC_TOKEN_CONFIG_DIRTY_3			714
+#define CMDQ_SYNC_TOKEN_STREAM_EOF_3			715
+#define CMDQ_SYNC_TOKEN_ESD_EOF_3			716
+#define CMDQ_SYNC_TOKEN_STREAM_BLOCK_3			717
+#define CMDQ_SYNC_TOKEN_CABC_EOF_3			718
+/* End of GCE0 software token */
+
+/* Begin of GCE1 software token */
+/* CMDQ_SYNC_TOKEN_IMGSYS_POOL0 ~ 300: 512 ~ 812 */
+#define	CMDQ_SYNC_TOKEN_IMGSYS_POOL(n)			(512 + (n))
+/* ISP software token */
+#define CMDQ_SYNC_TOKEN_IMGSYS_WPE_EIS			813
+#define CMDQ_SYNC_TOKEN_IMGSYS_OMC_TNR			814
+#define CMDQ_SYNC_TOKEN_IMGSYS_WPE_LITE			815
+#define CMDQ_SYNC_TOKEN_IMGSYS_TRAW			816
+#define CMDQ_SYNC_TOKEN_IMGSYS_LTRAW			817
+#define CMDQ_SYNC_TOKEN_IMGSYS_XTRAW			818
+#define CMDQ_SYNC_TOKEN_IMGSYS_DIP			819
+#define CMDQ_SYNC_TOKEN_IMGSYS_PQDIP_A			820
+#define CMDQ_SYNC_TOKEN_IMGSYS_PQDIP_B			821
+#define CMDQ_SYNC_TOKEN_IPESYS_ME			822
+#define CMDQ_SYNC_TOKEN_APUSYS_APU			823
+#define CMDQ_SYNC_TOKEN_IMGSYS_VSS_TRAW			824
+#define CMDQ_SYNC_TOKEN_IMGSYS_VSS_LTRAW		825
+#define CMDQ_SYNC_TOKEN_IMGSYS_VSS_XTRAW		826
+#define CMDQ_SYNC_TOKEN_IMGSYS_VSS_DIP			827
+#define CMDQ_SYNC_TOKEN_IMGSYS_OMC_LITE			828
+/* IMG software token for QoS */
+#define CMDQ_SYNC_TOKEN_IMGSYS_QOS_LOCK			829
+/* IMG software token for Qof */
+#define CMDQ_SYNC_TOKEN_DIP_POWER_CTRL			830
+#define CMDQ_SYNC_TOKEN_DIP_TRIG_PWR_ON			831
+#define CMDQ_SYNC_TOKEN_DIP_PWR_ON			832
+#define CMDQ_SYNC_TOKEN_DIP_TRIG_PWR_OFF		833
+#define CMDQ_SYNC_TOKEN_DIP_PWR_OFF			834
+#define CMDQ_SYNC_TOKEN_DIP_PWR_HAND_SHAKE		835
+#define CMDQ_SYNC_TOKEN_TRAW_POWER_CTRL			836
+#define CMDQ_SYNC_TOKEN_TRAW_TRIG_PWR_ON		837
+#define CMDQ_SYNC_TOKEN_TRAW_PWR_ON			838
+#define CMDQ_SYNC_TOKEN_TRAW_TRIG_PWR_OFF		839
+#define CMDQ_SYNC_TOKEN_TRAW_PWR_OFF			840
+#define CMDQ_SYNC_TOKEN_TRAW_PWR_HAND_SHAKE		841
+/* End of GCE1 software token */
+
+/* Begin of common software token */
+/*
+ * Notify normal CMDQ there are some secure task done
+ * MUST NOT CHANGE, this token sync with secure world
+ */
+#define CMDQ_SYNC_SECURE_THR_EOF			940
+/* CMDQ use software token */
+#define CMDQ_SYNC_TOKEN_USER_0				941
+#define CMDQ_SYNC_TOKEN_USER_1				942
+#define CMDQ_SYNC_TOKEN_POLL_MONITOR			943
+#define CMDQ_SYNC_TOKEN_TPR_LOCK			942
+/* TZMP software token */
+#define CMDQ_SYNC_TOKEN_TZMP_DISP_WAIT			943
+#define CMDQ_SYNC_TOKEN_TZMP_DISP_SET			944
+#define CMDQ_SYNC_TOKEN_TZMP_ISP_WAIT			945
+#define CMDQ_SYNC_TOKEN_TZMP_ISP_SET			946
+#define CMDQ_SYNC_TOKEN_TZMP_AIE_WAIT			947
+#define CMDQ_SYNC_TOKEN_TZMP_AIE_SET			948
+#define CMDQ_SYNC_TOKEN_TZMP_ADL_WAIT			949
+#define CMDQ_SYNC_TOKEN_TZMP_ADL_SET			950
+/* PREBUILT software token */
+#define CMDQ_SYNC_TOKEN_PREBUILT_MDP_LOCK		951
+#define CMDQ_SYNC_TOKEN_PREBUILT_MML_LOCK		952
+#define CMDQ_SYNC_TOKEN_PREBUILT_VFMT_LOCK		953
+#define CMDQ_SYNC_TOKEN_PREBUILT_DISP_LOCK		954
+#define CMDQ_SYNC_TOKEN_DISP_VA_START			955
+#define CMDQ_SYNC_TOKEN_DISP_VA_END			956
+
+/*
+ * Event for GPR timer, used in sleep and poll with timeout
+ *
+ * CMDQ_TOKEN_GPR_TIMER_R0~15: 994 ~ 1009
+ */
+#define CMDQ_TOKEN_GPR_TIMER_R(n)			(994 + (n))
+/* End of common software token */
+
+#endif
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support for MT8196
  2025-02-18  5:41 [PATCH v4 0/8] Add GCE support for MT8196 Jason-JH Lin
  2025-02-18  5:41 ` [PATCH v4 1/8] dt-bindings: mailbox: mediatek: Add support for MT8196 GCE mailbox Jason-JH Lin
  2025-02-18  5:41 ` [PATCH v4 2/8] arm64: dts: mediatek: Add GCE header for MT8196 Jason-JH Lin
@ 2025-02-18  5:41 ` Jason-JH Lin
  2025-02-18  9:25   ` CK Hu (胡俊光)
  2025-03-04  9:32   ` AngeloGioacchino Del Regno
  2025-02-18  5:41 ` [PATCH v4 4/8] soc: mediatek: mtk-cmdq: Add pa_base parsing for unsupported subsys ID hardware Jason-JH Lin
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 31+ messages in thread
From: Jason-JH Lin @ 2025-02-18  5:41 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jassi Brar,
	Chun-Kuang Hu, AngeloGioacchino Del Regno, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Jason-JH Lin, Nancy Lin, Singo Chang, Moudy Ho,
	Xavier Chang, Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

MT8196 has 3 new hardware configuration compared with the previous SoC,
which correspond to the 3 new driver data:

1. mminfra_offset: For GCE data plane control
   Since GCE has been moved into mminfra, GCE needs to append the
   mminfra offset to the DRAM address when accessing the DRAM.

2. gce_vm: For GCE hardware virtualization
   Currently, the first version of the mt8196 mailbox controller only
   requires setting the VM-related registers to enable the permissions
   of a host VM.

3. dma_mask_bit: For dma address bit control
   In order to avoid the hardware limitations of MT8196 accessing DRAM,
   GCE needs to configure the DMA address to be less than 35 bits.

Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
---
 drivers/mailbox/mtk-cmdq-mailbox.c       | 90 +++++++++++++++++++++---
 include/linux/mailbox/mtk-cmdq-mailbox.h |  2 +
 2 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c b/drivers/mailbox/mtk-cmdq-mailbox.c
index d186865b8dce..0abe10a7fef9 100644
--- a/drivers/mailbox/mtk-cmdq-mailbox.c
+++ b/drivers/mailbox/mtk-cmdq-mailbox.c
@@ -43,6 +43,17 @@
 #define GCE_CTRL_BY_SW				GENMASK(2, 0)
 #define GCE_DDR_EN				GENMASK(18, 16)
 
+#define GCE_VM_ID_MAP0			0x5018
+#define GCE_VM_MAP0_ALL_HOST			GENMASK(29, 0)
+#define GCE_VM_ID_MAP1			0x501c
+#define GCE_VM_MAP1_ALL_HOST			GENMASK(29, 0)
+#define GCE_VM_ID_MAP2			0x5020
+#define GCE_VM_MAP2_ALL_HOST			GENMASK(29, 0)
+#define GCE_VM_ID_MAP3			0x5024
+#define GCE_VM_MAP3_ALL_HOST			GENMASK(5, 0)
+#define GCE_VM_CPR_GSIZE		0x50c4
+#define GCE_VM_CPR_GSIZE_HSOT			GENMASK(3, 0)
+
 #define CMDQ_THR_ACTIVE_SLOT_CYCLES	0x3200
 #define CMDQ_THR_ENABLED		0x1
 #define CMDQ_THR_DISABLED		0x0
@@ -87,11 +98,24 @@ struct cmdq {
 struct gce_plat {
 	u32 thread_nr;
 	u8 shift;
+	dma_addr_t mminfra_offset;
 	bool control_by_sw;
 	bool sw_ddr_en;
+	bool gce_vm;
+	u32 dma_mask_bit;
 	u32 gce_num;
 };
 
+static inline u32 cmdq_reg_shift_addr(dma_addr_t addr, const struct gce_plat *pdata)
+{
+	return ((addr + pdata->mminfra_offset) >> pdata->shift);
+}
+
+static inline u32 cmdq_reg_revert_addr(dma_addr_t addr, const struct gce_plat *pdata)
+{
+	return ((addr << pdata->shift) - pdata->mminfra_offset);
+}
+
 static void cmdq_sw_ddr_enable(struct cmdq *cmdq, bool enable)
 {
 	WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));
@@ -112,6 +136,30 @@ u8 cmdq_get_shift_pa(struct mbox_chan *chan)
 }
 EXPORT_SYMBOL(cmdq_get_shift_pa);
 
+dma_addr_t cmdq_get_offset_pa(struct mbox_chan *chan)
+{
+	struct cmdq *cmdq = container_of(chan->mbox, struct cmdq, mbox);
+
+	return cmdq->pdata->mminfra_offset;
+}
+EXPORT_SYMBOL(cmdq_get_offset_pa);
+
+bool cmdq_addr_need_offset(struct mbox_chan *chan, dma_addr_t addr)
+{
+	struct cmdq *cmdq = container_of(chan->mbox, struct cmdq, mbox);
+
+	if (cmdq->pdata->mminfra_offset == 0)
+		return false;
+
+	/*
+	 * mminfra will recognize the addr that greater than the mminfra_offset
+	 * as a transaction to DRAM.
+	 * So the caller needs to append mminfra_offset for the true case.
+	 */
+	return (addr >= cmdq->pdata->mminfra_offset);
+}
+EXPORT_SYMBOL(cmdq_addr_need_offset);
+
 static int cmdq_thread_suspend(struct cmdq *cmdq, struct cmdq_thread *thread)
 {
 	u32 status;
@@ -143,6 +191,17 @@ static void cmdq_init(struct cmdq *cmdq)
 	u32 gctl_regval = 0;
 
 	WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));
+
+	if (cmdq->pdata->gce_vm) {
+		/* config cpr size for host vm */
+		writel(GCE_VM_CPR_GSIZE_HSOT, cmdq->base + GCE_VM_CPR_GSIZE);
+		/* config CPR_GSIZE before setting VM_ID_MAP to avoid data leakage */
+		writel(GCE_VM_MAP0_ALL_HOST, cmdq->base + GCE_VM_ID_MAP0);
+		writel(GCE_VM_MAP1_ALL_HOST, cmdq->base + GCE_VM_ID_MAP1);
+		writel(GCE_VM_MAP2_ALL_HOST, cmdq->base + GCE_VM_ID_MAP2);
+		writel(GCE_VM_MAP3_ALL_HOST, cmdq->base + GCE_VM_ID_MAP3);
+	}
+
 	if (cmdq->pdata->control_by_sw)
 		gctl_regval = GCE_CTRL_BY_SW;
 	if (cmdq->pdata->sw_ddr_en)
@@ -199,7 +258,7 @@ static void cmdq_task_insert_into_thread(struct cmdq_task *task)
 				prev_task->pkt->cmd_buf_size, DMA_TO_DEVICE);
 	prev_task_base[CMDQ_NUM_CMD(prev_task->pkt) - 1] =
 		(u64)CMDQ_JUMP_BY_PA << 32 |
-		(task->pa_base >> task->cmdq->pdata->shift);
+		cmdq_reg_shift_addr(task->pa_base, task->cmdq->pdata);
 	dma_sync_single_for_device(dev, prev_task->pa_base,
 				   prev_task->pkt->cmd_buf_size, DMA_TO_DEVICE);
 
@@ -264,7 +323,7 @@ static void cmdq_thread_irq_handler(struct cmdq *cmdq,
 	else
 		return;
 
-	curr_pa = readl(thread->base + CMDQ_THR_CURR_ADDR) << cmdq->pdata->shift;
+	curr_pa = cmdq_reg_shift_addr(readl(thread->base + CMDQ_THR_CURR_ADDR), cmdq->pdata);
 
 	list_for_each_entry_safe(task, tmp, &thread->task_busy_list,
 				 list_entry) {
@@ -416,9 +475,9 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
 		 */
 		WARN_ON(cmdq_thread_reset(cmdq, thread) < 0);
 
-		writel(task->pa_base >> cmdq->pdata->shift,
+		writel(cmdq_reg_shift_addr(task->pa_base, cmdq->pdata),
 		       thread->base + CMDQ_THR_CURR_ADDR);
-		writel((task->pa_base + pkt->cmd_buf_size) >> cmdq->pdata->shift,
+		writel(cmdq_reg_shift_addr(task->pa_base + pkt->cmd_buf_size, cmdq->pdata),
 		       thread->base + CMDQ_THR_END_ADDR);
 
 		writel(thread->priority, thread->base + CMDQ_THR_PRIORITY);
@@ -426,10 +485,10 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
 		writel(CMDQ_THR_ENABLED, thread->base + CMDQ_THR_ENABLE_TASK);
 	} else {
 		WARN_ON(cmdq_thread_suspend(cmdq, thread) < 0);
-		curr_pa = readl(thread->base + CMDQ_THR_CURR_ADDR) <<
-			cmdq->pdata->shift;
-		end_pa = readl(thread->base + CMDQ_THR_END_ADDR) <<
-			cmdq->pdata->shift;
+		curr_pa = cmdq_reg_revert_addr(readl(thread->base + CMDQ_THR_CURR_ADDR),
+					       cmdq->pdata);
+		end_pa = cmdq_reg_revert_addr(readl(thread->base + CMDQ_THR_END_ADDR),
+					      cmdq->pdata);
 		/* check boundary */
 		if (curr_pa == end_pa - CMDQ_INST_SIZE ||
 		    curr_pa == end_pa) {
@@ -663,6 +722,9 @@ static int cmdq_probe(struct platform_device *pdev)
 	if (err)
 		return err;
 
+	if (cmdq->pdata->dma_mask_bit)
+		dma_set_coherent_mask(dev, DMA_BIT_MASK(cmdq->pdata->dma_mask_bit));
+
 	cmdq->mbox.dev = dev;
 	cmdq->mbox.chans = devm_kcalloc(dev, cmdq->pdata->thread_nr,
 					sizeof(*cmdq->mbox.chans), GFP_KERNEL);
@@ -782,6 +844,17 @@ static const struct gce_plat gce_plat_mt8195 = {
 	.gce_num = 2
 };
 
+static const struct gce_plat gce_plat_mt8196 = {
+	.thread_nr = 32,
+	.shift = 3,
+	.mminfra_offset = 0x80000000, /* 2GB */
+	.control_by_sw = true,
+	.sw_ddr_en = true,
+	.gce_vm = true,
+	.dma_mask_bit = 35,
+	.gce_num = 2
+};
+
 static const struct of_device_id cmdq_of_ids[] = {
 	{.compatible = "mediatek,mt6779-gce", .data = (void *)&gce_plat_mt6779},
 	{.compatible = "mediatek,mt8173-gce", .data = (void *)&gce_plat_mt8173},
@@ -790,6 +863,7 @@ static const struct of_device_id cmdq_of_ids[] = {
 	{.compatible = "mediatek,mt8188-gce", .data = (void *)&gce_plat_mt8188},
 	{.compatible = "mediatek,mt8192-gce", .data = (void *)&gce_plat_mt8192},
 	{.compatible = "mediatek,mt8195-gce", .data = (void *)&gce_plat_mt8195},
+	{.compatible = "mediatek,mt8196-gce", .data = (void *)&gce_plat_mt8196},
 	{}
 };
 MODULE_DEVICE_TABLE(of, cmdq_of_ids);
diff --git a/include/linux/mailbox/mtk-cmdq-mailbox.h b/include/linux/mailbox/mtk-cmdq-mailbox.h
index a8f0070c7aa9..79398bf95f8d 100644
--- a/include/linux/mailbox/mtk-cmdq-mailbox.h
+++ b/include/linux/mailbox/mtk-cmdq-mailbox.h
@@ -79,5 +79,7 @@ struct cmdq_pkt {
 };
 
 u8 cmdq_get_shift_pa(struct mbox_chan *chan);
+dma_addr_t cmdq_get_offset_pa(struct mbox_chan *chan);
+bool cmdq_addr_need_offset(struct mbox_chan *chan, dma_addr_t addr);
 
 #endif /* __MTK_CMDQ_MAILBOX_H__ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 4/8] soc: mediatek: mtk-cmdq: Add pa_base parsing for unsupported subsys ID hardware
  2025-02-18  5:41 [PATCH v4 0/8] Add GCE support for MT8196 Jason-JH Lin
                   ` (2 preceding siblings ...)
  2025-02-18  5:41 ` [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support " Jason-JH Lin
@ 2025-02-18  5:41 ` Jason-JH Lin
  2025-03-04  9:35   ` AngeloGioacchino Del Regno
  2025-02-18  5:41 ` [PATCH v4 5/8] soc: mediatek: mtk-cmdq: Add mminfra_offset compatibility for DRAM address Jason-JH Lin
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Jason-JH Lin @ 2025-02-18  5:41 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jassi Brar,
	Chun-Kuang Hu, AngeloGioacchino Del Regno, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Jason-JH Lin, Nancy Lin, Singo Chang, Moudy Ho,
	Xavier Chang, Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

When GCE executes instructions, the corresponding hardware register
can be found through the subsys ID. For hardware that does not support
subsys ID, its subsys ID will be set to invalid value and its physical
address needs to be used to generate GCE instructions.

This commit adds a pa_base parsing flow to the cmdq_client_reg structure
for these unsupported subsys ID hardware.

Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
---
 drivers/soc/mediatek/mtk-cmdq-helper.c | 18 ++++++++++++++++--
 include/linux/soc/mediatek/mtk-cmdq.h  |  3 +++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/soc/mediatek/mtk-cmdq-helper.c b/drivers/soc/mediatek/mtk-cmdq-helper.c
index 455221e8de24..aa9853100d78 100644
--- a/drivers/soc/mediatek/mtk-cmdq-helper.c
+++ b/drivers/soc/mediatek/mtk-cmdq-helper.c
@@ -8,6 +8,7 @@
 #include <linux/module.h>
 #include <linux/mailbox_controller.h>
 #include <linux/of.h>
+#include <linux/of_address.h>
 #include <linux/soc/mediatek/mtk-cmdq.h>
 
 #define CMDQ_WRITE_ENABLE_MASK	BIT(0)
@@ -60,20 +61,30 @@ int cmdq_dev_get_client_reg(struct device *dev,
 			    struct cmdq_client_reg *client_reg, int idx)
 {
 	struct of_phandle_args spec;
+	struct resource res;
 	int err;
 
 	if (!client_reg)
 		return -ENOENT;
 
+	if (of_address_to_resource(dev->of_node, 0, &res) != 0) {
+		dev_err(dev, "Missing reg in %s node\n", dev->of_node->full_name);
+		return -EINVAL;
+	}
+	client_reg->pa_base = res.start;
+
 	err = of_parse_phandle_with_fixed_args(dev->of_node,
 					       "mediatek,gce-client-reg",
 					       3, idx, &spec);
 	if (err < 0) {
-		dev_warn(dev,
+		dev_dbg(dev,
 			"error %d can't parse gce-client-reg property (%d)",
 			err, idx);
 
-		return err;
+		/* make subsys invalid */
+		client_reg->subsys = CMDQ_SUBSYS_INVALID;
+
+		return 0;
 	}
 
 	client_reg->subsys = (u8)spec.args[0];
@@ -130,6 +141,9 @@ int cmdq_pkt_create(struct cmdq_client *client, struct cmdq_pkt *pkt, size_t siz
 
 	pkt->buf_size = size;
 
+	/* need to use pkt->cl->chan later to call mbox APIs when generating instruction */
+	pkt->cl = (void *)client;
+
 	dev = client->chan->mbox->dev;
 	dma_addr = dma_map_single(dev, pkt->va_base, pkt->buf_size,
 				  DMA_TO_DEVICE);
diff --git a/include/linux/soc/mediatek/mtk-cmdq.h b/include/linux/soc/mediatek/mtk-cmdq.h
index 0c3906e8ad19..3578cc9200e9 100644
--- a/include/linux/soc/mediatek/mtk-cmdq.h
+++ b/include/linux/soc/mediatek/mtk-cmdq.h
@@ -23,6 +23,8 @@
 #define CMDQ_THR_SPR_IDX2	(2)
 #define CMDQ_THR_SPR_IDX3	(3)
 
+#define CMDQ_SUBSYS_INVALID	(U8_MAX)
+
 struct cmdq_pkt;
 
 enum cmdq_logic_op {
@@ -52,6 +54,7 @@ struct cmdq_operand {
 
 struct cmdq_client_reg {
 	u8 subsys;
+	phys_addr_t pa_base;
 	u16 offset;
 	u16 size;
 };
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 5/8] soc: mediatek: mtk-cmdq: Add mminfra_offset compatibility for DRAM address
  2025-02-18  5:41 [PATCH v4 0/8] Add GCE support for MT8196 Jason-JH Lin
                   ` (3 preceding siblings ...)
  2025-02-18  5:41 ` [PATCH v4 4/8] soc: mediatek: mtk-cmdq: Add pa_base parsing for unsupported subsys ID hardware Jason-JH Lin
@ 2025-02-18  5:41 ` Jason-JH Lin
  2025-03-04  9:37   ` AngeloGioacchino Del Regno
  2025-02-18  5:41 ` [PATCH v4 6/8] soc: mediatek: Add programming flow for unsupported subsys ID hardware Jason-JH Lin
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Jason-JH Lin @ 2025-02-18  5:41 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jassi Brar,
	Chun-Kuang Hu, AngeloGioacchino Del Regno, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Jason-JH Lin, Nancy Lin, Singo Chang, Moudy Ho,
	Xavier Chang, Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

Since GCE has been moved to mminfra in MT8196, all transactions from
mminfra to DRAM will have their addresses adjusted by subtracting a
mminfra offset.
This information should be handled inside the CMDQ driver, allowing
CMDQ users to call CMDQ APIs as usual.

Therefore, CMDQ driver needs to use the mbox API to get the
mminfra_offset value of the SoC, and then add it to the DRAM address
when generating instructions to ensure GCE accesses the correct DRAM
address.

Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
---
 drivers/soc/mediatek/mtk-cmdq-helper.c | 35 ++++++++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/soc/mediatek/mtk-cmdq-helper.c b/drivers/soc/mediatek/mtk-cmdq-helper.c
index aa9853100d78..f2853a74af01 100644
--- a/drivers/soc/mediatek/mtk-cmdq-helper.c
+++ b/drivers/soc/mediatek/mtk-cmdq-helper.c
@@ -314,10 +314,22 @@ EXPORT_SYMBOL(cmdq_pkt_write_s_mask_value);
 
 int cmdq_pkt_mem_move(struct cmdq_pkt *pkt, dma_addr_t src_addr, dma_addr_t dst_addr)
 {
+	struct cmdq_client *cl = (struct cmdq_client *)pkt->cl;
 	const u16 high_addr_reg_idx  = CMDQ_THR_SPR_IDX0;
 	const u16 value_reg_idx = CMDQ_THR_SPR_IDX1;
 	int ret;
 
+	if (!cl) {
+		pr_err("%s %d: pkt->cl is NULL!\n", __func__, __LINE__);
+		return -EINVAL;
+	}
+
+	if (cmdq_addr_need_offset(cl->chan, src_addr))
+		src_addr += cmdq_get_offset_pa(cl->chan);
+
+	if (cmdq_addr_need_offset(cl->chan, dst_addr))
+		dst_addr += cmdq_get_offset_pa(cl->chan);
+
 	/* read the value of src_addr into high_addr_reg_idx */
 	ret = cmdq_pkt_assign(pkt, high_addr_reg_idx, CMDQ_ADDR_HIGH(src_addr));
 	if (ret < 0)
@@ -428,10 +440,19 @@ EXPORT_SYMBOL(cmdq_pkt_poll_mask);
 
 int cmdq_pkt_poll_addr(struct cmdq_pkt *pkt, dma_addr_t addr, u32 value, u32 mask)
 {
+	struct cmdq_client *cl = (struct cmdq_client *)pkt->cl;
 	struct cmdq_instruction inst = { {0} };
 	u8 use_mask = 0;
 	int ret;
 
+	if (!cl) {
+		pr_err("%s %d: pkt->cl is NULL!\n", __func__, __LINE__);
+		return -EINVAL;
+	}
+
+	if (cmdq_addr_need_offset(cl->chan, addr))
+		addr += cmdq_get_offset_pa(cl->chan);
+
 	/*
 	 * Append an MASK instruction to set the mask for following POLL instruction
 	 * which enables use_mask bit.
@@ -509,11 +530,21 @@ EXPORT_SYMBOL(cmdq_pkt_assign);
 
 int cmdq_pkt_jump_abs(struct cmdq_pkt *pkt, dma_addr_t addr, u8 shift_pa)
 {
+	struct cmdq_client *cl = (struct cmdq_client *)pkt->cl;
 	struct cmdq_instruction inst = {
 		.op = CMDQ_CODE_JUMP,
-		.offset = CMDQ_JUMP_ABSOLUTE,
-		.value = addr >> shift_pa
+		.offset = CMDQ_JUMP_ABSOLUTE
 	};
+
+	if (!cl) {
+		pr_err("%s %d: pkt->cl is NULL!\n", __func__, __LINE__);
+		return -EINVAL;
+	}
+
+	if (cmdq_addr_need_offset(cl->chan, addr))
+		addr += cmdq_get_offset_pa(cl->chan);
+
+	inst.value = addr >> shift_pa;
 	return cmdq_pkt_append_command(pkt, inst);
 }
 EXPORT_SYMBOL(cmdq_pkt_jump_abs);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 6/8] soc: mediatek: Add programming flow for unsupported subsys ID hardware
  2025-02-18  5:41 [PATCH v4 0/8] Add GCE support for MT8196 Jason-JH Lin
                   ` (4 preceding siblings ...)
  2025-02-18  5:41 ` [PATCH v4 5/8] soc: mediatek: mtk-cmdq: Add mminfra_offset compatibility for DRAM address Jason-JH Lin
@ 2025-02-18  5:41 ` Jason-JH Lin
  2025-03-04  9:41   ` AngeloGioacchino Del Regno
  2025-02-18  5:41 ` [PATCH v4 7/8] drm/mediatek: " Jason-JH Lin
  2025-02-18  5:41 ` [PATCH v4 8/8] media: mediatek: mdp3: " Jason-JH Lin
  7 siblings, 1 reply; 31+ messages in thread
From: Jason-JH Lin @ 2025-02-18  5:41 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jassi Brar,
	Chun-Kuang Hu, AngeloGioacchino Del Regno, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Jason-JH Lin, Nancy Lin, Singo Chang, Moudy Ho,
	Xavier Chang, Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

To support hardware without subsys IDs on new SoCs, add a programming
flow that checks whether the subsys ID is valid. If the subsys ID is
invalid, the flow will call 2 alternative CMDQ APIs:
cmdq_pkt_assign() and cmdq_pkt_write_s_value() to achieve the same
functionality.

Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
---
 drivers/soc/mediatek/mtk-mmsys.c | 14 +++++++++++---
 drivers/soc/mediatek/mtk-mutex.c | 11 +++++++++--
 2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/soc/mediatek/mtk-mmsys.c b/drivers/soc/mediatek/mtk-mmsys.c
index bb4639ca0b8c..ce949b863b05 100644
--- a/drivers/soc/mediatek/mtk-mmsys.c
+++ b/drivers/soc/mediatek/mtk-mmsys.c
@@ -167,9 +167,17 @@ static void mtk_mmsys_update_bits(struct mtk_mmsys *mmsys, u32 offset, u32 mask,
 	u32 tmp;
 
 	if (mmsys->cmdq_base.size && cmdq_pkt) {
-		ret = cmdq_pkt_write_mask(cmdq_pkt, mmsys->cmdq_base.subsys,
-					  mmsys->cmdq_base.offset + offset, val,
-					  mask);
+		offset += mmsys->cmdq_base.offset;
+		if (mmsys->cmdq_base.subsys != CMDQ_SUBSYS_INVALID) {
+			ret = cmdq_pkt_write_mask(cmdq_pkt, mmsys->cmdq_base.subsys,
+						  offset, val, mask);
+		} else {
+			/* only MMIO access, no need to check mminfro_offset */
+			ret = cmdq_pkt_assign(cmdq_pkt, CMDQ_THR_SPR_IDX0,
+					      CMDQ_ADDR_HIGH(mmsys->cmdq_base.pa_base));
+			ret |= cmdq_pkt_write_s_mask_value(cmdq_pkt, CMDQ_THR_SPR_IDX0,
+							   CMDQ_ADDR_LOW(offset), val, mask);
+		}
 		if (ret)
 			pr_debug("CMDQ unavailable: using CPU write\n");
 		else
diff --git a/drivers/soc/mediatek/mtk-mutex.c b/drivers/soc/mediatek/mtk-mutex.c
index 5250c1d702eb..3788b16efbf4 100644
--- a/drivers/soc/mediatek/mtk-mutex.c
+++ b/drivers/soc/mediatek/mtk-mutex.c
@@ -963,6 +963,7 @@ int mtk_mutex_enable_by_cmdq(struct mtk_mutex *mutex, void *pkt)
 	struct mtk_mutex_ctx *mtx = container_of(mutex, struct mtk_mutex_ctx,
 						 mutex[mutex->id]);
 	struct cmdq_pkt *cmdq_pkt = (struct cmdq_pkt *)pkt;
+	dma_addr_t en_addr = mtx->addr + DISP_REG_MUTEX_EN(mutex->id);
 
 	WARN_ON(&mtx->mutex[mutex->id] != mutex);
 
@@ -971,8 +972,14 @@ int mtk_mutex_enable_by_cmdq(struct mtk_mutex *mutex, void *pkt)
 		return -ENODEV;
 	}
 
-	cmdq_pkt_write(cmdq_pkt, mtx->cmdq_reg.subsys,
-		       mtx->addr + DISP_REG_MUTEX_EN(mutex->id), 1);
+	if (mtx->cmdq_reg.subsys != CMDQ_SUBSYS_INVALID) {
+		cmdq_pkt_write(cmdq_pkt, mtx->cmdq_reg.subsys, en_addr, 1);
+	} else {
+		/* only MMIO access, no need to check mminfro_offset */
+		cmdq_pkt_assign(cmdq_pkt, CMDQ_THR_SPR_IDX0, CMDQ_ADDR_HIGH(en_addr));
+		cmdq_pkt_write_s_value(cmdq_pkt, CMDQ_THR_SPR_IDX0, CMDQ_ADDR_LOW(en_addr), 1);
+	}
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(mtk_mutex_enable_by_cmdq);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 7/8] drm/mediatek: Add programming flow for unsupported subsys ID hardware
  2025-02-18  5:41 [PATCH v4 0/8] Add GCE support for MT8196 Jason-JH Lin
                   ` (5 preceding siblings ...)
  2025-02-18  5:41 ` [PATCH v4 6/8] soc: mediatek: Add programming flow for unsupported subsys ID hardware Jason-JH Lin
@ 2025-02-18  5:41 ` Jason-JH Lin
  2025-03-06  9:47   ` AngeloGioacchino Del Regno
  2025-02-18  5:41 ` [PATCH v4 8/8] media: mediatek: mdp3: " Jason-JH Lin
  7 siblings, 1 reply; 31+ messages in thread
From: Jason-JH Lin @ 2025-02-18  5:41 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jassi Brar,
	Chun-Kuang Hu, AngeloGioacchino Del Regno, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Jason-JH Lin, Nancy Lin, Singo Chang, Moudy Ho,
	Xavier Chang, Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media, CK Hu

To support hardware without subsys IDs on new SoCs, add a programming
flow that checks whether the subsys ID is valid. If the subsys ID is
invalid, the flow will call 2 alternative CMDQ APIs:
cmdq_pkt_assign() and cmdq_pkt_write_s_value() to achieve the same
functionality.

Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
---
 drivers/gpu/drm/mediatek/mtk_ddp_comp.c | 33 ++++++++++++++++++++-----
 1 file changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_ddp_comp.c b/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
index edc6417639e6..219d67735a54 100644
--- a/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
+++ b/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
@@ -66,14 +66,37 @@ struct mtk_ddp_comp_dev {
 	struct cmdq_client_reg cmdq_reg;
 };
 
+#if IS_REACHABLE(CONFIG_MTK_CMDQ)
+static void mtk_ddp_write_cmdq_pkt(struct cmdq_pkt *cmdq_pkt, struct cmdq_client_reg *cmdq_reg,
+				   unsigned int offset, unsigned int value, unsigned int mask)
+{
+	offset += cmdq_reg->offset;
+
+	if (cmdq_reg->subsys != CMDQ_SUBSYS_INVALID) {
+		if (mask == GENMASK(31, 0))
+			cmdq_pkt_write(cmdq_pkt, cmdq_reg->subsys, offset, value);
+		else
+			cmdq_pkt_write_mask(cmdq_pkt, cmdq_reg->subsys, offset, value, mask);
+	} else {
+		/* only MMIO access, no need to check mminfro_offset */
+		cmdq_pkt_assign(cmdq_pkt, 0, CMDQ_ADDR_HIGH(cmdq_reg->pa_base));
+		if (mask == GENMASK(31, 0))
+			cmdq_pkt_write_s_value(cmdq_pkt, CMDQ_THR_SPR_IDX0,
+					       CMDQ_ADDR_LOW(offset), value);
+		else
+			cmdq_pkt_write_s_mask_value(cmdq_pkt, CMDQ_THR_SPR_IDX0,
+						    CMDQ_ADDR_LOW(offset), value, mask);
+	}
+}
+#endif
+
 void mtk_ddp_write(struct cmdq_pkt *cmdq_pkt, unsigned int value,
 		   struct cmdq_client_reg *cmdq_reg, void __iomem *regs,
 		   unsigned int offset)
 {
 #if IS_REACHABLE(CONFIG_MTK_CMDQ)
 	if (cmdq_pkt)
-		cmdq_pkt_write(cmdq_pkt, cmdq_reg->subsys,
-			       cmdq_reg->offset + offset, value);
+		mtk_ddp_write_cmdq_pkt(cmdq_pkt, cmdq_reg, offset, value, GENMASK(31, 0));
 	else
 #endif
 		writel(value, regs + offset);
@@ -85,8 +108,7 @@ void mtk_ddp_write_relaxed(struct cmdq_pkt *cmdq_pkt, unsigned int value,
 {
 #if IS_REACHABLE(CONFIG_MTK_CMDQ)
 	if (cmdq_pkt)
-		cmdq_pkt_write(cmdq_pkt, cmdq_reg->subsys,
-			       cmdq_reg->offset + offset, value);
+		mtk_ddp_write_cmdq_pkt(cmdq_pkt, cmdq_reg, offset, value, GENMASK(31, 0));
 	else
 #endif
 		writel_relaxed(value, regs + offset);
@@ -98,8 +120,7 @@ void mtk_ddp_write_mask(struct cmdq_pkt *cmdq_pkt, unsigned int value,
 {
 #if IS_REACHABLE(CONFIG_MTK_CMDQ)
 	if (cmdq_pkt) {
-		cmdq_pkt_write_mask(cmdq_pkt, cmdq_reg->subsys,
-				    cmdq_reg->offset + offset, value, mask);
+		mtk_ddp_write_cmdq_pkt(cmdq_pkt, cmdq_reg, offset, value, mask);
 	} else {
 #endif
 		u32 tmp = readl(regs + offset);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 8/8] media: mediatek: mdp3: Add programming flow for unsupported subsys ID hardware
  2025-02-18  5:41 [PATCH v4 0/8] Add GCE support for MT8196 Jason-JH Lin
                   ` (6 preceding siblings ...)
  2025-02-18  5:41 ` [PATCH v4 7/8] drm/mediatek: " Jason-JH Lin
@ 2025-02-18  5:41 ` Jason-JH Lin
  7 siblings, 0 replies; 31+ messages in thread
From: Jason-JH Lin @ 2025-02-18  5:41 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jassi Brar,
	Chun-Kuang Hu, AngeloGioacchino Del Regno, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Jason-JH Lin, Nancy Lin, Singo Chang, Moudy Ho,
	Xavier Chang, Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

To support hardware without subsys IDs on new SoCs, add a programming
flow that checks whether the subsys ID is valid. If the subsys ID is
invalid, the flow will call 2 alternative CMDQ APIs:
cmdq_pkt_assign() and cmdq_pkt_write_s_mask_value() to achieve the
same functionality.

Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
---
 .../platform/mediatek/mdp3/mtk-mdp3-cmdq.c    | 18 ++++-
 .../platform/mediatek/mdp3/mtk-mdp3-comp.h    | 79 ++++++++++++++-----
 2 files changed, 77 insertions(+), 20 deletions(-)

diff --git a/drivers/media/platform/mediatek/mdp3/mtk-mdp3-cmdq.c b/drivers/media/platform/mediatek/mdp3/mtk-mdp3-cmdq.c
index e5ccf673e152..0ee3354963db 100644
--- a/drivers/media/platform/mediatek/mdp3/mtk-mdp3-cmdq.c
+++ b/drivers/media/platform/mediatek/mdp3/mtk-mdp3-cmdq.c
@@ -321,7 +321,14 @@ static int mdp_path_config_subfrm(struct mdp_cmdq_cmd *cmd,
 	/* Enable mux settings */
 	for (index = 0; index < ctrl->num_sets; index++) {
 		set = &ctrl->sets[index];
-		cmdq_pkt_write(&cmd->pkt, set->subsys_id, set->reg, set->value);
+		if (set->subsys_id != CMDQ_SUBSYS_INVALID) {
+			cmdq_pkt_write(&cmd->pkt, set->subsys_id, set->reg, set->value);
+		} else {
+			/* only MMIO access, no need to check mminfro_offset */
+			cmdq_pkt_assign(&cmd->pkt, CMDQ_THR_SPR_IDX0, CMDQ_ADDR_HIGH(set->reg));
+			cmdq_pkt_write_s_value(&cmd->pkt, CMDQ_THR_SPR_IDX0,
+					       CMDQ_ADDR_LOW(set->reg), set->value);
+		}
 	}
 	/* Config sub-frame information */
 	for (index = (num_comp - 1); index >= 0; index--) {
@@ -376,7 +383,14 @@ static int mdp_path_config_subfrm(struct mdp_cmdq_cmd *cmd,
 	/* Disable mux settings */
 	for (index = 0; index < ctrl->num_sets; index++) {
 		set = &ctrl->sets[index];
-		cmdq_pkt_write(&cmd->pkt, set->subsys_id, set->reg, 0);
+		if (set->subsys_id != CMDQ_SUBSYS_INVALID) {
+			cmdq_pkt_write(&cmd->pkt, set->subsys_id, set->reg, 0);
+		} else {
+			/* only MMIO access, no need to check mminfro_offset */
+			cmdq_pkt_assign(&cmd->pkt, CMDQ_THR_SPR_IDX0, CMDQ_ADDR_HIGH(set->reg));
+			cmdq_pkt_write_s_value(&cmd->pkt, CMDQ_THR_SPR_IDX0,
+					       CMDQ_ADDR_LOW(set->reg), 0);
+		}
 	}
 
 	return 0;
diff --git a/drivers/media/platform/mediatek/mdp3/mtk-mdp3-comp.h b/drivers/media/platform/mediatek/mdp3/mtk-mdp3-comp.h
index 681906c16419..e20f9d080db9 100644
--- a/drivers/media/platform/mediatek/mdp3/mtk-mdp3-comp.h
+++ b/drivers/media/platform/mediatek/mdp3/mtk-mdp3-comp.h
@@ -9,17 +9,44 @@
 
 #include "mtk-mdp3-cmdq.h"
 
-#define MM_REG_WRITE_MASK(cmd, id, base, ofst, val, mask)	\
-do {								\
-	typeof(mask) (m) = (mask);				\
-	cmdq_pkt_write_mask(&((cmd)->pkt), id, (base) + (ofst),	\
-			    (val),				\
-		(((m) & (ofst##_MASK)) == (ofst##_MASK)) ?	\
-			(0xffffffff) : (m));			\
+#define MM_REG_WRITE_MASK(cmd, id, base, ofst, val, mask)		\
+do {									\
+	typeof(cmd) (_c) = (cmd);					\
+	typeof(id) (_i) = (id);						\
+	typeof(base) (_b) = (base);					\
+	typeof(ofst) (_o) = (ofst);					\
+	typeof(val) (_v) = (val);					\
+	typeof(mask) (_m) = (mask);					\
+	_m = ((_m & (ofst##_MASK)) == (ofst##_MASK)) ? 0xffffffff : _m;	\
+	if (_i != CMDQ_SUBSYS_INVALID) {				\
+		cmdq_pkt_write_mask(&_c->pkt, _i, _b + _o, _v, _m);	\
+	} else {							\
+		/* only MMIO access, no need to check mminfro_offset */	\
+		cmdq_pkt_assign(&_c->pkt, CMDQ_THR_SPR_IDX0,		\
+				CMDQ_ADDR_HIGH(_b));			\
+		cmdq_pkt_write_s_mask_value(&_c->pkt, CMDQ_THR_SPR_IDX0,\
+					    CMDQ_ADDR_LOW(_b + _o),	\
+					    _v, _m);			\
+	}								\
 } while (0)
 
-#define MM_REG_WRITE(cmd, id, base, ofst, val)			\
-	cmdq_pkt_write(&((cmd)->pkt), id, (base) + (ofst), (val))
+#define MM_REG_WRITE(cmd, id, base, ofst, val)				\
+do {									\
+	typeof(cmd) (_c) = (cmd);					\
+	typeof(id) (_i) = (id);						\
+	typeof(base) (_b) = (base);					\
+	typeof(ofst) (_o) = (ofst);					\
+	typeof(val) (_v) = (val);					\
+	if (_i != CMDQ_SUBSYS_INVALID) {				\
+		cmdq_pkt_write(&_c->pkt, _i, _b + _o, _v);		\
+	} else {							\
+		/* only MMIO access, no need to check mminfro_offset */	\
+		cmdq_pkt_assign(&_c->pkt, CMDQ_THR_SPR_IDX0,		\
+				CMDQ_ADDR_HIGH(_b));			\
+		cmdq_pkt_write_s_value(&_c->pkt, CMDQ_THR_SPR_IDX0,	\
+				       CMDQ_ADDR_LOW(_b + _o), _v);	\
+	}								\
+} while (0)
 
 #define MM_REG_WAIT(cmd, evt)					\
 do {								\
@@ -49,17 +76,33 @@ do {								\
 	cmdq_pkt_set_event(&((c)->pkt), (e));			\
 } while (0)
 
-#define MM_REG_POLL_MASK(cmd, id, base, ofst, val, _mask)	\
-do {								\
-	typeof(_mask) (_m) = (_mask);				\
-	cmdq_pkt_poll_mask(&((cmd)->pkt), id,			\
-		(base) + (ofst), (val),				\
-		(((_m) & (ofst##_MASK)) == (ofst##_MASK)) ?	\
-			(0xffffffff) : (_m));			\
+#define MM_REG_POLL_MASK(cmd, id, base, ofst, val, mask)		\
+do {									\
+	typeof(cmd) (_c) = (cmd);					\
+	typeof(id) (_i) = (id);						\
+	typeof(base) (_b) = (base);					\
+	typeof(ofst) (_o) = (ofst);					\
+	typeof(val) (_v) = (val);					\
+	typeof(mask) (_m) = (mask);					\
+	_m = ((_m & (ofst##_MASK)) == (ofst##_MASK)) ? 0xffffffff : _m;	\
+	if (_i != CMDQ_SUBSYS_INVALID)					\
+		cmdq_pkt_poll_mask(&_c->pkt, _i, _b + _o, _v, _m);	\
+	else /* POLL not support SPR, so use cmdq_pkt_poll_addr() */	\
+		cmdq_pkt_poll_addr(&_c->pkt, _b + _o, _v, _m);		\
 } while (0)
 
-#define MM_REG_POLL(cmd, id, base, ofst, val)			\
-	cmdq_pkt_poll(&((cmd)->pkt), id, (base) + (ofst), (val))
+#define MM_REG_POLL(cmd, id, base, ofst, val)				\
+do {									\
+	typeof(cmd) (_c) = (cmd);					\
+	typeof(id) (_i) = (id);						\
+	typeof(base) (_b) = (base);					\
+	typeof(ofst) (_o) = (ofst);					\
+	typeof(val) (_v) = (val);					\
+	if (_i != CMDQ_SUBSYS_INVALID)					\
+		cmdq_pkt_poll(&_c->pkt, _i, _b + _o, _v);		\
+	else /* POLL not support SPR, so use cmdq_pkt_poll_addr() */	\
+		cmdq_pkt_poll_addr(&_c->pkt, _b + _o, _v, 0xffffffff);	\
+} while (0)
 
 enum mtk_mdp_comp_id {
 	MDP_COMP_NONE = -1,	/* Invalid engine */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] dt-bindings: mailbox: mediatek: Add support for MT8196 GCE mailbox
  2025-02-18  5:41 ` [PATCH v4 1/8] dt-bindings: mailbox: mediatek: Add support for MT8196 GCE mailbox Jason-JH Lin
@ 2025-02-18  8:12   ` Krzysztof Kozlowski
  0 siblings, 0 replies; 31+ messages in thread
From: Krzysztof Kozlowski @ 2025-02-18  8:12 UTC (permalink / raw)
  To: Jason-JH Lin
  Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jassi Brar,
	Chun-Kuang Hu, AngeloGioacchino Del Regno, Mauro Carvalho Chehab,
	Matthias Brugger, Nancy Lin, Singo Chang, Moudy Ho, Xavier Chang,
	Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

On Tue, Feb 18, 2025 at 01:41:46PM +0800, Jason-JH Lin wrote:
> Add the compatible name and iommus property for MT8196.
> 
> In MT8196, all command buffers allocated and used by the GCE device
> work with IOMMU.
> 
> Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
> ---
>  .../devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml     | 4 ++++
>  1 file changed, 4 insertions(+)

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>

Best regards,
Krzysztof



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support for MT8196
  2025-02-18  5:41 ` [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support " Jason-JH Lin
@ 2025-02-18  9:25   ` CK Hu (胡俊光)
  2025-02-22 10:44     ` Jason-JH Lin (林睿祥)
  2025-03-04  9:32   ` AngeloGioacchino Del Regno
  1 sibling, 1 reply; 31+ messages in thread
From: CK Hu (胡俊光) @ 2025-02-18  9:25 UTC (permalink / raw)
  To: chunkuang.hu@kernel.org, AngeloGioacchino Del Regno,
	robh@kernel.org, Jason-JH Lin (林睿祥),
	krzk+dt@kernel.org, jassisinghbrar@gmail.com, mchehab@kernel.org,
	conor+dt@kernel.org
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Moudy Ho (何宗原),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	fshao@chromium.org, Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

On Tue, 2025-02-18 at 13:41 +0800, Jason-JH Lin wrote:
> MT8196 has 3 new hardware configuration compared with the previous SoC,
> which correspond to the 3 new driver data:
> 
> 1. mminfra_offset: For GCE data plane control
>    Since GCE has been moved into mminfra, GCE needs to append the
>    mminfra offset to the DRAM address when accessing the DRAM.

It seems that GCE has iova and mminfra would mapping the iova to physical address.
Maybe let GCE be a iommu device or add iommus property in device node, and use dma_map_xxx() to get iova of GCE.
iommus property point to mminfra device (maybe another name) and mminfra device would process the mapping of iova and physical address.

> 
> 2. gce_vm: For GCE hardware virtualization
>    Currently, the first version of the mt8196 mailbox controller only
>    requires setting the VM-related registers to enable the permissions
>    of a host VM.

What's this? I know this patch would not implement the full VM function,
but describe more about what this is. Why need to enable permission?

Regards,
CK

> 
> 3. dma_mask_bit: For dma address bit control
>    In order to avoid the hardware limitations of MT8196 accessing DRAM,
>    GCE needs to configure the DMA address to be less than 35 bits.
> 
> Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
> ---
>  drivers/mailbox/mtk-cmdq-mailbox.c       | 90 +++++++++++++++++++++---
>  include/linux/mailbox/mtk-cmdq-mailbox.h |  2 +
>  2 files changed, 84 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c b/drivers/mailbox/mtk-cmdq-mailbox.c
> index d186865b8dce..0abe10a7fef9 100644
> --- a/drivers/mailbox/mtk-cmdq-mailbox.c
> +++ b/drivers/mailbox/mtk-cmdq-mailbox.c
> @@ -43,6 +43,17 @@
>  #define GCE_CTRL_BY_SW				GENMASK(2, 0)
>  #define GCE_DDR_EN				GENMASK(18, 16)
>  
> +#define GCE_VM_ID_MAP0			0x5018
> +#define GCE_VM_MAP0_ALL_HOST			GENMASK(29, 0)
> +#define GCE_VM_ID_MAP1			0x501c
> +#define GCE_VM_MAP1_ALL_HOST			GENMASK(29, 0)
> +#define GCE_VM_ID_MAP2			0x5020
> +#define GCE_VM_MAP2_ALL_HOST			GENMASK(29, 0)
> +#define GCE_VM_ID_MAP3			0x5024
> +#define GCE_VM_MAP3_ALL_HOST			GENMASK(5, 0)
> +#define GCE_VM_CPR_GSIZE		0x50c4
> +#define GCE_VM_CPR_GSIZE_HSOT			GENMASK(3, 0)
> +
>  #define CMDQ_THR_ACTIVE_SLOT_CYCLES	0x3200
>  #define CMDQ_THR_ENABLED		0x1
>  #define CMDQ_THR_DISABLED		0x0
> @@ -87,11 +98,24 @@ struct cmdq {
>  struct gce_plat {
>  	u32 thread_nr;
>  	u8 shift;
> +	dma_addr_t mminfra_offset;
>  	bool control_by_sw;
>  	bool sw_ddr_en;
> +	bool gce_vm;
> +	u32 dma_mask_bit;
>  	u32 gce_num;
>  };
>  
> +static inline u32 cmdq_reg_shift_addr(dma_addr_t addr, const struct gce_plat *pdata)
> +{
> +	return ((addr + pdata->mminfra_offset) >> pdata->shift);
> +}
> +
> +static inline u32 cmdq_reg_revert_addr(dma_addr_t addr, const struct gce_plat *pdata)
> +{
> +	return ((addr << pdata->shift) - pdata->mminfra_offset);
> +}
> +
>  static void cmdq_sw_ddr_enable(struct cmdq *cmdq, bool enable)
>  {
>  	WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));
> @@ -112,6 +136,30 @@ u8 cmdq_get_shift_pa(struct mbox_chan *chan)
>  }
>  EXPORT_SYMBOL(cmdq_get_shift_pa);
>  
> +dma_addr_t cmdq_get_offset_pa(struct mbox_chan *chan)
> +{
> +	struct cmdq *cmdq = container_of(chan->mbox, struct cmdq, mbox);
> +
> +	return cmdq->pdata->mminfra_offset;
> +}
> +EXPORT_SYMBOL(cmdq_get_offset_pa);
> +
> +bool cmdq_addr_need_offset(struct mbox_chan *chan, dma_addr_t addr)
> +{
> +	struct cmdq *cmdq = container_of(chan->mbox, struct cmdq, mbox);
> +
> +	if (cmdq->pdata->mminfra_offset == 0)
> +		return false;
> +
> +	/*
> +	 * mminfra will recognize the addr that greater than the mminfra_offset
> +	 * as a transaction to DRAM.
> +	 * So the caller needs to append mminfra_offset for the true case.
> +	 */
> +	return (addr >= cmdq->pdata->mminfra_offset);
> +}
> +EXPORT_SYMBOL(cmdq_addr_need_offset);
> +
>  static int cmdq_thread_suspend(struct cmdq *cmdq, struct cmdq_thread *thread)
>  {
>  	u32 status;
> @@ -143,6 +191,17 @@ static void cmdq_init(struct cmdq *cmdq)
>  	u32 gctl_regval = 0;
>  
>  	WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));
> +
> +	if (cmdq->pdata->gce_vm) {
> +		/* config cpr size for host vm */
> +		writel(GCE_VM_CPR_GSIZE_HSOT, cmdq->base + GCE_VM_CPR_GSIZE);
> +		/* config CPR_GSIZE before setting VM_ID_MAP to avoid data leakage */
> +		writel(GCE_VM_MAP0_ALL_HOST, cmdq->base + GCE_VM_ID_MAP0);
> +		writel(GCE_VM_MAP1_ALL_HOST, cmdq->base + GCE_VM_ID_MAP1);
> +		writel(GCE_VM_MAP2_ALL_HOST, cmdq->base + GCE_VM_ID_MAP2);
> +		writel(GCE_VM_MAP3_ALL_HOST, cmdq->base + GCE_VM_ID_MAP3);
> +	}
> +
>  	if (cmdq->pdata->control_by_sw)
>  		gctl_regval = GCE_CTRL_BY_SW;
>  	if (cmdq->pdata->sw_ddr_en)
> @@ -199,7 +258,7 @@ static void cmdq_task_insert_into_thread(struct cmdq_task *task)
>  				prev_task->pkt->cmd_buf_size, DMA_TO_DEVICE);
>  	prev_task_base[CMDQ_NUM_CMD(prev_task->pkt) - 1] =
>  		(u64)CMDQ_JUMP_BY_PA << 32 |
> -		(task->pa_base >> task->cmdq->pdata->shift);
> +		cmdq_reg_shift_addr(task->pa_base, task->cmdq->pdata);
>  	dma_sync_single_for_device(dev, prev_task->pa_base,
>  				   prev_task->pkt->cmd_buf_size, DMA_TO_DEVICE);
>  
> @@ -264,7 +323,7 @@ static void cmdq_thread_irq_handler(struct cmdq *cmdq,
>  	else
>  		return;
>  
> -	curr_pa = readl(thread->base + CMDQ_THR_CURR_ADDR) << cmdq->pdata->shift;
> +	curr_pa = cmdq_reg_shift_addr(readl(thread->base + CMDQ_THR_CURR_ADDR), cmdq->pdata);
>  
>  	list_for_each_entry_safe(task, tmp, &thread->task_busy_list,
>  				 list_entry) {
> @@ -416,9 +475,9 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
>  		 */
>  		WARN_ON(cmdq_thread_reset(cmdq, thread) < 0);
>  
> -		writel(task->pa_base >> cmdq->pdata->shift,
> +		writel(cmdq_reg_shift_addr(task->pa_base, cmdq->pdata),
>  		       thread->base + CMDQ_THR_CURR_ADDR);
> -		writel((task->pa_base + pkt->cmd_buf_size) >> cmdq->pdata->shift,
> +		writel(cmdq_reg_shift_addr(task->pa_base + pkt->cmd_buf_size, cmdq->pdata),
>  		       thread->base + CMDQ_THR_END_ADDR);
>  
>  		writel(thread->priority, thread->base + CMDQ_THR_PRIORITY);
> @@ -426,10 +485,10 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
>  		writel(CMDQ_THR_ENABLED, thread->base + CMDQ_THR_ENABLE_TASK);
>  	} else {
>  		WARN_ON(cmdq_thread_suspend(cmdq, thread) < 0);
> -		curr_pa = readl(thread->base + CMDQ_THR_CURR_ADDR) <<
> -			cmdq->pdata->shift;
> -		end_pa = readl(thread->base + CMDQ_THR_END_ADDR) <<
> -			cmdq->pdata->shift;
> +		curr_pa = cmdq_reg_revert_addr(readl(thread->base + CMDQ_THR_CURR_ADDR),
> +					       cmdq->pdata);
> +		end_pa = cmdq_reg_revert_addr(readl(thread->base + CMDQ_THR_END_ADDR),
> +					      cmdq->pdata);
>  		/* check boundary */
>  		if (curr_pa == end_pa - CMDQ_INST_SIZE ||
>  		    curr_pa == end_pa) {
> @@ -663,6 +722,9 @@ static int cmdq_probe(struct platform_device *pdev)
>  	if (err)
>  		return err;
>  
> +	if (cmdq->pdata->dma_mask_bit)
> +		dma_set_coherent_mask(dev, DMA_BIT_MASK(cmdq->pdata->dma_mask_bit));
> +
>  	cmdq->mbox.dev = dev;
>  	cmdq->mbox.chans = devm_kcalloc(dev, cmdq->pdata->thread_nr,
>  					sizeof(*cmdq->mbox.chans), GFP_KERNEL);
> @@ -782,6 +844,17 @@ static const struct gce_plat gce_plat_mt8195 = {
>  	.gce_num = 2
>  };
>  
> +static const struct gce_plat gce_plat_mt8196 = {
> +	.thread_nr = 32,
> +	.shift = 3,
> +	.mminfra_offset = 0x80000000, /* 2GB */
> +	.control_by_sw = true,
> +	.sw_ddr_en = true,
> +	.gce_vm = true,
> +	.dma_mask_bit = 35,
> +	.gce_num = 2
> +};
> +
>  static const struct of_device_id cmdq_of_ids[] = {
>  	{.compatible = "mediatek,mt6779-gce", .data = (void *)&gce_plat_mt6779},
>  	{.compatible = "mediatek,mt8173-gce", .data = (void *)&gce_plat_mt8173},
> @@ -790,6 +863,7 @@ static const struct of_device_id cmdq_of_ids[] = {
>  	{.compatible = "mediatek,mt8188-gce", .data = (void *)&gce_plat_mt8188},
>  	{.compatible = "mediatek,mt8192-gce", .data = (void *)&gce_plat_mt8192},
>  	{.compatible = "mediatek,mt8195-gce", .data = (void *)&gce_plat_mt8195},
> +	{.compatible = "mediatek,mt8196-gce", .data = (void *)&gce_plat_mt8196},
>  	{}
>  };
>  MODULE_DEVICE_TABLE(of, cmdq_of_ids);
> diff --git a/include/linux/mailbox/mtk-cmdq-mailbox.h b/include/linux/mailbox/mtk-cmdq-mailbox.h
> index a8f0070c7aa9..79398bf95f8d 100644
> --- a/include/linux/mailbox/mtk-cmdq-mailbox.h
> +++ b/include/linux/mailbox/mtk-cmdq-mailbox.h
> @@ -79,5 +79,7 @@ struct cmdq_pkt {
>  };
>  
>  u8 cmdq_get_shift_pa(struct mbox_chan *chan);
> +dma_addr_t cmdq_get_offset_pa(struct mbox_chan *chan);
> +bool cmdq_addr_need_offset(struct mbox_chan *chan, dma_addr_t addr);
>  
>  #endif /* __MTK_CMDQ_MAILBOX_H__ */


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support for MT8196
  2025-02-18  9:25   ` CK Hu (胡俊光)
@ 2025-02-22 10:44     ` Jason-JH Lin (林睿祥)
  0 siblings, 0 replies; 31+ messages in thread
From: Jason-JH Lin (林睿祥) @ 2025-02-22 10:44 UTC (permalink / raw)
  To: chunkuang.hu@kernel.org, AngeloGioacchino Del Regno,
	robh@kernel.org, krzk+dt@kernel.org, jassisinghbrar@gmail.com,
	mchehab@kernel.org, conor+dt@kernel.org,
	CK Hu (胡俊光)
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-mediatek@lists.infradead.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	Moudy Ho (何宗原), devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, fshao@chromium.org,
	Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

Hi CK,

Thanks for the reviews.

On Tue, 2025-02-18 at 09:25 +0000, CK Hu (胡俊光) wrote:
> On Tue, 2025-02-18 at 13:41 +0800, Jason-JH Lin wrote:
> > MT8196 has 3 new hardware configuration compared with the previous
> > SoC,
> > which correspond to the 3 new driver data:
> > 
> > 1. mminfra_offset: For GCE data plane control
> >    Since GCE has been moved into mminfra, GCE needs to append the
> >    mminfra offset to the DRAM address when accessing the DRAM.
> 
> It seems that GCE has iova and mminfra would mapping the iova to
> physical address.
> Maybe let GCE be a iommu device or add iommus property in device
> node, and use dma_map_xxx() to get iova of GCE.
> iommus property point to mminfra device (maybe another name) and
> mminfra device would process the mapping of iova and physical
> address.

The GCE in the 8196 is using IOVA already.

The main reason of adding the mminfra_offset 0x8000_0000(2G) is to
solve the address conflicting problem:
Due to MMIO, if the GCE needs to access a hardware register at
0x1000_0000, but the SMMU is also mapping a DRAM block at 0x1000_0000,
the GCE will not know whether it should write to the hardware register
or the DRAM.
Therefore, a rule was set in the MMINFRA circuit during the HW design:
transactions with addresses greater than 2G are considered data paths,
while those with addresses less than 2G are considered config paths.
Additionally, on the data path, addresses are remapped by subtracting
2G, allowing the SMMU to still map DRAM addresses less than 2G.
However, the software needs to add 2G to the DRAM address that
the GCE needs to access to ensure that MMINFRA will follow the data
path.

Since the MMINFRA remap subtracting 2G is done in the hardware circuit
and cannot be configured by software, the address +2G adjustment is
implemented in the CMDQ driver.

> 
> > 
> > 2. gce_vm: For GCE hardware virtualization
> >    Currently, the first version of the mt8196 mailbox controller
> > only
> >    requires setting the VM-related registers to enable the
> > permissions
> >    of a host VM.
> 
> What's this? I know this patch would not implement the full VM
> function,
> but describe more about what this is. Why need to enable permission?
> 

OK I'll add the commit message below in the next version:

For the 8196, it is necessary to configure access permissions for
specific GCE threads for different VMs in order to properly access the
GCE thread registers.
Currently, since only the host VM is being used, it is required to
enable access permissions for all GCE threads for the host VM.

VM_MAP0 is for threads 0-9, VM_MAP1 is for threads 10-19, VM_MAP2 is
for threads 20-29, and VM_MAP3 is for threads 30-31. Each thread has a
3-bit configuration, and setting all bits to 1 means the thread is
configured for the host VM.

Regards,
Jason-JH Lin

> Regards,
> CK
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support for MT8196
  2025-02-18  5:41 ` [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support " Jason-JH Lin
  2025-02-18  9:25   ` CK Hu (胡俊光)
@ 2025-03-04  9:32   ` AngeloGioacchino Del Regno
  2025-03-05  8:36     ` Jason-JH Lin (林睿祥)
  1 sibling, 1 reply; 31+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-03-04  9:32 UTC (permalink / raw)
  To: Jason-JH Lin, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jassi Brar, Chun-Kuang Hu, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Nancy Lin, Singo Chang, Moudy Ho, Xavier Chang,
	Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> MT8196 has 3 new hardware configuration compared with the previous SoC,
> which correspond to the 3 new driver data:
> 
> 1. mminfra_offset: For GCE data plane control
>     Since GCE has been moved into mminfra, GCE needs to append the
>     mminfra offset to the DRAM address when accessing the DRAM.
> 
> 2. gce_vm: For GCE hardware virtualization
>     Currently, the first version of the mt8196 mailbox controller only
>     requires setting the VM-related registers to enable the permissions
>     of a host VM.

I think that the GCE VM changes should go to a different commit, as that
looks like being something not critical for basic functionality of the
MMINFRA GCE.

I really like seeing support for that, but please split the basic stuff
from the extra functionality :-)

> 
> 3. dma_mask_bit: For dma address bit control
>     In order to avoid the hardware limitations of MT8196 accessing DRAM,
>     GCE needs to configure the DMA address to be less than 35 bits.
> 
> Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
> ---
>   drivers/mailbox/mtk-cmdq-mailbox.c       | 90 +++++++++++++++++++++---
>   include/linux/mailbox/mtk-cmdq-mailbox.h |  2 +
>   2 files changed, 84 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c b/drivers/mailbox/mtk-cmdq-mailbox.c
> index d186865b8dce..0abe10a7fef9 100644
> --- a/drivers/mailbox/mtk-cmdq-mailbox.c
> +++ b/drivers/mailbox/mtk-cmdq-mailbox.c
> @@ -43,6 +43,17 @@
>   #define GCE_CTRL_BY_SW				GENMASK(2, 0)
>   #define GCE_DDR_EN				GENMASK(18, 16)
>   
> +#define GCE_VM_ID_MAP0			0x5018
> +#define GCE_VM_MAP0_ALL_HOST			GENMASK(29, 0)
> +#define GCE_VM_ID_MAP1			0x501c
> +#define GCE_VM_MAP1_ALL_HOST			GENMASK(29, 0)
> +#define GCE_VM_ID_MAP2			0x5020
> +#define GCE_VM_MAP2_ALL_HOST			GENMASK(29, 0)
> +#define GCE_VM_ID_MAP3			0x5024
> +#define GCE_VM_MAP3_ALL_HOST			GENMASK(5, 0)
> +#define GCE_VM_CPR_GSIZE		0x50c4
> +#define GCE_VM_CPR_GSIZE_HSOT			GENMASK(3, 0)

typo: GSIZE_HOST....

...but also, if you could add some brief description of what the VMIDs are used for
and what the GSIZE is... that'd be very much appreciated from whoever is reading
this.

The GCE stuff isn't even properly described in datasheets - I do (probably!)
understand what those are for, but asking people to get years of experience on
MediaTek to understand what's going on would be a bit rude, wouldn't it? :-D

> +
>   #define CMDQ_THR_ACTIVE_SLOT_CYCLES	0x3200
>   #define CMDQ_THR_ENABLED		0x1
>   #define CMDQ_THR_DISABLED		0x0
> @@ -87,11 +98,24 @@ struct cmdq {
>   struct gce_plat {
>   	u32 thread_nr;
>   	u8 shift;
> +	dma_addr_t mminfra_offset;

It looks like this is exactly the DRAM's iostart... at least, I can see that in the
downstream devicetree that's where it starts.

	memory: memory@80000000 {
		device_type = "memory";
		reg = <0 0x80000000 0 0x40000000>;
	};

It doesn't really look like being a coincidence, but, for the sake of asking:
is this just a coincidence? :-)

>   	bool control_by_sw;
>   	bool sw_ddr_en;
> +	bool gce_vm;
> +	u32 dma_mask_bit;
>   	u32 gce_num;
>   };
>   
> +static inline u32 cmdq_reg_shift_addr(dma_addr_t addr, const struct gce_plat *pdata)
> +{
> +	return ((addr + pdata->mminfra_offset) >> pdata->shift);
> +}
> +
> +static inline u32 cmdq_reg_revert_addr(dma_addr_t addr, const struct gce_plat *pdata)
> +{
> +	return ((addr << pdata->shift) - pdata->mminfra_offset);
> +}

I'm not sure that you really need those two functions... probably it's simply
cleaner and easier to just write that single line every time... and I'm
saying that especially for how you're using those functions, with some readl()
passed directly as param, decreasing human readability by "a whole lot" :-)

> +
>   static void cmdq_sw_ddr_enable(struct cmdq *cmdq, bool enable)
>   {
>   	WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));
> @@ -112,6 +136,30 @@ u8 cmdq_get_shift_pa(struct mbox_chan *chan)
>   }
>   EXPORT_SYMBOL(cmdq_get_shift_pa);
>   
> +dma_addr_t cmdq_get_offset_pa(struct mbox_chan *chan)
> +{
> +	struct cmdq *cmdq = container_of(chan->mbox, struct cmdq, mbox);
> +
> +	return cmdq->pdata->mminfra_offset;
> +}
> +EXPORT_SYMBOL(cmdq_get_offset_pa);

I think I remember this get_offset_pa from the old times, then CK removed it (and I
was really happy about that disappearing), or am I confusing this with something
else?

(of course, this wasn't used for mminfra, but for something else!)

> +
> +bool cmdq_addr_need_offset(struct mbox_chan *chan, dma_addr_t addr)
> +{
> +	struct cmdq *cmdq = container_of(chan->mbox, struct cmdq, mbox);
> +
> +	if (cmdq->pdata->mminfra_offset == 0)
> +		return false;
> +
> +	/*
> +	 * mminfra will recognize the addr that greater than the mminfra_offset
> +	 * as a transaction to DRAM.
> +	 * So the caller needs to append mminfra_offset for the true case.
> +	 */
> +	return (addr >= cmdq->pdata->mminfra_offset);


/**
  * cmdq_is_mminfra_gce() - Brief description
  * @args.....
  *
  * The MMINFRA GCE will recognize an address greater than DRAM iostart as a
  * DRAM transaction instead of ....xyz
  *
  * In order for callers to perform (xyz) transactions through the CMDQ, those
  * need to know if they are using a GCE located in MMINFRA.
  */
bool cmdq_is_mminfra_gce(...)
{
	return cmdq->pdata->mminfra_offset &&
	       (addr >= cmdq->pdata->mminfra_offset)

> +}
> +EXPORT_SYMBOL(cmdq_addr_need_offset);
> +

...but then, is there really no way of just handling the GCE being in MMINFRA
transparently from the callers? Do the callers really *need* to know that they're
using a new GCE?!

Another way of saying: can't we just handle the address translation in here instead
of instructing each and every driver about how to communicate with the new GCE?!


Cheers,
Angelo

>   static int cmdq_thread_suspend(struct cmdq *cmdq, struct cmdq_thread *thread)
>   {
>   	u32 status;
> @@ -143,6 +191,17 @@ static void cmdq_init(struct cmdq *cmdq)
>   	u32 gctl_regval = 0;
>   
>   	WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));
> +
> +	if (cmdq->pdata->gce_vm) {
> +		/* config cpr size for host vm */
> +		writel(GCE_VM_CPR_GSIZE_HSOT, cmdq->base + GCE_VM_CPR_GSIZE);
> +		/* config CPR_GSIZE before setting VM_ID_MAP to avoid data leakage */
> +		writel(GCE_VM_MAP0_ALL_HOST, cmdq->base + GCE_VM_ID_MAP0);
> +		writel(GCE_VM_MAP1_ALL_HOST, cmdq->base + GCE_VM_ID_MAP1);
> +		writel(GCE_VM_MAP2_ALL_HOST, cmdq->base + GCE_VM_ID_MAP2);
> +		writel(GCE_VM_MAP3_ALL_HOST, cmdq->base + GCE_VM_ID_MAP3);
> +	}
> +
>   	if (cmdq->pdata->control_by_sw)
>   		gctl_regval = GCE_CTRL_BY_SW;
>   	if (cmdq->pdata->sw_ddr_en)
> @@ -199,7 +258,7 @@ static void cmdq_task_insert_into_thread(struct cmdq_task *task)
>   				prev_task->pkt->cmd_buf_size, DMA_TO_DEVICE);
>   	prev_task_base[CMDQ_NUM_CMD(prev_task->pkt) - 1] =
>   		(u64)CMDQ_JUMP_BY_PA << 32 |
> -		(task->pa_base >> task->cmdq->pdata->shift);
> +		cmdq_reg_shift_addr(task->pa_base, task->cmdq->pdata);
>   	dma_sync_single_for_device(dev, prev_task->pa_base,
>   				   prev_task->pkt->cmd_buf_size, DMA_TO_DEVICE);
>   
> @@ -264,7 +323,7 @@ static void cmdq_thread_irq_handler(struct cmdq *cmdq,
>   	else
>   		return;
>   
> -	curr_pa = readl(thread->base + CMDQ_THR_CURR_ADDR) << cmdq->pdata->shift;
> +	curr_pa = cmdq_reg_shift_addr(readl(thread->base + CMDQ_THR_CURR_ADDR), cmdq->pdata);
>   
>   	list_for_each_entry_safe(task, tmp, &thread->task_busy_list,
>   				 list_entry) {
> @@ -416,9 +475,9 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
>   		 */
>   		WARN_ON(cmdq_thread_reset(cmdq, thread) < 0);
>   
> -		writel(task->pa_base >> cmdq->pdata->shift,
> +		writel(cmdq_reg_shift_addr(task->pa_base, cmdq->pdata),
>   		       thread->base + CMDQ_THR_CURR_ADDR);
> -		writel((task->pa_base + pkt->cmd_buf_size) >> cmdq->pdata->shift,
> +		writel(cmdq_reg_shift_addr(task->pa_base + pkt->cmd_buf_size, cmdq->pdata),
>   		       thread->base + CMDQ_THR_END_ADDR);
>   
>   		writel(thread->priority, thread->base + CMDQ_THR_PRIORITY);
> @@ -426,10 +485,10 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
>   		writel(CMDQ_THR_ENABLED, thread->base + CMDQ_THR_ENABLE_TASK);
>   	} else {
>   		WARN_ON(cmdq_thread_suspend(cmdq, thread) < 0);
> -		curr_pa = readl(thread->base + CMDQ_THR_CURR_ADDR) <<
> -			cmdq->pdata->shift;
> -		end_pa = readl(thread->base + CMDQ_THR_END_ADDR) <<
> -			cmdq->pdata->shift;
> +		curr_pa = cmdq_reg_revert_addr(readl(thread->base + CMDQ_THR_CURR_ADDR),
> +					       cmdq->pdata);
> +		end_pa = cmdq_reg_revert_addr(readl(thread->base + CMDQ_THR_END_ADDR),
> +					      cmdq->pdata);
>   		/* check boundary */
>   		if (curr_pa == end_pa - CMDQ_INST_SIZE ||
>   		    curr_pa == end_pa) {
> @@ -663,6 +722,9 @@ static int cmdq_probe(struct platform_device *pdev)
>   	if (err)
>   		return err;
>   
> +	if (cmdq->pdata->dma_mask_bit)
> +		dma_set_coherent_mask(dev, DMA_BIT_MASK(cmdq->pdata->dma_mask_bit));
> +
>   	cmdq->mbox.dev = dev;
>   	cmdq->mbox.chans = devm_kcalloc(dev, cmdq->pdata->thread_nr,
>   					sizeof(*cmdq->mbox.chans), GFP_KERNEL);
> @@ -782,6 +844,17 @@ static const struct gce_plat gce_plat_mt8195 = {
>   	.gce_num = 2
>   };
>   
> +static const struct gce_plat gce_plat_mt8196 = {
> +	.thread_nr = 32,
> +	.shift = 3,
> +	.mminfra_offset = 0x80000000, /* 2GB */
> +	.control_by_sw = true,
> +	.sw_ddr_en = true,
> +	.gce_vm = true,
> +	.dma_mask_bit = 35,
> +	.gce_num = 2
> +};
> +
>   static const struct of_device_id cmdq_of_ids[] = {
>   	{.compatible = "mediatek,mt6779-gce", .data = (void *)&gce_plat_mt6779},
>   	{.compatible = "mediatek,mt8173-gce", .data = (void *)&gce_plat_mt8173},
> @@ -790,6 +863,7 @@ static const struct of_device_id cmdq_of_ids[] = {
>   	{.compatible = "mediatek,mt8188-gce", .data = (void *)&gce_plat_mt8188},
>   	{.compatible = "mediatek,mt8192-gce", .data = (void *)&gce_plat_mt8192},
>   	{.compatible = "mediatek,mt8195-gce", .data = (void *)&gce_plat_mt8195},
> +	{.compatible = "mediatek,mt8196-gce", .data = (void *)&gce_plat_mt8196},
>   	{}
>   };
>   MODULE_DEVICE_TABLE(of, cmdq_of_ids);
> diff --git a/include/linux/mailbox/mtk-cmdq-mailbox.h b/include/linux/mailbox/mtk-cmdq-mailbox.h
> index a8f0070c7aa9..79398bf95f8d 100644
> --- a/include/linux/mailbox/mtk-cmdq-mailbox.h
> +++ b/include/linux/mailbox/mtk-cmdq-mailbox.h
> @@ -79,5 +79,7 @@ struct cmdq_pkt {
>   };
>   
>   u8 cmdq_get_shift_pa(struct mbox_chan *chan);
> +dma_addr_t cmdq_get_offset_pa(struct mbox_chan *chan);
> +bool cmdq_addr_need_offset(struct mbox_chan *chan, dma_addr_t addr);
>   
>   #endif /* __MTK_CMDQ_MAILBOX_H__ */



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/8] soc: mediatek: mtk-cmdq: Add pa_base parsing for unsupported subsys ID hardware
  2025-02-18  5:41 ` [PATCH v4 4/8] soc: mediatek: mtk-cmdq: Add pa_base parsing for unsupported subsys ID hardware Jason-JH Lin
@ 2025-03-04  9:35   ` AngeloGioacchino Del Regno
  2025-03-05  9:46     ` Jason-JH Lin (林睿祥)
  0 siblings, 1 reply; 31+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-03-04  9:35 UTC (permalink / raw)
  To: Jason-JH Lin, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jassi Brar, Chun-Kuang Hu, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Nancy Lin, Singo Chang, Moudy Ho, Xavier Chang,
	Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> When GCE executes instructions, the corresponding hardware register
> can be found through the subsys ID. For hardware that does not support
> subsys ID, its subsys ID will be set to invalid value and its physical
> address needs to be used to generate GCE instructions.
> 
> This commit adds a pa_base parsing flow to the cmdq_client_reg structure
> for these unsupported subsys ID hardware.
> 

Does this work only for the MMINFRA located GCEs, or does this work also for
the legacy ones in MT8173/83/88/92/95 // MT6795/6893/etc?

In order to actually review and decide, I do need to know :-)

Thanks,
Angelo




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 5/8] soc: mediatek: mtk-cmdq: Add mminfra_offset compatibility for DRAM address
  2025-02-18  5:41 ` [PATCH v4 5/8] soc: mediatek: mtk-cmdq: Add mminfra_offset compatibility for DRAM address Jason-JH Lin
@ 2025-03-04  9:37   ` AngeloGioacchino Del Regno
  2025-03-05 16:26     ` Jason-JH Lin (林睿祥)
  0 siblings, 1 reply; 31+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-03-04  9:37 UTC (permalink / raw)
  To: Jason-JH Lin, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jassi Brar, Chun-Kuang Hu, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Nancy Lin, Singo Chang, Moudy Ho, Xavier Chang,
	Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> Since GCE has been moved to mminfra in MT8196, all transactions from
> mminfra to DRAM will have their addresses adjusted by subtracting a
> mminfra offset.
> This information should be handled inside the CMDQ driver, allowing
> CMDQ users to call CMDQ APIs as usual.
> 
> Therefore, CMDQ driver needs to use the mbox API to get the
> mminfra_offset value of the SoC, and then add it to the DRAM address
> when generating instructions to ensure GCE accesses the correct DRAM
> address.
> 
> Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
> ---
>   drivers/soc/mediatek/mtk-cmdq-helper.c | 35 ++++++++++++++++++++++++--
>   1 file changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/soc/mediatek/mtk-cmdq-helper.c b/drivers/soc/mediatek/mtk-cmdq-helper.c
> index aa9853100d78..f2853a74af01 100644
> --- a/drivers/soc/mediatek/mtk-cmdq-helper.c
> +++ b/drivers/soc/mediatek/mtk-cmdq-helper.c
> @@ -314,10 +314,22 @@ EXPORT_SYMBOL(cmdq_pkt_write_s_mask_value);
>   
>   int cmdq_pkt_mem_move(struct cmdq_pkt *pkt, dma_addr_t src_addr, dma_addr_t dst_addr)
>   {
> +	struct cmdq_client *cl = (struct cmdq_client *)pkt->cl;
>   	const u16 high_addr_reg_idx  = CMDQ_THR_SPR_IDX0;
>   	const u16 value_reg_idx = CMDQ_THR_SPR_IDX1;
>   	int ret;
>   
> +	if (!cl) {
> +		pr_err("%s %d: pkt->cl is NULL!\n", __func__, __LINE__);
> +		return -EINVAL;
> +	}
> +
> +	if (cmdq_addr_need_offset(cl->chan, src_addr))
> +		src_addr += cmdq_get_offset_pa(cl->chan);

If the offset is just DRAM IOSTART, you could manage that differently in the cmdq
helper as well as the cmdq mailbox... :-)

> +
> +	if (cmdq_addr_need_offset(cl->chan, dst_addr))
> +		dst_addr += cmdq_get_offset_pa(cl->chan);
> +
Cheers,
Angelo


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 6/8] soc: mediatek: Add programming flow for unsupported subsys ID hardware
  2025-02-18  5:41 ` [PATCH v4 6/8] soc: mediatek: Add programming flow for unsupported subsys ID hardware Jason-JH Lin
@ 2025-03-04  9:41   ` AngeloGioacchino Del Regno
  2025-03-05 16:12     ` Jason-JH Lin (林睿祥)
  0 siblings, 1 reply; 31+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-03-04  9:41 UTC (permalink / raw)
  To: Jason-JH Lin, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jassi Brar, Chun-Kuang Hu, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Nancy Lin, Singo Chang, Moudy Ho, Xavier Chang,
	Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media

Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> To support hardware without subsys IDs on new SoCs, add a programming
> flow that checks whether the subsys ID is valid. If the subsys ID is
> invalid, the flow will call 2 alternative CMDQ APIs:
> cmdq_pkt_assign() and cmdq_pkt_write_s_value() to achieve the same
> functionality.
> 
> Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
> ---
>   drivers/soc/mediatek/mtk-mmsys.c | 14 +++++++++++---
>   drivers/soc/mediatek/mtk-mutex.c | 11 +++++++++--
>   2 files changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/soc/mediatek/mtk-mmsys.c b/drivers/soc/mediatek/mtk-mmsys.c
> index bb4639ca0b8c..ce949b863b05 100644
> --- a/drivers/soc/mediatek/mtk-mmsys.c
> +++ b/drivers/soc/mediatek/mtk-mmsys.c
> @@ -167,9 +167,17 @@ static void mtk_mmsys_update_bits(struct mtk_mmsys *mmsys, u32 offset, u32 mask,
>   	u32 tmp;
>   
>   	if (mmsys->cmdq_base.size && cmdq_pkt) {
> -		ret = cmdq_pkt_write_mask(cmdq_pkt, mmsys->cmdq_base.subsys,
> -					  mmsys->cmdq_base.offset + offset, val,
> -					  mask);
> +		offset += mmsys->cmdq_base.offset;
> +		if (mmsys->cmdq_base.subsys != CMDQ_SUBSYS_INVALID) {

You're still anyway passing the subsys to cmdq_pkt_write_mask(), right?!
Why don't you just handle this in cmdq_pkt_write_mask() then? ;-)

I can see this pattern being repeated over and over in both drm/mediatek and MDP3
drivers, and it's not necessary to duplicate this many times when you can write it
just once.

Would've also been faster for you to implement... :-D

Cheers,
Angelo

> +			ret = cmdq_pkt_write_mask(cmdq_pkt, mmsys->cmdq_base.subsys,
> +						  offset, val, mask);
> +		} else {
> +			/* only MMIO access, no need to check mminfro_offset */
> +			ret = cmdq_pkt_assign(cmdq_pkt, CMDQ_THR_SPR_IDX0,
> +					      CMDQ_ADDR_HIGH(mmsys->cmdq_base.pa_base));
> +			ret |= cmdq_pkt_write_s_mask_value(cmdq_pkt, CMDQ_THR_SPR_IDX0,
> +							   CMDQ_ADDR_LOW(offset), val, mask);
> +		}
>   		if (ret)
>   			pr_debug("CMDQ unavailable: using CPU write\n");
>   		else
> diff --git a/drivers/soc/mediatek/mtk-mutex.c b/drivers/soc/mediatek/mtk-mutex.c
> index 5250c1d702eb..3788b16efbf4 100644
> --- a/drivers/soc/mediatek/mtk-mutex.c
> +++ b/drivers/soc/mediatek/mtk-mutex.c
> @@ -963,6 +963,7 @@ int mtk_mutex_enable_by_cmdq(struct mtk_mutex *mutex, void *pkt)
>   	struct mtk_mutex_ctx *mtx = container_of(mutex, struct mtk_mutex_ctx,
>   						 mutex[mutex->id]);
>   	struct cmdq_pkt *cmdq_pkt = (struct cmdq_pkt *)pkt;
> +	dma_addr_t en_addr = mtx->addr + DISP_REG_MUTEX_EN(mutex->id);
>   
>   	WARN_ON(&mtx->mutex[mutex->id] != mutex);
>   
> @@ -971,8 +972,14 @@ int mtk_mutex_enable_by_cmdq(struct mtk_mutex *mutex, void *pkt)
>   		return -ENODEV;
>   	}
>   
> -	cmdq_pkt_write(cmdq_pkt, mtx->cmdq_reg.subsys,
> -		       mtx->addr + DISP_REG_MUTEX_EN(mutex->id), 1);
> +	if (mtx->cmdq_reg.subsys != CMDQ_SUBSYS_INVALID) {
> +		cmdq_pkt_write(cmdq_pkt, mtx->cmdq_reg.subsys, en_addr, 1);
> +	} else {
> +		/* only MMIO access, no need to check mminfro_offset */
> +		cmdq_pkt_assign(cmdq_pkt, CMDQ_THR_SPR_IDX0, CMDQ_ADDR_HIGH(en_addr));
> +		cmdq_pkt_write_s_value(cmdq_pkt, CMDQ_THR_SPR_IDX0, CMDQ_ADDR_LOW(en_addr), 1);
> +	}
> +
>   	return 0;
>   }
>   EXPORT_SYMBOL_GPL(mtk_mutex_enable_by_cmdq);




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support for MT8196
  2025-03-04  9:32   ` AngeloGioacchino Del Regno
@ 2025-03-05  8:36     ` Jason-JH Lin (林睿祥)
  2025-03-05 12:05       ` AngeloGioacchino Del Regno
  0 siblings, 1 reply; 31+ messages in thread
From: Jason-JH Lin (林睿祥) @ 2025-03-05  8:36 UTC (permalink / raw)
  To: robh@kernel.org, krzk+dt@kernel.org, AngeloGioacchino Del Regno,
	conor+dt@kernel.org, mchehab@kernel.org, chunkuang.hu@kernel.org,
	jassisinghbrar@gmail.com
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Moudy Ho (何宗原),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	fshao@chromium.org, Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

Hi Angelo,

Thanks for the reviews.

On Tue, 2025-03-04 at 10:32 +0100, AngeloGioacchino Del Regno wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> > MT8196 has 3 new hardware configuration compared with the previous
> > SoC,
> > which correspond to the 3 new driver data:
> > 
> > 1. mminfra_offset: For GCE data plane control
> >     Since GCE has been moved into mminfra, GCE needs to append the
> >     mminfra offset to the DRAM address when accessing the DRAM.
> > 
> > 2. gce_vm: For GCE hardware virtualization
> >     Currently, the first version of the mt8196 mailbox controller
> > only
> >     requires setting the VM-related registers to enable the
> > permissions
> >     of a host VM.
> 
> I think that the GCE VM changes should go to a different commit, as
> that
> looks like being something not critical for basic functionality of
> the
> MMINFRA GCE.
> 
> I really like seeing support for that, but please split the basic
> stuff
> from the extra functionality :-)
> 

The VM configuration is the basic configuration for MT8196, so I put it
together with other configurations in one patch.
But I can understand you want the new function as a independent patch.
So I will split the VM related part, mminfra_offset part and dma_mask
part to 3 single pathes. Then add them as a driver data for MT8196 in
this patch.

> > 
> > 3. dma_mask_bit: For dma address bit control
> >     In order to avoid the hardware limitations of MT8196 accessing
> > DRAM,
> >     GCE needs to configure the DMA address to be less than 35 bits.
> > 
> > Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
> > ---
> >   drivers/mailbox/mtk-cmdq-mailbox.c       | 90
> > +++++++++++++++++++++---
> >   include/linux/mailbox/mtk-cmdq-mailbox.h |  2 +
> >   2 files changed, 84 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c
> > b/drivers/mailbox/mtk-cmdq-mailbox.c
> > index d186865b8dce..0abe10a7fef9 100644
> > --- a/drivers/mailbox/mtk-cmdq-mailbox.c
> > +++ b/drivers/mailbox/mtk-cmdq-mailbox.c
> > @@ -43,6 +43,17 @@
> >   #define GCE_CTRL_BY_SW                              GENMASK(2, 0)
> >   #define GCE_DDR_EN                          GENMASK(18, 16)
> > 
> > +#define GCE_VM_ID_MAP0                       0x5018
> > +#define GCE_VM_MAP0_ALL_HOST                 GENMASK(29, 0)
> > +#define GCE_VM_ID_MAP1                       0x501c
> > +#define GCE_VM_MAP1_ALL_HOST                 GENMASK(29, 0)
> > +#define GCE_VM_ID_MAP2                       0x5020
> > +#define GCE_VM_MAP2_ALL_HOST                 GENMASK(29, 0)
> > +#define GCE_VM_ID_MAP3                       0x5024
> > +#define GCE_VM_MAP3_ALL_HOST                 GENMASK(5, 0)
> > +#define GCE_VM_CPR_GSIZE             0x50c4
> > +#define GCE_VM_CPR_GSIZE_HSOT                        GENMASK(3, 0)
> 
> typo: GSIZE_HOST....
> 

Thanks, I'll fix it.

> ...but also, if you could add some brief description of what the
> VMIDs are used for
> and what the GSIZE is... that'd be very much appreciated from whoever
> is reading
> this.
> 
VMID_MAP configuration is in the previous reply mail for CK.
CPR_GSIZE is the setting for allocating the CPR SRAM size to each VM.

> The GCE stuff isn't even properly described in datasheets - I do
> (probably!)
> understand what those are for, but asking people to get years of
> experience on
> MediaTek to understand what's going on would be a bit rude, wouldn't
> it? :-D
> 

I agree with you :-)
I'll put them in the VM patch and add some brief description for them.

> > +
> >   #define CMDQ_THR_ACTIVE_SLOT_CYCLES 0x3200
> >   #define CMDQ_THR_ENABLED            0x1
> >   #define CMDQ_THR_DISABLED           0x0
> > @@ -87,11 +98,24 @@ struct cmdq {
> >   struct gce_plat {
> >       u32 thread_nr;
> >       u8 shift;
> > +     dma_addr_t mminfra_offset;
> 
> It looks like this is exactly the DRAM's iostart... at least, I can
> see that in the
> downstream devicetree that's where it starts.
> 
>         memory: memory@80000000 {
>                 device_type = "memory";
>                 reg = <0 0x80000000 0 0x40000000>;
>         };
> 
> It doesn't really look like being a coincidence, but, for the sake of
> asking:
> is this just a coincidence? :-)
> 

As the confirmation with the hardware designer in previous reply mail
for CK:
https://patchwork.kernel.org/project/linux-mediatek/patch/20250218054405.2017918-4-jason-jh.lin@mediatek.com/#26258463

Since the MMINFRA remap subtracting 2G is done in the hardware circuit
and cannot be configured by software, the address +2G adjustment is
necessary to implement in the CMDQ driver.

So that might not be a coincidence.
But even if DRAM start address changes, this mminfra_offset is still
subtracting 2G, so I think it is a better choice to define it as the
driver data for MT8196.

> >       bool control_by_sw;
> >       bool sw_ddr_en;
> > +     bool gce_vm;
> > +     u32 dma_mask_bit;
> >       u32 gce_num;
> >   };
> > 
> > +static inline u32 cmdq_reg_shift_addr(dma_addr_t addr, const
> > struct gce_plat *pdata)
> > +{
> > +     return ((addr + pdata->mminfra_offset) >> pdata->shift);
> > +}
> > +
> > +static inline u32 cmdq_reg_revert_addr(dma_addr_t addr, const
> > struct gce_plat *pdata)
> > +{
> > +     return ((addr << pdata->shift) - pdata->mminfra_offset);
> > +}
> 
> I'm not sure that you really need those two functions... probably
> it's simply
> cleaner and easier to just write that single line every time... and
> I'm
> saying that especially for how you're using those functions, with
> some readl()
> passed directly as param, decreasing human readability by "a whole
> lot" :-)
> 

The reason why I use API wrapper instead of writing it directly in
readl() is to avoid missing the shift or mminfra_offset conversion in
some places.
This problem is not easy to debug, and I have encountered it at least
twice...

I think the advantage of using function is that it can be uniformly
modified to all places that need to handle DRAM address conversion.
What do you think? :-)

> > +
> >   static void cmdq_sw_ddr_enable(struct cmdq *cmdq, bool enable)
> >   {
> >       WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));
> > @@ -112,6 +136,30 @@ u8 cmdq_get_shift_pa(struct mbox_chan *chan)
> >   }
> >   EXPORT_SYMBOL(cmdq_get_shift_pa);
> > 
> > +dma_addr_t cmdq_get_offset_pa(struct mbox_chan *chan)
> > +{
> > +     struct cmdq *cmdq = container_of(chan->mbox, struct cmdq,
> > mbox);
> > +
> > +     return cmdq->pdata->mminfra_offset;
> > +}
> > +EXPORT_SYMBOL(cmdq_get_offset_pa);
> 
> I think I remember this get_offset_pa from the old times, then CK
> removed it (and I
> was really happy about that disappearing), or am I confusing this
> with something
> else?
> 
> (of course, this wasn't used for mminfra, but for something else!)
> 

I can't find any remove history in mtk-cmdq-mailbox.c.

Maybe you mean the patch in this series?
https://lore.kernel.org/all/171213938049.123698.15573779837703602591.b4-ty@collabora.com/

> > +
> > +bool cmdq_addr_need_offset(struct mbox_chan *chan, dma_addr_t
> > addr)
> > +{
> > +     struct cmdq *cmdq = container_of(chan->mbox, struct cmdq,
> > mbox);
> > +
> > +     if (cmdq->pdata->mminfra_offset == 0)
> > +             return false;
> > +
> > +     /*
> > +      * mminfra will recognize the addr that greater than the
> > mminfra_offset
> > +      * as a transaction to DRAM.
> > +      * So the caller needs to append mminfra_offset for the true
> > case.
> > +      */
> > +     return (addr >= cmdq->pdata->mminfra_offset);
> 
> 
> /**
>   * cmdq_is_mminfra_gce() - Brief description
>   * @args.....
>   *
>   * The MMINFRA GCE will recognize an address greater than DRAM
> iostart as a
>   * DRAM transaction instead of ....xyz
>   *
>   * In order for callers to perform (xyz) transactions through the
> CMDQ, those
>   * need to know if they are using a GCE located in MMINFRA.
>   */
> bool cmdq_is_mminfra_gce(...)
> {
>         return cmdq->pdata->mminfra_offset &&
>                (addr >= cmdq->pdata->mminfra_offset)
> 
> > +}
> > +EXPORT_SYMBOL(cmdq_addr_need_offset);
> > +
> 

OK, I'll modify the API like this.

> ...but then, is there really no way of just handling the GCE being in
> MMINFRA
> transparently from the callers? Do the callers really *need* to know
> that they're
> using a new GCE?!
> 

Since the address subtracting is done in MMINFRA hardware, I think GCE
users really need to handle it in driver.

> Another way of saying: can't we just handle the address translation
> in here instead
> of instructing each and every driver about how to communicate with
> the new GCE?!
> 

The DRAM address may not only be the command buffer to GCE, but also
the working buffer provided by CMDQ users and being a part of GCE
instruction, so we need to handle the address translation in CMDQ
helper driver for the instruction generation.
E.g. ISP drivers may use GCE to write a hardware settings to a DRAM as
backup buffer. The GCE write instruction will be:
WRITE the value of ISP register to DRAM address + mminfra_offset.

But most of the CMDQ users only need to use GCE to write hardware
register, so I only keep the translation in cmdq_pkt_mem_move(),
cmdq_pkt_poll_addr() and cmdq_pkt_jump_abs() at the latest series.

Regards,
Jason-JH lin

> 
> Cheers,
> Angelo


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/8] soc: mediatek: mtk-cmdq: Add pa_base parsing for unsupported subsys ID hardware
  2025-03-04  9:35   ` AngeloGioacchino Del Regno
@ 2025-03-05  9:46     ` Jason-JH Lin (林睿祥)
  2025-03-05 11:03       ` AngeloGioacchino Del Regno
  0 siblings, 1 reply; 31+ messages in thread
From: Jason-JH Lin (林睿祥) @ 2025-03-05  9:46 UTC (permalink / raw)
  To: robh@kernel.org, krzk+dt@kernel.org, AngeloGioacchino Del Regno,
	conor+dt@kernel.org, mchehab@kernel.org, chunkuang.hu@kernel.org,
	jassisinghbrar@gmail.com
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Moudy Ho (何宗原),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	fshao@chromium.org, Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

On Tue, 2025-03-04 at 10:35 +0100, AngeloGioacchino Del Regno wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> > When GCE executes instructions, the corresponding hardware register
> > can be found through the subsys ID. For hardware that does not
> > support
> > subsys ID, its subsys ID will be set to invalid value and its
> > physical
> > address needs to be used to generate GCE instructions.
> > 
> > This commit adds a pa_base parsing flow to the cmdq_client_reg
> > structure
> > for these unsupported subsys ID hardware.
> > 
> 
> Does this work only for the MMINFRA located GCEs, or does this work
> also for
> the legacy ones in MT8173/83/88/92/95 // MT6795/6893/etc?
> 
> In order to actually review and decide, I do need to know :-)
> 

Yes, it's for the SoCs without subsys ID, it's not related to the
MMINFRA.

This can also work on MT8173/83/92/95 // MT6795/6893/etc.
You can remove the `mediatek,gce-client-reg` properties in their dtsi
and cherry-pick this series to verify it. :-)

Regards,
Jason-JH Lin

> Thanks,
> Angelo
> 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/8] soc: mediatek: mtk-cmdq: Add pa_base parsing for unsupported subsys ID hardware
  2025-03-05  9:46     ` Jason-JH Lin (林睿祥)
@ 2025-03-05 11:03       ` AngeloGioacchino Del Regno
  2025-03-05 15:58         ` Jason-JH Lin (林睿祥)
  0 siblings, 1 reply; 31+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-03-05 11:03 UTC (permalink / raw)
  To: Jason-JH Lin (林睿祥), robh@kernel.org,
	krzk+dt@kernel.org, conor+dt@kernel.org, mchehab@kernel.org,
	chunkuang.hu@kernel.org, jassisinghbrar@gmail.com
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Moudy Ho (何宗原),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	fshao@chromium.org, Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

Il 05/03/25 10:46, Jason-JH Lin (林睿祥) ha scritto:
> On Tue, 2025-03-04 at 10:35 +0100, AngeloGioacchino Del Regno wrote:
>>
>> External email : Please do not click links or open attachments until
>> you have verified the sender or the content.
>>
>>
>> Il 18/02/25 06:41, Jason-JH Lin ha scritto:
>>> When GCE executes instructions, the corresponding hardware register
>>> can be found through the subsys ID. For hardware that does not
>>> support
>>> subsys ID, its subsys ID will be set to invalid value and its
>>> physical
>>> address needs to be used to generate GCE instructions.
>>>
>>> This commit adds a pa_base parsing flow to the cmdq_client_reg
>>> structure
>>> for these unsupported subsys ID hardware.
>>>
>>
>> Does this work only for the MMINFRA located GCEs, or does this work
>> also for
>> the legacy ones in MT8173/83/88/92/95 // MT6795/6893/etc?
>>
>> In order to actually review and decide, I do need to know :-)
>>
> 
> Yes, it's for the SoCs without subsys ID, it's not related to the
> MMINFRA.
> 
> This can also work on MT8173/83/92/95 // MT6795/6893/etc.
> You can remove the `mediatek,gce-client-reg` properties in their dtsi
> and cherry-pick this series to verify it. :-)
> 

This is curious - and that brings more questions to the table (for curiosity
more than anything else at this point).

Since this is a way to make use of the CMDQ for address ranges that are not tied
to any subsys id (hence no gce-client-reg and just physical address parsing for
generating instructions), do you know what are the performance implications of
using this, instead of subsys IDs on SoCs that do support them?

Being clear: if we were to migrate a SoC like MT8195 to using this globally
instead of using subsys ids, would the performance be degraded?
And if yes, do you know by how much?

What you're proposing almost looks like being too good to be true - and makes
me wonder, at this point, why the subsys id was used in the first place :-)

Cheers!
Angelo

> Regards,
> Jason-JH Lin
> 
>> Thanks,
>> Angelo
>>
>>
> 





^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support for MT8196
  2025-03-05  8:36     ` Jason-JH Lin (林睿祥)
@ 2025-03-05 12:05       ` AngeloGioacchino Del Regno
  2025-03-06 11:00         ` Jason-JH Lin (林睿祥)
  0 siblings, 1 reply; 31+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-03-05 12:05 UTC (permalink / raw)
  To: Jason-JH Lin (林睿祥), chunkuang.hu@kernel.org,
	Moudy Ho (何宗原)
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	fshao@chromium.org, Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org, robh@kernel.org, krzk+dt@kernel.org,
	conor+dt@kernel.org, mchehab@kernel.org, jassisinghbrar@gmail.com

Il 05/03/25 09:36, Jason-JH Lin (林睿祥) ha scritto:
> Hi Angelo,
> 
> Thanks for the reviews.
> 
> On Tue, 2025-03-04 at 10:32 +0100, AngeloGioacchino Del Regno wrote:
>>
>> External email : Please do not click links or open attachments until
>> you have verified the sender or the content.
>>
>>
>> Il 18/02/25 06:41, Jason-JH Lin ha scritto:
>>> MT8196 has 3 new hardware configuration compared with the previous
>>> SoC,
>>> which correspond to the 3 new driver data:
>>>
>>> 1. mminfra_offset: For GCE data plane control
>>>      Since GCE has been moved into mminfra, GCE needs to append the
>>>      mminfra offset to the DRAM address when accessing the DRAM.
>>>
>>> 2. gce_vm: For GCE hardware virtualization
>>>      Currently, the first version of the mt8196 mailbox controller
>>> only
>>>      requires setting the VM-related registers to enable the
>>> permissions
>>>      of a host VM.
>>
>> I think that the GCE VM changes should go to a different commit, as
>> that
>> looks like being something not critical for basic functionality of
>> the
>> MMINFRA GCE.
>>
>> I really like seeing support for that, but please split the basic
>> stuff
>> from the extra functionality :-)
>>
> 
> The VM configuration is the basic configuration for MT8196, so I put it
> together with other configurations in one patch.
> But I can understand you want the new function as a independent patch.
> So I will split the VM related part, mminfra_offset part and dma_mask
> part to 3 single pathes. Then add them as a driver data for MT8196 in
> this patch.
> 

Yeah, thanks, the log simply gets more readable like that.

>>>
>>> 3. dma_mask_bit: For dma address bit control
>>>      In order to avoid the hardware limitations of MT8196 accessing
>>> DRAM,
>>>      GCE needs to configure the DMA address to be less than 35 bits.
>>>
>>> Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
>>> ---
>>>    drivers/mailbox/mtk-cmdq-mailbox.c       | 90
>>> +++++++++++++++++++++---
>>>    include/linux/mailbox/mtk-cmdq-mailbox.h |  2 +
>>>    2 files changed, 84 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c
>>> b/drivers/mailbox/mtk-cmdq-mailbox.c
>>> index d186865b8dce..0abe10a7fef9 100644
>>> --- a/drivers/mailbox/mtk-cmdq-mailbox.c
>>> +++ b/drivers/mailbox/mtk-cmdq-mailbox.c
>>> @@ -43,6 +43,17 @@
>>>    #define GCE_CTRL_BY_SW                              GENMASK(2, 0)
>>>    #define GCE_DDR_EN                          GENMASK(18, 16)
>>>
>>> +#define GCE_VM_ID_MAP0                       0x5018
>>> +#define GCE_VM_MAP0_ALL_HOST                 GENMASK(29, 0)
>>> +#define GCE_VM_ID_MAP1                       0x501c
>>> +#define GCE_VM_MAP1_ALL_HOST                 GENMASK(29, 0)
>>> +#define GCE_VM_ID_MAP2                       0x5020
>>> +#define GCE_VM_MAP2_ALL_HOST                 GENMASK(29, 0)
>>> +#define GCE_VM_ID_MAP3                       0x5024
>>> +#define GCE_VM_MAP3_ALL_HOST                 GENMASK(5, 0)
>>> +#define GCE_VM_CPR_GSIZE             0x50c4
>>> +#define GCE_VM_CPR_GSIZE_HSOT                        GENMASK(3, 0)
>>
>> typo: GSIZE_HOST....
>>
> 
> Thanks, I'll fix it.
> 
>> ...but also, if you could add some brief description of what the
>> VMIDs are used for
>> and what the GSIZE is... that'd be very much appreciated from whoever
>> is reading
>> this.
>>
> VMID_MAP configuration is in the previous reply mail for CK.

Oh, sorry, too many emails - sometimes I lose some :-)

> CPR_GSIZE is the setting for allocating the CPR SRAM size to each VM.

Would be awesome if you could then clarify the comment that you have later in
the code here, from...

/* config cpr size for host vm */

to

/* Set the amount of CPR SRAM to allocate to each VM */

...that could be a way of more properly describing what the writel there is doing.

> 
>> The GCE stuff isn't even properly described in datasheets - I do
>> (probably!)
>> understand what those are for, but asking people to get years of
>> experience on
>> MediaTek to understand what's going on would be a bit rude, wouldn't
>> it? :-D
>>
> 
> I agree with you :-)
> I'll put them in the VM patch and add some brief description for them.
> 

Thanks, much appreciated!

>>> +
>>>    #define CMDQ_THR_ACTIVE_SLOT_CYCLES 0x3200
>>>    #define CMDQ_THR_ENABLED            0x1
>>>    #define CMDQ_THR_DISABLED           0x0
>>> @@ -87,11 +98,24 @@ struct cmdq {
>>>    struct gce_plat {
>>>        u32 thread_nr;
>>>        u8 shift;
>>> +     dma_addr_t mminfra_offset;
>>
>> It looks like this is exactly the DRAM's iostart... at least, I can
>> see that in the
>> downstream devicetree that's where it starts.
>>
>>          memory: memory@80000000 {
>>                  device_type = "memory";
>>                  reg = <0 0x80000000 0 0x40000000>;
>>          };
>>
>> It doesn't really look like being a coincidence, but, for the sake of
>> asking:
>> is this just a coincidence? :-)
>>
> 
> As the confirmation with the hardware designer in previous reply mail
> for CK:
> https://patchwork.kernel.org/project/linux-mediatek/patch/20250218054405.2017918-4-jason-jh.lin@mediatek.com/#26258463
> 

That explanation was simply wonderful.

> Since the MMINFRA remap subtracting 2G is done in the hardware circuit
> and cannot be configured by software, the address +2G adjustment is
> necessary to implement in the CMDQ driver.
> 
> So that might not be a coincidence.
> But even if DRAM start address changes, this mminfra_offset is still
> subtracting 2G, so I think it is a better choice to define it as the
> driver data for MT8196.
> 

....so, this makes me think the following:

1. The DRAM start address cannot *ever* be less than 2G, because otherwise the
    MMINFRA HW would have a hole in the usable address range;
    1a. If the start address changes to less than 2G, then also the IOMMU would
        get limitations, not only the mminfra..!
    2b. This makes it very very very unlikely for the start address to be changed
        to less than 0x80000000

2. If the DRAM start address changes to be ABOVE 2G (so more than 0x80000000),
    there would be no point for MMINFRA to start a "config path" write (or read)
    in the SMMU DRAM block, would it? ;-)

I get it - if the DRAM moves up, MMINFRA is still at 2G because that's hard baked
into the hardware, but I foresee that it'll be unlikely to see a platform changing
the DRAM start address arbitrarily, getting out-of-sync with MMINFRA.

I propose to just get the address from the memory node for now, and to add a nice
comment in the code that explains that "In at least MT8196, the MMINFRA hardware
subtracts xyz etc etc" (and that explanation from the previous email is again
wonderful and shall not be lost: either use that in the comment, or add it to
the commit description, because it's really that good).

Should a new SoC appear in the future requiring an offset from the DRAM start
address, we will think about how to make that work in the best possible way: in
that case we could either reference something else to get the right address or
we can just change this driver to just use the 2G offset statically for all.

What I'm trying to do here is to reduce the amount of changes that we'd need for
adding new SoCs: since that 2G MMINFRA offset -> 2G DRAM start is not a coincidence
I think that, should the DRAM start vary on new SoCs, the MMINFRA offset will
follow the trend and vary with it.

So what I think is:
1. If I'm right, adding a new SoC (with different MMINFRA + DRAM offset) will be
    as easy as adding a compatible string in the bindings, no effort in changing
    this driver with new pdata offsets;
2. If I'm wrong, adding a new SoC means adding compat string and adding pdata and
    one variable in the cmdq struct.

Where N.2 is what we would do anyway if we don't go with my proposed solution...

All this is just to give you my considerations about this topic - you're left
completely free to disagree with me.
If you disagree, I will trust your judgement, no problem here.

>>>        bool control_by_sw;
>>>        bool sw_ddr_en;
>>> +     bool gce_vm;
>>> +     u32 dma_mask_bit;
>>>        u32 gce_num;
>>>    };
>>>
>>> +static inline u32 cmdq_reg_shift_addr(dma_addr_t addr, const
>>> struct gce_plat *pdata)
>>> +{
>>> +     return ((addr + pdata->mminfra_offset) >> pdata->shift);
>>> +}
>>> +
>>> +static inline u32 cmdq_reg_revert_addr(dma_addr_t addr, const
>>> struct gce_plat *pdata)
>>> +{
>>> +     return ((addr << pdata->shift) - pdata->mminfra_offset);
>>> +}
>>
>> I'm not sure that you really need those two functions... probably
>> it's simply
>> cleaner and easier to just write that single line every time... and
>> I'm
>> saying that especially for how you're using those functions, with
>> some readl()
>> passed directly as param, decreasing human readability by "a whole
>> lot" :-)
>>
> 
> The reason why I use API wrapper instead of writing it directly in
> readl() is to avoid missing the shift or mminfra_offset conversion in
> some places.
> This problem is not easy to debug, and I have encountered it at least
> twice...
> 
> I think the advantage of using function is that it can be uniformly
> modified to all places that need to handle DRAM address conversion.
> What do you think? :-)
> 

Eh, if you put it like that... it makes sense, so.. yeah, okay :-)

Still, please cleanup those instances of

`cmdq_reg_revert_addr(readl(something), pdata)`

those might be hard to read, so please just do something like:

regval = readl(something);
curr_pa = cmdq_revert_addr(regval, pdata);

...reword to your own liking, of course.

>>> +
>>>    static void cmdq_sw_ddr_enable(struct cmdq *cmdq, bool enable)
>>>    {
>>>        WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));
>>> @@ -112,6 +136,30 @@ u8 cmdq_get_shift_pa(struct mbox_chan *chan)
>>>    }
>>>    EXPORT_SYMBOL(cmdq_get_shift_pa);
>>>
>>> +dma_addr_t cmdq_get_offset_pa(struct mbox_chan *chan)
>>> +{
>>> +     struct cmdq *cmdq = container_of(chan->mbox, struct cmdq,
>>> mbox);
>>> +
>>> +     return cmdq->pdata->mminfra_offset;
>>> +}
>>> +EXPORT_SYMBOL(cmdq_get_offset_pa);
>>
>> I think I remember this get_offset_pa from the old times, then CK
>> removed it (and I
>> was really happy about that disappearing), or am I confusing this
>> with something
>> else?
>>
>> (of course, this wasn't used for mminfra, but for something else!)
>>
> 
> I can't find any remove history in mtk-cmdq-mailbox.c.
> 
> Maybe you mean the patch in this series?
> https://lore.kernel.org/all/171213938049.123698.15573779837703602591.b4-ty@collabora.com/
> 

Uhm, I think I may have confused something here, but yes I was remembering the
patch series that you pointed out, definitely.

At the end, that series is doing something else, so nevermind, was just confusion.

>>> +
>>> +bool cmdq_addr_need_offset(struct mbox_chan *chan, dma_addr_t
>>> addr)
>>> +{
>>> +     struct cmdq *cmdq = container_of(chan->mbox, struct cmdq,
>>> mbox);
>>> +
>>> +     if (cmdq->pdata->mminfra_offset == 0)
>>> +             return false;
>>> +
>>> +     /*
>>> +      * mminfra will recognize the addr that greater than the
>>> mminfra_offset
>>> +      * as a transaction to DRAM.
>>> +      * So the caller needs to append mminfra_offset for the true
>>> case.
>>> +      */
>>> +     return (addr >= cmdq->pdata->mminfra_offset);
>>
>>
>> /**
>>    * cmdq_is_mminfra_gce() - Brief description
>>    * @args.....
>>    *
>>    * The MMINFRA GCE will recognize an address greater than DRAM
>> iostart as a
>>    * DRAM transaction instead of ....xyz
>>    *
>>    * In order for callers to perform (xyz) transactions through the
>> CMDQ, those
>>    * need to know if they are using a GCE located in MMINFRA.
>>    */
>> bool cmdq_is_mminfra_gce(...)
>> {
>>          return cmdq->pdata->mminfra_offset &&
>>                 (addr >= cmdq->pdata->mminfra_offset)
>>
>>> +}
>>> +EXPORT_SYMBOL(cmdq_addr_need_offset);
>>> +
>>
> 
> OK, I'll modify the API like this.
> 
>> ...but then, is there really no way of just handling the GCE being in
>> MMINFRA
>> transparently from the callers? Do the callers really *need* to know
>> that they're
>> using a new GCE?!
>>
> 
> Since the address subtracting is done in MMINFRA hardware, I think GCE
> users really need to handle it in driver.
> 

Since the users of this infrastructure are multimedia related (disp/MDP3),
I'd also like to get an opinion from MediaTek engineers familiar with that.

CK, Moudy, any opinion on that, please?

>> Another way of saying: can't we just handle the address translation
>> in here instead
>> of instructing each and every driver about how to communicate with
>> the new GCE?!
>>
> 
> The DRAM address may not only be the command buffer to GCE, but also
> the working buffer provided by CMDQ users and being a part of GCE
> instruction, so we need to handle the address translation in CMDQ
> helper driver for the instruction generation.
> E.g. ISP drivers may use GCE to write a hardware settings to a DRAM as
> backup buffer. The GCE write instruction will be:
> WRITE the value of ISP register to DRAM address + mminfra_offset.
> 
> But most of the CMDQ users only need to use GCE to write hardware
> register, so I only keep the translation in cmdq_pkt_mem_move(),
> cmdq_pkt_poll_addr() and cmdq_pkt_jump_abs() at the latest series.

Yeah you're choosing the best of both worlds in that case, I do agree, but
still - if there's a way to avoid drivers to have different handling for
mminfra vs no-mminfra, that'd still be preferred.

Having the handling for something *centralized* somewhere, instead of it
being sparse here and there, would make maintenance way easier...

...and that's why I'm asking for CK and Moudy's opinion, nothing else :-)

Cheers!
Angelo



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/8] soc: mediatek: mtk-cmdq: Add pa_base parsing for unsupported subsys ID hardware
  2025-03-05 11:03       ` AngeloGioacchino Del Regno
@ 2025-03-05 15:58         ` Jason-JH Lin (林睿祥)
  2025-03-05 17:40           ` AngeloGioacchino Del Regno
  0 siblings, 1 reply; 31+ messages in thread
From: Jason-JH Lin (林睿祥) @ 2025-03-05 15:58 UTC (permalink / raw)
  To: robh@kernel.org, krzk+dt@kernel.org, AngeloGioacchino Del Regno,
	conor+dt@kernel.org, mchehab@kernel.org, chunkuang.hu@kernel.org,
	jassisinghbrar@gmail.com
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-mediatek@lists.infradead.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	Moudy Ho (何宗原), devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, fshao@chromium.org,
	Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

On Wed, 2025-03-05 at 12:03 +0100, AngeloGioacchino Del Regno wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> Il 05/03/25 10:46, Jason-JH Lin (林睿祥) ha scritto:
> > On Tue, 2025-03-04 at 10:35 +0100, AngeloGioacchino Del Regno
> > wrote:
> > > 
> > > External email : Please do not click links or open attachments
> > > until
> > > you have verified the sender or the content.
> > > 
> > > 
> > > Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> > > > When GCE executes instructions, the corresponding hardware
> > > > register
> > > > can be found through the subsys ID. For hardware that does not
> > > > support
> > > > subsys ID, its subsys ID will be set to invalid value and its
> > > > physical
> > > > address needs to be used to generate GCE instructions.
> > > > 
> > > > This commit adds a pa_base parsing flow to the cmdq_client_reg
> > > > structure
> > > > for these unsupported subsys ID hardware.
> > > > 
> > > 
> > > Does this work only for the MMINFRA located GCEs, or does this
> > > work
> > > also for
> > > the legacy ones in MT8173/83/88/92/95 // MT6795/6893/etc?
> > > 
> > > In order to actually review and decide, I do need to know :-)
> > > 
> > 
> > Yes, it's for the SoCs without subsys ID, it's not related to the
> > MMINFRA.
> > 
> > This can also work on MT8173/83/92/95 // MT6795/6893/etc.
> > You can remove the `mediatek,gce-client-reg` properties in their
> > dtsi
> > and cherry-pick this series to verify it. :-)
> > 
> 
> This is curious - and that brings more questions to the table (for
> curiosity
> more than anything else at this point).
> 
> Since this is a way to make use of the CMDQ for address ranges that
> are not tied
> to any subsys id (hence no gce-client-reg and just physical address
> parsing for
> generating instructions), do you know what are the performance
> implications of
> using this, instead of subsys IDs on SoCs that do support them?
> 

The main advantage of using subsys ID is to reduce the number of
instruction.
Without subsy ID, you will need one more `ASSIGN` instruction to assign
the high bytes of the physical address.

E,g. In mt8195-gce.h: #define SUBSYS_1c00XXXX 3

If you want GCE to write the value 0x0000000f to 0x1c00_002c.

With subsys ID, you can use only one instruction to achieve it:
1. WRITE value: 0x000000f to subsys: 0x3 + offset: 0x0002c
- OP code: WRTIE = 0x90
- subsys ID: 0x1c00XXXX = 0x03
- offset: 0x002c
- value: 0x0000000f

Without subsys ID, you will need 2 instructions to achieve it:
1. ASSIGN address high bytes: 0x1c00 to GCE temp register: SPR0
- OP code: LOGIC = 0xa0
- arg_type: register, value, value = (0x8)
- sub OP: ASSIGN = 0x0
- register index to store the assign value: SPR0 = 0x0
- value to assign: 0x1c00
2. WRITE value: 0x0000000f to temp register: SPR0 + offset:0x002c
- OP code: WRITE = 0x90
- sub OP(temp register index): SPR0 = 0x0
- offset for temp register: 0x002c
- value: 0x0000000f

> Being clear: if we were to migrate a SoC like MT8195 to using this
> globally
> instead of using subsys ids, would the performance be degraded?
> And if yes, do you know by how much?
> 

E,g. If the inst number with subsys ID is N.
1. If CMDQ is implement like this, then inst number will be (N * 2):
assign SPR0 = 0x1c00
write A to SPR0 + offset: 0x2c 
assign SPR0 = 0x1c00
write B to SPR0 + offset: 0x3c 
assign SPR0 = 0x1c00
write C to SPR0 + offset: 0x4c 
...

2. If CMDQ is implement like this, the inst number will be (N + 1 * n):
assign SPR0 = 0x1c00
write A to SPR0 + offset: 0x2c 
write B to SPR0 + offset: 0x3c
write C to SPR0 + offset: 0x4c
...

When the same cmd buffer changes the base address for n times:
assign SPR0 = 0x1c00
write A to SPR0 + offset: 0x2c
assign SPR0 = 0x1c01
write B to SPR0 + offset: 0x2c
assign SPR0 = 0x1c02
write C to SPR0 + offset: 0x2c
assign SPR0 = 0x1c00
write D to SPR0 + offset: 0x3c
...

So you can imagine the performance will increase, but maybe not too
much if we use it in the right way...
Except the old SoC that didn't support SPR and CPR. The reason will be
addressed in the next paragraph.

> What you're proposing almost looks like being too good to be true -
> and makes
> me wonder, at this point, why the subsys id was used in the first
> place :-)
> 

That's because of the old GCE version in the old SoC only support GPR,
it didn't support SPR and CPR.

GPR:
All 32 GCE threads share the same GPR0~GPR15, GPR will be affected by
other GCE threads if they use it at the same time.

SPR:
Each GCE thread has 4 SPR, SPR won't be affected by another GCE thread.

CPR:
All 32 GCE threads share the same CPR, there are over 1000 CPR can be
used. It need to be managed properly to avoid the resource conflicting.

Due to the GPR resource restriction in the old GCE version, the usage
of subsys ID can avoid GPR conflicting issues when multiple GCE threads
are using GPR to physical assign high bytes all the time.


I have simplified some complicate instruction rules, so the description
above may not be 100% matched to the CMDQ helper driver code.
But I think the main concept is correct.
Hope these explanation can help well :-)

Regards,
Jason-JH Lin

> Cheers!
> Angelo
> 
> > Regards,
> > Jason-JH Lin
> > 
> > > Thanks,
> > > Angelo
> > > 
> > > 
> > 
> 
> 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 6/8] soc: mediatek: Add programming flow for unsupported subsys ID hardware
  2025-03-04  9:41   ` AngeloGioacchino Del Regno
@ 2025-03-05 16:12     ` Jason-JH Lin (林睿祥)
  2025-03-05 18:08       ` AngeloGioacchino Del Regno
  0 siblings, 1 reply; 31+ messages in thread
From: Jason-JH Lin (林睿祥) @ 2025-03-05 16:12 UTC (permalink / raw)
  To: robh@kernel.org, krzk+dt@kernel.org, AngeloGioacchino Del Regno,
	conor+dt@kernel.org, mchehab@kernel.org, chunkuang.hu@kernel.org,
	jassisinghbrar@gmail.com
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Moudy Ho (何宗原),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	fshao@chromium.org, Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

On Tue, 2025-03-04 at 10:41 +0100, AngeloGioacchino Del Regno wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> > To support hardware without subsys IDs on new SoCs, add a
> > programming
> > flow that checks whether the subsys ID is valid. If the subsys ID
> > is
> > invalid, the flow will call 2 alternative CMDQ APIs:
> > cmdq_pkt_assign() and cmdq_pkt_write_s_value() to achieve the same
> > functionality.
> > 
> > Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
> > ---
> >   drivers/soc/mediatek/mtk-mmsys.c | 14 +++++++++++---
> >   drivers/soc/mediatek/mtk-mutex.c | 11 +++++++++--
> >   2 files changed, 20 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/soc/mediatek/mtk-mmsys.c
> > b/drivers/soc/mediatek/mtk-mmsys.c
> > index bb4639ca0b8c..ce949b863b05 100644
> > --- a/drivers/soc/mediatek/mtk-mmsys.c
> > +++ b/drivers/soc/mediatek/mtk-mmsys.c
> > @@ -167,9 +167,17 @@ static void mtk_mmsys_update_bits(struct
> > mtk_mmsys *mmsys, u32 offset, u32 mask,
> >       u32 tmp;
> > 
> >       if (mmsys->cmdq_base.size && cmdq_pkt) {
> > -             ret = cmdq_pkt_write_mask(cmdq_pkt, mmsys-
> > >cmdq_base.subsys,
> > -                                       mmsys->cmdq_base.offset +
> > offset, val,
> > -                                       mask);
> > +             offset += mmsys->cmdq_base.offset;
> > +             if (mmsys->cmdq_base.subsys != CMDQ_SUBSYS_INVALID) {
> 
> You're still anyway passing the subsys to cmdq_pkt_write_mask(),
> right?!
> Why don't you just handle this in cmdq_pkt_write_mask() then? ;-)
> 
> I can see this pattern being repeated over and over in both
> drm/mediatek and MDP3
> drivers, and it's not necessary to duplicate this many times when you
> can write it
> just once.
> 
> Would've also been faster for you to implement... :-D
> 

I think did it in the series V1:
https://patchwork.kernel.org/project/linux-mediatek/patch/20241121042602.32730-5-jason-jh.lin@mediatek.com/

Because it'll need to passing the base_pa and that will need to change
the interface for original APIs.

And CK think that's not a necessary to change the APIs. It can be done
by cmdq_pkt_assign() + cmdq_pkt_write_s_mask_value() in the client
drivers. Then you can see this pattern in everywhere. :-)

Regards,
Jason-JH Lin

> Cheers,
> Angelo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 5/8] soc: mediatek: mtk-cmdq: Add mminfra_offset compatibility for DRAM address
  2025-03-04  9:37   ` AngeloGioacchino Del Regno
@ 2025-03-05 16:26     ` Jason-JH Lin (林睿祥)
  0 siblings, 0 replies; 31+ messages in thread
From: Jason-JH Lin (林睿祥) @ 2025-03-05 16:26 UTC (permalink / raw)
  To: robh@kernel.org, krzk+dt@kernel.org, AngeloGioacchino Del Regno,
	conor+dt@kernel.org, mchehab@kernel.org, chunkuang.hu@kernel.org,
	jassisinghbrar@gmail.com
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Moudy Ho (何宗原),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	fshao@chromium.org, Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

On Tue, 2025-03-04 at 10:37 +0100, AngeloGioacchino Del Regno wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> > Since GCE has been moved to mminfra in MT8196, all transactions
> > from
> > mminfra to DRAM will have their addresses adjusted by subtracting a
> > mminfra offset.
> > This information should be handled inside the CMDQ driver, allowing
> > CMDQ users to call CMDQ APIs as usual.
> > 
> > Therefore, CMDQ driver needs to use the mbox API to get the
> > mminfra_offset value of the SoC, and then add it to the DRAM
> > address
> > when generating instructions to ensure GCE accesses the correct
> > DRAM
> > address.
> > 
> > Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
> > ---
> >   drivers/soc/mediatek/mtk-cmdq-helper.c | 35
> > ++++++++++++++++++++++++--
> >   1 file changed, 33 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/soc/mediatek/mtk-cmdq-helper.c
> > b/drivers/soc/mediatek/mtk-cmdq-helper.c
> > index aa9853100d78..f2853a74af01 100644
> > --- a/drivers/soc/mediatek/mtk-cmdq-helper.c
> > +++ b/drivers/soc/mediatek/mtk-cmdq-helper.c
> > @@ -314,10 +314,22 @@ EXPORT_SYMBOL(cmdq_pkt_write_s_mask_value);
> > 
> >   int cmdq_pkt_mem_move(struct cmdq_pkt *pkt, dma_addr_t src_addr,
> > dma_addr_t dst_addr)
> >   {
> > +     struct cmdq_client *cl = (struct cmdq_client *)pkt->cl;
> >       const u16 high_addr_reg_idx  = CMDQ_THR_SPR_IDX0;
> >       const u16 value_reg_idx = CMDQ_THR_SPR_IDX1;
> >       int ret;
> > 
> > +     if (!cl) {
> > +             pr_err("%s %d: pkt->cl is NULL!\n", __func__,
> > __LINE__);
> > +             return -EINVAL;
> > +     }
> > +
> > +     if (cmdq_addr_need_offset(cl->chan, src_addr))
> > +             src_addr += cmdq_get_offset_pa(cl->chan);
> 
> If the offset is just DRAM IOSTART, you could manage that differently
> in the cmdq
> helper as well as the cmdq mailbox... :-)
> 

The offset_pa is not DRAM IOSTART, it is the MMINFRA subtracting
offset.

CMDQ helper is used to generate the instruction to the command buffer.
Since this offset_pa is added for the PA put into the instruction, I
think adding the offset_pa here is more suitable than CMDQ mailbox.
Does that make sense? :-)

Regards,
Jason-JH Lin

> > +
> > +     if (cmdq_addr_need_offset(cl->chan, dst_addr))
> > +             dst_addr += cmdq_get_offset_pa(cl->chan);
> > +
> Cheers,
> Angelo


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/8] soc: mediatek: mtk-cmdq: Add pa_base parsing for unsupported subsys ID hardware
  2025-03-05 15:58         ` Jason-JH Lin (林睿祥)
@ 2025-03-05 17:40           ` AngeloGioacchino Del Regno
  0 siblings, 0 replies; 31+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-03-05 17:40 UTC (permalink / raw)
  To: Jason-JH Lin (林睿祥), robh@kernel.org,
	krzk+dt@kernel.org, conor+dt@kernel.org, mchehab@kernel.org,
	chunkuang.hu@kernel.org, jassisinghbrar@gmail.com
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-mediatek@lists.infradead.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	Moudy Ho (何宗原), devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, fshao@chromium.org,
	Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

Il 05/03/25 16:58, Jason-JH Lin (林睿祥) ha scritto:
> On Wed, 2025-03-05 at 12:03 +0100, AngeloGioacchino Del Regno wrote:
>>
>> External email : Please do not click links or open attachments until
>> you have verified the sender or the content.
>>
>>
>> Il 05/03/25 10:46, Jason-JH Lin (林睿祥) ha scritto:
>>> On Tue, 2025-03-04 at 10:35 +0100, AngeloGioacchino Del Regno
>>> wrote:
>>>>
>>>> External email : Please do not click links or open attachments
>>>> until
>>>> you have verified the sender or the content.
>>>>
>>>>
>>>> Il 18/02/25 06:41, Jason-JH Lin ha scritto:
>>>>> When GCE executes instructions, the corresponding hardware
>>>>> register
>>>>> can be found through the subsys ID. For hardware that does not
>>>>> support
>>>>> subsys ID, its subsys ID will be set to invalid value and its
>>>>> physical
>>>>> address needs to be used to generate GCE instructions.
>>>>>
>>>>> This commit adds a pa_base parsing flow to the cmdq_client_reg
>>>>> structure
>>>>> for these unsupported subsys ID hardware.
>>>>>
>>>>
>>>> Does this work only for the MMINFRA located GCEs, or does this
>>>> work
>>>> also for
>>>> the legacy ones in MT8173/83/88/92/95 // MT6795/6893/etc?
>>>>
>>>> In order to actually review and decide, I do need to know :-)
>>>>
>>>
>>> Yes, it's for the SoCs without subsys ID, it's not related to the
>>> MMINFRA.
>>>
>>> This can also work on MT8173/83/92/95 // MT6795/6893/etc.
>>> You can remove the `mediatek,gce-client-reg` properties in their
>>> dtsi
>>> and cherry-pick this series to verify it. :-)
>>>
>>
>> This is curious - and that brings more questions to the table (for
>> curiosity
>> more than anything else at this point).
>>
>> Since this is a way to make use of the CMDQ for address ranges that
>> are not tied
>> to any subsys id (hence no gce-client-reg and just physical address
>> parsing for
>> generating instructions), do you know what are the performance
>> implications of
>> using this, instead of subsys IDs on SoCs that do support them?
>>
> 
> The main advantage of using subsys ID is to reduce the number of
> instruction.
> Without subsy ID, you will need one more `ASSIGN` instruction to assign
> the high bytes of the physical address.
> 
> E,g. In mt8195-gce.h: #define SUBSYS_1c00XXXX 3
> 
> If you want GCE to write the value 0x0000000f to 0x1c00_002c.
> 
> With subsys ID, you can use only one instruction to achieve it:
> 1. WRITE value: 0x000000f to subsys: 0x3 + offset: 0x0002c
> - OP code: WRTIE = 0x90
> - subsys ID: 0x1c00XXXX = 0x03
> - offset: 0x002c
> - value: 0x0000000f
> 
> Without subsys ID, you will need 2 instructions to achieve it:
> 1. ASSIGN address high bytes: 0x1c00 to GCE temp register: SPR0
> - OP code: LOGIC = 0xa0
> - arg_type: register, value, value = (0x8)
> - sub OP: ASSIGN = 0x0
> - register index to store the assign value: SPR0 = 0x0
> - value to assign: 0x1c00
> 2. WRITE value: 0x0000000f to temp register: SPR0 + offset:0x002c
> - OP code: WRITE = 0x90
> - sub OP(temp register index): SPR0 = 0x0
> - offset for temp register: 0x002c
> - value: 0x0000000f
> 
>> Being clear: if we were to migrate a SoC like MT8195 to using this
>> globally
>> instead of using subsys ids, would the performance be degraded?
>> And if yes, do you know by how much?
>>
> 
> E,g. If the inst number with subsys ID is N.
> 1. If CMDQ is implement like this, then inst number will be (N * 2):
> assign SPR0 = 0x1c00
> write A to SPR0 + offset: 0x2c
> assign SPR0 = 0x1c00
> write B to SPR0 + offset: 0x3c
> assign SPR0 = 0x1c00
> write C to SPR0 + offset: 0x4c
> ...
> 
> 2. If CMDQ is implement like this, the inst number will be (N + 1 * n):
> assign SPR0 = 0x1c00
> write A to SPR0 + offset: 0x2c
> write B to SPR0 + offset: 0x3c
> write C to SPR0 + offset: 0x4c
> ...
> 
> When the same cmd buffer changes the base address for n times:
> assign SPR0 = 0x1c00
> write A to SPR0 + offset: 0x2c
> assign SPR0 = 0x1c01
> write B to SPR0 + offset: 0x2c
> assign SPR0 = 0x1c02
> write C to SPR0 + offset: 0x2c
> assign SPR0 = 0x1c00
> write D to SPR0 + offset: 0x3c
> ...
> 
> So you can imagine the performance will increase, but maybe not too
> much if we use it in the right way...
> Except the old SoC that didn't support SPR and CPR. The reason will be
> addressed in the next paragraph.
> 
>> What you're proposing almost looks like being too good to be true -
>> and makes
>> me wonder, at this point, why the subsys id was used in the first
>> place :-)
>>
> 
> That's because of the old GCE version in the old SoC only support GPR,
> it didn't support SPR and CPR.
> 
> GPR:
> All 32 GCE threads share the same GPR0~GPR15, GPR will be affected by
> other GCE threads if they use it at the same time.
> 
> SPR:
> Each GCE thread has 4 SPR, SPR won't be affected by another GCE thread.
> 
> CPR:
> All 32 GCE threads share the same CPR, there are over 1000 CPR can be
> used. It need to be managed properly to avoid the resource conflicting.
> 
> Due to the GPR resource restriction in the old GCE version, the usage
> of subsys ID can avoid GPR conflicting issues when multiple GCE threads
> are using GPR to physical assign high bytes all the time.
> 
> 
> I have simplified some complicate instruction rules, so the description
> above may not be 100% matched to the CMDQ helper driver code.
> But I think the main concept is correct.
> Hope these explanation can help well :-)
> 

Jason, that's simply wonderful. Thanks a lot for this precious description.

Yes, this has clarified even more than I was asking for, and besides,
yeah I know that there are some rules to follow, some of which I know,
some of which I imagine - and describing all of that would need lots
and lots of text - again, I know, and no worries about that! :-D

Thanks again!
Angelo

> Regards,
> Jason-JH Lin
> 
>> Cheers!
>> Angelo
>>
>>> Regards,
>>> Jason-JH Lin
>>>
>>>> Thanks,
>>>> Angelo
>>>>
>>>>
>>>
>>
>>
>>
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 6/8] soc: mediatek: Add programming flow for unsupported subsys ID hardware
  2025-03-05 16:12     ` Jason-JH Lin (林睿祥)
@ 2025-03-05 18:08       ` AngeloGioacchino Del Regno
  2025-03-06 11:05         ` Jason-JH Lin (林睿祥)
  0 siblings, 1 reply; 31+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-03-05 18:08 UTC (permalink / raw)
  To: Jason-JH Lin (林睿祥), robh@kernel.org,
	krzk+dt@kernel.org, conor+dt@kernel.org, mchehab@kernel.org,
	chunkuang.hu@kernel.org, jassisinghbrar@gmail.com
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Moudy Ho (何宗原),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	fshao@chromium.org, Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

Il 05/03/25 17:12, Jason-JH Lin (林睿祥) ha scritto:
> On Tue, 2025-03-04 at 10:41 +0100, AngeloGioacchino Del Regno wrote:
>>
>> External email : Please do not click links or open attachments until
>> you have verified the sender or the content.
>>
>>
>> Il 18/02/25 06:41, Jason-JH Lin ha scritto:
>>> To support hardware without subsys IDs on new SoCs, add a
>>> programming
>>> flow that checks whether the subsys ID is valid. If the subsys ID
>>> is
>>> invalid, the flow will call 2 alternative CMDQ APIs:
>>> cmdq_pkt_assign() and cmdq_pkt_write_s_value() to achieve the same
>>> functionality.
>>>
>>> Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
>>> ---
>>>    drivers/soc/mediatek/mtk-mmsys.c | 14 +++++++++++---
>>>    drivers/soc/mediatek/mtk-mutex.c | 11 +++++++++--
>>>    2 files changed, 20 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/soc/mediatek/mtk-mmsys.c
>>> b/drivers/soc/mediatek/mtk-mmsys.c
>>> index bb4639ca0b8c..ce949b863b05 100644
>>> --- a/drivers/soc/mediatek/mtk-mmsys.c
>>> +++ b/drivers/soc/mediatek/mtk-mmsys.c
>>> @@ -167,9 +167,17 @@ static void mtk_mmsys_update_bits(struct
>>> mtk_mmsys *mmsys, u32 offset, u32 mask,
>>>        u32 tmp;
>>>
>>>        if (mmsys->cmdq_base.size && cmdq_pkt) {
>>> -             ret = cmdq_pkt_write_mask(cmdq_pkt, mmsys-
>>>> cmdq_base.subsys,
>>> -                                       mmsys->cmdq_base.offset +
>>> offset, val,
>>> -                                       mask);
>>> +             offset += mmsys->cmdq_base.offset;
>>> +             if (mmsys->cmdq_base.subsys != CMDQ_SUBSYS_INVALID) {
>>
>> You're still anyway passing the subsys to cmdq_pkt_write_mask(),
>> right?!
>> Why don't you just handle this in cmdq_pkt_write_mask() then? ;-)
>>
>> I can see this pattern being repeated over and over in both
>> drm/mediatek and MDP3
>> drivers, and it's not necessary to duplicate this many times when you
>> can write it
>> just once.
>>
>> Would've also been faster for you to implement... :-D
>>
> 
> I think did it in the series V1:
> https://patchwork.kernel.org/project/linux-mediatek/patch/20241121042602.32730-5-jason-jh.lin@mediatek.com/
> 
> Because it'll need to passing the base_pa and that will need to change
> the interface for original APIs.
> 
> And CK think that's not a necessary to change the APIs. It can be done
> by cmdq_pkt_assign() + cmdq_pkt_write_s_mask_value() in the client
> drivers. Then you can see this pattern in everywhere. :-)
> 

Using likely(x) and unlikely(x) should be avoided, really, unless it's something
that is really really really really ... really ... rea.... likely or unlikely :-)

Btw. Changing the APIs is a bit difficult, but I disagree with CK about not
"inventing" a new API for the unsupported-subsys flow.

It's true, it is not *strictly* needed to add a function, but it's good for any
kind of future maintainability - as I explained, it's easier then to fix a problem
if there's one.... and well, I can see that you agree with me, because effectively
you did it the first time :-)

CK mentioned using cmdq_pkt_write() *or* cmdq_pkt_assignwrite/cmdq_pkt_write_pa()
(however you wanna call it, it's fine for me), in drivers that know that there
always is or there always isn't a subsys ID: that's a good suggestion, as this can
be eventually done with assigning a function pointer, so, no conditionals at each
operation.

My point of view, finally, is:
  - This is just another way of doing cmdq_pkt_write()
    - This, at the end of the day, does exactly what cmdq_pkt_write() is doing,
      except it's doing it with two instructions instead of one;
  - The same thing can be done in two different ways (depending on SoC)
    - This same thing should have a function that does it.

A function that does it can be

int cmdq_pkt_write_pa(struct cmdq_pkt *pkt, u8 subsys /*unused*/, u32 pa_base, u16 
offset, u32 value)
{
	err = cmdq_pkt_assign(pkt, 0, CMDQ_ADDR_HIGH(pa_base));
	if (err < 0)
		return err;

	return cmdq_pkt_write_s_value( .... etc)
}

int cmdq_pkt_write() <--- unchanged, scheduled for removal after all drivers migrated

int cmdq_pkt_write_subsys(struct cmdq_pkt *pkt, u8 subsys, u32 pa_base /*unused*/, 
u16 offset, u32 value)
{
	/* This function will get the contents of cmdq_pkt_write once removed,
            but, in the meanwhile, to avoid duplication we just call that: */

	return cmdq_pkt_write(pkt, subsys, offset, value);
}

- Are we adding one more function parameter? Yes
- Is this impacting performance overall? Not really

After all, we're living in an ARMv8 (actually, ARMv9 for new ones) world, so
one more function param won't hurt anyone.

I think that's the best of both worlds, and makes everyone happy.
Are you happy with that? :-)

Cheers,
Angelo



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 7/8] drm/mediatek: Add programming flow for unsupported subsys ID hardware
  2025-02-18  5:41 ` [PATCH v4 7/8] drm/mediatek: " Jason-JH Lin
@ 2025-03-06  9:47   ` AngeloGioacchino Del Regno
  2025-03-06 11:10     ` Jason-JH Lin (林睿祥)
  0 siblings, 1 reply; 31+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-03-06  9:47 UTC (permalink / raw)
  To: Jason-JH Lin, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jassi Brar, Chun-Kuang Hu, Mauro Carvalho Chehab
  Cc: Matthias Brugger, Nancy Lin, Singo Chang, Moudy Ho, Xavier Chang,
	Xiandong Wang, Sirius Wang, Fei Shao, Pin-yen Lin,
	Project_Global_Chrome_Upstream_Group, devicetree, linux-kernel,
	dri-devel, linux-mediatek, linux-arm-kernel, linux-media, CK Hu

Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> To support hardware without subsys IDs on new SoCs, add a programming
> flow that checks whether the subsys ID is valid. If the subsys ID is
> invalid, the flow will call 2 alternative CMDQ APIs:
> cmdq_pkt_assign() and cmdq_pkt_write_s_value() to achieve the same
> functionality.
> 
> Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
> Reviewed-by: CK Hu <ck.hu@mediatek.com>
> ---
>   drivers/gpu/drm/mediatek/mtk_ddp_comp.c | 33 ++++++++++++++++++++-----
>   1 file changed, 27 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/mediatek/mtk_ddp_comp.c b/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
> index edc6417639e6..219d67735a54 100644
> --- a/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
> +++ b/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
> @@ -66,14 +66,37 @@ struct mtk_ddp_comp_dev {
>   	struct cmdq_client_reg cmdq_reg;
>   };
>   
> +#if IS_REACHABLE(CONFIG_MTK_CMDQ)
> +static void mtk_ddp_write_cmdq_pkt(struct cmdq_pkt *cmdq_pkt, struct cmdq_client_reg *cmdq_reg,
> +				   unsigned int offset, unsigned int value, unsigned int mask)
> +{
> +	offset += cmdq_reg->offset;
> +
> +	if (cmdq_reg->subsys != CMDQ_SUBSYS_INVALID) {
> +		if (mask == GENMASK(31, 0))
> +			cmdq_pkt_write(cmdq_pkt, cmdq_reg->subsys, offset, value);
> +		else
> +			cmdq_pkt_write_mask(cmdq_pkt, cmdq_reg->subsys, offset, value, mask);

Sorry but this is pointless, really...

Function cmdq_pkt_write_mask() in mtk-cmdq-helper is doing:

	if (mask != GENMASK(31, 0)) {
		err = cmdq_pkt_mask(pkt, mask);
		if (err < 0)
			return err;

		offset_mask |= CMDQ_WRITE_ENABLE_MASK;
	}
	return cmdq_pkt_write(pkt, subsys, offset_mask, value);

here you're doing the exact inverse check.

At this point you can just do:

static int mtk_ddp_write_cmdq_pkt(struct cmdq_pkt *cmdq_pkt, struct cmdq_client_reg 
*cmdq_reg,
				  u16 offset, u32 value, u32 mask)
{
	u16 gce_offset = cmdq_reg->offset + offset;

	if (cmdq_reg->subsys != CMDQ_SUBSYS_INVALID)
		return cmdq_pkt_write_mask(pkt, cmdq_reg->subsys, gce_offset, value, mask);

	return cmdq_pkt_write_mask_pa(cmdq_pkt, cmdq_reg->pa_base, gce_offset, value, mask);
}

Cheers,
Angelo

> +	} else {
> +		/* only MMIO access, no need to check mminfro_offset */
> +		cmdq_pkt_assign(cmdq_pkt, 0, CMDQ_ADDR_HIGH(cmdq_reg->pa_base));
> +		if (mask == GENMASK(31, 0))
> +			cmdq_pkt_write_s_value(cmdq_pkt, CMDQ_THR_SPR_IDX0,
> +					       CMDQ_ADDR_LOW(offset), value);
> +		else
> +			cmdq_pkt_write_s_mask_value(cmdq_pkt, CMDQ_THR_SPR_IDX0,
> +						    CMDQ_ADDR_LOW(offset), value, mask);
> +	}
> +}
> +#endif
> +
>   void mtk_ddp_write(struct cmdq_pkt *cmdq_pkt, unsigned int value,
>   		   struct cmdq_client_reg *cmdq_reg, void __iomem *regs,
>   		   unsigned int offset)
>   {
>   #if IS_REACHABLE(CONFIG_MTK_CMDQ)
>   	if (cmdq_pkt)
> -		cmdq_pkt_write(cmdq_pkt, cmdq_reg->subsys,
> -			       cmdq_reg->offset + offset, value);
> +		mtk_ddp_write_cmdq_pkt(cmdq_pkt, cmdq_reg, offset, value, GENMASK(31, 0));
>   	else
>   #endif
>   		writel(value, regs + offset);
> @@ -85,8 +108,7 @@ void mtk_ddp_write_relaxed(struct cmdq_pkt *cmdq_pkt, unsigned int value,
>   {
>   #if IS_REACHABLE(CONFIG_MTK_CMDQ)
>   	if (cmdq_pkt)
> -		cmdq_pkt_write(cmdq_pkt, cmdq_reg->subsys,
> -			       cmdq_reg->offset + offset, value);
> +		mtk_ddp_write_cmdq_pkt(cmdq_pkt, cmdq_reg, offset, value, GENMASK(31, 0));
>   	else
>   #endif
>   		writel_relaxed(value, regs + offset);
> @@ -98,8 +120,7 @@ void mtk_ddp_write_mask(struct cmdq_pkt *cmdq_pkt, unsigned int value,
>   {
>   #if IS_REACHABLE(CONFIG_MTK_CMDQ)
>   	if (cmdq_pkt) {
> -		cmdq_pkt_write_mask(cmdq_pkt, cmdq_reg->subsys,
> -				    cmdq_reg->offset + offset, value, mask);
> +		mtk_ddp_write_cmdq_pkt(cmdq_pkt, cmdq_reg, offset, value, mask);
>   	} else {
>   #endif
>   		u32 tmp = readl(regs + offset);



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support for MT8196
  2025-03-05 12:05       ` AngeloGioacchino Del Regno
@ 2025-03-06 11:00         ` Jason-JH Lin (林睿祥)
  2025-03-06 13:08           ` AngeloGioacchino Del Regno
  0 siblings, 1 reply; 31+ messages in thread
From: Jason-JH Lin (林睿祥) @ 2025-03-06 11:00 UTC (permalink / raw)
  To: AngeloGioacchino Del Regno, chunkuang.hu@kernel.org,
	Moudy Ho (何宗原)
  Cc: conor+dt@kernel.org, robh@kernel.org, linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	fshao@chromium.org, krzk+dt@kernel.org, jassisinghbrar@gmail.com,
	Singo Chang (張興國), mchehab@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

[snip]
> 
> > CPR_GSIZE is the setting for allocating the CPR SRAM size to each
> > VM.
> 
> Would be awesome if you could then clarify the comment that you have
> later in
> the code here, from...
> 
> /* config cpr size for host vm */
> 
> to
> 
> /* Set the amount of CPR SRAM to allocate to each VM */
> 
> ...that could be a way of more properly describing what the writel
> there is doing.
> 

OK, I'll change it.

> > 
> > > The GCE stuff isn't even properly described in datasheets - I do
> > > (probably!)
> > > understand what those are for, but asking people to get years of
> > > experience on
> > > MediaTek to understand what's going on would be a bit rude,
> > > wouldn't
> > > it? :-D
> > > 
> > 
> > I agree with you :-)
> > I'll put them in the VM patch and add some brief description for
> > them.
> > 
> 
> Thanks, much appreciated!
> 
> > > > +
> > > >    #define CMDQ_THR_ACTIVE_SLOT_CYCLES 0x3200
> > > >    #define CMDQ_THR_ENABLED            0x1
> > > >    #define CMDQ_THR_DISABLED           0x0
> > > > @@ -87,11 +98,24 @@ struct cmdq {
> > > >    struct gce_plat {
> > > >        u32 thread_nr;
> > > >        u8 shift;
> > > > +     dma_addr_t mminfra_offset;
> > > 
> > > It looks like this is exactly the DRAM's iostart... at least, I
> > > can
> > > see that in the
> > > downstream devicetree that's where it starts.
> > > 
> > >          memory: memory@80000000 {
> > >                  device_type = "memory";
> > >                  reg = <0 0x80000000 0 0x40000000>;
> > >          };
> > > 
> > > It doesn't really look like being a coincidence, but, for the
> > > sake of
> > > asking:
> > > is this just a coincidence? :-)
> > > 
> > 
> > As the confirmation with the hardware designer in previous reply
> > mail
> > for CK:
> > https://patchwork.kernel.org/project/linux-mediatek/patch/20250218054405.2017918-4-jason-jh.lin@mediatek.com/*26258463
> > 
> 
> That explanation was simply wonderful.
> 
> > Since the MMINFRA remap subtracting 2G is done in the hardware
> > circuit
> > and cannot be configured by software, the address +2G adjustment is
> > necessary to implement in the CMDQ driver.
> > 
> > So that might not be a coincidence.
> > But even if DRAM start address changes, this mminfra_offset is
> > still
> > subtracting 2G, so I think it is a better choice to define it as
> > the
> > driver data for MT8196.
> > 
> 
> ....so, this makes me think the following:
> 
> 1. The DRAM start address cannot *ever* be less than 2G, because
> otherwise the
>     MMINFRA HW would have a hole in the usable address range;
>     1a. If the start address changes to less than 2G, then also the
> IOMMU would
>         get limitations, not only the mminfra..!
>     2b. This makes it very very very unlikely for the start address
> to be changed
>         to less than 0x80000000
> 
> 2. If the DRAM start address changes to be ABOVE 2G (so more than
> 0x80000000),
>     there would be no point for MMINFRA to start a "config path"
> write (or read)
>     in the SMMU DRAM block, would it? ;-)
> 

GCE is using IOMMU in MT8196, so all the address put into the GCE
instruction or GCE register for GCE access should be IOVA.

The DRAM start address is 2G(PA=0x80000000, IOVA=0x0) currently, so
when GCE want to access the IOVA=0x0, it will need to +2G into the
instruction, then the MMINFRA will see it as data path(IOVA > 2G) and
subtract 2G for that IOVA, so GCE can finally access the IOVA=0x0.

I'm not sure if I've misunderstood what you mean by ABOVE 2G. :-)
If DRAM start address is changed to 3G(PA=0xc0000000) the IOVA is still
0x0, so GCE still need to + 2G to make MMINFRA go to the data path.

But if you mean PA=0x80000000 and IOVA start address is 3G(0xc0000000),
then MMINFRA will go to the data path without GCE +2G.
However, MMINFRA will -2G when going to the data path and that will
cause GCE access the wrong IOVA.
So GCE still need to +2G no matter IOVA start address is already can
make MMINFRA go to the data path(IOVA > 2G).

We have already complained to our hardware designer that MMINFRA -2G
con not be changed, which will make software operation very
troublesome.
So in the next few generations of SoC will change this MMINFRA -2G to
software configurable. Then we can just make IOVA start address to 2G
without adding the mminfra_offset to the IOVA for GCE.

> I get it - if the DRAM moves up, MMINFRA is still at 2G because
> that's hard baked
> into the hardware, but I foresee that it'll be unlikely to see a
> platform changing
> the DRAM start address arbitrarily, getting out-of-sync with MMINFRA.
> 
> I propose to just get the address from the memory node for now, and
> to add a nice
> comment in the code that explains that "In at least MT8196, the
> MMINFRA hardware
> subtracts xyz etc etc" (and that explanation from the previous email
> is again
> wonderful and shall not be lost: either use that in the comment, or
> add it to
> the commit description, because it's really that good).
> 
> Should a new SoC appear in the future requiring an offset from the
> DRAM start
> address, we will think about how to make that work in the best
> possible way: in
> that case we could either reference something else to get the right
> address or
> we can just change this driver to just use the 2G offset statically
> for all.
> 
> What I'm trying to do here is to reduce the amount of changes that
> we'd need for
> adding new SoCs: since that 2G MMINFRA offset -> 2G DRAM start is not
> a coincidence
> I think that, should the DRAM start vary on new SoCs, the MMINFRA
> offset will
> follow the trend and vary with it.
> 
> So what I think is:
> 1. If I'm right, adding a new SoC (with different MMINFRA + DRAM
> offset) will be
>     as easy as adding a compatible string in the bindings, no effort
> in changing
>     this driver with new pdata offsets;
> 2. If I'm wrong, adding a new SoC means adding compat string and
> adding pdata and
>     one variable in the cmdq struct.
> 
> Where N.2 is what we would do anyway if we don't go with my proposed
> solution...
> 
> All this is just to give you my considerations about this topic -
> you're left
> completely free to disagree with me.
> If you disagree, I will trust your judgement, no problem here.
> 

Yes, I think your are right. No matter the IOVA start address changing,
MMINFRA will still -2G(the start address of DRAM PA).
Do you mean we can get the mminfra_offset from the start address of
memory in DTS, rather than defining it in pdata?

> > > >        bool control_by_sw;
> > > >        bool sw_ddr_en;
> > > > +     bool gce_vm;
> > > > +     u32 dma_mask_bit;
> > > >        u32 gce_num;
> > > >    };
> > > > 
> > > > +static inline u32 cmdq_reg_shift_addr(dma_addr_t addr, const
> > > > struct gce_plat *pdata)
> > > > +{
> > > > +     return ((addr + pdata->mminfra_offset) >> pdata->shift);
> > > > +}
> > > > +
> > > > +static inline u32 cmdq_reg_revert_addr(dma_addr_t addr, const
> > > > struct gce_plat *pdata)
> > > > +{
> > > > +     return ((addr << pdata->shift) - pdata->mminfra_offset);
> > > > +}
> > > 
> > > I'm not sure that you really need those two functions... probably
> > > it's simply
> > > cleaner and easier to just write that single line every time...
> > > and
> > > I'm
> > > saying that especially for how you're using those functions, with
> > > some readl()
> > > passed directly as param, decreasing human readability by "a
> > > whole
> > > lot" :-)
> > > 
> > 
> > The reason why I use API wrapper instead of writing it directly in
> > readl() is to avoid missing the shift or mminfra_offset conversion
> > in
> > some places.
> > This problem is not easy to debug, and I have encountered it at
> > least
> > twice...
> > 
> > I think the advantage of using function is that it can be uniformly
> > modified to all places that need to handle DRAM address conversion.
> > What do you think? :-)
> > 
> 
> Eh, if you put it like that... it makes sense, so.. yeah, okay :-)
> 
> Still, please cleanup those instances of
> 
> `cmdq_reg_revert_addr(readl(something), pdata)`
> 
> those might be hard to read, so please just do something like:
> 
> regval = readl(something);
> curr_pa = cmdq_revert_addr(regval, pdata);
> 
> ...reword to your own liking, of course.
> 

OK, I'll refine that. Thanks.

> > > > +
> > > >    static void cmdq_sw_ddr_enable(struct cmdq *cmdq, bool
> > > > enable)
> > > >    {
> > > >        WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq-
> > > > >clocks));
> > > > @@ -112,6 +136,30 @@ u8 cmdq_get_shift_pa(struct mbox_chan
> > > > *chan)
> > > >    }
> > > >    EXPORT_SYMBOL(cmdq_get_shift_pa);
> > > > 
> > > > +dma_addr_t cmdq_get_offset_pa(struct mbox_chan *chan)
> > > > +{
> > > > +     struct cmdq *cmdq = container_of(chan->mbox, struct cmdq,
> > > > mbox);
> > > > +
> > > > +     return cmdq->pdata->mminfra_offset;
> > > > +}
> > > > +EXPORT_SYMBOL(cmdq_get_offset_pa);
> > > 
> > > I think I remember this get_offset_pa from the old times, then CK
> > > removed it (and I
> > > was really happy about that disappearing), or am I confusing this
> > > with something
> > > else?
> > > 
> > > (of course, this wasn't used for mminfra, but for something
> > > else!)
> > > 
> > 
> > I can't find any remove history in mtk-cmdq-mailbox.c.
> > 
> > Maybe you mean the patch in this series?
> > https://lore.kernel.org/all/171213938049.123698.15573779837703602591.b4-ty@collabora.com/
> > 
> 
> Uhm, I think I may have confused something here, but yes I was
> remembering the
> patch series that you pointed out, definitely.
> 
> At the end, that series is doing something else, so nevermind, was
> just confusion.
> 

OK, no problem.

> > > > +
> > > > +bool cmdq_addr_need_offset(struct mbox_chan *chan, dma_addr_t
> > > > addr)
> > > > +{
> > > > +     struct cmdq *cmdq = container_of(chan->mbox, struct cmdq,
> > > > mbox);
> > > > +
> > > > +     if (cmdq->pdata->mminfra_offset == 0)
> > > > +             return false;
> > > > +
> > > > +     /*
> > > > +      * mminfra will recognize the addr that greater than the
> > > > mminfra_offset
> > > > +      * as a transaction to DRAM.
> > > > +      * So the caller needs to append mminfra_offset for the
> > > > true
> > > > case.
> > > > +      */
> > > > +     return (addr >= cmdq->pdata->mminfra_offset);
> > > 
> > > 
> > > /**
> > >    * cmdq_is_mminfra_gce() - Brief description
> > >    * @args.....
> > >    *
> > >    * The MMINFRA GCE will recognize an address greater than DRAM
> > > iostart as a
> > >    * DRAM transaction instead of ....xyz
> > >    *
> > >    * In order for callers to perform (xyz) transactions through
> > > the
> > > CMDQ, those
> > >    * need to know if they are using a GCE located in MMINFRA.
> > >    */
> > > bool cmdq_is_mminfra_gce(...)
> > > {
> > >          return cmdq->pdata->mminfra_offset &&
> > >                 (addr >= cmdq->pdata->mminfra_offset)
> > > 
> > > > +}
> > > > +EXPORT_SYMBOL(cmdq_addr_need_offset);
> > > > +
> > > 
> > 
> > OK, I'll modify the API like this.
> > 
> > > ...but then, is there really no way of just handling the GCE
> > > being in
> > > MMINFRA
> > > transparently from the callers? Do the callers really *need* to
> > > know
> > > that they're
> > > using a new GCE?!
> > > 
> > 
> > Since the address subtracting is done in MMINFRA hardware, I think
> > GCE
> > users really need to handle it in driver.
> > 
> 
> Since the users of this infrastructure are multimedia related
> (disp/MDP3),
> I'd also like to get an opinion from MediaTek engineers familiar with
> that.
> 
> CK, Moudy, any opinion on that, please?
> 
> > > Another way of saying: can't we just handle the address
> > > translation
> > > in here instead
> > > of instructing each and every driver about how to communicate
> > > with
> > > the new GCE?!
> > > 
> > 
> > The DRAM address may not only be the command buffer to GCE, but
> > also
> > the working buffer provided by CMDQ users and being a part of GCE
> > instruction, so we need to handle the address translation in CMDQ
> > helper driver for the instruction generation.
> > E.g. ISP drivers may use GCE to write a hardware settings to a DRAM
> > as
> > backup buffer. The GCE write instruction will be:
> > WRITE the value of ISP register to DRAM address + mminfra_offset.
> > 
> > But most of the CMDQ users only need to use GCE to write hardware
> > register, so I only keep the translation in cmdq_pkt_mem_move(),
> > cmdq_pkt_poll_addr() and cmdq_pkt_jump_abs() at the latest series.
> 
> Yeah you're choosing the best of both worlds in that case, I do
> agree, but
> still - if there's a way to avoid drivers to have different handling
> for
> mminfra vs no-mminfra, that'd still be preferred.
> 
> Having the handling for something *centralized* somewhere, instead of
> it
> being sparse here and there, would make maintenance way easier...
> 
> ...and that's why I'm asking for CK and Moudy's opinion, nothing else
> :-)
> 

Yes, I totally agree with you. Thanks for the asking!

Regards,
Jason-JH.Lin 

> Cheers!
> Angelo
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 6/8] soc: mediatek: Add programming flow for unsupported subsys ID hardware
  2025-03-05 18:08       ` AngeloGioacchino Del Regno
@ 2025-03-06 11:05         ` Jason-JH Lin (林睿祥)
  0 siblings, 0 replies; 31+ messages in thread
From: Jason-JH Lin (林睿祥) @ 2025-03-06 11:05 UTC (permalink / raw)
  To: robh@kernel.org, krzk+dt@kernel.org, AngeloGioacchino Del Regno,
	conor+dt@kernel.org, mchehab@kernel.org, chunkuang.hu@kernel.org,
	jassisinghbrar@gmail.com
  Cc: linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-mediatek@lists.infradead.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	Moudy Ho (何宗原), devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, fshao@chromium.org,
	Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

On Wed, 2025-03-05 at 19:08 +0100, AngeloGioacchino Del Regno wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> Il 05/03/25 17:12, Jason-JH Lin (林睿祥) ha scritto:
> > On Tue, 2025-03-04 at 10:41 +0100, AngeloGioacchino Del Regno
> > wrote:
> > > 
> > > External email : Please do not click links or open attachments
> > > until
> > > you have verified the sender or the content.
> > > 
> > > 
> > > Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> > > > To support hardware without subsys IDs on new SoCs, add a
> > > > programming
> > > > flow that checks whether the subsys ID is valid. If the subsys
> > > > ID
> > > > is
> > > > invalid, the flow will call 2 alternative CMDQ APIs:
> > > > cmdq_pkt_assign() and cmdq_pkt_write_s_value() to achieve the
> > > > same
> > > > functionality.
> > > > 
> > > > Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
> > > > ---
> > > >    drivers/soc/mediatek/mtk-mmsys.c | 14 +++++++++++---
> > > >    drivers/soc/mediatek/mtk-mutex.c | 11 +++++++++--
> > > >    2 files changed, 20 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/drivers/soc/mediatek/mtk-mmsys.c
> > > > b/drivers/soc/mediatek/mtk-mmsys.c
> > > > index bb4639ca0b8c..ce949b863b05 100644
> > > > --- a/drivers/soc/mediatek/mtk-mmsys.c
> > > > +++ b/drivers/soc/mediatek/mtk-mmsys.c
> > > > @@ -167,9 +167,17 @@ static void mtk_mmsys_update_bits(struct
> > > > mtk_mmsys *mmsys, u32 offset, u32 mask,
> > > >        u32 tmp;
> > > > 
> > > >        if (mmsys->cmdq_base.size && cmdq_pkt) {
> > > > -             ret = cmdq_pkt_write_mask(cmdq_pkt, mmsys-
> > > > > cmdq_base.subsys,
> > > > -                                       mmsys->cmdq_base.offset
> > > > +
> > > > offset, val,
> > > > -                                       mask);
> > > > +             offset += mmsys->cmdq_base.offset;
> > > > +             if (mmsys->cmdq_base.subsys !=
> > > > CMDQ_SUBSYS_INVALID) {
> > > 
> > > You're still anyway passing the subsys to cmdq_pkt_write_mask(),
> > > right?!
> > > Why don't you just handle this in cmdq_pkt_write_mask() then? ;-)
> > > 
> > > I can see this pattern being repeated over and over in both
> > > drm/mediatek and MDP3
> > > drivers, and it's not necessary to duplicate this many times when
> > > you
> > > can write it
> > > just once.
> > > 
> > > Would've also been faster for you to implement... :-D
> > > 
> > 
> > I think did it in the series V1:
> > https://patchwork.kernel.org/project/linux-mediatek/patch/20241121042602.32730-5-jason-jh.lin@mediatek.com
> > 
> > Because it'll need to passing the base_pa and that will need to
> > change
> > the interface for original APIs.
> > 
> > And CK think that's not a necessary to change the APIs. It can be
> > done
> > by cmdq_pkt_assign() + cmdq_pkt_write_s_mask_value() in the client
> > drivers. Then you can see this pattern in everywhere. :-)
> > 
> 
> Using likely(x) and unlikely(x) should be avoided, really, unless
> it's something
> that is really really really really ... really ... rea.... likely or
> unlikely :-)
> 
> Btw. Changing the APIs is a bit difficult, but I disagree with CK
> about not
> "inventing" a new API for the unsupported-subsys flow.
> 
> It's true, it is not *strictly* needed to add a function, but it's
> good for any
> kind of future maintainability - as I explained, it's easier then to
> fix a problem
> if there's one.... and well, I can see that you agree with me,
> because effectively
> you did it the first time :-)
> 
> CK mentioned using cmdq_pkt_write() *or*
> cmdq_pkt_assignwrite/cmdq_pkt_write_pa()
> (however you wanna call it, it's fine for me), in drivers that know
> that there
> always is or there always isn't a subsys ID: that's a good
> suggestion, as this can
> be eventually done with assigning a function pointer, so, no
> conditionals at each
> operation.
> 
> My point of view, finally, is:
>   - This is just another way of doing cmdq_pkt_write()
>     - This, at the end of the day, does exactly what cmdq_pkt_write()
> is doing,
>       except it's doing it with two instructions instead of one;
>   - The same thing can be done in two different ways (depending on
> SoC)
>     - This same thing should have a function that does it.
> 
> A function that does it can be
> 
> int cmdq_pkt_write_pa(struct cmdq_pkt *pkt, u8 subsys /*unused*/, u32
> pa_base, u16
> offset, u32 value)
> {
>         err = cmdq_pkt_assign(pkt, 0, CMDQ_ADDR_HIGH(pa_base));
>         if (err < 0)
>                 return err;
> 
>         return cmdq_pkt_write_s_value( .... etc)
> }
> 
> int cmdq_pkt_write() <--- unchanged, scheduled for removal after all
> drivers migrated
> 
> int cmdq_pkt_write_subsys(struct cmdq_pkt *pkt, u8 subsys, u32
> pa_base /*unused*/,
> u16 offset, u32 value)
> {
>         /* This function will get the contents of cmdq_pkt_write once
> removed,
>             but, in the meanwhile, to avoid duplication we just call
> that: */
> 
>         return cmdq_pkt_write(pkt, subsys, offset, value);
> }
> 
> - Are we adding one more function parameter? Yes
> - Is this impacting performance overall? Not really
> 
> After all, we're living in an ARMv8 (actually, ARMv9 for new ones)
> world, so
> one more function param won't hurt anyone.
> 
> I think that's the best of both worlds, and makes everyone happy.
> Are you happy with that? :-)
> 

Yes, I am happy with that. :-)
And thanks for your detail coding.

I'll change it in the next version.

regards,
Jason-JH Lin

> Cheers,
> Angelo
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 7/8] drm/mediatek: Add programming flow for unsupported subsys ID hardware
  2025-03-06  9:47   ` AngeloGioacchino Del Regno
@ 2025-03-06 11:10     ` Jason-JH Lin (林睿祥)
  0 siblings, 0 replies; 31+ messages in thread
From: Jason-JH Lin (林睿祥) @ 2025-03-06 11:10 UTC (permalink / raw)
  To: robh@kernel.org, krzk+dt@kernel.org, AngeloGioacchino Del Regno,
	conor+dt@kernel.org, mchehab@kernel.org, chunkuang.hu@kernel.org,
	jassisinghbrar@gmail.com
  Cc: CK Hu (胡俊光), linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Moudy Ho (何宗原),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	fshao@chromium.org, Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

On Thu, 2025-03-06 at 10:47 +0100, AngeloGioacchino Del Regno wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> Il 18/02/25 06:41, Jason-JH Lin ha scritto:
> > To support hardware without subsys IDs on new SoCs, add a
> > programming
> > flow that checks whether the subsys ID is valid. If the subsys ID
> > is
> > invalid, the flow will call 2 alternative CMDQ APIs:
> > cmdq_pkt_assign() and cmdq_pkt_write_s_value() to achieve the same
> > functionality.
> > 
> > Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
> > Reviewed-by: CK Hu <ck.hu@mediatek.com>
> > ---
> >   drivers/gpu/drm/mediatek/mtk_ddp_comp.c | 33
> > ++++++++++++++++++++-----
> >   1 file changed, 27 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
> > b/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
> > index edc6417639e6..219d67735a54 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
> > @@ -66,14 +66,37 @@ struct mtk_ddp_comp_dev {
> >       struct cmdq_client_reg cmdq_reg;
> >   };
> > 
> > +#if IS_REACHABLE(CONFIG_MTK_CMDQ)
> > +static void mtk_ddp_write_cmdq_pkt(struct cmdq_pkt *cmdq_pkt,
> > struct cmdq_client_reg *cmdq_reg,
> > +                                unsigned int offset, unsigned int
> > value, unsigned int mask)
> > +{
> > +     offset += cmdq_reg->offset;
> > +
> > +     if (cmdq_reg->subsys != CMDQ_SUBSYS_INVALID) {
> > +             if (mask == GENMASK(31, 0))
> > +                     cmdq_pkt_write(cmdq_pkt, cmdq_reg->subsys,
> > offset, value);
> > +             else
> > +                     cmdq_pkt_write_mask(cmdq_pkt, cmdq_reg-
> > >subsys, offset, value, mask);
> 
> Sorry but this is pointless, really...
> 
> Function cmdq_pkt_write_mask() in mtk-cmdq-helper is doing:
> 
>         if (mask != GENMASK(31, 0)) {
>                 err = cmdq_pkt_mask(pkt, mask);
>                 if (err < 0)
>                         return err;
> 
>                 offset_mask |= CMDQ_WRITE_ENABLE_MASK;
>         }
>         return cmdq_pkt_write(pkt, subsys, offset_mask, value);
> 
> here you're doing the exact inverse check.
> 

Oh, I didn't notice that it was redundant. Thanks.

> At this point you can just do:
> 
> static int mtk_ddp_write_cmdq_pkt(struct cmdq_pkt *cmdq_pkt, struct
> cmdq_client_reg
> *cmdq_reg,
>                                   u16 offset, u32 value, u32 mask)
> {
>         u16 gce_offset = cmdq_reg->offset + offset;
> 
>         if (cmdq_reg->subsys != CMDQ_SUBSYS_INVALID)
>                 return cmdq_pkt_write_mask(pkt, cmdq_reg->subsys,
> gce_offset, value, mask);
> 
>         return cmdq_pkt_write_mask_pa(cmdq_pkt, cmdq_reg->pa_base,
> gce_offset, value, mask);
> }
> 

I'll change like this, Thanks!

Regards,
Jason-JH Lin

> Cheers,
> Angelo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support for MT8196
  2025-03-06 11:00         ` Jason-JH Lin (林睿祥)
@ 2025-03-06 13:08           ` AngeloGioacchino Del Regno
  2025-03-07  1:11             ` Jason-JH Lin (林睿祥)
  0 siblings, 1 reply; 31+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-03-06 13:08 UTC (permalink / raw)
  To: Jason-JH Lin (林睿祥), chunkuang.hu@kernel.org,
	Moudy Ho (何宗原)
  Cc: conor+dt@kernel.org, robh@kernel.org, linux-media@vger.kernel.org,
	Sirius Wang (王皓昱),
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	fshao@chromium.org, krzk+dt@kernel.org, jassisinghbrar@gmail.com,
	Singo Chang (張興國), mchehab@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

Il 06/03/25 12:00, Jason-JH Lin (林睿祥) ha scritto:
> [snip]
>>
>>> CPR_GSIZE is the setting for allocating the CPR SRAM size to each
>>> VM.
>>
>> Would be awesome if you could then clarify the comment that you have
>> later in
>> the code here, from...
>>
>> /* config cpr size for host vm */
>>
>> to
>>
>> /* Set the amount of CPR SRAM to allocate to each VM */
>>
>> ...that could be a way of more properly describing what the writel
>> there is doing.
>>
> 
> OK, I'll change it.
> 
>>>
>>>> The GCE stuff isn't even properly described in datasheets - I do
>>>> (probably!)
>>>> understand what those are for, but asking people to get years of
>>>> experience on
>>>> MediaTek to understand what's going on would be a bit rude,
>>>> wouldn't
>>>> it? :-D
>>>>
>>>
>>> I agree with you :-)
>>> I'll put them in the VM patch and add some brief description for
>>> them.
>>>
>>
>> Thanks, much appreciated!
>>
>>>>> +
>>>>>     #define CMDQ_THR_ACTIVE_SLOT_CYCLES 0x3200
>>>>>     #define CMDQ_THR_ENABLED            0x1
>>>>>     #define CMDQ_THR_DISABLED           0x0
>>>>> @@ -87,11 +98,24 @@ struct cmdq {
>>>>>     struct gce_plat {
>>>>>         u32 thread_nr;
>>>>>         u8 shift;
>>>>> +     dma_addr_t mminfra_offset;
>>>>
>>>> It looks like this is exactly the DRAM's iostart... at least, I
>>>> can
>>>> see that in the
>>>> downstream devicetree that's where it starts.
>>>>
>>>>           memory: memory@80000000 {
>>>>                   device_type = "memory";
>>>>                   reg = <0 0x80000000 0 0x40000000>;
>>>>           };
>>>>
>>>> It doesn't really look like being a coincidence, but, for the
>>>> sake of
>>>> asking:
>>>> is this just a coincidence? :-)
>>>>
>>>
>>> As the confirmation with the hardware designer in previous reply
>>> mail
>>> for CK:
>>> https://patchwork.kernel.org/project/linux-mediatek/patch/20250218054405.2017918-4-jason-jh.lin@mediatek.com/*26258463
>>>
>>
>> That explanation was simply wonderful.
>>
>>> Since the MMINFRA remap subtracting 2G is done in the hardware
>>> circuit
>>> and cannot be configured by software, the address +2G adjustment is
>>> necessary to implement in the CMDQ driver.
>>>
>>> So that might not be a coincidence.
>>> But even if DRAM start address changes, this mminfra_offset is
>>> still
>>> subtracting 2G, so I think it is a better choice to define it as
>>> the
>>> driver data for MT8196.
>>>
>>
>> ....so, this makes me think the following:
>>
>> 1. The DRAM start address cannot *ever* be less than 2G, because
>> otherwise the
>>      MMINFRA HW would have a hole in the usable address range;
>>      1a. If the start address changes to less than 2G, then also the
>> IOMMU would
>>          get limitations, not only the mminfra..!
>>      2b. This makes it very very very unlikely for the start address
>> to be changed
>>          to less than 0x80000000
>>
>> 2. If the DRAM start address changes to be ABOVE 2G (so more than
>> 0x80000000),
>>      there would be no point for MMINFRA to start a "config path"
>> write (or read)
>>      in the SMMU DRAM block, would it? ;-)
>>
> 
> GCE is using IOMMU in MT8196, so all the address put into the GCE
> instruction or GCE register for GCE access should be IOVA.
> 
> The DRAM start address is 2G(PA=0x80000000, IOVA=0x0) currently, so
> when GCE want to access the IOVA=0x0, it will need to +2G into the
> instruction, then the MMINFRA will see it as data path(IOVA > 2G) and
> subtract 2G for that IOVA, so GCE can finally access the IOVA=0x0.
> 
> I'm not sure if I've misunderstood what you mean by ABOVE 2G. :-)
> If DRAM start address is changed to 3G(PA=0xc0000000) the IOVA is still
> 0x0, so GCE still need to + 2G to make MMINFRA go to the data path.
> 
> But if you mean PA=0x80000000 and IOVA start address is 3G(0xc0000000),
> then MMINFRA will go to the data path without GCE +2G.
> However, MMINFRA will -2G when going to the data path and that will
> cause GCE access the wrong IOVA.
> So GCE still need to +2G no matter IOVA start address is already can
> make MMINFRA go to the data path(IOVA > 2G).
> 
> We have already complained to our hardware designer that MMINFRA -2G
> con not be changed, which will make software operation very
> troublesome.
> So in the next few generations of SoC will change this MMINFRA -2G to
> software configurable. Then we can just make IOVA start address to 2G
> without adding the mminfra_offset to the IOVA for GCE.
> 

Okay now I got it, the reality is way worse than I was thinking... eww... :-(

>> I get it - if the DRAM moves up, MMINFRA is still at 2G because
>> that's hard baked
>> into the hardware, but I foresee that it'll be unlikely to see a
>> platform changing
>> the DRAM start address arbitrarily, getting out-of-sync with MMINFRA.
>>
>> I propose to just get the address from the memory node for now, and
>> to add a nice
>> comment in the code that explains that "In at least MT8196, the
>> MMINFRA hardware
>> subtracts xyz etc etc" (and that explanation from the previous email
>> is again
>> wonderful and shall not be lost: either use that in the comment, or
>> add it to
>> the commit description, because it's really that good).
>>
>> Should a new SoC appear in the future requiring an offset from the
>> DRAM start
>> address, we will think about how to make that work in the best
>> possible way: in
>> that case we could either reference something else to get the right
>> address or
>> we can just change this driver to just use the 2G offset statically
>> for all.
>>
>> What I'm trying to do here is to reduce the amount of changes that
>> we'd need for
>> adding new SoCs: since that 2G MMINFRA offset -> 2G DRAM start is not
>> a coincidence
>> I think that, should the DRAM start vary on new SoCs, the MMINFRA
>> offset will
>> follow the trend and vary with it.
>>
>> So what I think is:
>> 1. If I'm right, adding a new SoC (with different MMINFRA + DRAM
>> offset) will be
>>      as easy as adding a compatible string in the bindings, no effort
>> in changing
>>      this driver with new pdata offsets;
>> 2. If I'm wrong, adding a new SoC means adding compat string and
>> adding pdata and
>>      one variable in the cmdq struct.
>>
>> Where N.2 is what we would do anyway if we don't go with my proposed
>> solution...
>>
>> All this is just to give you my considerations about this topic -
>> you're left
>> completely free to disagree with me.
>> If you disagree, I will trust your judgement, no problem here.
>>
> 
> Yes, I think your are right. No matter the IOVA start address changing,
> MMINFRA will still -2G(the start address of DRAM PA).
> Do you mean we can get the mminfra_offset from the start address of
> memory in DTS, rather than defining it in pdata?
> 

After the last explanation... no, it would be wrong to get the start from
memory in DTS, because then this will still need hacks.
I was somehow convinced that the DRAM start address and the MMINFRA offset
were directly related and that it would've been hard to change the DRAM
start address with the MMINFRA offset being -2G, but it's not, so doing it
my way will eventually backfire on us.

So my way is not good, as it's not showing the reality of things.

Just go with your current way, as it's really tied to the hardware and it's
not restricted to that dram start coincidence in the end. That's a pity.

Just instead of writing 0x80000000, use " SZ_2G " instead... and please
add a comment in the code that explains in brief that there's this strange
behavior for which we.. need that, and that's a static subtraction, and is
tied to the MMINFRA hardware, and cannot be changed :-(

btw, being clear..

#include <linux/sizes.h>

.mminfra_offset = SZ_2G,

..that way you don't even need a comment saying /* 2GB */ ....

Cheers!

>>>>>         bool control_by_sw;
>>>>>         bool sw_ddr_en;
>>>>> +     bool gce_vm;
>>>>> +     u32 dma_mask_bit;
>>>>>         u32 gce_num;
>>>>>     };
>>>>>
>>>>> +static inline u32 cmdq_reg_shift_addr(dma_addr_t addr, const
>>>>> struct gce_plat *pdata)
>>>>> +{
>>>>> +     return ((addr + pdata->mminfra_offset) >> pdata->shift);
>>>>> +}
>>>>> +
>>>>> +static inline u32 cmdq_reg_revert_addr(dma_addr_t addr, const
>>>>> struct gce_plat *pdata)
>>>>> +{
>>>>> +     return ((addr << pdata->shift) - pdata->mminfra_offset);
>>>>> +}
>>>>
>>>> I'm not sure that you really need those two functions... probably
>>>> it's simply
>>>> cleaner and easier to just write that single line every time...
>>>> and
>>>> I'm
>>>> saying that especially for how you're using those functions, with
>>>> some readl()
>>>> passed directly as param, decreasing human readability by "a
>>>> whole
>>>> lot" :-)
>>>>
>>>
>>> The reason why I use API wrapper instead of writing it directly in
>>> readl() is to avoid missing the shift or mminfra_offset conversion
>>> in
>>> some places.
>>> This problem is not easy to debug, and I have encountered it at
>>> least
>>> twice...
>>>
>>> I think the advantage of using function is that it can be uniformly
>>> modified to all places that need to handle DRAM address conversion.
>>> What do you think? :-)
>>>
>>
>> Eh, if you put it like that... it makes sense, so.. yeah, okay :-)
>>
>> Still, please cleanup those instances of
>>
>> `cmdq_reg_revert_addr(readl(something), pdata)`
>>
>> those might be hard to read, so please just do something like:
>>
>> regval = readl(something);
>> curr_pa = cmdq_revert_addr(regval, pdata);
>>
>> ...reword to your own liking, of course.
>>
> 
> OK, I'll refine that. Thanks.
> 
>>>>> +
>>>>>     static void cmdq_sw_ddr_enable(struct cmdq *cmdq, bool
>>>>> enable)
>>>>>     {
>>>>>         WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq-
>>>>>> clocks));
>>>>> @@ -112,6 +136,30 @@ u8 cmdq_get_shift_pa(struct mbox_chan
>>>>> *chan)
>>>>>     }
>>>>>     EXPORT_SYMBOL(cmdq_get_shift_pa);
>>>>>
>>>>> +dma_addr_t cmdq_get_offset_pa(struct mbox_chan *chan)
>>>>> +{
>>>>> +     struct cmdq *cmdq = container_of(chan->mbox, struct cmdq,
>>>>> mbox);
>>>>> +
>>>>> +     return cmdq->pdata->mminfra_offset;
>>>>> +}
>>>>> +EXPORT_SYMBOL(cmdq_get_offset_pa);
>>>>
>>>> I think I remember this get_offset_pa from the old times, then CK
>>>> removed it (and I
>>>> was really happy about that disappearing), or am I confusing this
>>>> with something
>>>> else?
>>>>
>>>> (of course, this wasn't used for mminfra, but for something
>>>> else!)
>>>>
>>>
>>> I can't find any remove history in mtk-cmdq-mailbox.c.
>>>
>>> Maybe you mean the patch in this series?
>>> https://lore.kernel.org/all/171213938049.123698.15573779837703602591.b4-ty@collabora.com/
>>>
>>
>> Uhm, I think I may have confused something here, but yes I was
>> remembering the
>> patch series that you pointed out, definitely.
>>
>> At the end, that series is doing something else, so nevermind, was
>> just confusion.
>>
> 
> OK, no problem.
> 
>>>>> +
>>>>> +bool cmdq_addr_need_offset(struct mbox_chan *chan, dma_addr_t
>>>>> addr)
>>>>> +{
>>>>> +     struct cmdq *cmdq = container_of(chan->mbox, struct cmdq,
>>>>> mbox);
>>>>> +
>>>>> +     if (cmdq->pdata->mminfra_offset == 0)
>>>>> +             return false;
>>>>> +
>>>>> +     /*
>>>>> +      * mminfra will recognize the addr that greater than the
>>>>> mminfra_offset
>>>>> +      * as a transaction to DRAM.
>>>>> +      * So the caller needs to append mminfra_offset for the
>>>>> true
>>>>> case.
>>>>> +      */
>>>>> +     return (addr >= cmdq->pdata->mminfra_offset);
>>>>
>>>>
>>>> /**
>>>>     * cmdq_is_mminfra_gce() - Brief description
>>>>     * @args.....
>>>>     *
>>>>     * The MMINFRA GCE will recognize an address greater than DRAM
>>>> iostart as a
>>>>     * DRAM transaction instead of ....xyz
>>>>     *
>>>>     * In order for callers to perform (xyz) transactions through
>>>> the
>>>> CMDQ, those
>>>>     * need to know if they are using a GCE located in MMINFRA.
>>>>     */
>>>> bool cmdq_is_mminfra_gce(...)
>>>> {
>>>>           return cmdq->pdata->mminfra_offset &&
>>>>                  (addr >= cmdq->pdata->mminfra_offset)
>>>>
>>>>> +}
>>>>> +EXPORT_SYMBOL(cmdq_addr_need_offset);
>>>>> +
>>>>
>>>
>>> OK, I'll modify the API like this.
>>>
>>>> ...but then, is there really no way of just handling the GCE
>>>> being in
>>>> MMINFRA
>>>> transparently from the callers? Do the callers really *need* to
>>>> know
>>>> that they're
>>>> using a new GCE?!
>>>>
>>>
>>> Since the address subtracting is done in MMINFRA hardware, I think
>>> GCE
>>> users really need to handle it in driver.
>>>
>>
>> Since the users of this infrastructure are multimedia related
>> (disp/MDP3),
>> I'd also like to get an opinion from MediaTek engineers familiar with
>> that.
>>
>> CK, Moudy, any opinion on that, please?
>>
>>>> Another way of saying: can't we just handle the address
>>>> translation
>>>> in here instead
>>>> of instructing each and every driver about how to communicate
>>>> with
>>>> the new GCE?!
>>>>
>>>
>>> The DRAM address may not only be the command buffer to GCE, but
>>> also
>>> the working buffer provided by CMDQ users and being a part of GCE
>>> instruction, so we need to handle the address translation in CMDQ
>>> helper driver for the instruction generation.
>>> E.g. ISP drivers may use GCE to write a hardware settings to a DRAM
>>> as
>>> backup buffer. The GCE write instruction will be:
>>> WRITE the value of ISP register to DRAM address + mminfra_offset.
>>>
>>> But most of the CMDQ users only need to use GCE to write hardware
>>> register, so I only keep the translation in cmdq_pkt_mem_move(),
>>> cmdq_pkt_poll_addr() and cmdq_pkt_jump_abs() at the latest series.
>>
>> Yeah you're choosing the best of both worlds in that case, I do
>> agree, but
>> still - if there's a way to avoid drivers to have different handling
>> for
>> mminfra vs no-mminfra, that'd still be preferred.
>>
>> Having the handling for something *centralized* somewhere, instead of
>> it
>> being sparse here and there, would make maintenance way easier...
>>
>> ...and that's why I'm asking for CK and Moudy's opinion, nothing else
>> :-)
>>
> 
> Yes, I totally agree with you. Thanks for the asking!
> 
> Regards,
> Jason-JH.Lin
> 
>> Cheers!
>> Angelo
>>
> 





^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support for MT8196
  2025-03-06 13:08           ` AngeloGioacchino Del Regno
@ 2025-03-07  1:11             ` Jason-JH Lin (林睿祥)
  0 siblings, 0 replies; 31+ messages in thread
From: Jason-JH Lin (林睿祥) @ 2025-03-07  1:11 UTC (permalink / raw)
  To: AngeloGioacchino Del Regno, chunkuang.hu@kernel.org,
	Moudy Ho (何宗原)
  Cc: robh@kernel.org, linux-media@vger.kernel.org,
	Sirius Wang (王皓昱), mchehab@kernel.org,
	Nancy Lin (林欣螢),
	Xiandong Wang (王先冬),
	linux-mediatek@lists.infradead.org,
	Project_Global_Chrome_Upstream_Group, conor+dt@kernel.org,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	devicetree@vger.kernel.org, fshao@chromium.org,
	krzk+dt@kernel.org, jassisinghbrar@gmail.com,
	Singo Chang (張興國),
	linux-arm-kernel@lists.infradead.org,
	Xavier Chang (張獻文), matthias.bgg@gmail.com,
	treapking@chromium.org

> > 

[snip]

> > > > > > +
> > > > > >     #define CMDQ_THR_ACTIVE_SLOT_CYCLES 0x3200
> > > > > >     #define CMDQ_THR_ENABLED            0x1
> > > > > >     #define CMDQ_THR_DISABLED           0x0
> > > > > > @@ -87,11 +98,24 @@ struct cmdq {
> > > > > >     struct gce_plat {
> > > > > >         u32 thread_nr;
> > > > > >         u8 shift;
> > > > > > +     dma_addr_t mminfra_offset;
> > > > > 
> > > > > It looks like this is exactly the DRAM's iostart... at least,
> > > > > I
> > > > > can
> > > > > see that in the
> > > > > downstream devicetree that's where it starts.
> > > > > 
> > > > >           memory: memory@80000000 {
> > > > >                   device_type = "memory";
> > > > >                   reg = <0 0x80000000 0 0x40000000>;
> > > > >           };
> > > > > 
> > > > > It doesn't really look like being a coincidence, but, for the
> > > > > sake of
> > > > > asking:
> > > > > is this just a coincidence? :-)
> > > > > 
> > > > 
> > > > As the confirmation with the hardware designer in previous
> > > > reply
> > > > mail
> > > > for CK:
> > > > https://patchwork.kernel.org/project/linux-mediatek/patch/20250218054405.2017918-4-jason-jh.lin@mediatek.com/*26258463
> > > > 
> > > 
> > > That explanation was simply wonderful.
> > > 
> > > > Since the MMINFRA remap subtracting 2G is done in the hardware
> > > > circuit
> > > > and cannot be configured by software, the address +2G
> > > > adjustment is
> > > > necessary to implement in the CMDQ driver.
> > > > 
> > > > So that might not be a coincidence.
> > > > But even if DRAM start address changes, this mminfra_offset is
> > > > still
> > > > subtracting 2G, so I think it is a better choice to define it
> > > > as
> > > > the
> > > > driver data for MT8196.
> > > > 
> > > 
> > > ....so, this makes me think the following:
> > > 
> > > 1. The DRAM start address cannot *ever* be less than 2G, because
> > > otherwise the
> > >      MMINFRA HW would have a hole in the usable address range;
> > >      1a. If the start address changes to less than 2G, then also
> > > the
> > > IOMMU would
> > >          get limitations, not only the mminfra..!
> > >      2b. This makes it very very very unlikely for the start
> > > address
> > > to be changed
> > >          to less than 0x80000000
> > > 
> > > 2. If the DRAM start address changes to be ABOVE 2G (so more than
> > > 0x80000000),
> > >      there would be no point for MMINFRA to start a "config path"
> > > write (or read)
> > >      in the SMMU DRAM block, would it? ;-)
> > > 
> > 
> > GCE is using IOMMU in MT8196, so all the address put into the GCE
> > instruction or GCE register for GCE access should be IOVA.
> > 
> > The DRAM start address is 2G(PA=0x80000000, IOVA=0x0) currently, so
> > when GCE want to access the IOVA=0x0, it will need to +2G into the
> > instruction, then the MMINFRA will see it as data path(IOVA > 2G)
> > and
> > subtract 2G for that IOVA, so GCE can finally access the IOVA=0x0.
> > 
> > I'm not sure if I've misunderstood what you mean by ABOVE 2G. :-)
> > If DRAM start address is changed to 3G(PA=0xc0000000) the IOVA is
> > still
> > 0x0, so GCE still need to + 2G to make MMINFRA go to the data path.
> > 
> > But if you mean PA=0x80000000 and IOVA start address is
> > 3G(0xc0000000),
> > then MMINFRA will go to the data path without GCE +2G.
> > However, MMINFRA will -2G when going to the data path and that will
> > cause GCE access the wrong IOVA.
> > So GCE still need to +2G no matter IOVA start address is already
> > can
> > make MMINFRA go to the data path(IOVA > 2G).
> > 
> > We have already complained to our hardware designer that MMINFRA -
> > 2G
> > con not be changed, which will make software operation very
> > troublesome.
> > So in the next few generations of SoC will change this MMINFRA -2G
> > to
> > software configurable. Then we can just make IOVA start address to
> > 2G
> > without adding the mminfra_offset to the IOVA for GCE.
> > 
> 
> Okay now I got it, the reality is way worse than I was thinking...
> eww... :-(
> 
> > > I get it - if the DRAM moves up, MMINFRA is still at 2G because
> > > that's hard baked
> > > into the hardware, but I foresee that it'll be unlikely to see a
> > > platform changing
> > > the DRAM start address arbitrarily, getting out-of-sync with
> > > MMINFRA.
> > > 
> > > I propose to just get the address from the memory node for now,
> > > and
> > > to add a nice
> > > comment in the code that explains that "In at least MT8196, the
> > > MMINFRA hardware
> > > subtracts xyz etc etc" (and that explanation from the previous
> > > email
> > > is again
> > > wonderful and shall not be lost: either use that in the comment,
> > > or
> > > add it to
> > > the commit description, because it's really that good).
> > > 
> > > Should a new SoC appear in the future requiring an offset from
> > > the
> > > DRAM start
> > > address, we will think about how to make that work in the best
> > > possible way: in
> > > that case we could either reference something else to get the
> > > right
> > > address or
> > > we can just change this driver to just use the 2G offset
> > > statically
> > > for all.
> > > 
> > > What I'm trying to do here is to reduce the amount of changes
> > > that
> > > we'd need for
> > > adding new SoCs: since that 2G MMINFRA offset -> 2G DRAM start is
> > > not
> > > a coincidence
> > > I think that, should the DRAM start vary on new SoCs, the MMINFRA
> > > offset will
> > > follow the trend and vary with it.
> > > 
> > > So what I think is:
> > > 1. If I'm right, adding a new SoC (with different MMINFRA + DRAM
> > > offset) will be
> > >      as easy as adding a compatible string in the bindings, no
> > > effort
> > > in changing
> > >      this driver with new pdata offsets;
> > > 2. If I'm wrong, adding a new SoC means adding compat string and
> > > adding pdata and
> > >      one variable in the cmdq struct.
> > > 
> > > Where N.2 is what we would do anyway if we don't go with my
> > > proposed
> > > solution...
> > > 
> > > All this is just to give you my considerations about this topic -
> > > you're left
> > > completely free to disagree with me.
> > > If you disagree, I will trust your judgement, no problem here.
> > > 
> > 
> > Yes, I think your are right. No matter the IOVA start address
> > changing,
> > MMINFRA will still -2G(the start address of DRAM PA).
> > Do you mean we can get the mminfra_offset from the start address of
> > memory in DTS, rather than defining it in pdata?
> > 
> 
> After the last explanation... no, it would be wrong to get the start
> from
> memory in DTS, because then this will still need hacks.
> I was somehow convinced that the DRAM start address and the MMINFRA
> offset
> were directly related and that it would've been hard to change the
> DRAM
> start address with the MMINFRA offset being -2G, but it's not, so
> doing it
> my way will eventually backfire on us.
> 
> So my way is not good, as it's not showing the reality of things.
> 
> Just go with your current way, as it's really tied to the hardware
> and it's
> not restricted to that dram start coincidence in the end. That's a
> pity.
> 
> Just instead of writing 0x80000000, use " SZ_2G " instead... and
> please
> add a comment in the code that explains in brief that there's this
> strange
> behavior for which we.. need that, and that's a static subtraction,
> and is
> tied to the MMINFRA hardware, and cannot be changed :-(
> 
> btw, being clear..
> 
> #include <linux/sizes.h>
> 
> .mminfra_offset = SZ_2G,
> 
> ..that way you don't even need a comment saying /* 2GB */ ....
> 
> Cheers!

Got it. I'll modify it.
Thanks for the confirmation.

Regards,
Jason-JH Lin

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2025-03-07  1:32 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-18  5:41 [PATCH v4 0/8] Add GCE support for MT8196 Jason-JH Lin
2025-02-18  5:41 ` [PATCH v4 1/8] dt-bindings: mailbox: mediatek: Add support for MT8196 GCE mailbox Jason-JH Lin
2025-02-18  8:12   ` Krzysztof Kozlowski
2025-02-18  5:41 ` [PATCH v4 2/8] arm64: dts: mediatek: Add GCE header for MT8196 Jason-JH Lin
2025-02-18  5:41 ` [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support " Jason-JH Lin
2025-02-18  9:25   ` CK Hu (胡俊光)
2025-02-22 10:44     ` Jason-JH Lin (林睿祥)
2025-03-04  9:32   ` AngeloGioacchino Del Regno
2025-03-05  8:36     ` Jason-JH Lin (林睿祥)
2025-03-05 12:05       ` AngeloGioacchino Del Regno
2025-03-06 11:00         ` Jason-JH Lin (林睿祥)
2025-03-06 13:08           ` AngeloGioacchino Del Regno
2025-03-07  1:11             ` Jason-JH Lin (林睿祥)
2025-02-18  5:41 ` [PATCH v4 4/8] soc: mediatek: mtk-cmdq: Add pa_base parsing for unsupported subsys ID hardware Jason-JH Lin
2025-03-04  9:35   ` AngeloGioacchino Del Regno
2025-03-05  9:46     ` Jason-JH Lin (林睿祥)
2025-03-05 11:03       ` AngeloGioacchino Del Regno
2025-03-05 15:58         ` Jason-JH Lin (林睿祥)
2025-03-05 17:40           ` AngeloGioacchino Del Regno
2025-02-18  5:41 ` [PATCH v4 5/8] soc: mediatek: mtk-cmdq: Add mminfra_offset compatibility for DRAM address Jason-JH Lin
2025-03-04  9:37   ` AngeloGioacchino Del Regno
2025-03-05 16:26     ` Jason-JH Lin (林睿祥)
2025-02-18  5:41 ` [PATCH v4 6/8] soc: mediatek: Add programming flow for unsupported subsys ID hardware Jason-JH Lin
2025-03-04  9:41   ` AngeloGioacchino Del Regno
2025-03-05 16:12     ` Jason-JH Lin (林睿祥)
2025-03-05 18:08       ` AngeloGioacchino Del Regno
2025-03-06 11:05         ` Jason-JH Lin (林睿祥)
2025-02-18  5:41 ` [PATCH v4 7/8] drm/mediatek: " Jason-JH Lin
2025-03-06  9:47   ` AngeloGioacchino Del Regno
2025-03-06 11:10     ` Jason-JH Lin (林睿祥)
2025-02-18  5:41 ` [PATCH v4 8/8] media: mediatek: mdp3: " Jason-JH Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox