[PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement

devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement
       [not found] <CGME20250303143634eucas1p269281f72bdc4d764edd54b9427f68787@eucas1p2.samsung.com>
@ 2025-03-03 14:36 ` Michal Wilczynski
  2025-03-03 14:36   ` [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller Michal Wilczynski
                     ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Michal Wilczynski @ 2025-03-03 14:36 UTC (permalink / raw)
  To: mturquette, sboyd, robh, krzk+dt, conor+dt, drew, guoren, wefu,
	paul.walmsley, palmer, aou, alex, jszhang, p.zabel, m.szyprowski
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv,
	Michal Wilczynski

This is a subset of a larger patch series enabling the Imagination BXM-4-64 GPU
on the LicheePi 4A board, which is powered by the T-HEAD TH1520 SoC. While the
full series includes power-domain, reset, and firmware changes, this part
focuses solely on the clock subsystem needed for the GPU and other VO (video
output) blocks. By merging these clock patches independently, we prepare the
groundwork for future GPU integration via the `drm/imagination` driver.

The T-HEAD TH1520 SoC features multiple clock controllers. Initially, only the
AP clock controller was supported upstream. The patches below add support for
the VO (video output) clock controller, which manages GPU-related gates, HDMI,
and other multimedia clocks. Additionally, they introduce a mechanism to
provide no-op operations for the GPU's "mem" clock gate (documented as
“Reserved” in the hardware manual) and coordinate the GPU CLKGEN reset in the
clock driver.

Bigger series cover letter:

https://lore.kernel.org/all/20250219140239.1378758-1-m.wilczynski@samsung.com/

Michal Wilczynski (4):
  dt-bindings: clock: thead: Add TH1520 VO clock controller
  clk: thead: Add clock support for VO subsystem in T-Head TH1520 SoC
  clk: thead: Add support for custom ops in CCU_GATE_CLK_OPS macro
  clk: thead: Add GPU clock gate control with CLKGEN reset support

 .../bindings/clock/thead,th1520-clk-ap.yaml   |  33 +-
 drivers/clk/thead/clk-th1520-ap.c             | 298 ++++++++++++++++--
 .../dt-bindings/clock/thead,th1520-clk-ap.h   |  34 ++
 3 files changed, 334 insertions(+), 31 deletions(-)

-- 
2.34.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller
  2025-03-03 14:36 ` [PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement Michal Wilczynski
@ 2025-03-03 14:36   ` Michal Wilczynski
  2025-03-03 16:24     ` Conor Dooley
  2025-03-03 17:41     ` Krzysztof Kozlowski
  2025-03-03 14:36   ` [PATCH v1 2/4] clk: thead: Add clock support for VO subsystem in T-Head TH1520 SoC Michal Wilczynski
                     ` (3 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: Michal Wilczynski @ 2025-03-03 14:36 UTC (permalink / raw)
  To: mturquette, sboyd, robh, krzk+dt, conor+dt, drew, guoren, wefu,
	paul.walmsley, palmer, aou, alex, jszhang, p.zabel, m.szyprowski
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv,
	Michal Wilczynski

Add device tree bindings for the TH1520 Video Output (VO) subsystem
clock controller. The VO sub-system manages clock gates for multimedia
components including HDMI, MIPI, and GPU.

Document the VIDEO_PLL requirements for the VO clock controller, which
receives its input from the AP clock controller. The VIDEO_PLL is a
Silicon Creations Sigma-Delta (integer) PLL typically running at 792 MHz
with maximum FOUTVCO of 2376 MHz.

Add a mandatory reset property for the TH1520 VO clock controller that
handles the GPU clocks. This reset line controls the GPU CLKGEN reset,
which is required for proper GPU clock operation.

The reset property is only required for the "thead,th1520-clk-vo"
compatible, as it specifically handles the GPU-related clocks.

This binding complements the existing AP sub-system clock controller
which manages CPU, DPU, GMAC and TEE PLLs.

Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
---
 .../bindings/clock/thead,th1520-clk-ap.yaml   | 33 ++++++++++++++++--
 .../dt-bindings/clock/thead,th1520-clk-ap.h   | 34 +++++++++++++++++++
 2 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/clock/thead,th1520-clk-ap.yaml b/Documentation/devicetree/bindings/clock/thead,th1520-clk-ap.yaml
index 0129bd0ba4b3..6ea8202718d0 100644
--- a/Documentation/devicetree/bindings/clock/thead,th1520-clk-ap.yaml
+++ b/Documentation/devicetree/bindings/clock/thead,th1520-clk-ap.yaml
@@ -8,7 +8,8 @@ title: T-HEAD TH1520 AP sub-system clock controller
 
 description: |
   The T-HEAD TH1520 AP sub-system clock controller configures the
-  CPU, DPU, GMAC and TEE PLLs.
+  CPU, DPU, GMAC and TEE PLLs. Additionally the VO subsystem configures
+  the clock gates for the HDMI, MIPI and the GPU.
 
   SoC reference manual
   https://openbeagle.org/beaglev-ahead/beaglev-ahead/-/blob/main/docs/TH1520%20System%20User%20Manual.pdf
@@ -20,14 +21,30 @@ maintainers:
 
 properties:
   compatible:
-    const: thead,th1520-clk-ap
+    enum:
+      - thead,th1520-clk-ap
+      - thead,th1520-clk-vo
 
   reg:
     maxItems: 1
 
   clocks:
     items:
-      - description: main oscillator (24MHz)
+      - description: |
+          One input clock:
+          - For "thead,th1520-clk-ap": the clock input must be the 24 MHz
+            main oscillator.
+          - For "thead,th1520-clk-vo": the clock input must be the VIDEO_PLL,
+            which is configured by the AP clock controller. According to the
+            TH1520 manual, VIDEO_PLL is a Silicon Creations Sigma-Delta PLL
+            (integer PLL) typically running at 792 MHz (FOUTPOSTDIV), with
+            a maximum FOUTVCO of 2376 MHz.
+
+  resets:
+    maxItems: 1
+    description:
+      Required for "thead,th1520-clk-vo". This reset line controls the
+      GPU CLKGEN reset which is required for proper GPU clock operation.
 
   "#clock-cells":
     const: 1
@@ -40,6 +57,16 @@ required:
   - clocks
   - "#clock-cells"
 
+allOf:
+  - if:
+      properties:
+        compatible:
+          contains:
+            const: thead,th1520-clk-vo
+    then:
+      required:
+        - resets
+
 additionalProperties: false
 
 examples:
diff --git a/include/dt-bindings/clock/thead,th1520-clk-ap.h b/include/dt-bindings/clock/thead,th1520-clk-ap.h
index a199784b3512..09a9aa7b3ab1 100644
--- a/include/dt-bindings/clock/thead,th1520-clk-ap.h
+++ b/include/dt-bindings/clock/thead,th1520-clk-ap.h
@@ -93,4 +93,38 @@
 #define CLK_SRAM3		83
 #define CLK_PLL_GMAC_100M	84
 #define CLK_UART_SCLK		85
+
+/* VO clocks */
+#define CLK_AXI4_VO_ACLK		0
+#define CLK_GPU_MEM			1
+#define CLK_GPU_CORE			2
+#define CLK_GPU_CFG_ACLK		3
+#define CLK_DPU_PIXELCLK0		4
+#define CLK_DPU_PIXELCLK1		5
+#define CLK_DPU_HCLK			6
+#define CLK_DPU_ACLK			7
+#define CLK_DPU_CCLK			8
+#define CLK_HDMI_SFR			9
+#define CLK_HDMI_PCLK			10
+#define CLK_HDMI_CEC			11
+#define CLK_MIPI_DSI0_PCLK		12
+#define CLK_MIPI_DSI1_PCLK		13
+#define CLK_MIPI_DSI0_CFG		14
+#define CLK_MIPI_DSI1_CFG		15
+#define CLK_MIPI_DSI0_REFCLK		16
+#define CLK_MIPI_DSI1_REFCLK		17
+#define CLK_HDMI_I2S			18
+#define CLK_X2H_DPU1_ACLK		19
+#define CLK_X2H_DPU_ACLK		20
+#define CLK_AXI4_VO_PCLK		21
+#define CLK_IOPMP_VOSYS_DPU_PCLK	22
+#define CLK_IOPMP_VOSYS_DPU1_PCLK	23
+#define CLK_IOPMP_VOSYS_GPU_PCLK	24
+#define CLK_IOPMP_DPU1_ACLK		25
+#define CLK_IOPMP_DPU_ACLK		26
+#define CLK_IOPMP_GPU_ACLK		27
+#define CLK_MIPIDSI0_PIXCLK		28
+#define CLK_MIPIDSI1_PIXCLK		29
+#define CLK_HDMI_PIXCLK			30
+
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 2/4] clk: thead: Add clock support for VO subsystem in T-Head TH1520 SoC
  2025-03-03 14:36 ` [PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement Michal Wilczynski
  2025-03-03 14:36   ` [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller Michal Wilczynski
@ 2025-03-03 14:36   ` Michal Wilczynski
  2025-03-03 14:36   ` [PATCH v1 3/4] clk: thead: Add support for custom ops in CCU_GATE_CLK_OPS macro Michal Wilczynski
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Michal Wilczynski @ 2025-03-03 14:36 UTC (permalink / raw)
  To: mturquette, sboyd, robh, krzk+dt, conor+dt, drew, guoren, wefu,
	paul.walmsley, palmer, aou, alex, jszhang, p.zabel, m.szyprowski
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv,
	Michal Wilczynski

The T-Head TH1520 SoC integrates a variety of clocks for its subsystems,
including the Application Processor (AP) and the Video Output (VO) [1].
Up until now, the T-Head clock driver only supported AP clocks.

This commit extends the driver to provide clock functionality for the VO
subsystem. At this stage, the focus is on implementing the VO clock
gates, as these are currently the most relevant and required components
for enabling and disabling the VO subsystem functionality.  Future
enhancements may introduce additional VO-related clocks as necessary.

Link: https://openbeagle.org/beaglev-ahead/beaglev-ahead/-/blob/main/docs/TH1520%20System%20User%20Manual.pdf [1]

Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
---
 drivers/clk/thead/clk-th1520-ap.c | 197 +++++++++++++++++++++++++-----
 1 file changed, 169 insertions(+), 28 deletions(-)

diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
index 4c9555fc6184..57972589f120 100644
--- a/drivers/clk/thead/clk-th1520-ap.c
+++ b/drivers/clk/thead/clk-th1520-ap.c
@@ -847,6 +847,67 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
 static CCU_GATE(CLK_SRAM2, sram2_clk, "sram2", axi_aclk_pd, 0x20c, BIT(2), 0);
 static CCU_GATE(CLK_SRAM3, sram3_clk, "sram3", axi_aclk_pd, 0x20c, BIT(1), 0);
 
+static CCU_GATE(CLK_AXI4_VO_ACLK, axi4_vo_aclk, "axi4-vo-aclk",
+		video_pll_clk_pd, 0x0, BIT(0), 0);
+static CCU_GATE(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", video_pll_clk_pd,
+		0x0, BIT(3), 0);
+static CCU_GATE(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
+		video_pll_clk_pd, 0x0, BIT(4), 0);
+static CCU_GATE(CLK_DPU_PIXELCLK0, dpu0_pixelclk, "dpu0-pixelclk",
+		video_pll_clk_pd, 0x0, BIT(5), 0);
+static CCU_GATE(CLK_DPU_PIXELCLK1, dpu1_pixelclk, "dpu1-pixelclk",
+		video_pll_clk_pd, 0x0, BIT(6), 0);
+static CCU_GATE(CLK_DPU_HCLK, dpu_hclk, "dpu-hclk", video_pll_clk_pd, 0x0,
+		BIT(7), 0);
+static CCU_GATE(CLK_DPU_ACLK, dpu_aclk, "dpu-aclk", video_pll_clk_pd, 0x0,
+		BIT(8), 0);
+static CCU_GATE(CLK_DPU_CCLK, dpu_cclk, "dpu-cclk", video_pll_clk_pd, 0x0,
+		BIT(9), 0);
+static CCU_GATE(CLK_HDMI_SFR, hdmi_sfr_clk, "hdmi-sfr-clk", video_pll_clk_pd,
+		0x0, BIT(10), 0);
+static CCU_GATE(CLK_HDMI_PCLK, hdmi_pclk, "hdmi-pclk", video_pll_clk_pd, 0x0,
+		BIT(11), 0);
+static CCU_GATE(CLK_HDMI_CEC, hdmi_cec_clk, "hdmi-cec-clk", video_pll_clk_pd,
+		0x0, BIT(12), 0);
+static CCU_GATE(CLK_MIPI_DSI0_PCLK, mipi_dsi0_pclk, "mipi-dsi0-pclk",
+		video_pll_clk_pd, 0x0, BIT(13), 0);
+static CCU_GATE(CLK_MIPI_DSI1_PCLK, mipi_dsi1_pclk, "mipi-dsi1-pclk",
+		video_pll_clk_pd, 0x0, BIT(14), 0);
+static CCU_GATE(CLK_MIPI_DSI0_CFG, mipi_dsi0_cfg_clk, "mipi-dsi0-cfg-clk",
+		video_pll_clk_pd, 0x0, BIT(15), 0);
+static CCU_GATE(CLK_MIPI_DSI1_CFG, mipi_dsi1_cfg_clk, "mipi-dsi1-cfg-clk",
+		video_pll_clk_pd, 0x0, BIT(16), 0);
+static CCU_GATE(CLK_MIPI_DSI0_REFCLK, mipi_dsi0_refclk, "mipi-dsi0-refclk",
+		video_pll_clk_pd, 0x0, BIT(17), 0);
+static CCU_GATE(CLK_MIPI_DSI1_REFCLK, mipi_dsi1_refclk, "mipi-dsi1-refclk",
+		video_pll_clk_pd, 0x0, BIT(18), 0);
+static CCU_GATE(CLK_HDMI_I2S, hdmi_i2s_clk, "hdmi-i2s-clk", video_pll_clk_pd,
+		0x0, BIT(19), 0);
+static CCU_GATE(CLK_X2H_DPU1_ACLK, x2h_dpu1_aclk, "x2h-dpu1-aclk",
+		video_pll_clk_pd, 0x0, BIT(20), 0);
+static CCU_GATE(CLK_X2H_DPU_ACLK, x2h_dpu_aclk, "x2h-dpu-aclk",
+		video_pll_clk_pd, 0x0, BIT(21), 0);
+static CCU_GATE(CLK_AXI4_VO_PCLK, axi4_vo_pclk, "axi4-vo-pclk",
+		video_pll_clk_pd, 0x0, BIT(22), 0);
+static CCU_GATE(CLK_IOPMP_VOSYS_DPU_PCLK, iopmp_vosys_dpu_pclk,
+		"iopmp-vosys-dpu-pclk", video_pll_clk_pd, 0x0, BIT(23), 0);
+static CCU_GATE(CLK_IOPMP_VOSYS_DPU1_PCLK, iopmp_vosys_dpu1_pclk,
+		"iopmp-vosys-dpu1-pclk", video_pll_clk_pd, 0x0, BIT(24), 0);
+static CCU_GATE(CLK_IOPMP_VOSYS_GPU_PCLK, iopmp_vosys_gpu_pclk,
+		"iopmp-vosys-gpu-pclk", video_pll_clk_pd, 0x0, BIT(25), 0);
+static CCU_GATE(CLK_IOPMP_DPU1_ACLK, iopmp_dpu1_aclk, "iopmp-dpu1-aclk",
+		video_pll_clk_pd, 0x0, BIT(27), 0);
+static CCU_GATE(CLK_IOPMP_DPU_ACLK, iopmp_dpu_aclk, "iopmp-dpu-aclk",
+		video_pll_clk_pd, 0x0, BIT(28), 0);
+static CCU_GATE(CLK_IOPMP_GPU_ACLK, iopmp_gpu_aclk, "iopmp-gpu-aclk",
+		video_pll_clk_pd, 0x0, BIT(29), 0);
+static CCU_GATE(CLK_MIPIDSI0_PIXCLK, mipi_dsi0_pixclk, "mipi-dsi0-pixclk",
+		video_pll_clk_pd, 0x0, BIT(30), 0);
+static CCU_GATE(CLK_MIPIDSI1_PIXCLK, mipi_dsi1_pixclk, "mipi-dsi1-pixclk",
+		video_pll_clk_pd, 0x0, BIT(31), 0);
+static CCU_GATE(CLK_HDMI_PIXCLK, hdmi_pixclk, "hdmi-pixclk", video_pll_clk_pd,
+		0x4, BIT(0), 0);
+
 static CLK_FIXED_FACTOR_HW(gmac_pll_clk_100m, "gmac-pll-clk-100m",
 			   &gmac_pll_clk.common.hw, 10, 1, 0);
 
@@ -963,7 +1024,38 @@ static struct ccu_common *th1520_gate_clks[] = {
 	&sram3_clk.common,
 };
 
-#define NR_CLKS	(CLK_UART_SCLK + 1)
+static struct ccu_common *th1520_vo_gate_clks[] = {
+	&axi4_vo_aclk.common,
+	&gpu_core_clk.common,
+	&gpu_cfg_aclk.common,
+	&dpu0_pixelclk.common,
+	&dpu1_pixelclk.common,
+	&dpu_hclk.common,
+	&dpu_aclk.common,
+	&dpu_cclk.common,
+	&hdmi_sfr_clk.common,
+	&hdmi_pclk.common,
+	&hdmi_cec_clk.common,
+	&mipi_dsi0_pclk.common,
+	&mipi_dsi1_pclk.common,
+	&mipi_dsi0_cfg_clk.common,
+	&mipi_dsi1_cfg_clk.common,
+	&mipi_dsi0_refclk.common,
+	&mipi_dsi1_refclk.common,
+	&hdmi_i2s_clk.common,
+	&x2h_dpu1_aclk.common,
+	&x2h_dpu_aclk.common,
+	&axi4_vo_pclk.common,
+	&iopmp_vosys_dpu_pclk.common,
+	&iopmp_vosys_dpu1_pclk.common,
+	&iopmp_vosys_gpu_pclk.common,
+	&iopmp_dpu1_aclk.common,
+	&iopmp_dpu_aclk.common,
+	&iopmp_gpu_aclk.common,
+	&mipi_dsi0_pixclk.common,
+	&mipi_dsi1_pixclk.common,
+	&hdmi_pixclk.common
+};
 
 static const struct regmap_config th1520_clk_regmap_config = {
 	.reg_bits = 32,
@@ -972,8 +1064,44 @@ static const struct regmap_config th1520_clk_regmap_config = {
 	.fast_io = true,
 };
 
+struct th1520_plat_data {
+	struct ccu_common **th1520_pll_clks;
+	struct ccu_common **th1520_div_clks;
+	struct ccu_common **th1520_mux_clks;
+	struct ccu_common **th1520_gate_clks;
+
+	int nr_clks;
+	int nr_pll_clks;
+	int nr_div_clks;
+	int nr_mux_clks;
+	int nr_gate_clks;
+};
+
+static const struct th1520_plat_data th1520_ap_platdata = {
+	.th1520_pll_clks = th1520_pll_clks,
+	.th1520_div_clks = th1520_div_clks,
+	.th1520_mux_clks = th1520_mux_clks,
+	.th1520_gate_clks = th1520_gate_clks,
+
+	.nr_clks = CLK_UART_SCLK + 1,
+
+	.nr_pll_clks = ARRAY_SIZE(th1520_pll_clks),
+	.nr_div_clks = ARRAY_SIZE(th1520_div_clks),
+	.nr_mux_clks = ARRAY_SIZE(th1520_mux_clks),
+	.nr_gate_clks = ARRAY_SIZE(th1520_gate_clks),
+};
+
+static const struct th1520_plat_data th1520_vo_platdata = {
+	.th1520_gate_clks = th1520_vo_gate_clks,
+
+	.nr_clks = CLK_HDMI_PIXCLK + 1,
+
+	.nr_gate_clks = ARRAY_SIZE(th1520_vo_gate_clks),
+};
+
 static int th1520_clk_probe(struct platform_device *pdev)
 {
+	const struct th1520_plat_data *plat_data;
 	struct device *dev = &pdev->dev;
 	struct clk_hw_onecell_data *priv;
 
@@ -982,11 +1110,17 @@ static int th1520_clk_probe(struct platform_device *pdev)
 	struct clk_hw *hw;
 	int ret, i;
 
-	priv = devm_kzalloc(dev, struct_size(priv, hws, NR_CLKS), GFP_KERNEL);
+	plat_data = device_get_match_data(&pdev->dev);
+	if (!plat_data) {
+		dev_err(&pdev->dev, "Error: No device match data found\n");
+		return -ENODEV;
+	}
+
+	priv = devm_kzalloc(dev, struct_size(priv, hws, plat_data->nr_clks), GFP_KERNEL);
 	if (!priv)
 		return -ENOMEM;
 
-	priv->num = NR_CLKS;
+	priv->num = plat_data->nr_clks;
 
 	base = devm_platform_ioremap_resource(pdev, 0);
 	if (IS_ERR(base))
@@ -996,35 +1130,35 @@ static int th1520_clk_probe(struct platform_device *pdev)
 	if (IS_ERR(map))
 		return PTR_ERR(map);
 
-	for (i = 0; i < ARRAY_SIZE(th1520_pll_clks); i++) {
-		struct ccu_pll *cp = hw_to_ccu_pll(&th1520_pll_clks[i]->hw);
+	for (i = 0; i < plat_data->nr_pll_clks; i++) {
+		struct ccu_pll *cp = hw_to_ccu_pll(&plat_data->th1520_pll_clks[i]->hw);
 
-		th1520_pll_clks[i]->map = map;
+		plat_data->th1520_pll_clks[i]->map = map;
 
-		ret = devm_clk_hw_register(dev, &th1520_pll_clks[i]->hw);
+		ret = devm_clk_hw_register(dev, &plat_data->th1520_pll_clks[i]->hw);
 		if (ret)
 			return ret;
 
 		priv->hws[cp->common.clkid] = &cp->common.hw;
 	}
 
-	for (i = 0; i < ARRAY_SIZE(th1520_div_clks); i++) {
-		struct ccu_div *cd = hw_to_ccu_div(&th1520_div_clks[i]->hw);
+	for (i = 0; i < plat_data->nr_div_clks; i++) {
+		struct ccu_div *cd = hw_to_ccu_div(&plat_data->th1520_div_clks[i]->hw);
 
-		th1520_div_clks[i]->map = map;
+		plat_data->th1520_div_clks[i]->map = map;
 
-		ret = devm_clk_hw_register(dev, &th1520_div_clks[i]->hw);
+		ret = devm_clk_hw_register(dev, &plat_data->th1520_div_clks[i]->hw);
 		if (ret)
 			return ret;
 
 		priv->hws[cd->common.clkid] = &cd->common.hw;
 	}
 
-	for (i = 0; i < ARRAY_SIZE(th1520_mux_clks); i++) {
-		struct ccu_mux *cm = hw_to_ccu_mux(&th1520_mux_clks[i]->hw);
+	for (i = 0; i < plat_data->nr_mux_clks; i++) {
+		struct ccu_mux *cm = hw_to_ccu_mux(&plat_data->th1520_mux_clks[i]->hw);
 		const struct clk_init_data *init = cm->common.hw.init;
 
-		th1520_mux_clks[i]->map = map;
+		plat_data->th1520_mux_clks[i]->map = map;
 		hw = devm_clk_hw_register_mux_parent_data_table(dev,
 								init->name,
 								init->parent_data,
@@ -1040,10 +1174,10 @@ static int th1520_clk_probe(struct platform_device *pdev)
 		priv->hws[cm->common.clkid] = hw;
 	}
 
-	for (i = 0; i < ARRAY_SIZE(th1520_gate_clks); i++) {
-		struct ccu_gate *cg = hw_to_ccu_gate(&th1520_gate_clks[i]->hw);
+	for (i = 0; i < plat_data->nr_gate_clks; i++) {
+		struct ccu_gate *cg = hw_to_ccu_gate(&plat_data->th1520_gate_clks[i]->hw);
 
-		th1520_gate_clks[i]->map = map;
+		plat_data->th1520_gate_clks[i]->map = map;
 
 		hw = devm_clk_hw_register_gate_parent_data(dev,
 							   cg->common.hw.init->name,
@@ -1057,19 +1191,21 @@ static int th1520_clk_probe(struct platform_device *pdev)
 		priv->hws[cg->common.clkid] = hw;
 	}
 
-	ret = devm_clk_hw_register(dev, &osc12m_clk.hw);
-	if (ret)
-		return ret;
-	priv->hws[CLK_OSC12M] = &osc12m_clk.hw;
+	if (plat_data == &th1520_ap_platdata) {
+		ret = devm_clk_hw_register(dev, &osc12m_clk.hw);
+		if (ret)
+			return ret;
+		priv->hws[CLK_OSC12M] = &osc12m_clk.hw;
 
-	ret = devm_clk_hw_register(dev, &gmac_pll_clk_100m.hw);
-	if (ret)
-		return ret;
-	priv->hws[CLK_PLL_GMAC_100M] = &gmac_pll_clk_100m.hw;
+		ret = devm_clk_hw_register(dev, &gmac_pll_clk_100m.hw);
+		if (ret)
+			return ret;
+		priv->hws[CLK_PLL_GMAC_100M] = &gmac_pll_clk_100m.hw;
 
-	ret = devm_clk_hw_register(dev, &emmc_sdio_ref_clk.hw);
-	if (ret)
-		return ret;
+		ret = devm_clk_hw_register(dev, &emmc_sdio_ref_clk.hw);
+		if (ret)
+			return ret;
+	}
 
 	ret = devm_of_clk_add_hw_provider(dev, of_clk_hw_onecell_get, priv);
 	if (ret)
@@ -1081,6 +1217,11 @@ static int th1520_clk_probe(struct platform_device *pdev)
 static const struct of_device_id th1520_clk_match[] = {
 	{
 		.compatible = "thead,th1520-clk-ap",
+		.data = &th1520_ap_platdata,
+	},
+	{
+		.compatible = "thead,th1520-clk-vo",
+		.data = &th1520_vo_platdata,
 	},
 	{ /* sentinel */ },
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 3/4] clk: thead: Add support for custom ops in CCU_GATE_CLK_OPS macro
  2025-03-03 14:36 ` [PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement Michal Wilczynski
  2025-03-03 14:36   ` [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller Michal Wilczynski
  2025-03-03 14:36   ` [PATCH v1 2/4] clk: thead: Add clock support for VO subsystem in T-Head TH1520 SoC Michal Wilczynski
@ 2025-03-03 14:36   ` Michal Wilczynski
  2025-03-03 14:36   ` [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support Michal Wilczynski
  2025-03-05  8:18   ` [PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement Michal Wilczynski
  4 siblings, 0 replies; 20+ messages in thread
From: Michal Wilczynski @ 2025-03-03 14:36 UTC (permalink / raw)
  To: mturquette, sboyd, robh, krzk+dt, conor+dt, drew, guoren, wefu,
	paul.walmsley, palmer, aou, alex, jszhang, p.zabel, m.szyprowski
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv,
	Michal Wilczynski

The IMG Rogue GPU requires three clocks: core, sys, and mem [1]. On the
T-HEAD TH1520 SoC, the mem clock gate is marked as "Reserved" in the
hardware manual (section 4.4.2.6.1) [2] and cannot be configured.

Add a new CCU_GATE_CLK_OPS macro that allows specifying custom clock
operations. This enables us to use nop operations for the mem clock,
preventing the driver from attempting to enable/disable this reserved
clock gate.

Link: https://lore.kernel.org/all/2fe3d93f-62ac-4439-ac17-d81137f6410a@imgtec.com [1]
Link: https://openbeagle.org/beaglev-ahead/beaglev-ahead/-/blob/main/docs/TH1520%20System%20User%20Manual.pdf [2]

Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
---
 drivers/clk/thead/clk-th1520-ap.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
index 57972589f120..ea96d007aecd 100644
--- a/drivers/clk/thead/clk-th1520-ap.c
+++ b/drivers/clk/thead/clk-th1520-ap.c
@@ -89,6 +89,21 @@ struct ccu_pll {
 		}							\
 	}
 
+#define CCU_GATE_CLK_OPS(_clkid, _struct, _name, _parent, _reg, _gate, _flags,	\
+			 _clk_ops)						\
+	struct ccu_gate _struct = {						\
+		.enable	= _gate,						\
+		.common	= {							\
+			.clkid		= _clkid,				\
+			.cfg0		= _reg,					\
+			.hw.init	= CLK_HW_INIT_PARENTS_DATA(		\
+						_name,				\
+						_parent,			\
+						&_clk_ops,			\
+						_flags),			\
+		}								\
+	}
+
 static inline struct ccu_common *hw_to_ccu_common(struct clk_hw *hw)
 {
 	return container_of(hw, struct ccu_common, hw);
@@ -847,6 +862,11 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
 static CCU_GATE(CLK_SRAM2, sram2_clk, "sram2", axi_aclk_pd, 0x20c, BIT(2), 0);
 static CCU_GATE(CLK_SRAM3, sram3_clk, "sram3", axi_aclk_pd, 0x20c, BIT(1), 0);
 
+static const struct clk_ops clk_nop_ops = {};
+
+static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
+			video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
+
 static CCU_GATE(CLK_AXI4_VO_ACLK, axi4_vo_aclk, "axi4-vo-aclk",
 		video_pll_clk_pd, 0x0, BIT(0), 0);
 static CCU_GATE(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", video_pll_clk_pd,
@@ -1205,6 +1225,12 @@ static int th1520_clk_probe(struct platform_device *pdev)
 		ret = devm_clk_hw_register(dev, &emmc_sdio_ref_clk.hw);
 		if (ret)
 			return ret;
+	} else if (plat_data == &th1520_vo_platdata) {
+		ret = devm_clk_hw_register(dev, &gpu_mem_clk.common.hw);
+		if (ret)
+			return ret;
+		gpu_mem_clk.common.map = map;
+		priv->hws[CLK_GPU_MEM] = &gpu_mem_clk.common.hw;
 	}
 
 	ret = devm_of_clk_add_hw_provider(dev, of_clk_hw_onecell_get, priv);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-03-03 14:36 ` [PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement Michal Wilczynski
                     ` (2 preceding siblings ...)
  2025-03-03 14:36   ` [PATCH v1 3/4] clk: thead: Add support for custom ops in CCU_GATE_CLK_OPS macro Michal Wilczynski
@ 2025-03-03 14:36   ` Michal Wilczynski
  2025-03-05 23:47     ` Stephen Boyd
  2025-03-05  8:18   ` [PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement Michal Wilczynski
  4 siblings, 1 reply; 20+ messages in thread
From: Michal Wilczynski @ 2025-03-03 14:36 UTC (permalink / raw)
  To: mturquette, sboyd, robh, krzk+dt, conor+dt, drew, guoren, wefu,
	paul.walmsley, palmer, aou, alex, jszhang, p.zabel, m.szyprowski
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv,
	Michal Wilczynski

The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
clock gate is marked as "Reserved" in hardware, while core and cfg are
configurable. In order for these clock gates to work properly, the
CLKGEN reset must be managed in a specific sequence.

Move the CLKGEN reset handling to the clock driver since it's
fundamentally a clock-related workaround [1]. This ensures that clk_enabled
GPU clocks stay physically enabled without external interference from
the reset driver.  The reset is now deasserted only when both core and
cfg clocks are enabled, and asserted when either of them is disabled.

The mem clock is configured to use nop operations since it cannot be
controlled.

Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]

Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
---
 drivers/clk/thead/clk-th1520-ap.c | 87 ++++++++++++++++++++++++++++---
 1 file changed, 81 insertions(+), 6 deletions(-)

diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
index ea96d007aecd..1dfcde867233 100644
--- a/drivers/clk/thead/clk-th1520-ap.c
+++ b/drivers/clk/thead/clk-th1520-ap.c
@@ -12,6 +12,7 @@
 #include <linux/module.h>
 #include <linux/platform_device.h>
 #include <linux/regmap.h>
+#include <linux/reset.h>
 
 #define TH1520_PLL_POSTDIV2	GENMASK(26, 24)
 #define TH1520_PLL_POSTDIV1	GENMASK(22, 20)
@@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
 static CCU_GATE(CLK_SRAM2, sram2_clk, "sram2", axi_aclk_pd, 0x20c, BIT(2), 0);
 static CCU_GATE(CLK_SRAM3, sram3_clk, "sram3", axi_aclk_pd, 0x20c, BIT(1), 0);
 
+static struct reset_control *gpu_reset;
+static DEFINE_SPINLOCK(gpu_reset_lock); /* protect GPU reset sequence */
+
+static void ccu_gpu_clk_disable(struct clk_hw *hw);
+static int ccu_gpu_clk_enable(struct clk_hw *hw);
+
+static const struct clk_ops ccu_gate_gpu_ops = {
+	.disable	= ccu_gpu_clk_disable,
+	.enable		= ccu_gpu_clk_enable
+};
+
 static const struct clk_ops clk_nop_ops = {};
 
 static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
 			video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
+static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
+			video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
+static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
+			video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
+
+static void ccu_gpu_clk_disable(struct clk_hw *hw)
+{
+	struct ccu_gate *cg = hw_to_ccu_gate(hw);
+	unsigned long flags;
+
+	spin_lock_irqsave(&gpu_reset_lock, flags);
+
+	ccu_disable_helper(&cg->common, cg->enable);
+
+	if ((cg == &gpu_core_clk &&
+	     !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
+	    (cg == &gpu_cfg_aclk &&
+	     !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
+		reset_control_assert(gpu_reset);
+
+	spin_unlock_irqrestore(&gpu_reset_lock, flags);
+}
+
+static int ccu_gpu_clk_enable(struct clk_hw *hw)
+{
+	struct ccu_gate *cg = hw_to_ccu_gate(hw);
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&gpu_reset_lock, flags);
+
+	ret = ccu_enable_helper(&cg->common, cg->enable);
+	if (ret) {
+		spin_unlock_irqrestore(&gpu_reset_lock, flags);
+		return ret;
+	}
+
+	if ((cg == &gpu_core_clk &&
+	     clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
+	    (cg == &gpu_cfg_aclk && clk_hw_is_enabled(&gpu_core_clk.common.hw)))
+		ret = reset_control_deassert(gpu_reset);
+
+	spin_unlock_irqrestore(&gpu_reset_lock, flags);
+
+	return ret;
+}
 
 static CCU_GATE(CLK_AXI4_VO_ACLK, axi4_vo_aclk, "axi4-vo-aclk",
 		video_pll_clk_pd, 0x0, BIT(0), 0);
-static CCU_GATE(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", video_pll_clk_pd,
-		0x0, BIT(3), 0);
-static CCU_GATE(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
-		video_pll_clk_pd, 0x0, BIT(4), 0);
 static CCU_GATE(CLK_DPU_PIXELCLK0, dpu0_pixelclk, "dpu0-pixelclk",
 		video_pll_clk_pd, 0x0, BIT(5), 0);
 static CCU_GATE(CLK_DPU_PIXELCLK1, dpu1_pixelclk, "dpu1-pixelclk",
@@ -1046,8 +1100,6 @@ static struct ccu_common *th1520_gate_clks[] = {
 
 static struct ccu_common *th1520_vo_gate_clks[] = {
 	&axi4_vo_aclk.common,
-	&gpu_core_clk.common,
-	&gpu_cfg_aclk.common,
 	&dpu0_pixelclk.common,
 	&dpu1_pixelclk.common,
 	&dpu_hclk.common,
@@ -1150,6 +1202,13 @@ static int th1520_clk_probe(struct platform_device *pdev)
 	if (IS_ERR(map))
 		return PTR_ERR(map);
 
+	if (plat_data == &th1520_vo_platdata) {
+		gpu_reset = devm_reset_control_get_exclusive(dev, NULL);
+		if (IS_ERR(gpu_reset))
+			return dev_err_probe(dev, PTR_ERR(gpu_reset),
+					     "GPU reset is required for VO clock controller\n");
+	}
+
 	for (i = 0; i < plat_data->nr_pll_clks; i++) {
 		struct ccu_pll *cp = hw_to_ccu_pll(&plat_data->th1520_pll_clks[i]->hw);
 
@@ -1226,11 +1285,27 @@ static int th1520_clk_probe(struct platform_device *pdev)
 		if (ret)
 			return ret;
 	} else if (plat_data == &th1520_vo_platdata) {
+		/* GPU clocks need to be treated differently, as MEM clock
+		 * is non-configurable, and the reset needs to be de-asserted
+		 * after enabling CORE and CFG clocks.
+		 */
 		ret = devm_clk_hw_register(dev, &gpu_mem_clk.common.hw);
 		if (ret)
 			return ret;
 		gpu_mem_clk.common.map = map;
 		priv->hws[CLK_GPU_MEM] = &gpu_mem_clk.common.hw;
+
+		ret = devm_clk_hw_register(dev, &gpu_core_clk.common.hw);
+		if (ret)
+			return ret;
+		gpu_core_clk.common.map = map;
+		priv->hws[CLK_GPU_CORE] = &gpu_core_clk.common.hw;
+
+		ret = devm_clk_hw_register(dev, &gpu_cfg_aclk.common.hw);
+		if (ret)
+			return ret;
+		gpu_cfg_aclk.common.map = map;
+		priv->hws[CLK_GPU_CFG_ACLK] = &gpu_cfg_aclk.common.hw;
 	}
 
 	ret = devm_of_clk_add_hw_provider(dev, of_clk_hw_onecell_get, priv);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller
  2025-03-03 14:36   ` [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller Michal Wilczynski
@ 2025-03-03 16:24     ` Conor Dooley
  2025-03-03 17:41     ` Krzysztof Kozlowski
  1 sibling, 0 replies; 20+ messages in thread
From: Conor Dooley @ 2025-03-03 16:24 UTC (permalink / raw)
  To: Michal Wilczynski
  Cc: mturquette, sboyd, robh, krzk+dt, conor+dt, drew, guoren, wefu,
	paul.walmsley, palmer, aou, alex, jszhang, p.zabel, m.szyprowski,
	linux-clk, devicetree, linux-kernel, linux-riscv

[-- Attachment #1: Type: text/plain, Size: 1108 bytes --]

On Mon, Mar 03, 2025 at 03:36:26PM +0100, Michal Wilczynski wrote:
> Add device tree bindings for the TH1520 Video Output (VO) subsystem
> clock controller. The VO sub-system manages clock gates for multimedia
> components including HDMI, MIPI, and GPU.
> 
> Document the VIDEO_PLL requirements for the VO clock controller, which
> receives its input from the AP clock controller. The VIDEO_PLL is a
> Silicon Creations Sigma-Delta (integer) PLL typically running at 792 MHz
> with maximum FOUTVCO of 2376 MHz.
> 
> Add a mandatory reset property for the TH1520 VO clock controller that
> handles the GPU clocks. This reset line controls the GPU CLKGEN reset,
> which is required for proper GPU clock operation.
> 
> The reset property is only required for the "thead,th1520-clk-vo"
> compatible, as it specifically handles the GPU-related clocks.
> 
> This binding complements the existing AP sub-system clock controller
> which manages CPU, DPU, GMAC and TEE PLLs.
> 
> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>

Acked-by: Conor Dooley <conor.dooley@microchip.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller
  2025-03-03 14:36   ` [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller Michal Wilczynski
  2025-03-03 16:24     ` Conor Dooley
@ 2025-03-03 17:41     ` Krzysztof Kozlowski
  2025-03-03 17:46       ` Krzysztof Kozlowski
  1 sibling, 1 reply; 20+ messages in thread
From: Krzysztof Kozlowski @ 2025-03-03 17:41 UTC (permalink / raw)
  To: Michal Wilczynski, mturquette, sboyd, robh, krzk+dt, conor+dt,
	drew, guoren, wefu, paul.walmsley, palmer, aou, alex, jszhang,
	p.zabel, m.szyprowski
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv

On 03/03/2025 15:36, Michal Wilczynski wrote:
> Add device tree bindings for the TH1520 Video Output (VO) subsystem
> clock controller. The VO sub-system manages clock gates for multimedia
> components including HDMI, MIPI, and GPU.
> 
> Document the VIDEO_PLL requirements for the VO clock controller, which
> receives its input from the AP clock controller. The VIDEO_PLL is a
> Silicon Creations Sigma-Delta (integer) PLL typically running at 792 MHz
> with maximum FOUTVCO of 2376 MHz.
> 
> Add a mandatory reset property for the TH1520 VO clock controller that
> handles the GPU clocks. This reset line controls the GPU CLKGEN reset,
> which is required for proper GPU clock operation.
> 
> The reset property is only required for the "thead,th1520-clk-vo"
> compatible, as it specifically handles the GPU-related clocks.
> 
> This binding complements the existing AP sub-system clock controller
> which manages CPU, DPU, GMAC and TEE PLLs.
> 
> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
> ---
>  .../bindings/clock/thead,th1520-clk-ap.yaml   | 33 ++++++++++++++++--
>  .../dt-bindings/clock/thead,th1520-clk-ap.h   | 34 +++++++++++++++++++
>  2 files changed, 64 insertions(+), 3 deletions(-)


Where is the changelog? Why is this v1? There was extensive discussion
for many versions, so does it mean all of it was ignored?

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller
  2025-03-03 17:41     ` Krzysztof Kozlowski
@ 2025-03-03 17:46       ` Krzysztof Kozlowski
  2025-03-04  7:39         ` Michal Wilczynski
  0 siblings, 1 reply; 20+ messages in thread
From: Krzysztof Kozlowski @ 2025-03-03 17:46 UTC (permalink / raw)
  To: Michal Wilczynski, mturquette, sboyd, robh, krzk+dt, conor+dt,
	drew, guoren, wefu, paul.walmsley, palmer, aou, alex, jszhang,
	p.zabel, m.szyprowski
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv

On 03/03/2025 18:41, Krzysztof Kozlowski wrote:
> On 03/03/2025 15:36, Michal Wilczynski wrote:
>> Add device tree bindings for the TH1520 Video Output (VO) subsystem
>> clock controller. The VO sub-system manages clock gates for multimedia
>> components including HDMI, MIPI, and GPU.
>>
>> Document the VIDEO_PLL requirements for the VO clock controller, which
>> receives its input from the AP clock controller. The VIDEO_PLL is a
>> Silicon Creations Sigma-Delta (integer) PLL typically running at 792 MHz
>> with maximum FOUTVCO of 2376 MHz.
>>
>> Add a mandatory reset property for the TH1520 VO clock controller that
>> handles the GPU clocks. This reset line controls the GPU CLKGEN reset,
>> which is required for proper GPU clock operation.
>>
>> The reset property is only required for the "thead,th1520-clk-vo"
>> compatible, as it specifically handles the GPU-related clocks.
>>
>> This binding complements the existing AP sub-system clock controller
>> which manages CPU, DPU, GMAC and TEE PLLs.
>>
>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
>> ---
>>  .../bindings/clock/thead,th1520-clk-ap.yaml   | 33 ++++++++++++++++--
>>  .../dt-bindings/clock/thead,th1520-clk-ap.h   | 34 +++++++++++++++++++
>>  2 files changed, 64 insertions(+), 3 deletions(-)
> 
> 
> Where is the changelog? Why is this v1? There was extensive discussion
> for many versions, so does it mean all of it was ignored?


Plus this was reviewed so it is even more confusing. Where is the review
tag? If tag was dropped, you must explain this - see submitting patches,
which asks for that.

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller
  2025-03-03 17:46       ` Krzysztof Kozlowski
@ 2025-03-04  7:39         ` Michal Wilczynski
  2025-03-04  7:41           ` Krzysztof Kozlowski
  0 siblings, 1 reply; 20+ messages in thread
From: Michal Wilczynski @ 2025-03-04  7:39 UTC (permalink / raw)
  To: Krzysztof Kozlowski, mturquette, sboyd, robh, krzk+dt, conor+dt,
	drew, guoren, wefu, paul.walmsley, palmer, aou, alex, jszhang,
	p.zabel, m.szyprowski
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv



On 3/3/25 18:46, Krzysztof Kozlowski wrote:
> On 03/03/2025 18:41, Krzysztof Kozlowski wrote:
>> On 03/03/2025 15:36, Michal Wilczynski wrote:
>>> Add device tree bindings for the TH1520 Video Output (VO) subsystem
>>> clock controller. The VO sub-system manages clock gates for multimedia
>>> components including HDMI, MIPI, and GPU.
>>>
>>> Document the VIDEO_PLL requirements for the VO clock controller, which
>>> receives its input from the AP clock controller. The VIDEO_PLL is a
>>> Silicon Creations Sigma-Delta (integer) PLL typically running at 792 MHz
>>> with maximum FOUTVCO of 2376 MHz.
>>>
>>> Add a mandatory reset property for the TH1520 VO clock controller that
>>> handles the GPU clocks. This reset line controls the GPU CLKGEN reset,
>>> which is required for proper GPU clock operation.
>>>
>>> The reset property is only required for the "thead,th1520-clk-vo"
>>> compatible, as it specifically handles the GPU-related clocks.
>>>
>>> This binding complements the existing AP sub-system clock controller
>>> which manages CPU, DPU, GMAC and TEE PLLs.
>>>
>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
>>> ---
>>>  .../bindings/clock/thead,th1520-clk-ap.yaml   | 33 ++++++++++++++++--
>>>  .../dt-bindings/clock/thead,th1520-clk-ap.h   | 34 +++++++++++++++++++
>>>  2 files changed, 64 insertions(+), 3 deletions(-)
>>
>>
>> Where is the changelog? Why is this v1? There was extensive discussion
>> for many versions, so does it mean all of it was ignored?
> 
> 
> Plus this was reviewed so it is even more confusing. Where is the review
> tag? If tag was dropped, you must explain this - see submitting patches,
> which asks for that.

There was a tag, but later in v5 I've added another part to this
dt-binding - reset, which I wasn't sure whether you would approve of, so
I've removed the Reviewed-by.

> 
> Best regards,
> Krzysztof
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller
  2025-03-04  7:39         ` Michal Wilczynski
@ 2025-03-04  7:41           ` Krzysztof Kozlowski
  0 siblings, 0 replies; 20+ messages in thread
From: Krzysztof Kozlowski @ 2025-03-04  7:41 UTC (permalink / raw)
  To: Michal Wilczynski, mturquette, sboyd, robh, krzk+dt, conor+dt,
	drew, guoren, wefu, paul.walmsley, palmer, aou, alex, jszhang,
	p.zabel, m.szyprowski
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv

On 04/03/2025 08:39, Michal Wilczynski wrote:
> 
> 
> On 3/3/25 18:46, Krzysztof Kozlowski wrote:
>> On 03/03/2025 18:41, Krzysztof Kozlowski wrote:
>>> On 03/03/2025 15:36, Michal Wilczynski wrote:
>>>> Add device tree bindings for the TH1520 Video Output (VO) subsystem
>>>> clock controller. The VO sub-system manages clock gates for multimedia
>>>> components including HDMI, MIPI, and GPU.
>>>>
>>>> Document the VIDEO_PLL requirements for the VO clock controller, which
>>>> receives its input from the AP clock controller. The VIDEO_PLL is a
>>>> Silicon Creations Sigma-Delta (integer) PLL typically running at 792 MHz
>>>> with maximum FOUTVCO of 2376 MHz.
>>>>
>>>> Add a mandatory reset property for the TH1520 VO clock controller that
>>>> handles the GPU clocks. This reset line controls the GPU CLKGEN reset,
>>>> which is required for proper GPU clock operation.
>>>>
>>>> The reset property is only required for the "thead,th1520-clk-vo"
>>>> compatible, as it specifically handles the GPU-related clocks.
>>>>
>>>> This binding complements the existing AP sub-system clock controller
>>>> which manages CPU, DPU, GMAC and TEE PLLs.
>>>>
>>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
>>>> ---
>>>>  .../bindings/clock/thead,th1520-clk-ap.yaml   | 33 ++++++++++++++++--
>>>>  .../dt-bindings/clock/thead,th1520-clk-ap.h   | 34 +++++++++++++++++++
>>>>  2 files changed, 64 insertions(+), 3 deletions(-)
>>>
>>>
>>> Where is the changelog? Why is this v1? There was extensive discussion
>>> for many versions, so does it mean all of it was ignored?
>>
>>
>> Plus this was reviewed so it is even more confusing. Where is the review
>> tag? If tag was dropped, you must explain this - see submitting patches,
>> which asks for that.
> 
> There was a tag, but later in v5 I've added another part to this
> dt-binding - reset, which I wasn't sure whether you would approve of, so
> I've removed the Reviewed-by.

Dropping tag needs explicit explanation and the entire versioning plus
changelog are gone from here.

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement
  2025-03-03 14:36 ` [PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement Michal Wilczynski
                     ` (3 preceding siblings ...)
  2025-03-03 14:36   ` [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support Michal Wilczynski
@ 2025-03-05  8:18   ` Michal Wilczynski
  4 siblings, 0 replies; 20+ messages in thread
From: Michal Wilczynski @ 2025-03-05  8:18 UTC (permalink / raw)
  To: mturquette, sboyd, robh, krzk+dt, conor+dt, drew, guoren, wefu,
	paul.walmsley, palmer, aou, alex, jszhang, p.zabel, m.szyprowski
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv



On 3/3/25 15:36, Michal Wilczynski wrote:
> This is a subset of a larger patch series enabling the Imagination BXM-4-64 GPU
> on the LicheePi 4A board, which is powered by the T-HEAD TH1520 SoC. While the
> full series includes power-domain, reset, and firmware changes, this part
> focuses solely on the clock subsystem needed for the GPU and other VO (video
> output) blocks. By merging these clock patches independently, we prepare the
> groundwork for future GPU integration via the `drm/imagination` driver.
> 
> The T-HEAD TH1520 SoC features multiple clock controllers. Initially, only the
> AP clock controller was supported upstream. The patches below add support for
> the VO (video output) clock controller, which manages GPU-related gates, HDMI,
> and other multimedia clocks. Additionally, they introduce a mechanism to
> provide no-op operations for the GPU's "mem" clock gate (documented as
> “Reserved” in the hardware manual) and coordinate the GPU CLKGEN reset in the
> clock driver.
> 
> Bigger series cover letter:
> 
> https://lore.kernel.org/all/20250219140239.1378758-1-m.wilczynski@samsung.com/

This series should be versioned as v6, to maintain continuity with the
bigger patchset it is a subseries of. Please find below a changelog for
the clock sub-series:

v6:
- squashed the "dt-bindings: clock: thead: Add GPU clkgen reset property"
  with the "dt-bindings: clock: thead: Add TH1520 VO clock controller". As
  a result, also removed the Reviewed-by from Krzysztof, since the new
  resets property has been introduced, which is mandatory in the VO
  case

v5:
- introduced a new macro CCU_GATE_CLK_OPS, which allows providing custom clk_ops.
  In the case of the 'MEM' clock, it provides empty clk_nops. Later, this clock
  is provided to the GPU node, thereby avoiding any ABI breakage
- used the CCU_GATE_CLK_OPS macro to implement a workaround for de-asserting
  the clkgen reset only after both core and sys clocks are enabled. This
  sequence is required to properly initialize the GPU

v4:
 - enhanced documentation for new Video Output (VO) clock inputs in device tree
   bindings

v3:
 - reworked driver to support multiple clock controllers through .compatible
   and .data instead of using multiple address spaces in dt-binding. This change
   allows to re-use the driver code for multiple clock controllers

v2:
 - removed AP_SUBSYS clock refactoring commits (1-6):
 - instead of refactoring, I opted to extend the current driver and its
   associated device tree node to include support for a second address space.
 - resolved all checkpatch issues using --strict, except for the call to
   devm_clk_hw_register_gate_parent_data().  The current implementation remains
   preferable in this context, and clang-format aligns with this choice

> 
> Michal Wilczynski (4):
>   dt-bindings: clock: thead: Add TH1520 VO clock controller
>   clk: thead: Add clock support for VO subsystem in T-Head TH1520 SoC
>   clk: thead: Add support for custom ops in CCU_GATE_CLK_OPS macro
>   clk: thead: Add GPU clock gate control with CLKGEN reset support
> 
>  .../bindings/clock/thead,th1520-clk-ap.yaml   |  33 +-
>  drivers/clk/thead/clk-th1520-ap.c             | 298 ++++++++++++++++--
>  .../dt-bindings/clock/thead,th1520-clk-ap.h   |  34 ++
>  3 files changed, 334 insertions(+), 31 deletions(-)
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-03-03 14:36   ` [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support Michal Wilczynski
@ 2025-03-05 23:47     ` Stephen Boyd
  2025-03-06 16:43       ` Michal Wilczynski
  0 siblings, 1 reply; 20+ messages in thread
From: Stephen Boyd @ 2025-03-05 23:47 UTC (permalink / raw)
  To: Michal Wilczynski, alex, aou, conor+dt, drew, guoren, jszhang,
	krzk+dt, m.szyprowski, mturquette, p.zabel, palmer, paul.walmsley,
	robh, wefu
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv,
	Michal Wilczynski

Quoting Michal Wilczynski (2025-03-03 06:36:29)
> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
> clock gate is marked as "Reserved" in hardware, while core and cfg are
> configurable. In order for these clock gates to work properly, the
> CLKGEN reset must be managed in a specific sequence.
> 
> Move the CLKGEN reset handling to the clock driver since it's
> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
> GPU clocks stay physically enabled without external interference from
> the reset driver.  The reset is now deasserted only when both core and
> cfg clocks are enabled, and asserted when either of them is disabled.
> 
> The mem clock is configured to use nop operations since it cannot be
> controlled.
> 
> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
> 
> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
[...]
> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
> index ea96d007aecd..1dfcde867233 100644
> --- a/drivers/clk/thead/clk-th1520-ap.c
> +++ b/drivers/clk/thead/clk-th1520-ap.c
> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
[...]
>  
>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
> +
> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
> +{
> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&gpu_reset_lock, flags);
> +
> +       ccu_disable_helper(&cg->common, cg->enable);
> +
> +       if ((cg == &gpu_core_clk &&
> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
> +           (cg == &gpu_cfg_aclk &&
> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
> +               reset_control_assert(gpu_reset);

Why can't the clk consumer control the reset itself? Doing this here is
not ideal because we hold the clk lock when we try to grab the reset
lock. These are all spinlocks that should be small in lines of code
where the lock is held, but we're calling into an entire other framework
under a spinlock. If an (unrelated) reset driver tries to grab the clk
lock it will deadlock.

I see the commit text talks about this being a workaround. I'm not sure
what the workaround is though. I've seen designs where the reset doesn't
work unless the clk is enabled because the flops have to be clocking for
the reset to propagate a few cycles, or the clk has to be disabled so
that the reset controller can do the clocking, or vice versa for the clk
not working unless the reset is deasserted. Long story short, it's
different between SoCs.

Likely the reset and clk control should be coordinated in a PM domain
for the device so that when the device is active, the clks are enabled
and the reset is deasserted in the correct order for the SoC. Can you do
that?

> +
> +       spin_unlock_irqrestore(&gpu_reset_lock, flags);
> +}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-03-05 23:47     ` Stephen Boyd
@ 2025-03-06 16:43       ` Michal Wilczynski
  2025-03-13  9:25         ` Philipp Zabel
  0 siblings, 1 reply; 20+ messages in thread
From: Michal Wilczynski @ 2025-03-06 16:43 UTC (permalink / raw)
  To: Stephen Boyd, alex, aou, conor+dt, drew, guoren, jszhang, krzk+dt,
	m.szyprowski, mturquette, p.zabel, palmer, paul.walmsley, robh,
	wefu
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv

On 3/6/25 00:47, Stephen Boyd wrote:
> Quoting Michal Wilczynski (2025-03-03 06:36:29)
>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
>> clock gate is marked as "Reserved" in hardware, while core and cfg are
>> configurable. In order for these clock gates to work properly, the
>> CLKGEN reset must be managed in a specific sequence.
>>
>> Move the CLKGEN reset handling to the clock driver since it's
>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
>> GPU clocks stay physically enabled without external interference from
>> the reset driver.  The reset is now deasserted only when both core and
>> cfg clocks are enabled, and asserted when either of them is disabled.
>>
>> The mem clock is configured to use nop operations since it cannot be
>> controlled.
>>
>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
>>
>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
> [...]
>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
>> index ea96d007aecd..1dfcde867233 100644
>> --- a/drivers/clk/thead/clk-th1520-ap.c
>> +++ b/drivers/clk/thead/clk-th1520-ap.c
>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
> [...]
>>  
>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
>> +
>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
>> +{
>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
>> +       unsigned long flags;
>> +
>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
>> +
>> +       ccu_disable_helper(&cg->common, cg->enable);
>> +
>> +       if ((cg == &gpu_core_clk &&
>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
>> +           (cg == &gpu_cfg_aclk &&
>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
>> +               reset_control_assert(gpu_reset);
> 
> Why can't the clk consumer control the reset itself? Doing this here is
> not ideal because we hold the clk lock when we try to grab the reset
> lock. These are all spinlocks that should be small in lines of code
> where the lock is held, but we're calling into an entire other framework
> under a spinlock. If an (unrelated) reset driver tries to grab the clk
> lock it will deadlock.

So in our case the clk consumer is the drm/imagination driver. Here is
the comment from the maintainer for my previous attempt to use a reset
driver to abstract the GPU init sequence [1]:

"Do you know what this resets? From our side, the GPU only has a single
reset line (which I assume to be GPU_RESET)."

"I don't love that this procedure appears in the platform reset driver.
I appreciate it may not be clear from the SoC TRM, but this is the
standard reset procedure for all IMG Rogue GPUs. The currently
supported TI SoC handles this in silicon, when power up/down requests
are sent so we never needed to encode it in the driver before.

Strictly speaking, the 32 cycle delay is required between power and
clocks being enabled and the reset line being deasserted. If nothing
here touches power or clocks (which I don't think it should), the delay
could potentially be lifted to the GPU driver." 

From the drm/imagination maintainers point of view their hardware has
only one reset, the extra CLKGEN reset is SoC specific. 

Also the reset driver maintainer didn't like my way of abstracting two
resets ("GPU" and and SoC specific"CLKGEN") into one reset to make it
seem to the consumer driver drm/imagination like there is only one
reset and suggested and attempt to code the re-setting in the clock
driver [2]. Even though he suggested a different way of achieving that: 

"In my mind it shouldn't be much: the GPU clocks could all share the
same refcounted implementation. The first clock to get enabled would
ungate both GPU_CORE and GPU_CFG_ACLK gates and deassert
GPU_SW_CLKGEN_RST, all in one place. The remaining enable(s) would be
no-ops. Would that work?"

The above would have similar effect, but I felt like enabling both
clocks in single .enable callback could be confusing so I ended up with
the current approach. This could be easily re-done if you feel this
would be better.

I agree that using spinlocks here is dangerous, but looking at the code
of the reset_control_deassert and reset_control_assert, it doesn't seem
like any spinlocks are acquired/relased in that code flow, unless the
driver ops would introduce that. So in this specific case deadlock
shouldn't happen ?

[1] - https://lore.kernel.org/all/816db99d-7088-4c1a-af03-b9a825ac09dc@imgtec.com/
[2] - https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de/

> 
> I see the commit text talks about this being a workaround. I'm not sure
> what the workaround is though. I've seen designs where the reset doesn't
> work unless the clk is enabled because the flops have to be clocking for
> the reset to propagate a few cycles, or the clk has to be disabled so
> that the reset controller can do the clocking, or vice versa for the clk
> not working unless the reset is deasserted. Long story short, it's
> different between SoCs.

OK, so this is how GPU initialization needs to happen for this specific
SoC:

drm/imagination consumer driver:

pvr_power_device_resume():

clk_prepare_enable(pvr_dev->core_clk);
clk_prepare_enable(pvr_dev->sys_clk);
clk_prepare_enable(pvr_dev->mem_clk);

// Then deassert reset introduced in [3]
// [3] - https://lore.kernel.org/all/20250128194816.2185326-11-m.wilczynski@samsung.com/

// Eventually this would get called in the SoC specific reset driver
static void th1520_rst_gpu_enable(struct regmap *reg,
				  struct mutex *gpu_seq_lock)
{
	int val;

	mutex_lock(gpu_seq_lock);

	/* if the GPU is not in a reset state it, put it into one */
	regmap_read(reg, TH1520_GPU_RST_CFG, &val);
	if (val)
		regmap_update_bits(reg, TH1520_GPU_RST_CFG,
				   TH1520_GPU_RST_CFG_MASK, 0x0);

	/* rst gpu clkgen */
	regmap_set_bits(reg, TH1520_GPU_RST_CFG, TH1520_GPU_SW_CLKGEN_RST);

	/*
	 * According to the hardware manual, a delay of at least 32 clock
	 * cycles is required between de-asserting the clkgen reset and
	 * de-asserting the GPU reset. Assuming a worst-case scenario with
	 * a very high GPU clock frequency, a delay of 1 microsecond is
	 * sufficient to ensure this requirement is met across all
	 * feasible GPU clock speeds.
	 */
	udelay(1);

	/* rst gpu */
	regmap_set_bits(reg, TH1520_GPU_RST_CFG, TH1520_GPU_SW_GPU_RST);

	mutex_unlock(gpu_seq_lock);
}

Based on my testing this is exactly how the resets should happen, else
the GPU fails to initialize, and the drm/imagination driver hangs. To
reiterate: first power ON the GPU using power-domain driver. Then
drm/imagination enable all three clocks, then deassert reset of the GPU
CLKGEN (SoC specific), delay for 32 cycles, and then deassert the GPU reset.

> 
> Likely the reset and clk control should be coordinated in a PM domain
> for the device so that when the device is active, the clks are enabled
> and the reset is deasserted in the correct order for the SoC. Can you do
> that?

Obviously this would work, but I'm worried, the drm/imagination driver
maintainers would reject it, as for them this is a special case, from
their perspective there is only one reset line to their hardware, and
there are three clocks which they manage in driver already. And the PM
maintainers probably wouldn't be happy to take this as well.

At the very end maybe we could go back to abstracting the resets in the
reset driver, as the reset maintainer seemed to be open to it, if the
alternatives proves to be too problematic [4].

"I won't object to carry this in the reset driver if the clock
implementation turns out to be unreasonably complex, but I currently
don't expect that to be the case. Please give it a shot."

[4] - https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de/

> 
>> +
>> +       spin_unlock_irqrestore(&gpu_reset_lock, flags);
>> +}
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-03-06 16:43       ` Michal Wilczynski
@ 2025-03-13  9:25         ` Philipp Zabel
  2025-03-19  9:22           ` Michal Wilczynski
  0 siblings, 1 reply; 20+ messages in thread
From: Philipp Zabel @ 2025-03-13  9:25 UTC (permalink / raw)
  To: Michal Wilczynski, Stephen Boyd, alex, aou, conor+dt, drew,
	guoren, jszhang, krzk+dt, m.szyprowski, mturquette, palmer,
	paul.walmsley, robh, wefu
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv

On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
> 
> On 3/6/25 00:47, Stephen Boyd wrote:
> > Quoting Michal Wilczynski (2025-03-03 06:36:29)
> > > The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
> > > clock gate is marked as "Reserved" in hardware, while core and cfg are
> > > configurable. In order for these clock gates to work properly, the
> > > CLKGEN reset must be managed in a specific sequence.
> > > 
> > > Move the CLKGEN reset handling to the clock driver since it's
> > > fundamentally a clock-related workaround [1]. This ensures that clk_enabled
> > > GPU clocks stay physically enabled without external interference from
> > > the reset driver.  The reset is now deasserted only when both core and
> > > cfg clocks are enabled, and asserted when either of them is disabled.
> > > 
> > > The mem clock is configured to use nop operations since it cannot be
> > > controlled.
> > > 
> > > Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
> > > 
> > > Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
> > [...]
> > > diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
> > > index ea96d007aecd..1dfcde867233 100644
> > > --- a/drivers/clk/thead/clk-th1520-ap.c
> > > +++ b/drivers/clk/thead/clk-th1520-ap.c
> > > @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
> > [...]
> > >  
> > >  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
> > >                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
> > > +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
> > > +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
> > > +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
> > > +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
> > > +
> > > +static void ccu_gpu_clk_disable(struct clk_hw *hw)
> > > +{
> > > +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
> > > +       unsigned long flags;
> > > +
> > > +       spin_lock_irqsave(&gpu_reset_lock, flags);
> > > +
> > > +       ccu_disable_helper(&cg->common, cg->enable);
> > > +
> > > +       if ((cg == &gpu_core_clk &&
> > > +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
> > > +           (cg == &gpu_cfg_aclk &&
> > > +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
> > > +               reset_control_assert(gpu_reset);
> > 
> > Why can't the clk consumer control the reset itself? Doing this here is
> > not ideal because we hold the clk lock when we try to grab the reset
> > lock. These are all spinlocks that should be small in lines of code
> > where the lock is held, but we're calling into an entire other framework
> > under a spinlock. If an (unrelated) reset driver tries to grab the clk
> > lock it will deadlock.
> 
> So in our case the clk consumer is the drm/imagination driver. Here is
> the comment from the maintainer for my previous attempt to use a reset
> driver to abstract the GPU init sequence [1]:
> 
> "Do you know what this resets? From our side, the GPU only has a single
> reset line (which I assume to be GPU_RESET)."
> 
> "I don't love that this procedure appears in the platform reset driver.
> I appreciate it may not be clear from the SoC TRM, but this is the
> standard reset procedure for all IMG Rogue GPUs. The currently
> supported TI SoC handles this in silicon, when power up/down requests
> are sent so we never needed to encode it in the driver before.
> 
> Strictly speaking, the 32 cycle delay is required between power and
> clocks being enabled and the reset line being deasserted. If nothing
> here touches power or clocks (which I don't think it should), the delay
> could potentially be lifted to the GPU driver." 
> 
> From the drm/imagination maintainers point of view their hardware has
> only one reset, the extra CLKGEN reset is SoC specific.

If I am understanding correctly, the CLKGEN reset doesn't reset
anything in the GPU itself, but holds the GPU clock generator block in
reset, effectively disabling the three GPU clocks as a workaround for
the always-ungated GPU_MEM clock.

> Also the reset driver maintainer didn't like my way of abstracting two
> resets ("GPU" and and SoC specific"CLKGEN") into one reset

That is one part of it. The other is that (according to my
understanding as laid out above), the combined GPU+CLKGEN reset would
effectively disable all three GPU clocks for a while, after the GPU
driver has already requested them to be enabled.

> to make it
> seem to the consumer driver drm/imagination like there is only one
> reset and suggested and attempt to code the re-setting in the clock
> driver [2]. Even though he suggested a different way of achieving that: 
> 
> "In my mind it shouldn't be much: the GPU clocks could all share the
> same refcounted implementation. The first clock to get enabled would
> ungate both GPU_CORE and GPU_CFG_ACLK gates and deassert
> GPU_SW_CLKGEN_RST, all in one place. The remaining enable(s) would be
> no-ops. Would that work?"
> 
> The above would have similar effect, but I felt like enabling both
> clocks in single .enable callback could be confusing so I ended up with
> the current approach. This could be easily re-done if you feel this
> would be better.
> 
> I agree that using spinlocks here is dangerous, but looking at the code
> of the reset_control_deassert and reset_control_assert, it doesn't seem
> like any spinlocks are acquired/relased in that code flow, unless the
> driver ops would introduce that. So in this specific case deadlock
> shouldn't happen ?

There are no spinlocks in the reset_control_(de)assert paths in the
reset framework, but in general there could be in the driver. The thead
driver [1], uses regmap_update_bits() on a regmap with .fast_io = true,
which uses the internal struct regmap::spinlock.

[1] https://lore.kernel.org/all/20250303152511.494405-3-m.wilczynski@samsung.com/


regards
Philipp

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-03-13  9:25         ` Philipp Zabel
@ 2025-03-19  9:22           ` Michal Wilczynski
  2025-03-25 22:40             ` Stephen Boyd
  0 siblings, 1 reply; 20+ messages in thread
From: Michal Wilczynski @ 2025-03-19  9:22 UTC (permalink / raw)
  To: Philipp Zabel, Stephen Boyd, alex, aou, conor+dt, drew, guoren,
	jszhang, krzk+dt, m.szyprowski, mturquette, palmer, paul.walmsley,
	robh, wefu
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv



On 3/13/25 10:25, Philipp Zabel wrote:
> On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
>>
>> On 3/6/25 00:47, Stephen Boyd wrote:
>>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
>>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
>>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
>>>> configurable. In order for these clock gates to work properly, the
>>>> CLKGEN reset must be managed in a specific sequence.
>>>>
>>>> Move the CLKGEN reset handling to the clock driver since it's
>>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
>>>> GPU clocks stay physically enabled without external interference from
>>>> the reset driver.  The reset is now deasserted only when both core and
>>>> cfg clocks are enabled, and asserted when either of them is disabled.
>>>>
>>>> The mem clock is configured to use nop operations since it cannot be
>>>> controlled.
>>>>
>>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
>>>>
>>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
>>> [...]
>>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
>>>> index ea96d007aecd..1dfcde867233 100644
>>>> --- a/drivers/clk/thead/clk-th1520-ap.c
>>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
>>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
>>> [...]
>>>>  
>>>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
>>>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
>>>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
>>>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
>>>> +
>>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
>>>> +{
>>>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
>>>> +       unsigned long flags;
>>>> +
>>>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
>>>> +
>>>> +       ccu_disable_helper(&cg->common, cg->enable);
>>>> +
>>>> +       if ((cg == &gpu_core_clk &&
>>>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
>>>> +           (cg == &gpu_cfg_aclk &&
>>>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
>>>> +               reset_control_assert(gpu_reset);
>>>
>>> Why can't the clk consumer control the reset itself? Doing this here is
>>> not ideal because we hold the clk lock when we try to grab the reset
>>> lock. These are all spinlocks that should be small in lines of code
>>> where the lock is held, but we're calling into an entire other framework
>>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
>>> lock it will deadlock.
>>
>> So in our case the clk consumer is the drm/imagination driver. Here is
>> the comment from the maintainer for my previous attempt to use a reset
>> driver to abstract the GPU init sequence [1]:
>>
>> "Do you know what this resets? From our side, the GPU only has a single
>> reset line (which I assume to be GPU_RESET)."
>>
>> "I don't love that this procedure appears in the platform reset driver.
>> I appreciate it may not be clear from the SoC TRM, but this is the
>> standard reset procedure for all IMG Rogue GPUs. The currently
>> supported TI SoC handles this in silicon, when power up/down requests
>> are sent so we never needed to encode it in the driver before.
>>
>> Strictly speaking, the 32 cycle delay is required between power and
>> clocks being enabled and the reset line being deasserted. If nothing
>> here touches power or clocks (which I don't think it should), the delay
>> could potentially be lifted to the GPU driver." 
>>
>> From the drm/imagination maintainers point of view their hardware has
>> only one reset, the extra CLKGEN reset is SoC specific.
> 
> If I am understanding correctly, the CLKGEN reset doesn't reset
> anything in the GPU itself, but holds the GPU clock generator block in
> reset, effectively disabling the three GPU clocks as a workaround for
> the always-ungated GPU_MEM clock.
> 
>> Also the reset driver maintainer didn't like my way of abstracting two
>> resets ("GPU" and and SoC specific"CLKGEN") into one reset
> 
> That is one part of it. The other is that (according to my
> understanding as laid out above), the combined GPU+CLKGEN reset would
> effectively disable all three GPU clocks for a while, after the GPU
> driver has already requested them to be enabled.

Thank you for your comments Philipp, it seems like we're on the same
page here. I was wondering whether there is anything I can do to move the
patches forward.

Stephen, if the current patch is a no go from your perspective could you
please advise whether there is a way to solve this in a clock that would
be acceptable to you.

Thanks,
Michał

> 
>> to make it
>> seem to the consumer driver drm/imagination like there is only one
>> reset and suggested and attempt to code the re-setting in the clock
>> driver [2]. Even though he suggested a different way of achieving that: 
>>
>> "In my mind it shouldn't be much: the GPU clocks could all share the
>> same refcounted implementation. The first clock to get enabled would
>> ungate both GPU_CORE and GPU_CFG_ACLK gates and deassert
>> GPU_SW_CLKGEN_RST, all in one place. The remaining enable(s) would be
>> no-ops. Would that work?"
>>
>> The above would have similar effect, but I felt like enabling both
>> clocks in single .enable callback could be confusing so I ended up with
>> the current approach. This could be easily re-done if you feel this
>> would be better.
>>
>> I agree that using spinlocks here is dangerous, but looking at the code
>> of the reset_control_deassert and reset_control_assert, it doesn't seem
>> like any spinlocks are acquired/relased in that code flow, unless the
>> driver ops would introduce that. So in this specific case deadlock
>> shouldn't happen ?
> 
> There are no spinlocks in the reset_control_(de)assert paths in the
> reset framework, but in general there could be in the driver. The thead
> driver [1], uses regmap_update_bits() on a regmap with .fast_io = true,
> which uses the internal struct regmap::spinlock.
> 
> [1] https://lore.kernel.org/all/20250303152511.494405-3-m.wilczynski@samsung.com/
> 
> 
> regards
> Philipp
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-03-19  9:22           ` Michal Wilczynski
@ 2025-03-25 22:40             ` Stephen Boyd
  2025-03-26 11:24               ` Ulf Hansson
  0 siblings, 1 reply; 20+ messages in thread
From: Stephen Boyd @ 2025-03-25 22:40 UTC (permalink / raw)
  To: Michal Wilczynski, Philipp Zabel, alex, aou, conor+dt, drew,
	guoren, jszhang, krzk+dt, m.szyprowski, mturquette, palmer,
	paul.walmsley, robh, wefu, Ulf Hansson
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv, linux-pm

Quoting Michal Wilczynski (2025-03-19 02:22:11)
> 
> 
> On 3/13/25 10:25, Philipp Zabel wrote:
> > On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
> >>
> >> On 3/6/25 00:47, Stephen Boyd wrote:
> >>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
> >>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
> >>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
> >>>> configurable. In order for these clock gates to work properly, the
> >>>> CLKGEN reset must be managed in a specific sequence.
> >>>>
> >>>> Move the CLKGEN reset handling to the clock driver since it's
> >>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
> >>>> GPU clocks stay physically enabled without external interference from
> >>>> the reset driver.  The reset is now deasserted only when both core and
> >>>> cfg clocks are enabled, and asserted when either of them is disabled.
> >>>>
> >>>> The mem clock is configured to use nop operations since it cannot be
> >>>> controlled.
> >>>>
> >>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
> >>>>
> >>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
> >>> [...]
> >>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
> >>>> index ea96d007aecd..1dfcde867233 100644
> >>>> --- a/drivers/clk/thead/clk-th1520-ap.c
> >>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
> >>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
> >>> [...]
> >>>>  
> >>>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
> >>>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
> >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
> >>>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
> >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
> >>>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
> >>>> +
> >>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
> >>>> +{
> >>>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
> >>>> +       unsigned long flags;
> >>>> +
> >>>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
> >>>> +
> >>>> +       ccu_disable_helper(&cg->common, cg->enable);
> >>>> +
> >>>> +       if ((cg == &gpu_core_clk &&
> >>>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
> >>>> +           (cg == &gpu_cfg_aclk &&
> >>>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
> >>>> +               reset_control_assert(gpu_reset);
> >>>
> >>> Why can't the clk consumer control the reset itself? Doing this here is
> >>> not ideal because we hold the clk lock when we try to grab the reset
> >>> lock. These are all spinlocks that should be small in lines of code
> >>> where the lock is held, but we're calling into an entire other framework
> >>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
> >>> lock it will deadlock.
> >>
> >> So in our case the clk consumer is the drm/imagination driver. Here is
> >> the comment from the maintainer for my previous attempt to use a reset
> >> driver to abstract the GPU init sequence [1]:
> >>
> >> "Do you know what this resets? From our side, the GPU only has a single
> >> reset line (which I assume to be GPU_RESET)."
> >>
> >> "I don't love that this procedure appears in the platform reset driver.
> >> I appreciate it may not be clear from the SoC TRM, but this is the
> >> standard reset procedure for all IMG Rogue GPUs. The currently
> >> supported TI SoC handles this in silicon, when power up/down requests
> >> are sent so we never needed to encode it in the driver before.
> >>
> >> Strictly speaking, the 32 cycle delay is required between power and
> >> clocks being enabled and the reset line being deasserted. If nothing
> >> here touches power or clocks (which I don't think it should), the delay
> >> could potentially be lifted to the GPU driver." 
> >>
> >> From the drm/imagination maintainers point of view their hardware has
> >> only one reset, the extra CLKGEN reset is SoC specific.
> > 
> > If I am understanding correctly, the CLKGEN reset doesn't reset
> > anything in the GPU itself, but holds the GPU clock generator block in
> > reset, effectively disabling the three GPU clocks as a workaround for
> > the always-ungated GPU_MEM clock.
> > 
> >> Also the reset driver maintainer didn't like my way of abstracting two
> >> resets ("GPU" and and SoC specific"CLKGEN") into one reset
> > 
> > That is one part of it. The other is that (according to my
> > understanding as laid out above), the combined GPU+CLKGEN reset would
> > effectively disable all three GPU clocks for a while, after the GPU
> > driver has already requested them to be enabled.
> 
> Thank you for your comments Philipp, it seems like we're on the same
> page here. I was wondering whether there is anything I can do to move the
> patches forward.
> 
> Stephen, if the current patch is a no go from your perspective could you
> please advise whether there is a way to solve this in a clock that would
> be acceptable to you.

It looks like the SoC glue makes the interactions between the clk and
reset frameworks complicated because GPU clks don't work if a reset is
asserted. You're trying to find a place to coordinate the clk and reset.
Am I right?

I'd advise managing the clks and resets in a generic power domain that
is attached to the GPU device. In that power domain, coordinate the clk
and reset sequencing so that the reset is deasserted before the clks are
enabled (or whatever the actual requirement is). If the GPU driver
_must_ have a clk and reset pointer to use, implement one that either
does nothing or flag to the GPU driver that the power domain is managing
all this for it so it should just use runtime PM and system PM hooks to
turn on the clks and take the GPU out of reset.

From what I can tell, the GPU driver maintainer doesn't want to think
about the wrapper that likely got placed around the hardware block
shipped by IMG. This wrapper is the SoC glue that needs to go into a
generic power domain so that the different PM resources, reset, clk,
etc. can be coordinated based on the GPU device's power state. It's
either that, or go the dwc3 route and have SoC glue platform drivers
that manage this stuff and create a child device to represent the hard
macro shipped by the vendor like Synopsys/Imagination. Doing the parent
device design isn't as flexible as PM domains because you can only have
one parent device and the child device state can be ignored vs. many PM
domains attached in a graph to a device that are more directly
influenced by the device using runtime PM.

Maybe you'll be heartened to know this problem isn't unique and has
existed for decades :) I don't know what state the graphics driver is in
but they'll likely be interested in solving this problem in a way that
doesn't "pollute" their driver with SoC specific details. It's all a
question of where you put the code. The reset framework wants to focus
on resets, the clk framework wants to focus on clks, and the graphics
driver wants to focus on graphics. BTW, we went through a similar
discussion with regulators and clks years ago and ended up handling that
with OPPs and power domains.

I believe a PM domain is the right place for this kind of stuff, and I
actually presented on this topic at OSSEU[1], but I don't maintain that
code. Ulf does.

[1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-03-25 22:40             ` Stephen Boyd
@ 2025-03-26 11:24               ` Ulf Hansson
  2025-04-01 14:38                 ` Michal Wilczynski
  0 siblings, 1 reply; 20+ messages in thread
From: Ulf Hansson @ 2025-03-26 11:24 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Michal Wilczynski, Philipp Zabel, alex, aou, conor+dt, drew,
	guoren, jszhang, krzk+dt, m.szyprowski, mturquette, palmer,
	paul.walmsley, robh, wefu, linux-clk, devicetree, linux-kernel,
	linux-riscv, linux-pm

On Tue, 25 Mar 2025 at 23:40, Stephen Boyd <sboyd@kernel.org> wrote:
>
> Quoting Michal Wilczynski (2025-03-19 02:22:11)
> >
> >
> > On 3/13/25 10:25, Philipp Zabel wrote:
> > > On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
> > >>
> > >> On 3/6/25 00:47, Stephen Boyd wrote:
> > >>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
> > >>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
> > >>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
> > >>>> configurable. In order for these clock gates to work properly, the
> > >>>> CLKGEN reset must be managed in a specific sequence.
> > >>>>
> > >>>> Move the CLKGEN reset handling to the clock driver since it's
> > >>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
> > >>>> GPU clocks stay physically enabled without external interference from
> > >>>> the reset driver.  The reset is now deasserted only when both core and
> > >>>> cfg clocks are enabled, and asserted when either of them is disabled.
> > >>>>
> > >>>> The mem clock is configured to use nop operations since it cannot be
> > >>>> controlled.
> > >>>>
> > >>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
> > >>>>
> > >>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
> > >>> [...]
> > >>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
> > >>>> index ea96d007aecd..1dfcde867233 100644
> > >>>> --- a/drivers/clk/thead/clk-th1520-ap.c
> > >>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
> > >>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
> > >>> [...]
> > >>>>
> > >>>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
> > >>>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
> > >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
> > >>>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
> > >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
> > >>>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
> > >>>> +
> > >>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
> > >>>> +{
> > >>>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
> > >>>> +       unsigned long flags;
> > >>>> +
> > >>>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
> > >>>> +
> > >>>> +       ccu_disable_helper(&cg->common, cg->enable);
> > >>>> +
> > >>>> +       if ((cg == &gpu_core_clk &&
> > >>>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
> > >>>> +           (cg == &gpu_cfg_aclk &&
> > >>>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
> > >>>> +               reset_control_assert(gpu_reset);
> > >>>
> > >>> Why can't the clk consumer control the reset itself? Doing this here is
> > >>> not ideal because we hold the clk lock when we try to grab the reset
> > >>> lock. These are all spinlocks that should be small in lines of code
> > >>> where the lock is held, but we're calling into an entire other framework
> > >>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
> > >>> lock it will deadlock.
> > >>
> > >> So in our case the clk consumer is the drm/imagination driver. Here is
> > >> the comment from the maintainer for my previous attempt to use a reset
> > >> driver to abstract the GPU init sequence [1]:
> > >>
> > >> "Do you know what this resets? From our side, the GPU only has a single
> > >> reset line (which I assume to be GPU_RESET)."
> > >>
> > >> "I don't love that this procedure appears in the platform reset driver.
> > >> I appreciate it may not be clear from the SoC TRM, but this is the
> > >> standard reset procedure for all IMG Rogue GPUs. The currently
> > >> supported TI SoC handles this in silicon, when power up/down requests
> > >> are sent so we never needed to encode it in the driver before.
> > >>
> > >> Strictly speaking, the 32 cycle delay is required between power and
> > >> clocks being enabled and the reset line being deasserted. If nothing
> > >> here touches power or clocks (which I don't think it should), the delay
> > >> could potentially be lifted to the GPU driver."
> > >>
> > >> From the drm/imagination maintainers point of view their hardware has
> > >> only one reset, the extra CLKGEN reset is SoC specific.
> > >
> > > If I am understanding correctly, the CLKGEN reset doesn't reset
> > > anything in the GPU itself, but holds the GPU clock generator block in
> > > reset, effectively disabling the three GPU clocks as a workaround for
> > > the always-ungated GPU_MEM clock.
> > >
> > >> Also the reset driver maintainer didn't like my way of abstracting two
> > >> resets ("GPU" and and SoC specific"CLKGEN") into one reset
> > >
> > > That is one part of it. The other is that (according to my
> > > understanding as laid out above), the combined GPU+CLKGEN reset would
> > > effectively disable all three GPU clocks for a while, after the GPU
> > > driver has already requested them to be enabled.
> >
> > Thank you for your comments Philipp, it seems like we're on the same
> > page here. I was wondering whether there is anything I can do to move the
> > patches forward.
> >
> > Stephen, if the current patch is a no go from your perspective could you
> > please advise whether there is a way to solve this in a clock that would
> > be acceptable to you.
>
> It looks like the SoC glue makes the interactions between the clk and
> reset frameworks complicated because GPU clks don't work if a reset is
> asserted. You're trying to find a place to coordinate the clk and reset.
> Am I right?
>
> I'd advise managing the clks and resets in a generic power domain that
> is attached to the GPU device. In that power domain, coordinate the clk
> and reset sequencing so that the reset is deasserted before the clks are
> enabled (or whatever the actual requirement is). If the GPU driver
> _must_ have a clk and reset pointer to use, implement one that either
> does nothing or flag to the GPU driver that the power domain is managing
> all this for it so it should just use runtime PM and system PM hooks to
> turn on the clks and take the GPU out of reset.
>
> From what I can tell, the GPU driver maintainer doesn't want to think
> about the wrapper that likely got placed around the hardware block
> shipped by IMG. This wrapper is the SoC glue that needs to go into a
> generic power domain so that the different PM resources, reset, clk,
> etc. can be coordinated based on the GPU device's power state. It's
> either that, or go the dwc3 route and have SoC glue platform drivers
> that manage this stuff and create a child device to represent the hard
> macro shipped by the vendor like Synopsys/Imagination. Doing the parent
> device design isn't as flexible as PM domains because you can only have
> one parent device and the child device state can be ignored vs. many PM
> domains attached in a graph to a device that are more directly
> influenced by the device using runtime PM.
>
> Maybe you'll be heartened to know this problem isn't unique and has
> existed for decades :) I don't know what state the graphics driver is in
> but they'll likely be interested in solving this problem in a way that
> doesn't "pollute" their driver with SoC specific details. It's all a
> question of where you put the code. The reset framework wants to focus
> on resets, the clk framework wants to focus on clks, and the graphics
> driver wants to focus on graphics. BTW, we went through a similar
> discussion with regulators and clks years ago and ended up handling that
> with OPPs and power domains.

Right, power-domain providers are mostly implementing SoC specific code.

In some cases, power-domain providers also handle per device SoC
specific constraints/sequences, which seems what you are discussing
here. For that, genpd has a couple of callbacks that could be
interesting to have a look at, such as:

genpd->attach|detach_dev() - for probe/remove
genpd.dev_ops->start|stop() - for runtime/system PM

That said, maybe just using the regular genpd->power_on|off() callback
is sufficient here, depending on how you decide to model things.

>
> I believe a PM domain is the right place for this kind of stuff, and I
> actually presented on this topic at OSSEU[1], but I don't maintain that
> code. Ulf does.
>
> [1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-03-26 11:24               ` Ulf Hansson
@ 2025-04-01 14:38                 ` Michal Wilczynski
  2025-04-07 17:02                   ` Michal Wilczynski
  0 siblings, 1 reply; 20+ messages in thread
From: Michal Wilczynski @ 2025-04-01 14:38 UTC (permalink / raw)
  To: Ulf Hansson, Stephen Boyd
  Cc: Philipp Zabel, alex, aou, conor+dt, drew, guoren, jszhang,
	krzk+dt, m.szyprowski, mturquette, palmer, paul.walmsley, robh,
	wefu, linux-clk, devicetree, linux-kernel, linux-riscv, linux-pm



On 3/26/25 12:24, Ulf Hansson wrote:
> On Tue, 25 Mar 2025 at 23:40, Stephen Boyd <sboyd@kernel.org> wrote:
>>
>> Quoting Michal Wilczynski (2025-03-19 02:22:11)
>>>
>>>
>>> On 3/13/25 10:25, Philipp Zabel wrote:
>>>> On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
>>>>>
>>>>> On 3/6/25 00:47, Stephen Boyd wrote:
>>>>>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
>>>>>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
>>>>>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
>>>>>>> configurable. In order for these clock gates to work properly, the
>>>>>>> CLKGEN reset must be managed in a specific sequence.
>>>>>>>
>>>>>>> Move the CLKGEN reset handling to the clock driver since it's
>>>>>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
>>>>>>> GPU clocks stay physically enabled without external interference from
>>>>>>> the reset driver.  The reset is now deasserted only when both core and
>>>>>>> cfg clocks are enabled, and asserted when either of them is disabled.
>>>>>>>
>>>>>>> The mem clock is configured to use nop operations since it cannot be
>>>>>>> controlled.
>>>>>>>
>>>>>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
>>>>>>>
>>>>>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
>>>>>> [...]
>>>>>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
>>>>>>> index ea96d007aecd..1dfcde867233 100644
>>>>>>> --- a/drivers/clk/thead/clk-th1520-ap.c
>>>>>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
>>>>>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
>>>>>> [...]
>>>>>>>
>>>>>>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
>>>>>>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
>>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
>>>>>>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
>>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
>>>>>>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
>>>>>>> +
>>>>>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
>>>>>>> +{
>>>>>>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
>>>>>>> +       unsigned long flags;
>>>>>>> +
>>>>>>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
>>>>>>> +
>>>>>>> +       ccu_disable_helper(&cg->common, cg->enable);
>>>>>>> +
>>>>>>> +       if ((cg == &gpu_core_clk &&
>>>>>>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
>>>>>>> +           (cg == &gpu_cfg_aclk &&
>>>>>>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
>>>>>>> +               reset_control_assert(gpu_reset);
>>>>>>
>>>>>> Why can't the clk consumer control the reset itself? Doing this here is
>>>>>> not ideal because we hold the clk lock when we try to grab the reset
>>>>>> lock. These are all spinlocks that should be small in lines of code
>>>>>> where the lock is held, but we're calling into an entire other framework
>>>>>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
>>>>>> lock it will deadlock.
>>>>>
>>>>> So in our case the clk consumer is the drm/imagination driver. Here is
>>>>> the comment from the maintainer for my previous attempt to use a reset
>>>>> driver to abstract the GPU init sequence [1]:
>>>>>
>>>>> "Do you know what this resets? From our side, the GPU only has a single
>>>>> reset line (which I assume to be GPU_RESET)."
>>>>>
>>>>> "I don't love that this procedure appears in the platform reset driver.
>>>>> I appreciate it may not be clear from the SoC TRM, but this is the
>>>>> standard reset procedure for all IMG Rogue GPUs. The currently
>>>>> supported TI SoC handles this in silicon, when power up/down requests
>>>>> are sent so we never needed to encode it in the driver before.
>>>>>
>>>>> Strictly speaking, the 32 cycle delay is required between power and
>>>>> clocks being enabled and the reset line being deasserted. If nothing
>>>>> here touches power or clocks (which I don't think it should), the delay
>>>>> could potentially be lifted to the GPU driver."
>>>>>
>>>>> From the drm/imagination maintainers point of view their hardware has
>>>>> only one reset, the extra CLKGEN reset is SoC specific.
>>>>
>>>> If I am understanding correctly, the CLKGEN reset doesn't reset
>>>> anything in the GPU itself, but holds the GPU clock generator block in
>>>> reset, effectively disabling the three GPU clocks as a workaround for
>>>> the always-ungated GPU_MEM clock.
>>>>
>>>>> Also the reset driver maintainer didn't like my way of abstracting two
>>>>> resets ("GPU" and and SoC specific"CLKGEN") into one reset
>>>>
>>>> That is one part of it. The other is that (according to my
>>>> understanding as laid out above), the combined GPU+CLKGEN reset would
>>>> effectively disable all three GPU clocks for a while, after the GPU
>>>> driver has already requested them to be enabled.
>>>
>>> Thank you for your comments Philipp, it seems like we're on the same
>>> page here. I was wondering whether there is anything I can do to move the
>>> patches forward.
>>>
>>> Stephen, if the current patch is a no go from your perspective could you
>>> please advise whether there is a way to solve this in a clock that would
>>> be acceptable to you.
>>
>> It looks like the SoC glue makes the interactions between the clk and
>> reset frameworks complicated because GPU clks don't work if a reset is
>> asserted. You're trying to find a place to coordinate the clk and reset.
>> Am I right?
>>
>> I'd advise managing the clks and resets in a generic power domain that
>> is attached to the GPU device. In that power domain, coordinate the clk
>> and reset sequencing so that the reset is deasserted before the clks are
>> enabled (or whatever the actual requirement is). If the GPU driver
>> _must_ have a clk and reset pointer to use, implement one that either
>> does nothing or flag to the GPU driver that the power domain is managing
>> all this for it so it should just use runtime PM and system PM hooks to
>> turn on the clks and take the GPU out of reset.
>>
>> From what I can tell, the GPU driver maintainer doesn't want to think
>> about the wrapper that likely got placed around the hardware block
>> shipped by IMG. This wrapper is the SoC glue that needs to go into a
>> generic power domain so that the different PM resources, reset, clk,
>> etc. can be coordinated based on the GPU device's power state. It's
>> either that, or go the dwc3 route and have SoC glue platform drivers
>> that manage this stuff and create a child device to represent the hard
>> macro shipped by the vendor like Synopsys/Imagination. Doing the parent
>> device design isn't as flexible as PM domains because you can only have
>> one parent device and the child device state can be ignored vs. many PM
>> domains attached in a graph to a device that are more directly
>> influenced by the device using runtime PM.
>>
>> Maybe you'll be heartened to know this problem isn't unique and has
>> existed for decades :) I don't know what state the graphics driver is in
>> but they'll likely be interested in solving this problem in a way that
>> doesn't "pollute" their driver with SoC specific details. It's all a
>> question of where you put the code. The reset framework wants to focus
>> on resets, the clk framework wants to focus on clks, and the graphics
>> driver wants to focus on graphics. BTW, we went through a similar
>> discussion with regulators and clks years ago and ended up handling that
>> with OPPs and power domains.
> 
> Right, power-domain providers are mostly implementing SoC specific code.
> 
> In some cases, power-domain providers also handle per device SoC
> specific constraints/sequences, which seems what you are discussing
> here. For that, genpd has a couple of callbacks that could be
> interesting to have a look at, such as:
> 
> genpd->attach|detach_dev() - for probe/remove
> genpd.dev_ops->start|stop() - for runtime/system PM
> 
> That said, maybe just using the regular genpd->power_on|off() callback
> is sufficient here, depending on how you decide to model things.


Thanks Stephen, Ulf !

So the way forward I see:

1) The reset driver can be merged as-is, if Philipp is fine with this
code [2].
2) I will cook up the update to the thead power-domain driver which will
handle reset and clock management.
3) I think it would be best to convince the drm/imagination maintainers to
make the clock management in their consumer driver optional. This way if
there is a SoC specific sequence the clocks/resets will be managed from
generic PM driver which is SoC specific. Will talk to them.
4) Will remove the reset management from this series, and re-send.

[2] - https://lore.kernel.org/all/4205b786-fb65-468c-a3d8-bce807dd829a@samsung.com/
> 
>>
>> I believe a PM domain is the right place for this kind of stuff, and I
>> actually presented on this topic at OSSEU[1], but I don't maintain that
>> code. Ulf does.
>>
>> [1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google

Thanks ! Watched the presentation, very interesting. Hopefully I'll be
able to attend in person in Amsterdam this year if you're presenting :-)

> 
> Kind regards
> Uffe
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-04-01 14:38                 ` Michal Wilczynski
@ 2025-04-07 17:02                   ` Michal Wilczynski
  2025-04-08 12:27                     ` Ulf Hansson
  0 siblings, 1 reply; 20+ messages in thread
From: Michal Wilczynski @ 2025-04-07 17:02 UTC (permalink / raw)
  To: Ulf Hansson, Stephen Boyd
  Cc: Philipp Zabel, alex, aou, conor+dt, drew, guoren, jszhang,
	krzk+dt, m.szyprowski, mturquette, palmer, paul.walmsley, robh,
	wefu, linux-clk, devicetree, linux-kernel, linux-riscv, linux-pm



On 4/1/25 16:38, Michal Wilczynski wrote:
> 
> 
> On 3/26/25 12:24, Ulf Hansson wrote:
>> On Tue, 25 Mar 2025 at 23:40, Stephen Boyd <sboyd@kernel.org> wrote:
>>>
>>> Quoting Michal Wilczynski (2025-03-19 02:22:11)
>>>>
>>>>
>>>> On 3/13/25 10:25, Philipp Zabel wrote:
>>>>> On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
>>>>>>
>>>>>> On 3/6/25 00:47, Stephen Boyd wrote:
>>>>>>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
>>>>>>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
>>>>>>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
>>>>>>>> configurable. In order for these clock gates to work properly, the
>>>>>>>> CLKGEN reset must be managed in a specific sequence.
>>>>>>>>
>>>>>>>> Move the CLKGEN reset handling to the clock driver since it's
>>>>>>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
>>>>>>>> GPU clocks stay physically enabled without external interference from
>>>>>>>> the reset driver.  The reset is now deasserted only when both core and
>>>>>>>> cfg clocks are enabled, and asserted when either of them is disabled.
>>>>>>>>
>>>>>>>> The mem clock is configured to use nop operations since it cannot be
>>>>>>>> controlled.
>>>>>>>>
>>>>>>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
>>>>>>>>
>>>>>>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
>>>>>>> [...]
>>>>>>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
>>>>>>>> index ea96d007aecd..1dfcde867233 100644
>>>>>>>> --- a/drivers/clk/thead/clk-th1520-ap.c
>>>>>>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
>>>>>>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
>>>>>>> [...]
>>>>>>>>
>>>>>>>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
>>>>>>>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
>>>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
>>>>>>>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
>>>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
>>>>>>>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
>>>>>>>> +
>>>>>>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
>>>>>>>> +{
>>>>>>>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
>>>>>>>> +       unsigned long flags;
>>>>>>>> +
>>>>>>>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
>>>>>>>> +
>>>>>>>> +       ccu_disable_helper(&cg->common, cg->enable);
>>>>>>>> +
>>>>>>>> +       if ((cg == &gpu_core_clk &&
>>>>>>>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
>>>>>>>> +           (cg == &gpu_cfg_aclk &&
>>>>>>>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
>>>>>>>> +               reset_control_assert(gpu_reset);
>>>>>>>
>>>>>>> Why can't the clk consumer control the reset itself? Doing this here is
>>>>>>> not ideal because we hold the clk lock when we try to grab the reset
>>>>>>> lock. These are all spinlocks that should be small in lines of code
>>>>>>> where the lock is held, but we're calling into an entire other framework
>>>>>>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
>>>>>>> lock it will deadlock.
>>>>>>
>>>>>> So in our case the clk consumer is the drm/imagination driver. Here is
>>>>>> the comment from the maintainer for my previous attempt to use a reset
>>>>>> driver to abstract the GPU init sequence [1]:
>>>>>>
>>>>>> "Do you know what this resets? From our side, the GPU only has a single
>>>>>> reset line (which I assume to be GPU_RESET)."
>>>>>>
>>>>>> "I don't love that this procedure appears in the platform reset driver.
>>>>>> I appreciate it may not be clear from the SoC TRM, but this is the
>>>>>> standard reset procedure for all IMG Rogue GPUs. The currently
>>>>>> supported TI SoC handles this in silicon, when power up/down requests
>>>>>> are sent so we never needed to encode it in the driver before.
>>>>>>
>>>>>> Strictly speaking, the 32 cycle delay is required between power and
>>>>>> clocks being enabled and the reset line being deasserted. If nothing
>>>>>> here touches power or clocks (which I don't think it should), the delay
>>>>>> could potentially be lifted to the GPU driver."
>>>>>>
>>>>>> From the drm/imagination maintainers point of view their hardware has
>>>>>> only one reset, the extra CLKGEN reset is SoC specific.
>>>>>
>>>>> If I am understanding correctly, the CLKGEN reset doesn't reset
>>>>> anything in the GPU itself, but holds the GPU clock generator block in
>>>>> reset, effectively disabling the three GPU clocks as a workaround for
>>>>> the always-ungated GPU_MEM clock.
>>>>>
>>>>>> Also the reset driver maintainer didn't like my way of abstracting two
>>>>>> resets ("GPU" and and SoC specific"CLKGEN") into one reset
>>>>>
>>>>> That is one part of it. The other is that (according to my
>>>>> understanding as laid out above), the combined GPU+CLKGEN reset would
>>>>> effectively disable all three GPU clocks for a while, after the GPU
>>>>> driver has already requested them to be enabled.
>>>>
>>>> Thank you for your comments Philipp, it seems like we're on the same
>>>> page here. I was wondering whether there is anything I can do to move the
>>>> patches forward.
>>>>
>>>> Stephen, if the current patch is a no go from your perspective could you
>>>> please advise whether there is a way to solve this in a clock that would
>>>> be acceptable to you.
>>>
>>> It looks like the SoC glue makes the interactions between the clk and
>>> reset frameworks complicated because GPU clks don't work if a reset is
>>> asserted. You're trying to find a place to coordinate the clk and reset.
>>> Am I right?
>>>
>>> I'd advise managing the clks and resets in a generic power domain that
>>> is attached to the GPU device. In that power domain, coordinate the clk
>>> and reset sequencing so that the reset is deasserted before the clks are
>>> enabled (or whatever the actual requirement is). If the GPU driver
>>> _must_ have a clk and reset pointer to use, implement one that either
>>> does nothing or flag to the GPU driver that the power domain is managing
>>> all this for it so it should just use runtime PM and system PM hooks to
>>> turn on the clks and take the GPU out of reset.
>>>
>>> From what I can tell, the GPU driver maintainer doesn't want to think
>>> about the wrapper that likely got placed around the hardware block
>>> shipped by IMG. This wrapper is the SoC glue that needs to go into a
>>> generic power domain so that the different PM resources, reset, clk,
>>> etc. can be coordinated based on the GPU device's power state. It's
>>> either that, or go the dwc3 route and have SoC glue platform drivers
>>> that manage this stuff and create a child device to represent the hard
>>> macro shipped by the vendor like Synopsys/Imagination. Doing the parent
>>> device design isn't as flexible as PM domains because you can only have
>>> one parent device and the child device state can be ignored vs. many PM
>>> domains attached in a graph to a device that are more directly
>>> influenced by the device using runtime PM.
>>>
>>> Maybe you'll be heartened to know this problem isn't unique and has
>>> existed for decades :) I don't know what state the graphics driver is in
>>> but they'll likely be interested in solving this problem in a way that
>>> doesn't "pollute" their driver with SoC specific details. It's all a
>>> question of where you put the code. The reset framework wants to focus
>>> on resets, the clk framework wants to focus on clks, and the graphics
>>> driver wants to focus on graphics. BTW, we went through a similar
>>> discussion with regulators and clks years ago and ended up handling that
>>> with OPPs and power domains.
>>
>> Right, power-domain providers are mostly implementing SoC specific code.
>>
>> In some cases, power-domain providers also handle per device SoC
>> specific constraints/sequences, which seems what you are discussing
>> here. For that, genpd has a couple of callbacks that could be
>> interesting to have a look at, such as:
>>
>> genpd->attach|detach_dev() - for probe/remove
>> genpd.dev_ops->start|stop() - for runtime/system PM
>>
>> That said, maybe just using the regular genpd->power_on|off() callback
>> is sufficient here, depending on how you decide to model things.
> 
> 
> Thanks Stephen, Ulf !
> 
> So the way forward I see:
> 
> 1) The reset driver can be merged as-is, if Philipp is fine with this
> code [2].
> 2) I will cook up the update to the thead power-domain driver which will
> handle reset and clock management.

Hi Ulf,
I'm working on the series right now and I wanted to ask you how you
prefer versioning to be handled. Would you like me to send a series as a
v1, or treat is as a continuation of this series [1] and send as a v9.
Would like to avoid any confusion. 

Thanks,
Michał

[1] - https://lore.kernel.org/all/20250311171900.1549916-1-m.wilczynski@samsung.com/

> 3) I think it would be best to convince the drm/imagination maintainers to
> make the clock management in their consumer driver optional. This way if
> there is a SoC specific sequence the clocks/resets will be managed from
> generic PM driver which is SoC specific. Will talk to them.
> 4) Will remove the reset management from this series, and re-send.
> 
> [2] - https://lore.kernel.org/all/4205b786-fb65-468c-a3d8-bce807dd829a@samsung.com/
>>
>>>
>>> I believe a PM domain is the right place for this kind of stuff, and I
>>> actually presented on this topic at OSSEU[1], but I don't maintain that
>>> code. Ulf does.
>>>
>>> [1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google
> 
> Thanks ! Watched the presentation, very interesting. Hopefully I'll be
> able to attend in person in Amsterdam this year if you're presenting :-)
> 
>>
>> Kind regards
>> Uffe
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-04-07 17:02                   ` Michal Wilczynski
@ 2025-04-08 12:27                     ` Ulf Hansson
  0 siblings, 0 replies; 20+ messages in thread
From: Ulf Hansson @ 2025-04-08 12:27 UTC (permalink / raw)
  To: Michal Wilczynski
  Cc: Stephen Boyd, Philipp Zabel, alex, aou, conor+dt, drew, guoren,
	jszhang, krzk+dt, m.szyprowski, mturquette, palmer, paul.walmsley,
	robh, wefu, linux-clk, devicetree, linux-kernel, linux-riscv,
	linux-pm

[...]

> >>>
> >>> It looks like the SoC glue makes the interactions between the clk and
> >>> reset frameworks complicated because GPU clks don't work if a reset is
> >>> asserted. You're trying to find a place to coordinate the clk and reset.
> >>> Am I right?
> >>>
> >>> I'd advise managing the clks and resets in a generic power domain that
> >>> is attached to the GPU device. In that power domain, coordinate the clk
> >>> and reset sequencing so that the reset is deasserted before the clks are
> >>> enabled (or whatever the actual requirement is). If the GPU driver
> >>> _must_ have a clk and reset pointer to use, implement one that either
> >>> does nothing or flag to the GPU driver that the power domain is managing
> >>> all this for it so it should just use runtime PM and system PM hooks to
> >>> turn on the clks and take the GPU out of reset.
> >>>
> >>> From what I can tell, the GPU driver maintainer doesn't want to think
> >>> about the wrapper that likely got placed around the hardware block
> >>> shipped by IMG. This wrapper is the SoC glue that needs to go into a
> >>> generic power domain so that the different PM resources, reset, clk,
> >>> etc. can be coordinated based on the GPU device's power state. It's
> >>> either that, or go the dwc3 route and have SoC glue platform drivers
> >>> that manage this stuff and create a child device to represent the hard
> >>> macro shipped by the vendor like Synopsys/Imagination. Doing the parent
> >>> device design isn't as flexible as PM domains because you can only have
> >>> one parent device and the child device state can be ignored vs. many PM
> >>> domains attached in a graph to a device that are more directly
> >>> influenced by the device using runtime PM.
> >>>
> >>> Maybe you'll be heartened to know this problem isn't unique and has
> >>> existed for decades :) I don't know what state the graphics driver is in
> >>> but they'll likely be interested in solving this problem in a way that
> >>> doesn't "pollute" their driver with SoC specific details. It's all a
> >>> question of where you put the code. The reset framework wants to focus
> >>> on resets, the clk framework wants to focus on clks, and the graphics
> >>> driver wants to focus on graphics. BTW, we went through a similar
> >>> discussion with regulators and clks years ago and ended up handling that
> >>> with OPPs and power domains.
> >>
> >> Right, power-domain providers are mostly implementing SoC specific code.
> >>
> >> In some cases, power-domain providers also handle per device SoC
> >> specific constraints/sequences, which seems what you are discussing
> >> here. For that, genpd has a couple of callbacks that could be
> >> interesting to have a look at, such as:
> >>
> >> genpd->attach|detach_dev() - for probe/remove
> >> genpd.dev_ops->start|stop() - for runtime/system PM
> >>
> >> That said, maybe just using the regular genpd->power_on|off() callback
> >> is sufficient here, depending on how you decide to model things.
> >
> >
> > Thanks Stephen, Ulf !
> >
> > So the way forward I see:
> >
> > 1) The reset driver can be merged as-is, if Philipp is fine with this
> > code [2].
> > 2) I will cook up the update to the thead power-domain driver which will
> > handle reset and clock management.
>
> Hi Ulf,
> I'm working on the series right now and I wanted to ask you how you
> prefer versioning to be handled. Would you like me to send a series as a
> v1, or treat is as a continuation of this series [1] and send as a v9.
> Would like to avoid any confusion.

I would suggest starting over with v1, but don't forget to refer to
some of the previous attempts/discussion in the cover-letter.  At
least I would be fine by this.

>
> Thanks,
> Michał
>
> [1] - https://lore.kernel.org/all/20250311171900.1549916-1-m.wilczynski@samsung.com/
>

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2025-04-08 12:27 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20250303143634eucas1p269281f72bdc4d764edd54b9427f68787@eucas1p2.samsung.com>
2025-03-03 14:36 ` [PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement Michal Wilczynski
2025-03-03 14:36   ` [PATCH v1 1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller Michal Wilczynski
2025-03-03 16:24     ` Conor Dooley
2025-03-03 17:41     ` Krzysztof Kozlowski
2025-03-03 17:46       ` Krzysztof Kozlowski
2025-03-04  7:39         ` Michal Wilczynski
2025-03-04  7:41           ` Krzysztof Kozlowski
2025-03-03 14:36   ` [PATCH v1 2/4] clk: thead: Add clock support for VO subsystem in T-Head TH1520 SoC Michal Wilczynski
2025-03-03 14:36   ` [PATCH v1 3/4] clk: thead: Add support for custom ops in CCU_GATE_CLK_OPS macro Michal Wilczynski
2025-03-03 14:36   ` [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support Michal Wilczynski
2025-03-05 23:47     ` Stephen Boyd
2025-03-06 16:43       ` Michal Wilczynski
2025-03-13  9:25         ` Philipp Zabel
2025-03-19  9:22           ` Michal Wilczynski
2025-03-25 22:40             ` Stephen Boyd
2025-03-26 11:24               ` Ulf Hansson
2025-04-01 14:38                 ` Michal Wilczynski
2025-04-07 17:02                   ` Michal Wilczynski
2025-04-08 12:27                     ` Ulf Hansson
2025-03-05  8:18   ` [PATCH v1 0/4] Add T-Head TH1520 VO clock support for LicheePi 4A GPU enablement Michal Wilczynski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).