[PATCH 0/5] Properly Limit Tegra210 Clock Rates

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/5] Properly Limit Tegra210 Clock Rates
@ 2025-08-16  5:53 Aaron Kling via B4 Relay
  2025-08-16  5:53 ` [PATCH 1/5] dt-bindings: clock: tegra124-dfll: Add property to limit frequency Aaron Kling via B4 Relay
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Aaron Kling via B4 Relay @ 2025-08-16  5:53 UTC (permalink / raw)
  To: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad
  Cc: linux-clk, devicetree, linux-tegra, linux-kernel, Thierry Reding,
	Aaron Kling

The Tegra210 CVB tables were added in commit 2b2dbc2f94e5. Since then,
all Tegra210 socs have tried to scale the cpu to 1.9 GHz, when the
supported devkits are only supposed to scale to 1.5 or 1.7 GHZ.
Overclocking should not be the default state.

Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
---
Aaron Kling (5):
      dt-bindings: clock: tegra124-dfll: Add property to limit frequency
      soc: tegra: fuse: speedo-tegra210: Update speedo ids
      soc: tegra: fuse: speedo-tegra210: Add sku 0x8F
      clk: tegra: dfll: Support limiting max clock per device
      arm64: tegra: Limit max cpu frequency on P3450

 .../bindings/clock/nvidia,tegra124-dfll.txt        |  3 ++
 arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts |  1 +
 drivers/clk/tegra/clk-tegra124-dfll-fcpu.c         |  8 ++++-
 drivers/soc/tegra/fuse/speedo-tegra210.c           | 39 ++++++++++++++++++----
 4 files changed, 43 insertions(+), 8 deletions(-)
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250812-tegra210-speedo-470691e8b8cc

Best regards,
-- 
Aaron Kling <webgeek1234@gmail.com>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/5] dt-bindings: clock: tegra124-dfll: Add property to limit frequency
  2025-08-16  5:53 [PATCH 0/5] Properly Limit Tegra210 Clock Rates Aaron Kling via B4 Relay
@ 2025-08-16  5:53 ` Aaron Kling via B4 Relay
  2025-08-16  8:21   ` Krzysztof Kozlowski
  2025-08-16  5:53 ` [PATCH 2/5] soc: tegra: fuse: speedo-tegra210: Update speedo ids Aaron Kling via B4 Relay
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Aaron Kling via B4 Relay @ 2025-08-16  5:53 UTC (permalink / raw)
  To: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad
  Cc: linux-clk, devicetree, linux-tegra, linux-kernel, Thierry Reding,
	Aaron Kling

From: Aaron Kling <webgeek1234@gmail.com>

Some devices report a cpu speedo value that corresponds to a table that
scales beyond the chips capability. This allows devices to set a lower
limit.

Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
---
 Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt b/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt
index f7d347385b5775ddd702ecbb9821acfc9d4b9ff2..6cdbabc1f036a767bdc8e5df64eeff34171a3b85 100644
--- a/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt
+++ b/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt
@@ -70,6 +70,9 @@ Required properties for PWM mode:
   - dvfs_pwm_enable: I/O pad configuration when PWM control is enabled.
   - dvfs_pwm_disable: I/O pad configuration when PWM control is disabled.
 
+Optional properties for limiting frequency:
+- nvidia,dfll-max-freq: Maximum scaling frequency.
+
 Example for I2C:
 
 clock@70110000 {

-- 
2.50.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/5] soc: tegra: fuse: speedo-tegra210: Update speedo ids
  2025-08-16  5:53 [PATCH 0/5] Properly Limit Tegra210 Clock Rates Aaron Kling via B4 Relay
  2025-08-16  5:53 ` [PATCH 1/5] dt-bindings: clock: tegra124-dfll: Add property to limit frequency Aaron Kling via B4 Relay
@ 2025-08-16  5:53 ` Aaron Kling via B4 Relay
  2025-09-03  6:39   ` Mikko Perttunen
  2025-08-16  5:53 ` [PATCH 3/5] soc: tegra: fuse: speedo-tegra210: Add sku 0x8F Aaron Kling via B4 Relay
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Aaron Kling via B4 Relay @ 2025-08-16  5:53 UTC (permalink / raw)
  To: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad
  Cc: linux-clk, devicetree, linux-tegra, linux-kernel, Thierry Reding,
	Aaron Kling

From: Aaron Kling <webgeek1234@gmail.com>

Existing code only sets cpu and gpu speedo ids 0 and 1. The cpu dvfs
code supports 11 ids and nouveau supports 5. This aligns with what the
downstream vendor kernel supports. Align the existing supported skus
with the downstream speedo list.

The Tegra210 CVB tables were added in the referenced fixes commit. Since
then, all Tegra210 socs have tried to scale to 1.9 GHz, when the
supported devkits are only supposed to scale to 1.5 or 1.7 GHZ.
Overclocking should not be the default state.

Fixes: 2b2dbc2f94e5 ("clk: tegra: dfll: add CVB tables for Tegra210")
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
---
 drivers/soc/tegra/fuse/speedo-tegra210.c | 31 ++++++++++++++++++++++++-------
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/drivers/soc/tegra/fuse/speedo-tegra210.c b/drivers/soc/tegra/fuse/speedo-tegra210.c
index 695d0b7f9a8abe53c497155603147420cda40b63..909fdf8fcc9d9f5ac936ae322e7a73dc720e5ab6 100644
--- a/drivers/soc/tegra/fuse/speedo-tegra210.c
+++ b/drivers/soc/tegra/fuse/speedo-tegra210.c
@@ -68,18 +68,35 @@ static void __init rev_sku_to_speedo_ids(struct tegra_sku_info *sku_info,
 	switch (sku) {
 	case 0x00: /* Engineering SKU */
 	case 0x01: /* Engineering SKU */
+	case 0x13:
+		if (speedo_rev >= 2) {
+			sku_info->cpu_speedo_id = 5;
+			sku_info->gpu_speedo_id = 2;
+			break;
+		}
+
+		sku_info->gpu_speedo_id = 1;
+		break;
+
 	case 0x07:
 	case 0x17:
-	case 0x27:
-		if (speedo_rev >= 2)
-			sku_info->gpu_speedo_id = 1;
+		if (speedo_rev >= 2) {
+			sku_info->cpu_speedo_id = 7;
+			sku_info->gpu_speedo_id = 2;
+			break;
+		}
+
+		sku_info->gpu_speedo_id = 1;
 		break;
 
-	case 0x13:
-		if (speedo_rev >= 2)
-			sku_info->gpu_speedo_id = 1;
+	case 0x27:
+		if (speedo_rev >= 2) {
+			sku_info->cpu_speedo_id = 1;
+			sku_info->gpu_speedo_id = 2;
+			break;
+		}
 
-		sku_info->cpu_speedo_id = 1;
+		sku_info->gpu_speedo_id = 1;
 		break;
 
 	default:

-- 
2.50.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/5] soc: tegra: fuse: speedo-tegra210: Add sku 0x8F
  2025-08-16  5:53 [PATCH 0/5] Properly Limit Tegra210 Clock Rates Aaron Kling via B4 Relay
  2025-08-16  5:53 ` [PATCH 1/5] dt-bindings: clock: tegra124-dfll: Add property to limit frequency Aaron Kling via B4 Relay
  2025-08-16  5:53 ` [PATCH 2/5] soc: tegra: fuse: speedo-tegra210: Update speedo ids Aaron Kling via B4 Relay
@ 2025-08-16  5:53 ` Aaron Kling via B4 Relay
  2025-08-16  5:53 ` [PATCH 4/5] clk: tegra: dfll: Support limiting max clock per device Aaron Kling via B4 Relay
  2025-08-16  5:53 ` [PATCH 5/5] arm64: tegra: Limit max cpu frequency on P3450 Aaron Kling via B4 Relay
  4 siblings, 0 replies; 16+ messages in thread
From: Aaron Kling via B4 Relay @ 2025-08-16  5:53 UTC (permalink / raw)
  To: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad
  Cc: linux-clk, devicetree, linux-tegra, linux-kernel, Thierry Reding,
	Aaron Kling

From: Aaron Kling <webgeek1234@gmail.com>

This is used by the Jetson Nano series of SoMs

Fixes: 579db6e5d9b8 ("arm64: tegra: Enable DFLL support on Jetson Nano")
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
---
 drivers/soc/tegra/fuse/speedo-tegra210.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/soc/tegra/fuse/speedo-tegra210.c b/drivers/soc/tegra/fuse/speedo-tegra210.c
index 909fdf8fcc9d9f5ac936ae322e7a73dc720e5ab6..1cdd70c59c0753e602709f9179c0ab67d1b8f5e3 100644
--- a/drivers/soc/tegra/fuse/speedo-tegra210.c
+++ b/drivers/soc/tegra/fuse/speedo-tegra210.c
@@ -99,6 +99,14 @@ static void __init rev_sku_to_speedo_ids(struct tegra_sku_info *sku_info,
 		sku_info->gpu_speedo_id = 1;
 		break;
 
+	case 0x8F:
+		if (speedo_rev >= 2) {
+			sku_info->cpu_speedo_id = 9;
+			sku_info->gpu_speedo_id = 2;
+			break;
+		}
+		fallthrough;
+
 	default:
 		pr_err("Tegra210: unknown SKU %#04x\n", sku);
 		/* Using the default for the error case */

-- 
2.50.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/5] clk: tegra: dfll: Support limiting max clock per device
  2025-08-16  5:53 [PATCH 0/5] Properly Limit Tegra210 Clock Rates Aaron Kling via B4 Relay
                   ` (2 preceding siblings ...)
  2025-08-16  5:53 ` [PATCH 3/5] soc: tegra: fuse: speedo-tegra210: Add sku 0x8F Aaron Kling via B4 Relay
@ 2025-08-16  5:53 ` Aaron Kling via B4 Relay
  2025-08-16  5:53 ` [PATCH 5/5] arm64: tegra: Limit max cpu frequency on P3450 Aaron Kling via B4 Relay
  4 siblings, 0 replies; 16+ messages in thread
From: Aaron Kling via B4 Relay @ 2025-08-16  5:53 UTC (permalink / raw)
  To: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad
  Cc: linux-clk, devicetree, linux-tegra, linux-kernel, Thierry Reding,
	Aaron Kling

From: Aaron Kling <webgeek1234@gmail.com>

Some devices like the Jetson Nano report a cpu speedo value that scales
past the capabilities of the cpu. This allows limiting the maximum
scaling to a lower value within the table.

Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
---
 drivers/clk/tegra/clk-tegra124-dfll-fcpu.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/clk/tegra/clk-tegra124-dfll-fcpu.c b/drivers/clk/tegra/clk-tegra124-dfll-fcpu.c
index 0251618b82c8321724ba0aec7a5bd90b2c2ffaf2..0c84f7e85baaa96fee005a1c9a5dd6afbd1875fa 100644
--- a/drivers/clk/tegra/clk-tegra124-dfll-fcpu.c
+++ b/drivers/clk/tegra/clk-tegra124-dfll-fcpu.c
@@ -556,6 +556,7 @@ static int tegra124_dfll_fcpu_probe(struct platform_device *pdev)
 	struct tegra_dfll_soc_data *soc;
 	const struct dfll_fcpu_data *fcpu_data;
 	struct rail_alignment align;
+	u32 max_freq;
 
 	fcpu_data = of_device_get_match_data(&pdev->dev);
 	if (!fcpu_data)
@@ -589,7 +590,12 @@ static int tegra124_dfll_fcpu_probe(struct platform_device *pdev)
 			return err;
 	}
 
-	soc->max_freq = fcpu_data->cpu_max_freq_table[speedo_id];
+	if (!of_property_read_u32(pdev->dev.of_node,
+				 "nvidia,dfll-max-freq",
+				 &max_freq))
+		soc->max_freq = max_freq;
+	else
+		soc->max_freq = fcpu_data->cpu_max_freq_table[speedo_id];
 
 	soc->cvb = tegra_cvb_add_opp_table(soc->dev, fcpu_data->cpu_cvb_tables,
 					   fcpu_data->cpu_cvb_tables_size,

-- 
2.50.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/5] arm64: tegra: Limit max cpu frequency on P3450
  2025-08-16  5:53 [PATCH 0/5] Properly Limit Tegra210 Clock Rates Aaron Kling via B4 Relay
                   ` (3 preceding siblings ...)
  2025-08-16  5:53 ` [PATCH 4/5] clk: tegra: dfll: Support limiting max clock per device Aaron Kling via B4 Relay
@ 2025-08-16  5:53 ` Aaron Kling via B4 Relay
  2025-09-03  5:50   ` Mikko Perttunen
  4 siblings, 1 reply; 16+ messages in thread
From: Aaron Kling via B4 Relay @ 2025-08-16  5:53 UTC (permalink / raw)
  To: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad
  Cc: linux-clk, devicetree, linux-tegra, linux-kernel, Thierry Reding,
	Aaron Kling

From: Aaron Kling <webgeek1234@gmail.com>

P3450's cpu is only rated for 1.4 GHz while the CVB table it uses tries
to scale to 1.5 GHz. Set an appropriate limit on the maximum scaling
frequency.

Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
---
 arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts b/arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts
index ec0e84cb83ef9bf8f0e52e2958db33666813917c..10f878d3f50815d1f0297d15669048ab9cad73ee 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts
+++ b/arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts
@@ -594,6 +594,7 @@ clock@70110000 {
 		nvidia,droop-ctrl = <0x00000f00>;
 		nvidia,force-mode = <1>;
 		nvidia,sample-rate = <25000>;
+		nvidia,dfll-max-freq = <1479000000>;
 
 		nvidia,pwm-min-microvolts = <708000>;
 		nvidia,pwm-period-nanoseconds = <2500>; /* 2.5us */

-- 
2.50.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] dt-bindings: clock: tegra124-dfll: Add property to limit frequency
  2025-08-16  5:53 ` [PATCH 1/5] dt-bindings: clock: tegra124-dfll: Add property to limit frequency Aaron Kling via B4 Relay
@ 2025-08-16  8:21   ` Krzysztof Kozlowski
  2025-08-18  3:23     ` Aaron Kling
  0 siblings, 1 reply; 16+ messages in thread
From: Krzysztof Kozlowski @ 2025-08-16  8:21 UTC (permalink / raw)
  To: webgeek1234, Michael Turquette, Stephen Boyd, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Thierry Reding,
	Jonathan Hunter, Joseph Lo, Peter De Schrijver, Prashant Gaikwad
  Cc: linux-clk, devicetree, linux-tegra, linux-kernel, Thierry Reding

On 16/08/2025 07:53, Aaron Kling via B4 Relay wrote:
> From: Aaron Kling <webgeek1234@gmail.com>
> 
> Some devices report a cpu speedo value that corresponds to a table that
> scales beyond the chips capability. This allows devices to set a lower
> limit.
> 
> Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
> ---
>  Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt b/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt
> index f7d347385b5775ddd702ecbb9821acfc9d4b9ff2..6cdbabc1f036a767bdc8e5df64eeff34171a3b85 100644
> --- a/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt
> +++ b/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt
> @@ -70,6 +70,9 @@ Required properties for PWM mode:
>    - dvfs_pwm_enable: I/O pad configuration when PWM control is enabled.
>    - dvfs_pwm_disable: I/O pad configuration when PWM control is disabled.
>  
> +Optional properties for limiting frequency:
> +- nvidia,dfll-max-freq: Maximum scaling frequency.


1. Frequency is in units.
2. OPP defines it already, doesn't it?
3. You need to convert file to DT schema first. No new properties are
allowed in text.



Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] dt-bindings: clock: tegra124-dfll: Add property to limit frequency
  2025-08-16  8:21   ` Krzysztof Kozlowski
@ 2025-08-18  3:23     ` Aaron Kling
  2025-08-18  6:31       ` Krzysztof Kozlowski
  0 siblings, 1 reply; 16+ messages in thread
From: Aaron Kling @ 2025-08-18  3:23 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad, linux-clk, devicetree,
	linux-tegra, linux-kernel, Thierry Reding

On Sat, Aug 16, 2025 at 3:21 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> On 16/08/2025 07:53, Aaron Kling via B4 Relay wrote:
> > From: Aaron Kling <webgeek1234@gmail.com>
> >
> > Some devices report a cpu speedo value that corresponds to a table that
> > scales beyond the chips capability. This allows devices to set a lower
> > limit.
> >
> > Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
> > ---
> >  Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt b/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt
> > index f7d347385b5775ddd702ecbb9821acfc9d4b9ff2..6cdbabc1f036a767bdc8e5df64eeff34171a3b85 100644
> > --- a/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt
> > +++ b/Documentation/devicetree/bindings/clock/nvidia,tegra124-dfll.txt
> > @@ -70,6 +70,9 @@ Required properties for PWM mode:
> >    - dvfs_pwm_enable: I/O pad configuration when PWM control is enabled.
> >    - dvfs_pwm_disable: I/O pad configuration when PWM control is disabled.
> >
> > +Optional properties for limiting frequency:
> > +- nvidia,dfll-max-freq: Maximum scaling frequency.
>
>
> 1. Frequency is in units.
Ack, will fix in whatever form a new revision takes.

> 2. OPP defines it already, doesn't it?
The dfll driver generates the cpu opp table based on soc sku's, it
doesn't use dt opp tables. This property is intended to modify the
generation of said table. That said, if there's a generic dt opp
paradigm for this that I missed which works without dt opp tables, I'd
be happy to use that instead.

> 3. You need to convert file to DT schema first. No new properties are
> allowed in text.
Per an attempt to auto-convert this binding [0], there's a pending
copy already. As I don't want to duplicate existing work, I'll have to
wait on that then.

Aaron

[0] https://lore.kernel.org/all/20250630232632.3700405-1-robh@kernel.org/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] dt-bindings: clock: tegra124-dfll: Add property to limit frequency
  2025-08-18  3:23     ` Aaron Kling
@ 2025-08-18  6:31       ` Krzysztof Kozlowski
  0 siblings, 0 replies; 16+ messages in thread
From: Krzysztof Kozlowski @ 2025-08-18  6:31 UTC (permalink / raw)
  To: Aaron Kling
  Cc: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad, linux-clk, devicetree,
	linux-tegra, linux-kernel, Thierry Reding

On 18/08/2025 05:23, Aaron Kling wrote:
>>>
>>> +Optional properties for limiting frequency:
>>> +- nvidia,dfll-max-freq: Maximum scaling frequency.
>>
>>
>> 1. Frequency is in units.
> Ack, will fix in whatever form a new revision takes.
> 
>> 2. OPP defines it already, doesn't it?
> The dfll driver generates the cpu opp table based on soc sku's, it
> doesn't use dt opp tables. This property is intended to modify the
> generation of said table. That said, if there's a generic dt opp
> paradigm for this that I missed which works without dt opp tables, I'd
> be happy to use that instead.

Usually list of frequencies is via OPP, if it is not applicable here, it
should be explained briefly.

Just like - why same devices have different values should be explained
(commit msg is not precise here).

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] arm64: tegra: Limit max cpu frequency on P3450
  2025-08-16  5:53 ` [PATCH 5/5] arm64: tegra: Limit max cpu frequency on P3450 Aaron Kling via B4 Relay
@ 2025-09-03  5:50   ` Mikko Perttunen
  2025-09-03  6:28     ` Aaron Kling
  0 siblings, 1 reply; 16+ messages in thread
From: Mikko Perttunen @ 2025-09-03  5:50 UTC (permalink / raw)
  To: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad, webgeek1234
  Cc: linux-clk, devicetree, linux-tegra, linux-kernel, Thierry Reding,
	Aaron Kling

On Saturday, August 16, 2025 2:53 PM Aaron Kling via B4 Relay wrote:
> From: Aaron Kling <webgeek1234@gmail.com>
> 
> P3450's cpu is only rated for 1.4 GHz while the CVB table it uses tries
> to scale to 1.5 GHz. Set an appropriate limit on the maximum scaling
> frequency.

Looking at downstream, from what I can tell, the CPU's maximum frequency is indeed 1.55GHz under normal conditions. However, at temperatures over 90C, its voltage is limited to 1090mV. Reference:

static struct dvfs_therm_limits
tegra210_core_therm_caps_ucm2[MAX_THERMAL_LIMITS] = {
        {86, 1090},
        {0, 0},
};
(rel-32 kernel-4.9/drivers/soc/tegra/tegra210-dvfs.c)

Here the throttling is set at 86C, I suppose to give some margin.

1090mV perfectly matches the 1.479GHz operating point defined in the upstream kernel. So it seems to me that rather than setting a maximum frequency, we would need temperature dependent DVFS. Or, at least as a first step, we could have the driver just always limit the maximum frequency so it fits under the thermal cap voltage -- the temperature limit is rather high, after all.

If you have other information, please do tell.

Incidentally, some of the CVB tables in the upstream kernel seem to ignore speedo (I assume they are conservative) while rel-32 has different tables. So the upstream kernel is probably running at slightly unnecessarily high voltages.

Cheers,
Mikko

> 
> Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
> ---
>  arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts b/arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts
> index ec0e84cb83ef9bf8f0e52e2958db33666813917c..10f878d3f50815d1f0297d15669048ab9cad73ee 100644
> --- a/arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts
> +++ b/arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts
> @@ -594,6 +594,7 @@ clock@70110000 {
>  		nvidia,droop-ctrl = <0x00000f00>;
>  		nvidia,force-mode = <1>;
>  		nvidia,sample-rate = <25000>;
> +		nvidia,dfll-max-freq = <1479000000>;
>  
>  		nvidia,pwm-min-microvolts = <708000>;
>  		nvidia,pwm-period-nanoseconds = <2500>; /* 2.5us */
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] arm64: tegra: Limit max cpu frequency on P3450
  2025-09-03  5:50   ` Mikko Perttunen
@ 2025-09-03  6:28     ` Aaron Kling
  2025-09-03  7:29       ` Mikko Perttunen
  0 siblings, 1 reply; 16+ messages in thread
From: Aaron Kling @ 2025-09-03  6:28 UTC (permalink / raw)
  To: Mikko Perttunen
  Cc: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad, linux-clk, devicetree,
	linux-tegra, linux-kernel, Thierry Reding

On Wed, Sep 3, 2025 at 12:50 AM Mikko Perttunen <mperttunen@nvidia.com> wrote:
>
> On Saturday, August 16, 2025 2:53 PM Aaron Kling via B4 Relay wrote:
> > From: Aaron Kling <webgeek1234@gmail.com>
> >
> > P3450's cpu is only rated for 1.4 GHz while the CVB table it uses tries
> > to scale to 1.5 GHz. Set an appropriate limit on the maximum scaling
> > frequency.
>
> Looking at downstream, from what I can tell, the CPU's maximum frequency is indeed 1.55GHz under normal conditions. However, at temperatures over 90C, its voltage is limited to 1090mV. Reference:
>
> static struct dvfs_therm_limits
> tegra210_core_therm_caps_ucm2[MAX_THERMAL_LIMITS] = {
>         {86, 1090},
>         {0, 0},
> };
> (rel-32 kernel-4.9/drivers/soc/tegra/tegra210-dvfs.c)
>
> Here the throttling is set at 86C, I suppose to give some margin.
>
> 1090mV perfectly matches the 1.479GHz operating point defined in the upstream kernel. So it seems to me that rather than setting a maximum frequency, we would need temperature dependent DVFS. Or, at least as a first step, we could have the driver just always limit the maximum frequency so it fits under the thermal cap voltage -- the temperature limit is rather high, after all.
>
> If you have other information, please do tell.

I am basing on this line in the downstream porg dt repo:

nvidia,dfll-max-freq-khz = <1479000>;
(tegra-l4t-r32.7.6_good kernel-dts/tegra210-porg-p3448-common.dtsi)

Which in the downstream dfll driver limits the max frequency it will use:

        max_freq = fcpu_data->cpu_max_freq_table[speedo_id];
        if (!of_property_read_u32(pdev->dev.of_node, "nvidia,dfll-max-freq-khz",
                                  &f))
                max_freq = min(max_freq, f * 1000UL);
(tegra-l4t-r32.7.6_good drivers/clk/tegra/clk-tegra124-dfll-fcpu.c)

If I read the commit history correctly, it does appear that this limit
was set because the always-on use case was failing thermal tests. I
couldn't say if it was intentional that this throttling was applied to
all use cases or not, but that is what appears to have happened. Hence
trying to replicate here in an effort to squash stability issues.

> Incidentally, some of the CVB tables in the upstream kernel seem to ignore speedo (I assume they are conservative) while rel-32 has different tables. So the upstream kernel is probably running at slightly unnecessarily high voltages.

This is worrying as well, though most of those tables cannot currently
be used as the fuse driver never assigns those cpu speedo ids. All I
checked in this series was that the correct cpu speedo id was picked
and the appropriate CVB table was applied to p2371-2180, p3450-0000,
and p3541-0000. I haven't yet researched what the speedo values mean
and do. There's many other sku's missing as well. Such as the one's
used by the shield tv's. I have as of yet been unable to boot to
userspace on p2571-0930/1 or p2894-0050, so I haven't determined which
sku(s) are used by those to add them here. I'm in the process of
getting uart access to continue that endeavour.

Aaron

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] soc: tegra: fuse: speedo-tegra210: Update speedo ids
  2025-08-16  5:53 ` [PATCH 2/5] soc: tegra: fuse: speedo-tegra210: Update speedo ids Aaron Kling via B4 Relay
@ 2025-09-03  6:39   ` Mikko Perttunen
  0 siblings, 0 replies; 16+ messages in thread
From: Mikko Perttunen @ 2025-09-03  6:39 UTC (permalink / raw)
  To: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad, webgeek1234
  Cc: linux-clk, devicetree, linux-tegra, linux-kernel, Thierry Reding,
	Aaron Kling

On Saturday, August 16, 2025 2:53 PM Aaron Kling via B4 Relay wrote:
> From: Aaron Kling <webgeek1234@gmail.com>
> 
> Existing code only sets cpu and gpu speedo ids 0 and 1. The cpu dvfs
> code supports 11 ids and nouveau supports 5. This aligns with what the
> downstream vendor kernel supports. Align the existing supported skus
> with the downstream speedo list.
> 
> The Tegra210 CVB tables were added in the referenced fixes commit. Since
> then, all Tegra210 socs have tried to scale to 1.9 GHz, when the
> supported devkits are only supposed to scale to 1.5 or 1.7 GHZ.
> Overclocking should not be the default state.
> 
> Fixes: 2b2dbc2f94e5 ("clk: tegra: dfll: add CVB tables for Tegra210")
> Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
> ---
>  drivers/soc/tegra/fuse/speedo-tegra210.c | 31 ++++++++++++++++++++++++-------
>  1 file changed, 24 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/soc/tegra/fuse/speedo-tegra210.c b/drivers/soc/tegra/fuse/speedo-tegra210.c
> index 695d0b7f9a8abe53c497155603147420cda40b63..909fdf8fcc9d9f5ac936ae322e7a73dc720e5ab6 100644
> --- a/drivers/soc/tegra/fuse/speedo-tegra210.c
> +++ b/drivers/soc/tegra/fuse/speedo-tegra210.c
> @@ -68,18 +68,35 @@ static void __init rev_sku_to_speedo_ids(struct tegra_sku_info *sku_info,
>  	switch (sku) {
>  	case 0x00: /* Engineering SKU */
>  	case 0x01: /* Engineering SKU */
> +	case 0x13:
> +		if (speedo_rev >= 2) {
> +			sku_info->cpu_speedo_id = 5;
> +			sku_info->gpu_speedo_id = 2;
> +			break;
> +		}
> +
> +		sku_info->gpu_speedo_id = 1;
> +		break;
> +
>  	case 0x07:
>  	case 0x17:
> -	case 0x27:
> -		if (speedo_rev >= 2)
> -			sku_info->gpu_speedo_id = 1;
> +		if (speedo_rev >= 2) {
> +			sku_info->cpu_speedo_id = 7;
> +			sku_info->gpu_speedo_id = 2;
> +			break;
> +		}
> +
> +		sku_info->gpu_speedo_id = 1;
>  		break;
>  
> -	case 0x13:
> -		if (speedo_rev >= 2)
> -			sku_info->gpu_speedo_id = 1;
> +	case 0x27:
> +		if (speedo_rev >= 2) {
> +			sku_info->cpu_speedo_id = 1;
> +			sku_info->gpu_speedo_id = 2;
> +			break;
> +		}
>  
> -		sku_info->cpu_speedo_id = 1;
> +		sku_info->gpu_speedo_id = 1;
>  		break;
>  
>  	default:
> 
> 

This is getting unwieldy, so I think it'd be a good idea to restructure this to be more readable. Revision 1 chips are simple, so perhaps handle them separately -- something like ..

if speedo_rev >= 2 {
	switch (sku) {
	...
	}
} else if (sku == 0x00 || sku == 0x01 || sku == 0x07 || sku == 0x13 || sku == 0x17) {
	gpu_speedo_id = 1;
} else {
	print error 
}

We can also add rest of SKUs from downstream. By my count, this patch is missing:

	0x1F (same as 0x07, 0x17)
	0x83 (cpu=3, gpu=3)
	0x87 (cpu=2, gpu=1)

(all speedo_rev >= 2). Plus the 0x8F you're adding in the next patch. I think that can be merged into this patch as well.

Thanks!
Mikko




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] arm64: tegra: Limit max cpu frequency on P3450
  2025-09-03  6:28     ` Aaron Kling
@ 2025-09-03  7:29       ` Mikko Perttunen
  2025-09-03  8:01         ` Aaron Kling
  0 siblings, 1 reply; 16+ messages in thread
From: Mikko Perttunen @ 2025-09-03  7:29 UTC (permalink / raw)
  To: Aaron Kling
  Cc: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad, linux-clk, devicetree,
	linux-tegra, linux-kernel, Thierry Reding

On Wednesday, September 3, 2025 3:28 PM Aaron Kling wrote:
> On Wed, Sep 3, 2025 at 12:50 AM Mikko Perttunen <mperttunen@nvidia.com> wrote:
> >
> > On Saturday, August 16, 2025 2:53 PM Aaron Kling via B4 Relay wrote:
> > > From: Aaron Kling <webgeek1234@gmail.com>
> > >
> > > P3450's cpu is only rated for 1.4 GHz while the CVB table it uses tries
> > > to scale to 1.5 GHz. Set an appropriate limit on the maximum scaling
> > > frequency.
> >
> > Looking at downstream, from what I can tell, the CPU's maximum frequency is indeed 1.55GHz under normal conditions. However, at temperatures over 90C, its voltage is limited to 1090mV. Reference:
> >
> > static struct dvfs_therm_limits
> > tegra210_core_therm_caps_ucm2[MAX_THERMAL_LIMITS] = {
> >         {86, 1090},
> >         {0, 0},
> > };
> > (rel-32 kernel-4.9/drivers/soc/tegra/tegra210-dvfs.c)
> >
> > Here the throttling is set at 86C, I suppose to give some margin.
> >
> > 1090mV perfectly matches the 1.479GHz operating point defined in the upstream kernel. So it seems to me that rather than setting a maximum frequency, we would need temperature dependent DVFS. Or, at least as a first step, we could have the driver just always limit the maximum frequency so it fits under the thermal cap voltage -- the temperature limit is rather high, after all.
> >
> > If you have other information, please do tell.
> 
> I am basing on this line in the downstream porg dt repo:
> 
> nvidia,dfll-max-freq-khz = <1479000>;
> (tegra-l4t-r32.7.6_good kernel-dts/tegra210-porg-p3448-common.dtsi)
> 
> Which in the downstream dfll driver limits the max frequency it will use:
> 
>         max_freq = fcpu_data->cpu_max_freq_table[speedo_id];
>         if (!of_property_read_u32(pdev->dev.of_node, "nvidia,dfll-max-freq-khz",
>                                   &f))
>                 max_freq = min(max_freq, f * 1000UL);
> (tegra-l4t-r32.7.6_good drivers/clk/tegra/clk-tegra124-dfll-fcpu.c)
> 
> If I read the commit history correctly, it does appear that this limit
> was set because the always-on use case was failing thermal tests. I
> couldn't say if it was intentional that this throttling was applied to
> all use cases or not, but that is what appears to have happened. Hence
> trying to replicate here in an effort to squash stability issues.

I can't see any reference to failing thermal tests. Can to point to the commit?

I looked into why this was added for porg -- it does not seem to be related to reliability, but more so consistency of performance. I don't think that's a huge concern for upstream -- though in any case we should be capping the frequency in the DFLL driver for now since we don't support dynamic thermal capping.

> 
> > Incidentally, some of the CVB tables in the upstream kernel seem to ignore speedo (I assume they are conservative) while rel-32 has different tables. So the upstream kernel is probably running at slightly unnecessarily high voltages.
> 
> This is worrying as well, though most of those tables cannot currently
> be used as the fuse driver never assigns those cpu speedo ids. All I
> checked in this series was that the correct cpu speedo id was picked
> and the appropriate CVB table was applied to p2371-2180, p3450-0000,
> and p3541-0000. I haven't yet researched what the speedo values mean
> and do. There's many other sku's missing as well. Such as the one's
> used by the shield tv's. I have as of yet been unable to boot to
> userspace on p2571-0930/1 or p2894-0050, so I haven't determined which
> sku(s) are used by those to add them here. I'm in the process of
> getting uart access to continue that endeavour.

The speedo values are coefficients used to calculate voltage requirements for frequency operating points. Usually that kind of stuff starts with conservative constant values that are then refined, so my assumption is that the fixed values we currently have are safe but unoptimal (at least when they get used).

> 
> Aaron





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] arm64: tegra: Limit max cpu frequency on P3450
  2025-09-03  7:29       ` Mikko Perttunen
@ 2025-09-03  8:01         ` Aaron Kling
  2025-09-04  0:55           ` Mikko Perttunen
  0 siblings, 1 reply; 16+ messages in thread
From: Aaron Kling @ 2025-09-03  8:01 UTC (permalink / raw)
  To: Mikko Perttunen
  Cc: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad, linux-clk, devicetree,
	linux-tegra, linux-kernel, Thierry Reding

On Wed, Sep 3, 2025 at 2:29 AM Mikko Perttunen <mperttunen@nvidia.com> wrote:
>
> On Wednesday, September 3, 2025 3:28 PM Aaron Kling wrote:
> > On Wed, Sep 3, 2025 at 12:50 AM Mikko Perttunen <mperttunen@nvidia.com> wrote:
> > >
> > > On Saturday, August 16, 2025 2:53 PM Aaron Kling via B4 Relay wrote:
> > > > From: Aaron Kling <webgeek1234@gmail.com>
> > > >
> > > > P3450's cpu is only rated for 1.4 GHz while the CVB table it uses tries
> > > > to scale to 1.5 GHz. Set an appropriate limit on the maximum scaling
> > > > frequency.
> > >
> > > Looking at downstream, from what I can tell, the CPU's maximum frequency is indeed 1.55GHz under normal conditions. However, at temperatures over 90C, its voltage is limited to 1090mV. Reference:
> > >
> > > static struct dvfs_therm_limits
> > > tegra210_core_therm_caps_ucm2[MAX_THERMAL_LIMITS] = {
> > >         {86, 1090},
> > >         {0, 0},
> > > };
> > > (rel-32 kernel-4.9/drivers/soc/tegra/tegra210-dvfs.c)
> > >
> > > Here the throttling is set at 86C, I suppose to give some margin.
> > >
> > > 1090mV perfectly matches the 1.479GHz operating point defined in the upstream kernel. So it seems to me that rather than setting a maximum frequency, we would need temperature dependent DVFS. Or, at least as a first step, we could have the driver just always limit the maximum frequency so it fits under the thermal cap voltage -- the temperature limit is rather high, after all.
> > >
> > > If you have other information, please do tell.
> >
> > I am basing on this line in the downstream porg dt repo:
> >
> > nvidia,dfll-max-freq-khz = <1479000>;
> > (tegra-l4t-r32.7.6_good kernel-dts/tegra210-porg-p3448-common.dtsi)
> >
> > Which in the downstream dfll driver limits the max frequency it will use:
> >
> >         max_freq = fcpu_data->cpu_max_freq_table[speedo_id];
> >         if (!of_property_read_u32(pdev->dev.of_node, "nvidia,dfll-max-freq-khz",
> >                                   &f))
> >                 max_freq = min(max_freq, f * 1000UL);
> > (tegra-l4t-r32.7.6_good drivers/clk/tegra/clk-tegra124-dfll-fcpu.c)
> >
> > If I read the commit history correctly, it does appear that this limit
> > was set because the always-on use case was failing thermal tests. I
> > couldn't say if it was intentional that this throttling was applied to
> > all use cases or not, but that is what appears to have happened. Hence
> > trying to replicate here in an effort to squash stability issues.
>
> I can't see any reference to failing thermal tests. Can to point to the commit?

In the porg dt repo, commit hash d1326f08, which adds the
nvidia,dfll-max-freq-khz property, the message body states: "Set
CPU/GPU Fmax limit for 24x7 105C UCM." I read that to mean that the
24x7 always-on use case model was failing to stay under 105C unless
the cpu and gpu frequencies were limited. Is that an incorrect
reading? 105C is kind of a crazy number anyways, beyond the soctherm
critical shutdown temperature.

> I looked into why this was added for porg -- it does not seem to be related to reliability, but more so consistency of performance. I don't think that's a huge concern for upstream -- though in any case we should be capping the frequency in the DFLL driver for now since we don't support dynamic thermal capping.

So the whole conversation winds around to: The change is valid, but
the commit message needs better justification?

As a side note: I'm still chasing multiple stability issues on various
t210 devices. Though, the only one I've seen on p3450/p3541 is that
nouveau intermittently fails to init the gpu. Just hangs on probe and
eventually something times out, stack traces, and causes a panic
reboot. Seems to be about a 50/50 chance for me, but works fine if
probe succeeds. For another dev, it only works once in a blue moon,
but still dies shortly thereafter even if probe works. I thought it
might be related to the cpu/gpu getting 'overclocked'. But even after
this series, the problem persists. So maybe me calling this underclock
a stability fix is inaccurate. But stability issues still exist.

Aaron

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] arm64: tegra: Limit max cpu frequency on P3450
  2025-09-03  8:01         ` Aaron Kling
@ 2025-09-04  0:55           ` Mikko Perttunen
  2025-09-04  1:55             ` Aaron Kling
  0 siblings, 1 reply; 16+ messages in thread
From: Mikko Perttunen @ 2025-09-04  0:55 UTC (permalink / raw)
  To: Aaron Kling
  Cc: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad, linux-clk, devicetree,
	linux-tegra, linux-kernel, Thierry Reding

On Wednesday, September 3, 2025 5:01 PM Aaron Kling wrote:
> On Wed, Sep 3, 2025 at 2:29 AM Mikko Perttunen <mperttunen@nvidia.com> wrote:
> >
> > On Wednesday, September 3, 2025 3:28 PM Aaron Kling wrote:
> > > On Wed, Sep 3, 2025 at 12:50 AM Mikko Perttunen <mperttunen@nvidia.com> wrote:
> > > >
> > > > On Saturday, August 16, 2025 2:53 PM Aaron Kling via B4 Relay wrote:
> > > > > From: Aaron Kling <webgeek1234@gmail.com>
> > > > >
> > > > > P3450's cpu is only rated for 1.4 GHz while the CVB table it uses tries
> > > > > to scale to 1.5 GHz. Set an appropriate limit on the maximum scaling
> > > > > frequency.
> > > >
> > > > Looking at downstream, from what I can tell, the CPU's maximum frequency is indeed 1.55GHz under normal conditions. However, at temperatures over 90C, its voltage is limited to 1090mV. Reference:
> > > >
> > > > static struct dvfs_therm_limits
> > > > tegra210_core_therm_caps_ucm2[MAX_THERMAL_LIMITS] = {
> > > >         {86, 1090},
> > > >         {0, 0},
> > > > };
> > > > (rel-32 kernel-4.9/drivers/soc/tegra/tegra210-dvfs.c)
> > > >
> > > > Here the throttling is set at 86C, I suppose to give some margin.
> > > >
> > > > 1090mV perfectly matches the 1.479GHz operating point defined in the upstream kernel. So it seems to me that rather than setting a maximum frequency, we would need temperature dependent DVFS. Or, at least as a first step, we could have the driver just always limit the maximum frequency so it fits under the thermal cap voltage -- the temperature limit is rather high, after all.
> > > >
> > > > If you have other information, please do tell.
> > >
> > > I am basing on this line in the downstream porg dt repo:
> > >
> > > nvidia,dfll-max-freq-khz = <1479000>;
> > > (tegra-l4t-r32.7.6_good kernel-dts/tegra210-porg-p3448-common.dtsi)
> > >
> > > Which in the downstream dfll driver limits the max frequency it will use:
> > >
> > >         max_freq = fcpu_data->cpu_max_freq_table[speedo_id];
> > >         if (!of_property_read_u32(pdev->dev.of_node, "nvidia,dfll-max-freq-khz",
> > >                                   &f))
> > >                 max_freq = min(max_freq, f * 1000UL);
> > > (tegra-l4t-r32.7.6_good drivers/clk/tegra/clk-tegra124-dfll-fcpu.c)
> > >
> > > If I read the commit history correctly, it does appear that this limit
> > > was set because the always-on use case was failing thermal tests. I
> > > couldn't say if it was intentional that this throttling was applied to
> > > all use cases or not, but that is what appears to have happened. Hence
> > > trying to replicate here in an effort to squash stability issues.
> >
> > I can't see any reference to failing thermal tests. Can to point to the commit?
> 
> In the porg dt repo, commit hash d1326f08, which adds the
> nvidia,dfll-max-freq-khz property, the message body states: "Set
> CPU/GPU Fmax limit for 24x7 105C UCM." I read that to mean that the
> 24x7 always-on use case model was failing to stay under 105C unless
> the cpu and gpu frequencies were limited. Is that an incorrect
> reading? 105C is kind of a crazy number anyways, beyond the soctherm
> critical shutdown temperature.

What that's (trying) to say is that it sets the CPU's Fmax to the limit specified by the 24x7 105C UCM profile, which is the 1090mV i.e. 1.4GHz limit. The profile is called that because it's normally used for the 90C-105C temperature range.

> 
> > I looked into why this was added for porg -- it does not seem to be related to reliability, but more so consistency of performance. I don't think that's a huge concern for upstream -- though in any case we should be capping the frequency in the DFLL driver for now since we don't support dynamic thermal capping.
> 
> So the whole conversation winds around to: The change is valid, but
> the commit message needs better justification?

In my opinion, there is no need to add the device tree property in upstream. The CPU is designed to work at 1.5GHz under 90C, and 1.4GHz between 90C to 105C. I think this is a bit of a downstream-ism and not something we should add in upstream. If the user wants to underclock, then that should be through the cpufreq governor or such mechanism.

> 
> As a side note: I'm still chasing multiple stability issues on various
> t210 devices. Though, the only one I've seen on p3450/p3541 is that
> nouveau intermittently fails to init the gpu. Just hangs on probe and
> eventually something times out, stack traces, and causes a panic
> reboot. Seems to be about a 50/50 chance for me, but works fine if
> probe succeeds. For another dev, it only works once in a blue moon,
> but still dies shortly thereafter even if probe works. I thought it
> might be related to the cpu/gpu getting 'overclocked'. But even after
> this series, the problem persists. So maybe me calling this underclock
> a stability fix is inaccurate. But stability issues still exist.

Good to know. It doesn't strike me as a CPU issue -- I'd put the first place to look at nouveau's init code itself to see what is failing. There's a lot of potential software issues that can cause intermittencies during GPU boot. If power related, GPU or SOC rail.

Thanks,
Mikko

> 
> Aaron





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] arm64: tegra: Limit max cpu frequency on P3450
  2025-09-04  0:55           ` Mikko Perttunen
@ 2025-09-04  1:55             ` Aaron Kling
  0 siblings, 0 replies; 16+ messages in thread
From: Aaron Kling @ 2025-09-04  1:55 UTC (permalink / raw)
  To: Mikko Perttunen
  Cc: Michael Turquette, Stephen Boyd, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Thierry Reding, Jonathan Hunter, Joseph Lo,
	Peter De Schrijver, Prashant Gaikwad, linux-clk, devicetree,
	linux-tegra, linux-kernel, Thierry Reding

On Wed, Sep 3, 2025 at 7:56 PM Mikko Perttunen <mperttunen@nvidia.com> wrote:
>
> On Wednesday, September 3, 2025 5:01 PM Aaron Kling wrote:
> > On Wed, Sep 3, 2025 at 2:29 AM Mikko Perttunen <mperttunen@nvidia.com> wrote:
> > >
> > > On Wednesday, September 3, 2025 3:28 PM Aaron Kling wrote:
> > > > On Wed, Sep 3, 2025 at 12:50 AM Mikko Perttunen <mperttunen@nvidia.com> wrote:
> > > > >
> > > > > On Saturday, August 16, 2025 2:53 PM Aaron Kling via B4 Relay wrote:
> > > > > > From: Aaron Kling <webgeek1234@gmail.com>
> > > > > >
> > > > > > P3450's cpu is only rated for 1.4 GHz while the CVB table it uses tries
> > > > > > to scale to 1.5 GHz. Set an appropriate limit on the maximum scaling
> > > > > > frequency.
> > > > >
> > > > > Looking at downstream, from what I can tell, the CPU's maximum frequency is indeed 1.55GHz under normal conditions. However, at temperatures over 90C, its voltage is limited to 1090mV. Reference:
> > > > >
> > > > > static struct dvfs_therm_limits
> > > > > tegra210_core_therm_caps_ucm2[MAX_THERMAL_LIMITS] = {
> > > > >         {86, 1090},
> > > > >         {0, 0},
> > > > > };
> > > > > (rel-32 kernel-4.9/drivers/soc/tegra/tegra210-dvfs.c)
> > > > >
> > > > > Here the throttling is set at 86C, I suppose to give some margin.
> > > > >
> > > > > 1090mV perfectly matches the 1.479GHz operating point defined in the upstream kernel. So it seems to me that rather than setting a maximum frequency, we would need temperature dependent DVFS. Or, at least as a first step, we could have the driver just always limit the maximum frequency so it fits under the thermal cap voltage -- the temperature limit is rather high, after all.
> > > > >
> > > > > If you have other information, please do tell.
> > > >
> > > > I am basing on this line in the downstream porg dt repo:
> > > >
> > > > nvidia,dfll-max-freq-khz = <1479000>;
> > > > (tegra-l4t-r32.7.6_good kernel-dts/tegra210-porg-p3448-common.dtsi)
> > > >
> > > > Which in the downstream dfll driver limits the max frequency it will use:
> > > >
> > > >         max_freq = fcpu_data->cpu_max_freq_table[speedo_id];
> > > >         if (!of_property_read_u32(pdev->dev.of_node, "nvidia,dfll-max-freq-khz",
> > > >                                   &f))
> > > >                 max_freq = min(max_freq, f * 1000UL);
> > > > (tegra-l4t-r32.7.6_good drivers/clk/tegra/clk-tegra124-dfll-fcpu.c)
> > > >
> > > > If I read the commit history correctly, it does appear that this limit
> > > > was set because the always-on use case was failing thermal tests. I
> > > > couldn't say if it was intentional that this throttling was applied to
> > > > all use cases or not, but that is what appears to have happened. Hence
> > > > trying to replicate here in an effort to squash stability issues.
> > >
> > > I can't see any reference to failing thermal tests. Can to point to the commit?
> >
> > In the porg dt repo, commit hash d1326f08, which adds the
> > nvidia,dfll-max-freq-khz property, the message body states: "Set
> > CPU/GPU Fmax limit for 24x7 105C UCM." I read that to mean that the
> > 24x7 always-on use case model was failing to stay under 105C unless
> > the cpu and gpu frequencies were limited. Is that an incorrect
> > reading? 105C is kind of a crazy number anyways, beyond the soctherm
> > critical shutdown temperature.
>
> What that's (trying) to say is that it sets the CPU's Fmax to the limit specified by the 24x7 105C UCM profile, which is the 1090mV i.e. 1.4GHz limit. The profile is called that because it's normally used for the 90C-105C temperature range.
>
> >
> > > I looked into why this was added for porg -- it does not seem to be related to reliability, but more so consistency of performance. I don't think that's a huge concern for upstream -- though in any case we should be capping the frequency in the DFLL driver for now since we don't support dynamic thermal capping.
> >
> > So the whole conversation winds around to: The change is valid, but
> > the commit message needs better justification?
>
> In my opinion, there is no need to add the device tree property in upstream. The CPU is designed to work at 1.5GHz under 90C, and 1.4GHz between 90C to 105C. I think this is a bit of a downstream-ism and not something we should add in upstream. If the user wants to underclock, then that should be through the cpufreq governor or such mechanism.

Ah, alright. I'll drop property handling and send a new revision.
Which reduces this to only the speedo-tegra210 patch.

Aaron

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-09-04  1:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-16  5:53 [PATCH 0/5] Properly Limit Tegra210 Clock Rates Aaron Kling via B4 Relay
2025-08-16  5:53 ` [PATCH 1/5] dt-bindings: clock: tegra124-dfll: Add property to limit frequency Aaron Kling via B4 Relay
2025-08-16  8:21   ` Krzysztof Kozlowski
2025-08-18  3:23     ` Aaron Kling
2025-08-18  6:31       ` Krzysztof Kozlowski
2025-08-16  5:53 ` [PATCH 2/5] soc: tegra: fuse: speedo-tegra210: Update speedo ids Aaron Kling via B4 Relay
2025-09-03  6:39   ` Mikko Perttunen
2025-08-16  5:53 ` [PATCH 3/5] soc: tegra: fuse: speedo-tegra210: Add sku 0x8F Aaron Kling via B4 Relay
2025-08-16  5:53 ` [PATCH 4/5] clk: tegra: dfll: Support limiting max clock per device Aaron Kling via B4 Relay
2025-08-16  5:53 ` [PATCH 5/5] arm64: tegra: Limit max cpu frequency on P3450 Aaron Kling via B4 Relay
2025-09-03  5:50   ` Mikko Perttunen
2025-09-03  6:28     ` Aaron Kling
2025-09-03  7:29       ` Mikko Perttunen
2025-09-03  8:01         ` Aaron Kling
2025-09-04  0:55           ` Mikko Perttunen
2025-09-04  1:55             ` Aaron Kling

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).