devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk
@ 2025-06-25 10:01 Marek Vasut
  2025-06-26 21:41 ` Niklas Söderlund
  2025-08-06  9:35 ` Geert Uytterhoeven
  0 siblings, 2 replies; 9+ messages in thread
From: Marek Vasut @ 2025-06-25 10:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marek Vasut, Conor Dooley, Geert Uytterhoeven,
	Krzysztof Kozlowski, Magnus Damm, Niklas Söderlund,
	Rob Herring, devicetree, linux-renesas-soc

Since the Sparrow Hawk has a smaller PCB than the White Hawk, it tends
to generate more heat. To prevent potential damage to the board, adjust
the temperature trip points.

Add four "passive" trip points which increasingly throttle the CPU to
prevent overheating. The first trip point at 68°C disables the 1.8 GHz
and 1.7 GHz modes and limits the CPU to 1.5 GHz frequency. The second
trip point at 72°C disables the 1.5 GHz mode and limits the CPU to 1.0
GHz frequency. The third trip point at 76°C uses thermal-idle to start
inserting idle cycles into the CPU instruction stream to cool the CPU
cores down. The fourth and last trip point at 80°C disables the 1.0 GHz
mode and limits the CPU to 500 MHz frequency.

In case the SoC heats up further, in case either of the thermal sensors
readings passes the 100°C, a thermal shutdown is triggered to prevent
any damage to the hardware.

Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
---
Cc: Conor Dooley <conor+dt@kernel.org>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Krzysztof Kozlowski <krzk+dt@kernel.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: "Niklas Söderlund" <niklas.soderlund@ragnatech.se>
Cc: Rob Herring <robh@kernel.org>
Cc: devicetree@vger.kernel.org
Cc: linux-renesas-soc@vger.kernel.org
---
 .../dts/renesas/r8a779g3-sparrow-hawk.dts     | 137 ++++++++++++++++++
 1 file changed, 137 insertions(+)

diff --git a/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts b/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts
index 9ba23129e65e..ba81df3c779d 100644
--- a/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts
+++ b/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts
@@ -38,6 +38,7 @@
 
 /dts-v1/;
 #include <dt-bindings/gpio/gpio.h>
+#include <dt-bindings/thermal/thermal.h>
 
 #include "r8a779g3.dtsi"
 
@@ -797,3 +798,139 @@ &rwdt {
 &scif_clk {	/* X12 */
 	clock-frequency = <24000000>;
 };
+
+/* thermal-idle cooling for all SoC cores */
+&a76_0 {
+	#cooling-cells = <2>;
+
+	a76_0_thermal_idle: thermal-idle {
+		#cooling-cells = <2>;
+		duration-us = <10000>;
+		exit-latency-us = <500>;
+	};
+};
+
+&a76_1 {
+	a76_1_thermal_idle: thermal-idle {
+		#cooling-cells = <2>;
+		duration-us = <10000>;
+		exit-latency-us = <500>;
+	};
+};
+
+&a76_2 {
+	a76_2_thermal_idle: thermal-idle {
+		#cooling-cells = <2>;
+		duration-us = <10000>;
+		exit-latency-us = <500>;
+	};
+};
+
+&a76_3 {
+	a76_3_thermal_idle: thermal-idle {
+		#cooling-cells = <2>;
+		duration-us = <10000>;
+		exit-latency-us = <500>;
+	};
+};
+
+/* THS sensors in SoC, critical temperature trip point is 100C */
+&sensor1_crit {
+	temperature = <100000>;
+};
+
+&sensor2_crit {
+	temperature = <100000>;
+};
+
+&sensor3_crit {
+	temperature = <100000>;
+};
+
+&sensor4_crit {
+	temperature = <100000>;
+};
+
+&sensor_thermal_cr52 {
+	critical-action = "shutdown";
+};
+
+&sensor_thermal_cnn {
+	critical-action = "shutdown";
+};
+
+/* THS sensor in SoC near CA76 cores does more progressive cooling. */
+&sensor_thermal_ca76 {
+	critical-action = "shutdown";
+
+	cooling-maps {
+		/*
+		 * The cooling-device minimum and maximum parameters inversely
+		 * match opp-table-0 {} node entries in r8a779g0.dtsi, in other
+		 * words, 0 refers to 1.8 GHz OPP and 4 refers to 500 MHz OPP.
+		 * This is because they refer to cooling levels, where maximum
+		 * cooling level happens at 500 MHz OPP, when the CPU core is
+		 * running slowly and therefore generates least heat.
+		 */
+		map0 {
+			/* At 68C, inhibit 1.7 GHz and 1.8 GHz modes */
+			trip = <&sensor3_passive_low>;
+			cooling-device = <&a76_0 2 4>;
+			contribution = <128>;
+		};
+
+		map1 {
+			/* At 72C, inhibit 1.5 GHz mode */
+			trip = <&sensor3_passive_mid>;
+			cooling-device = <&a76_0 3 4>;
+			contribution = <256>;
+		};
+
+		map2 {
+			/* At 76C, start injecting idle states */
+			trip = <&sensor3_passive_hi>;
+			cooling-device = <&a76_0_thermal_idle 0 80>,
+					 <&a76_1_thermal_idle 0 80>,
+					 <&a76_2_thermal_idle 0 80>,
+					 <&a76_3_thermal_idle 0 80>;
+			contribution = <512>;
+		};
+
+		map3 {
+			/* At 80C, inhibit 1.0 GHz mode */
+			trip = <&sensor3_passive_crit>;
+			cooling-device = <&a76_0 4 4>;
+			contribution = <1024>;
+		};
+	};
+
+	trips {
+		sensor3_passive_low: sensor3-passive-low {
+			temperature = <68000>;
+			hysteresis = <2000>;
+			type = "passive";
+		};
+
+		sensor3_passive_mid: sensor3-passive-mid {
+			temperature = <72000>;
+			hysteresis = <2000>;
+			type = "passive";
+		};
+
+		sensor3_passive_hi: sensor3-passive-hi {
+			temperature = <76000>;
+			hysteresis = <2000>;
+			type = "passive";
+		};
+
+		sensor3_passive_crit: sensor3-passive-crit {
+			temperature = <80000>;
+			hysteresis = <2000>;
+			type = "passive";
+		};
+	};
+};
+
+&sensor_thermal_ddr1 {
+	critical-action = "shutdown";
+};
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk
  2025-06-25 10:01 [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk Marek Vasut
@ 2025-06-26 21:41 ` Niklas Söderlund
  2025-06-29 22:32   ` Marek Vasut
  2025-08-06  9:35 ` Geert Uytterhoeven
  1 sibling, 1 reply; 9+ messages in thread
From: Niklas Söderlund @ 2025-06-26 21:41 UTC (permalink / raw)
  To: Marek Vasut
  Cc: linux-arm-kernel, Conor Dooley, Geert Uytterhoeven,
	Krzysztof Kozlowski, Magnus Damm, Rob Herring, devicetree,
	linux-renesas-soc

Hi Marek,

Thanks for your work.

On 2025-06-25 12:01:56 +0200, Marek Vasut wrote:
> Since the Sparrow Hawk has a smaller PCB than the White Hawk, it tends
> to generate more heat. To prevent potential damage to the board, adjust
> the temperature trip points.
> 
> Add four "passive" trip points which increasingly throttle the CPU to
> prevent overheating. The first trip point at 68°C disables the 1.8 GHz
> and 1.7 GHz modes and limits the CPU to 1.5 GHz frequency. The second
> trip point at 72°C disables the 1.5 GHz mode and limits the CPU to 1.0
> GHz frequency. The third trip point at 76°C uses thermal-idle to start
> inserting idle cycles into the CPU instruction stream to cool the CPU
> cores down. The fourth and last trip point at 80°C disables the 1.0 GHz
> mode and limits the CPU to 500 MHz frequency.
> 
> In case the SoC heats up further, in case either of the thermal sensors
> readings passes the 100°C, a thermal shutdown is triggered to prevent
> any damage to the hardware.
> 
> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
> ---
> Cc: Conor Dooley <conor+dt@kernel.org>
> Cc: Geert Uytterhoeven <geert+renesas@glider.be>
> Cc: Krzysztof Kozlowski <krzk+dt@kernel.org>
> Cc: Magnus Damm <magnus.damm@gmail.com>
> Cc: "Niklas Söderlund" <niklas.soderlund@ragnatech.se>
> Cc: Rob Herring <robh@kernel.org>
> Cc: devicetree@vger.kernel.org
> Cc: linux-renesas-soc@vger.kernel.org
> ---
>  .../dts/renesas/r8a779g3-sparrow-hawk.dts     | 137 ++++++++++++++++++
>  1 file changed, 137 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts b/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts
> index 9ba23129e65e..ba81df3c779d 100644
> --- a/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts
> +++ b/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts
> @@ -38,6 +38,7 @@
>  
>  /dts-v1/;
>  #include <dt-bindings/gpio/gpio.h>
> +#include <dt-bindings/thermal/thermal.h>
>  
>  #include "r8a779g3.dtsi"
>  
> @@ -797,3 +798,139 @@ &rwdt {
>  &scif_clk {	/* X12 */
>  	clock-frequency = <24000000>;
>  };
> +
> +/* thermal-idle cooling for all SoC cores */
> +&a76_0 {
> +	#cooling-cells = <2>;
> +
> +	a76_0_thermal_idle: thermal-idle {
> +		#cooling-cells = <2>;
> +		duration-us = <10000>;
> +		exit-latency-us = <500>;
> +	};
> +};
> +
> +&a76_1 {
> +	a76_1_thermal_idle: thermal-idle {
> +		#cooling-cells = <2>;
> +		duration-us = <10000>;
> +		exit-latency-us = <500>;
> +	};
> +};
> +
> +&a76_2 {
> +	a76_2_thermal_idle: thermal-idle {
> +		#cooling-cells = <2>;
> +		duration-us = <10000>;
> +		exit-latency-us = <500>;
> +	};
> +};
> +
> +&a76_3 {
> +	a76_3_thermal_idle: thermal-idle {
> +		#cooling-cells = <2>;
> +		duration-us = <10000>;
> +		exit-latency-us = <500>;
> +	};
> +};

I did not know you could do this and use it as a cooling device, thanks 
for teaching me something new!

> +
> +/* THS sensors in SoC, critical temperature trip point is 100C */
> +&sensor1_crit {
> +	temperature = <100000>;
> +};
> +
> +&sensor2_crit {
> +	temperature = <100000>;
> +};
> +
> +&sensor3_crit {
> +	temperature = <100000>;
> +};
> +
> +&sensor4_crit {
> +	temperature = <100000>;
> +};
> +
> +&sensor_thermal_cr52 {
> +	critical-action = "shutdown";
> +};
> +
> +&sensor_thermal_cnn {
> +	critical-action = "shutdown";
> +};

Is this not the default action for critical trip points? In my testing 
in the past R-Car systems have always shutdown when the critical trip is 
reached. If it's not I think we should move these to r8a779g0.dtsi. And 
likely add them for all other SoCs too?

> +
> +/* THS sensor in SoC near CA76 cores does more progressive cooling. */
> +&sensor_thermal_ca76 {
> +	critical-action = "shutdown";
> +
> +	cooling-maps {
> +		/*
> +		 * The cooling-device minimum and maximum parameters inversely
> +		 * match opp-table-0 {} node entries in r8a779g0.dtsi, in other
> +		 * words, 0 refers to 1.8 GHz OPP and 4 refers to 500 MHz OPP.
> +		 * This is because they refer to cooling levels, where maximum
> +		 * cooling level happens at 500 MHz OPP, when the CPU core is
> +		 * running slowly and therefore generates least heat.
> +		 */
> +		map0 {
> +			/* At 68C, inhibit 1.7 GHz and 1.8 GHz modes */
> +			trip = <&sensor3_passive_low>;
> +			cooling-device = <&a76_0 2 4>;
> +			contribution = <128>;
> +		};
> +
> +		map1 {
> +			/* At 72C, inhibit 1.5 GHz mode */
> +			trip = <&sensor3_passive_mid>;
> +			cooling-device = <&a76_0 3 4>;
> +			contribution = <256>;
> +		};
> +
> +		map2 {
> +			/* At 76C, start injecting idle states */
> +			trip = <&sensor3_passive_hi>;
> +			cooling-device = <&a76_0_thermal_idle 0 80>,
> +					 <&a76_1_thermal_idle 0 80>,
> +					 <&a76_2_thermal_idle 0 80>,
> +					 <&a76_3_thermal_idle 0 80>;
> +			contribution = <512>;
> +		};
> +
> +		map3 {
> +			/* At 80C, inhibit 1.0 GHz mode */
> +			trip = <&sensor3_passive_crit>;
> +			cooling-device = <&a76_0 4 4>;
> +			contribution = <1024>;
> +		};
> +	};
> +
> +	trips {
> +		sensor3_passive_low: sensor3-passive-low {
> +			temperature = <68000>;
> +			hysteresis = <2000>;
> +			type = "passive";
> +		};
> +
> +		sensor3_passive_mid: sensor3-passive-mid {
> +			temperature = <72000>;
> +			hysteresis = <2000>;
> +			type = "passive";
> +		};
> +
> +		sensor3_passive_hi: sensor3-passive-hi {
> +			temperature = <76000>;
> +			hysteresis = <2000>;
> +			type = "passive";
> +		};
> +
> +		sensor3_passive_crit: sensor3-passive-crit {
> +			temperature = <80000>;
> +			hysteresis = <2000>;
> +			type = "passive";
> +		};
> +	};
> +};
> +
> +&sensor_thermal_ddr1 {
> +	critical-action = "shutdown";
> +};
> -- 
> 2.47.2
> 

-- 
Kind Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk
  2025-06-26 21:41 ` Niklas Söderlund
@ 2025-06-29 22:32   ` Marek Vasut
  2025-06-30  8:13     ` Niklas Söderlund
  0 siblings, 1 reply; 9+ messages in thread
From: Marek Vasut @ 2025-06-29 22:32 UTC (permalink / raw)
  To: Niklas Söderlund, Marek Vasut
  Cc: linux-arm-kernel, Conor Dooley, Geert Uytterhoeven,
	Krzysztof Kozlowski, Magnus Damm, Rob Herring, devicetree,
	linux-renesas-soc

On 6/26/25 11:41 PM, Niklas Söderlund wrote:

Hello Niklas,

>> +&a76_3 {
>> +	a76_3_thermal_idle: thermal-idle {
>> +		#cooling-cells = <2>;
>> +		duration-us = <10000>;
>> +		exit-latency-us = <500>;
>> +	};
>> +};
> 
> I did not know you could do this and use it as a cooling device, thanks
> for teaching me something new!

You could, although the cooling effect may vary. Some cores enter e.g. 
clock stop during idle and then they really cool down, some do not.

>> +/* THS sensors in SoC, critical temperature trip point is 100C */
>> +&sensor1_crit {
>> +	temperature = <100000>;
>> +};
>> +
>> +&sensor2_crit {
>> +	temperature = <100000>;
>> +};
>> +
>> +&sensor3_crit {
>> +	temperature = <100000>;
>> +};
>> +
>> +&sensor4_crit {
>> +	temperature = <100000>;
>> +};
>> +
>> +&sensor_thermal_cr52 {
>> +	critical-action = "shutdown";
>> +};
>> +
>> +&sensor_thermal_cnn {
>> +	critical-action = "shutdown";
>> +};
> 
> Is this not the default action for critical trip points? In my testing
> in the past R-Car systems have always shutdown when the critical trip is
> reached.

It isn't quite that clear cut.

drivers/thermal/thermal_of.c thermal_of_zone_register() contains this 
piece of code:

"
407         ret = of_property_read_string(np, "critical-action", &action);
408         if (!ret && !of_ops.critical) {
409                 if (!strcasecmp(action, "reboot"))
410                         of_ops.critical = 
thermal_zone_device_critical_reboot;
411                 else if (!strcasecmp(action, "shutdown"))
412                         of_ops.critical = 
thermal_zone_device_critical_shutdown;
413         }
"

If "critical-action" DT property is not set, then of_ops.critical are 
not modified.

drivers/thermal/thermal_core.c thermal_zone_device_register_with_trips() 
contains this piece of code:

1571         if (!tz->ops.critical)
1572                 tz->ops.critical = thermal_zone_device_critical;

If (in case of OF) of_ops.critical is not set, use 
thermal_zone_device_critical() handler.

There is a slight difference:
- If critical-action = "shutdown" is set in DT, then handler
   thermal_zone_device_critical_shutdown() is called, which is a wrapper
   around thermal_zone_device_halt(tz, HWPROT_ACT_SHUTDOWN);
- If critical-action = "shutdown" is NOT set in DT, then handler
   thermal_zone_device_critical() is called, which is a wrapper
   around thermal_zone_device_halt(tz, HWPROT_ACT_DEFAULT);

thermal_zone_device_halt() itself is a wrapper around 
__hw_protection_trigger(msg, poweroff_delay_ms, action); , where action 
is either HWPROT_ACT_SHUTDOWN or HWPROT_ACT_DEFAULT , which is handled 
in kernel/reboot.c __hw_protection_trigger() implementation :

1028 void __hw_protection_trigger(const char *reason, int ms_until_forced,
1029                              enum hw_protection_action action)
1030 {
1031         static atomic_t allow_proceed = ATOMIC_INIT(1);
1032
1033         if (action == HWPROT_ACT_DEFAULT)
1034                 action = hw_protection_action;

In case of HWPROT_ACT_DEFAULT , the 'hw_protection_action' which is 
assigned into 'action' can be overridden, either via sysfs write, or 
hw_protection_ kernel command line parameter . In case of 
HWPROT_ACT_SHUTDOWN , the action cannot be overridden .

In case this hardware starts to melt, we surely want HWPROT_ACT_SHUTDOWN 
with no override options ...

> If it's not I think we should move these to r8a779g0.dtsi. And
> likely add them for all other SoCs too?

... the other hardware has non-optional heatsink, where override-able 
HWPROT_ACT_DEFAULT is the right option I think .

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk
  2025-06-29 22:32   ` Marek Vasut
@ 2025-06-30  8:13     ` Niklas Söderlund
  0 siblings, 0 replies; 9+ messages in thread
From: Niklas Söderlund @ 2025-06-30  8:13 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Marek Vasut, linux-arm-kernel, Conor Dooley, Geert Uytterhoeven,
	Krzysztof Kozlowski, Magnus Damm, Rob Herring, devicetree,
	linux-renesas-soc

Hello Marek,

On 2025-06-30 00:32:54 +0200, Marek Vasut wrote:
> On 6/26/25 11:41 PM, Niklas Söderlund wrote:
> 
> Hello Niklas,
> 
> > > +&a76_3 {
> > > +	a76_3_thermal_idle: thermal-idle {
> > > +		#cooling-cells = <2>;
> > > +		duration-us = <10000>;
> > > +		exit-latency-us = <500>;
> > > +	};
> > > +};
> > 
> > I did not know you could do this and use it as a cooling device, thanks
> > for teaching me something new!
> 
> You could, although the cooling effect may vary. Some cores enter e.g. clock
> stop during idle and then they really cool down, some do not.
> 
> > > +/* THS sensors in SoC, critical temperature trip point is 100C */
> > > +&sensor1_crit {
> > > +	temperature = <100000>;
> > > +};
> > > +
> > > +&sensor2_crit {
> > > +	temperature = <100000>;
> > > +};
> > > +
> > > +&sensor3_crit {
> > > +	temperature = <100000>;
> > > +};
> > > +
> > > +&sensor4_crit {
> > > +	temperature = <100000>;
> > > +};
> > > +
> > > +&sensor_thermal_cr52 {
> > > +	critical-action = "shutdown";
> > > +};
> > > +
> > > +&sensor_thermal_cnn {
> > > +	critical-action = "shutdown";
> > > +};
> > 
> > Is this not the default action for critical trip points? In my testing
> > in the past R-Car systems have always shutdown when the critical trip is
> > reached.
> 
> It isn't quite that clear cut.
> 
> drivers/thermal/thermal_of.c thermal_of_zone_register() contains this piece
> of code:
> 
> "
> 407         ret = of_property_read_string(np, "critical-action", &action);
> 408         if (!ret && !of_ops.critical) {
> 409                 if (!strcasecmp(action, "reboot"))
> 410                         of_ops.critical =
> thermal_zone_device_critical_reboot;
> 411                 else if (!strcasecmp(action, "shutdown"))
> 412                         of_ops.critical =
> thermal_zone_device_critical_shutdown;
> 413         }
> "
> 
> If "critical-action" DT property is not set, then of_ops.critical are not
> modified.
> 
> drivers/thermal/thermal_core.c thermal_zone_device_register_with_trips()
> contains this piece of code:
> 
> 1571         if (!tz->ops.critical)
> 1572                 tz->ops.critical = thermal_zone_device_critical;
> 
> If (in case of OF) of_ops.critical is not set, use
> thermal_zone_device_critical() handler.
> 
> There is a slight difference:
> - If critical-action = "shutdown" is set in DT, then handler
>   thermal_zone_device_critical_shutdown() is called, which is a wrapper
>   around thermal_zone_device_halt(tz, HWPROT_ACT_SHUTDOWN);
> - If critical-action = "shutdown" is NOT set in DT, then handler
>   thermal_zone_device_critical() is called, which is a wrapper
>   around thermal_zone_device_halt(tz, HWPROT_ACT_DEFAULT);
> 
> thermal_zone_device_halt() itself is a wrapper around
> __hw_protection_trigger(msg, poweroff_delay_ms, action); , where action is
> either HWPROT_ACT_SHUTDOWN or HWPROT_ACT_DEFAULT , which is handled in
> kernel/reboot.c __hw_protection_trigger() implementation :
> 
> 1028 void __hw_protection_trigger(const char *reason, int ms_until_forced,
> 1029                              enum hw_protection_action action)
> 1030 {
> 1031         static atomic_t allow_proceed = ATOMIC_INIT(1);
> 1032
> 1033         if (action == HWPROT_ACT_DEFAULT)
> 1034                 action = hw_protection_action;
> 
> In case of HWPROT_ACT_DEFAULT , the 'hw_protection_action' which is assigned
> into 'action' can be overridden, either via sysfs write, or hw_protection_
> kernel command line parameter . In case of HWPROT_ACT_SHUTDOWN , the action
> cannot be overridden .
> 
> In case this hardware starts to melt, we surely want HWPROT_ACT_SHUTDOWN
> with no override options ...
> 
> > If it's not I think we should move these to r8a779g0.dtsi. And
> > likely add them for all other SoCs too?
> 
> ... the other hardware has non-optional heatsink, where override-able
> HWPROT_ACT_DEFAULT is the right option I think .

Wow, thanks for the detailed rundown. With that I agree with you, we 
should only force the shutdown on this particular platform. Nice work.

Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>

-- 
Kind Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk
  2025-06-25 10:01 [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk Marek Vasut
  2025-06-26 21:41 ` Niklas Söderlund
@ 2025-08-06  9:35 ` Geert Uytterhoeven
  2025-08-06 15:23   ` Marek Vasut
  1 sibling, 1 reply; 9+ messages in thread
From: Geert Uytterhoeven @ 2025-08-06  9:35 UTC (permalink / raw)
  To: Marek Vasut
  Cc: linux-arm-kernel, Conor Dooley, Geert Uytterhoeven,
	Krzysztof Kozlowski, Magnus Damm, Niklas Söderlund,
	Rob Herring, devicetree, linux-renesas-soc

Hi Marek,

On Wed, 25 Jun 2025 at 12:03, Marek Vasut
<marek.vasut+renesas@mailbox.org> wrote:
> Since the Sparrow Hawk has a smaller PCB than the White Hawk, it tends
> to generate more heat. To prevent potential damage to the board, adjust
> the temperature trip points.
>
> Add four "passive" trip points which increasingly throttle the CPU to
> prevent overheating. The first trip point at 68°C disables the 1.8 GHz
> and 1.7 GHz modes and limits the CPU to 1.5 GHz frequency. The second
> trip point at 72°C disables the 1.5 GHz mode and limits the CPU to 1.0
> GHz frequency. The third trip point at 76°C uses thermal-idle to start
> inserting idle cycles into the CPU instruction stream to cool the CPU
> cores down. The fourth and last trip point at 80°C disables the 1.0 GHz
> mode and limits the CPU to 500 MHz frequency.
>
> In case the SoC heats up further, in case either of the thermal sensors
> readings passes the 100°C, a thermal shutdown is triggered to prevent
> any damage to the hardware.
>
> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>

Thanks for your patch!

> --- a/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts
> +++ b/arch/arm64/boot/dts/renesas/r8a779g3-sparrow-hawk.dts
> @@ -38,6 +38,7 @@
>
>  /dts-v1/;
>  #include <dt-bindings/gpio/gpio.h>
> +#include <dt-bindings/thermal/thermal.h>
>
>  #include "r8a779g3.dtsi"
>
> @@ -797,3 +798,139 @@ &rwdt {
>  &scif_clk {    /* X12 */
>         clock-frequency = <24000000>;
>  };
> +
> +/* thermal-idle cooling for all SoC cores */
> +&a76_0 {

Please keep nodes sorted (alphabetically by label).

> +       #cooling-cells = <2>;

This is only present for the first CPU core, and map{0,1,3} refer
only to a76_0, because all four CPU cores are driven by a single clock
(Z0), right?

> +
> +       a76_0_thermal_idle: thermal-idle {
> +               #cooling-cells = <2>;
> +               duration-us = <10000>;
> +               exit-latency-us = <500>;
> +       };
> +};

> +/* THS sensor in SoC near CA76 cores does more progressive cooling. */
> +&sensor_thermal_ca76 {
> +       critical-action = "shutdown";
> +
> +       cooling-maps {
> +               /*
> +                * The cooling-device minimum and maximum parameters inversely
> +                * match opp-table-0 {} node entries in r8a779g0.dtsi, in other
> +                * words, 0 refers to 1.8 GHz OPP and 4 refers to 500 MHz OPP.
> +                * This is because they refer to cooling levels, where maximum
> +                * cooling level happens at 500 MHz OPP, when the CPU core is
> +                * running slowly and therefore generates least heat.

That applies to cooling-device = <&a76_[0-3] ...>...

> +                */
> +               map0 {
> +                       /* At 68C, inhibit 1.7 GHz and 1.8 GHz modes */
> +                       trip = <&sensor3_passive_low>;
> +                       cooling-device = <&a76_0 2 4>;
> +                       contribution = <128>;
> +               };
> +
> +               map1 {
> +                       /* At 72C, inhibit 1.5 GHz mode */
> +                       trip = <&sensor3_passive_mid>;
> +                       cooling-device = <&a76_0 3 4>;
> +                       contribution = <256>;
> +               };
> +
> +               map2 {
> +                       /* At 76C, start injecting idle states */
> +                       trip = <&sensor3_passive_hi>;
> +                       cooling-device = <&a76_0_thermal_idle 0 80>,
> +                                        <&a76_1_thermal_idle 0 80>,
> +                                        <&a76_2_thermal_idle 0 80>,
> +                                        <&a76_3_thermal_idle 0 80>;

... but what do "0 80" refer to? I couldn't find in the thermal-idle
bindings what exactly are the minimum and maximum cooling states here.

> +                       contribution = <512>;
> +               };

The rest LGTM, so with the sort order fixed, and the thermal-idle
states clarified:
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk
  2025-08-06  9:35 ` Geert Uytterhoeven
@ 2025-08-06 15:23   ` Marek Vasut
  2025-08-14 15:50     ` Geert Uytterhoeven
  0 siblings, 1 reply; 9+ messages in thread
From: Marek Vasut @ 2025-08-06 15:23 UTC (permalink / raw)
  To: Geert Uytterhoeven, Daniel Lezcano
  Cc: linux-arm-kernel, Conor Dooley, Geert Uytterhoeven,
	Krzysztof Kozlowski, Magnus Damm, Niklas Söderlund,
	Rob Herring, devicetree, linux-renesas-soc

On 8/6/25 11:35 AM, Geert Uytterhoeven wrote:
> Hi Marek,

Hi,

>> +       #cooling-cells = <2>;
> 
> This is only present for the first CPU core, and map{0,1,3} refer
> only to a76_0, because all four CPU cores are driven by a single clock
> (Z0), right?

That seems correct.

>> +
>> +       a76_0_thermal_idle: thermal-idle {
>> +               #cooling-cells = <2>;
>> +               duration-us = <10000>;
>> +               exit-latency-us = <500>;
>> +       };
>> +};
> 
>> +/* THS sensor in SoC near CA76 cores does more progressive cooling. */
>> +&sensor_thermal_ca76 {
>> +       critical-action = "shutdown";
>> +
>> +       cooling-maps {
>> +               /*
>> +                * The cooling-device minimum and maximum parameters inversely
>> +                * match opp-table-0 {} node entries in r8a779g0.dtsi, in other
>> +                * words, 0 refers to 1.8 GHz OPP and 4 refers to 500 MHz OPP.
>> +                * This is because they refer to cooling levels, where maximum
>> +                * cooling level happens at 500 MHz OPP, when the CPU core is
>> +                * running slowly and therefore generates least heat.
> 
> That applies to cooling-device = <&a76_[0-3] ...>...

Do you want me to add this line into the comment ?

>> +                */
>> +               map0 {
>> +                       /* At 68C, inhibit 1.7 GHz and 1.8 GHz modes */
>> +                       trip = <&sensor3_passive_low>;
>> +                       cooling-device = <&a76_0 2 4>;
>> +                       contribution = <128>;
>> +               };
>> +
>> +               map1 {
>> +                       /* At 72C, inhibit 1.5 GHz mode */
>> +                       trip = <&sensor3_passive_mid>;
>> +                       cooling-device = <&a76_0 3 4>;
>> +                       contribution = <256>;
>> +               };
>> +
>> +               map2 {
>> +                       /* At 76C, start injecting idle states */
>> +                       trip = <&sensor3_passive_hi>;
>> +                       cooling-device = <&a76_0_thermal_idle 0 80>,
>> +                                        <&a76_1_thermal_idle 0 80>,
>> +                                        <&a76_2_thermal_idle 0 80>,
>> +                                        <&a76_3_thermal_idle 0 80>;
> 
> ... but what do "0 80" refer to? I couldn't find in the thermal-idle
> bindings what exactly are the minimum and maximum cooling states here.
The comments in drivers/thermal/cpuidle_cooling.c clarify that, it is 
the idle injection rate in percent, in this case the cooling can inject 
idle states up to 80% of time.

+CC Daniel in case they want to chime in on that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk
  2025-08-06 15:23   ` Marek Vasut
@ 2025-08-14 15:50     ` Geert Uytterhoeven
  2025-08-14 23:36       ` Marek Vasut
  0 siblings, 1 reply; 9+ messages in thread
From: Geert Uytterhoeven @ 2025-08-14 15:50 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Daniel Lezcano, linux-arm-kernel, Conor Dooley,
	Geert Uytterhoeven, Krzysztof Kozlowski, Magnus Damm,
	Niklas Söderlund, Rob Herring, devicetree, linux-renesas-soc

Hi Marek,

On Wed, 6 Aug 2025 at 17:23, Marek Vasut <marek.vasut@mailbox.org> wrote:
> On 8/6/25 11:35 AM, Geert Uytterhoeven wrote:
> >> +/* THS sensor in SoC near CA76 cores does more progressive cooling. */
> >> +&sensor_thermal_ca76 {
> >> +       critical-action = "shutdown";
> >> +
> >> +       cooling-maps {
> >> +               /*
> >> +                * The cooling-device minimum and maximum parameters inversely
> >> +                * match opp-table-0 {} node entries in r8a779g0.dtsi, in other
> >> +                * words, 0 refers to 1.8 GHz OPP and 4 refers to 500 MHz OPP.
> >> +                * This is because they refer to cooling levels, where maximum
> >> +                * cooling level happens at 500 MHz OPP, when the CPU core is
> >> +                * running slowly and therefore generates least heat.
> >
> > That applies to cooling-device = <&a76_[0-3] ...>...
>
> Do you want me to add this line into the comment ?

I don't think that is really needed (see below)

> >> +                */
> >> +               map0 {
> >> +                       /* At 68C, inhibit 1.7 GHz and 1.8 GHz modes */
> >> +                       trip = <&sensor3_passive_low>;
> >> +                       cooling-device = <&a76_0 2 4>;
> >> +                       contribution = <128>;
> >> +               };
> >> +
> >> +               map1 {
> >> +                       /* At 72C, inhibit 1.5 GHz mode */
> >> +                       trip = <&sensor3_passive_mid>;
> >> +                       cooling-device = <&a76_0 3 4>;
> >> +                       contribution = <256>;
> >> +               };
> >> +
> >> +               map2 {
> >> +                       /* At 76C, start injecting idle states */
> >> +                       trip = <&sensor3_passive_hi>;
> >> +                       cooling-device = <&a76_0_thermal_idle 0 80>,
> >> +                                        <&a76_1_thermal_idle 0 80>,
> >> +                                        <&a76_2_thermal_idle 0 80>,
> >> +                                        <&a76_3_thermal_idle 0 80>;
> >
> > ... but what do "0 80" refer to? I couldn't find in the thermal-idle
> > bindings what exactly are the minimum and maximum cooling states here.
>
> The comments in drivers/thermal/cpuidle_cooling.c clarify that, it is
> the idle injection rate in percent, in this case the cooling can inject
> idle states up to 80% of time.

OK, so I will add "(0-80%)" to the idle states comment, and sort
the nodes while queuing in renesas-devel for v6.18.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk
  2025-08-14 15:50     ` Geert Uytterhoeven
@ 2025-08-14 23:36       ` Marek Vasut
  2025-08-18  8:59         ` Geert Uytterhoeven
  0 siblings, 1 reply; 9+ messages in thread
From: Marek Vasut @ 2025-08-14 23:36 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Daniel Lezcano, linux-arm-kernel, Conor Dooley,
	Geert Uytterhoeven, Krzysztof Kozlowski, Magnus Damm,
	Niklas Söderlund, Rob Herring, devicetree, linux-renesas-soc

On 8/14/25 5:50 PM, Geert Uytterhoeven wrote:

Hello Geert,

> On Wed, 6 Aug 2025 at 17:23, Marek Vasut <marek.vasut@mailbox.org> wrote:
>> On 8/6/25 11:35 AM, Geert Uytterhoeven wrote:
>>>> +/* THS sensor in SoC near CA76 cores does more progressive cooling. */
>>>> +&sensor_thermal_ca76 {
>>>> +       critical-action = "shutdown";
>>>> +
>>>> +       cooling-maps {
>>>> +               /*
>>>> +                * The cooling-device minimum and maximum parameters inversely
>>>> +                * match opp-table-0 {} node entries in r8a779g0.dtsi, in other
>>>> +                * words, 0 refers to 1.8 GHz OPP and 4 refers to 500 MHz OPP.
>>>> +                * This is because they refer to cooling levels, where maximum
>>>> +                * cooling level happens at 500 MHz OPP, when the CPU core is
>>>> +                * running slowly and therefore generates least heat.
>>>
>>> That applies to cooling-device = <&a76_[0-3] ...>...
>>
>> Do you want me to add this line into the comment ?
> 
> I don't think that is really needed (see below)
> 
>>>> +                */
>>>> +               map0 {
>>>> +                       /* At 68C, inhibit 1.7 GHz and 1.8 GHz modes */
>>>> +                       trip = <&sensor3_passive_low>;
>>>> +                       cooling-device = <&a76_0 2 4>;
>>>> +                       contribution = <128>;
>>>> +               };
>>>> +
>>>> +               map1 {
>>>> +                       /* At 72C, inhibit 1.5 GHz mode */
>>>> +                       trip = <&sensor3_passive_mid>;
>>>> +                       cooling-device = <&a76_0 3 4>;
>>>> +                       contribution = <256>;
>>>> +               };
>>>> +
>>>> +               map2 {
>>>> +                       /* At 76C, start injecting idle states */
>>>> +                       trip = <&sensor3_passive_hi>;
>>>> +                       cooling-device = <&a76_0_thermal_idle 0 80>,
>>>> +                                        <&a76_1_thermal_idle 0 80>,
>>>> +                                        <&a76_2_thermal_idle 0 80>,
>>>> +                                        <&a76_3_thermal_idle 0 80>;
>>>
>>> ... but what do "0 80" refer to? I couldn't find in the thermal-idle
>>> bindings what exactly are the minimum and maximum cooling states here.
>>
>> The comments in drivers/thermal/cpuidle_cooling.c clarify that, it is
>> the idle injection rate in percent, in this case the cooling can inject
>> idle states up to 80% of time.
> 
> OK, so I will add "(0-80%)" to the idle states comment, and sort
> the nodes while queuing in renesas-devel for v6.18.
I sent a V3 to make this more official. I hope that helps.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk
  2025-08-14 23:36       ` Marek Vasut
@ 2025-08-18  8:59         ` Geert Uytterhoeven
  0 siblings, 0 replies; 9+ messages in thread
From: Geert Uytterhoeven @ 2025-08-18  8:59 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Daniel Lezcano, linux-arm-kernel, Conor Dooley,
	Krzysztof Kozlowski, Magnus Damm, Niklas Söderlund,
	Rob Herring, devicetree, linux-renesas-soc

Hi Marek,

On Fri, 15 Aug 2025 at 01:36, Marek Vasut <marek.vasut@mailbox.org> wrote:
> On 8/14/25 5:50 PM, Geert Uytterhoeven wrote:
> > On Wed, 6 Aug 2025 at 17:23, Marek Vasut <marek.vasut@mailbox.org> wrote:
> >> On 8/6/25 11:35 AM, Geert Uytterhoeven wrote:
> >>>> +/* THS sensor in SoC near CA76 cores does more progressive cooling. */
> >>>> +&sensor_thermal_ca76 {
> >>>> +       critical-action = "shutdown";
> >>>> +
> >>>> +       cooling-maps {
> >>>> +               /*
> >>>> +                * The cooling-device minimum and maximum parameters inversely
> >>>> +                * match opp-table-0 {} node entries in r8a779g0.dtsi, in other
> >>>> +                * words, 0 refers to 1.8 GHz OPP and 4 refers to 500 MHz OPP.
> >>>> +                * This is because they refer to cooling levels, where maximum
> >>>> +                * cooling level happens at 500 MHz OPP, when the CPU core is
> >>>> +                * running slowly and therefore generates least heat.
> >>>
> >>> That applies to cooling-device = <&a76_[0-3] ...>...
> >>
> >> Do you want me to add this line into the comment ?
> >
> > I don't think that is really needed (see below)
> >
> >>>> +                */
> >>>> +               map0 {
> >>>> +                       /* At 68C, inhibit 1.7 GHz and 1.8 GHz modes */
> >>>> +                       trip = <&sensor3_passive_low>;
> >>>> +                       cooling-device = <&a76_0 2 4>;
> >>>> +                       contribution = <128>;
> >>>> +               };
> >>>> +
> >>>> +               map1 {
> >>>> +                       /* At 72C, inhibit 1.5 GHz mode */
> >>>> +                       trip = <&sensor3_passive_mid>;
> >>>> +                       cooling-device = <&a76_0 3 4>;
> >>>> +                       contribution = <256>;
> >>>> +               };
> >>>> +
> >>>> +               map2 {
> >>>> +                       /* At 76C, start injecting idle states */
> >>>> +                       trip = <&sensor3_passive_hi>;
> >>>> +                       cooling-device = <&a76_0_thermal_idle 0 80>,
> >>>> +                                        <&a76_1_thermal_idle 0 80>,
> >>>> +                                        <&a76_2_thermal_idle 0 80>,
> >>>> +                                        <&a76_3_thermal_idle 0 80>;
> >>>
> >>> ... but what do "0 80" refer to? I couldn't find in the thermal-idle
> >>> bindings what exactly are the minimum and maximum cooling states here.
> >>
> >> The comments in drivers/thermal/cpuidle_cooling.c clarify that, it is
> >> the idle injection rate in percent, in this case the cooling can inject
> >> idle states up to 80% of time.
> >
> > OK, so I will add "(0-80%)" to the idle states comment, and sort
> > the nodes while queuing in renesas-devel for v6.18.
> I sent a V3 to make this more official. I hope that helps.

Thanks, I took v3 instead.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-08-18  8:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-25 10:01 [PATCH] arm64: dts: renesas: r8a779g3: Update thermal trip points on V4H Sparrow Hawk Marek Vasut
2025-06-26 21:41 ` Niklas Söderlund
2025-06-29 22:32   ` Marek Vasut
2025-06-30  8:13     ` Niklas Söderlund
2025-08-06  9:35 ` Geert Uytterhoeven
2025-08-06 15:23   ` Marek Vasut
2025-08-14 15:50     ` Geert Uytterhoeven
2025-08-14 23:36       ` Marek Vasut
2025-08-18  8:59         ` Geert Uytterhoeven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).