* [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 @ 2024-01-06 22:23 Alexey Charkov 2024-01-06 22:54 ` Dragan Simic 2024-01-09 19:19 ` [PATCH v2] " Alexey Charkov 0 siblings, 2 replies; 21+ messages in thread From: Alexey Charkov @ 2024-01-06 22:23 UTC (permalink / raw) To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, Alexey Charkov, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel Include thermal zones information in device tree for rk3588 variants and enable the built-in thermal sensing ADC on RADXA Rock 5B Signed-off-by: Alexey Charkov <alchark@gmail.com> --- .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 143 ++++++++++++++++++ 2 files changed, 147 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index a5a104131403..f9d540000de3 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts @@ -772,3 +772,7 @@ &usb_host1_ehci { &usb_host1_ohci { status = "okay"; }; + +&tsadc { + status = "okay"; +}; diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi index 8aa0499f9b03..8235991e3112 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi @@ -10,6 +10,7 @@ #include <dt-bindings/reset/rockchip,rk3588-cru.h> #include <dt-bindings/phy/phy.h> #include <dt-bindings/ata/ahci.h> +#include <dt-bindings/thermal/thermal.h> / { compatible = "rockchip,rk3588"; @@ -2112,6 +2113,148 @@ tsadc: tsadc@fec00000 { status = "disabled"; }; + thermal_zones: thermal-zones { + soc_thermal: soc-thermal { + polling-delay-passive = <20>; /* milliseconds */ + polling-delay = <1000>; /* milliseconds */ + sustainable-power = <2100>; /* milliwatts */ + + thermal-sensors = <&tsadc 0>; + trips { + threshold: trip-point-0 { + temperature = <75000>; + hysteresis = <2000>; + type = "passive"; + }; + target: trip-point-1 { + temperature = <85000>; + hysteresis = <2000>; + type = "passive"; + }; + soc_crit: soc-crit { + /* millicelsius */ + temperature = <115000>; + /* millicelsius */ + hysteresis = <2000>; + type = "critical"; + }; + }; + cooling-maps { + map0 { + trip = <&target>; + cooling-device = <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + contribution = <1024>; + }; + map1 { + trip = <&target>; + cooling-device = <&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + contribution = <1024>; + }; + map2 { + trip = <&target>; + cooling-device = <&cpu_b2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + contribution = <1024>; + }; + }; + }; + + bigcore0_thermal: bigcore0-thermal { + polling-delay-passive = <20>; /* milliseconds */ + polling-delay = <1000>; /* milliseconds */ + thermal-sensors = <&tsadc 1>; + + trips { + big0_crit: big0-crit { + /* millicelsius */ + temperature = <115000>; + /* millicelsius */ + hysteresis = <2000>; + type = "critical"; + }; + }; + }; + + bigcore1_thermal: bigcore1-thermal { + polling-delay-passive = <20>; /* milliseconds */ + polling-delay = <1000>; /* milliseconds */ + thermal-sensors = <&tsadc 2>; + + trips { + big1_crit: big1-crit { + /* millicelsius */ + temperature = <115000>; + /* millicelsius */ + hysteresis = <2000>; + type = "critical"; + }; + }; + }; + + little_core_thermal: littlecore-thermal { + polling-delay-passive = <20>; /* milliseconds */ + polling-delay = <1000>; /* milliseconds */ + thermal-sensors = <&tsadc 3>; + + trips { + little_crit: little-crit { + /* millicelsius */ + temperature = <115000>; + /* millicelsius */ + hysteresis = <2000>; + type = "critical"; + }; + }; + }; + + center_thermal: center-thermal { + polling-delay-passive = <20>; /* milliseconds */ + polling-delay = <1000>; /* milliseconds */ + thermal-sensors = <&tsadc 4>; + + trips { + center_crit: center-crit { + /* millicelsius */ + temperature = <115000>; + /* millicelsius */ + hysteresis = <2000>; + type = "critical"; + }; + }; + }; + + gpu_thermal: gpu-thermal { + polling-delay-passive = <20>; /* milliseconds */ + polling-delay = <1000>; /* milliseconds */ + thermal-sensors = <&tsadc 5>; + + trips { + gpu_crit: gpu-crit { + /* millicelsius */ + temperature = <115000>; + /* millicelsius */ + hysteresis = <2000>; + type = "critical"; + }; + }; + }; + + npu_thermal: npu-thermal { + polling-delay-passive = <20>; /* milliseconds */ + polling-delay = <1000>; /* milliseconds */ + thermal-sensors = <&tsadc 6>; + + trips { + npu_crit: npu-crit { + /* millicelsius */ + temperature = <115000>; + /* millicelsius */ + hysteresis = <2000>; + type = "critical"; + }; + }; + }; + }; + saradc: adc@fec10000 { compatible = "rockchip,rk3588-saradc"; reg = <0x0 0xfec10000 0x0 0x10000>; -- 2.43.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-06 22:23 [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 Alexey Charkov @ 2024-01-06 22:54 ` Dragan Simic 2024-01-08 13:41 ` Alexey Charkov 2024-01-09 19:19 ` [PATCH v2] " Alexey Charkov 1 sibling, 1 reply; 21+ messages in thread From: Dragan Simic @ 2024-01-06 22:54 UTC (permalink / raw) To: Alexey Charkov Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel Hello Alexey, Please see my comments below. On 2024-01-06 23:23, Alexey Charkov wrote: > Include thermal zones information in device tree for rk3588 variants > and enable the built-in thermal sensing ADC on RADXA Rock 5B > > Signed-off-by: Alexey Charkov <alchark@gmail.com> > --- > .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + > arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 143 ++++++++++++++++++ > 2 files changed, 147 insertions(+) > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > index a5a104131403..f9d540000de3 100644 > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > @@ -772,3 +772,7 @@ &usb_host1_ehci { > &usb_host1_ohci { > status = "okay"; > }; > + > +&tsadc { > + status = "okay"; > +}; > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > index 8aa0499f9b03..8235991e3112 100644 > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > @@ -10,6 +10,7 @@ > #include <dt-bindings/reset/rockchip,rk3588-cru.h> > #include <dt-bindings/phy/phy.h> > #include <dt-bindings/ata/ahci.h> > +#include <dt-bindings/thermal/thermal.h> > > / { > compatible = "rockchip,rk3588"; > @@ -2112,6 +2113,148 @@ tsadc: tsadc@fec00000 { > status = "disabled"; > }; > > + thermal_zones: thermal-zones { > + soc_thermal: soc-thermal { It should be better to name it cpu_thermal instead. In the end, that's what it is. > + polling-delay-passive = <20>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + sustainable-power = <2100>; /* milliwatts */ These three comments above are pretty much redundant. > + > + thermal-sensors = <&tsadc 0>; An empty line should be added here. > + trips { > + threshold: trip-point-0 { It should be better to name it cpu_alert0 instead, because that's what other newer dtsi files already use. > + temperature = <75000>; > + hysteresis = <2000>; > + type = "passive"; > + }; > + target: trip-point-1 { It should be better to name it cpu_alert1 instead, because that's what other newer dtsi files already use. > + temperature = <85000>; > + hysteresis = <2000>; > + type = "passive"; > + }; > + soc_crit: soc-crit { It should be better to name it cpu_crit instead, because that's what other newer dtsi files already use. > + /* millicelsius */ > + temperature = <115000>; > + /* millicelsius */ These two comments above are pretty much redundant. It also applies to all other similar comments below. > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + cooling-maps { > + map0 { > + trip = <&target>; Shouldn't &threshold (i.e. &cpu_alert0) be referenced here instead? > + cooling-device = <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; Shouldn't all big CPU cores be listed here instead? > + contribution = <1024>; > + }; > + map1 { > + trip = <&target>; Shouldn't &target (i.e. &cpu_alert1) be referenced here instead? > + cooling-device = <&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; Shouldn't all little and big CPU cores be listed here instead? > + contribution = <1024>; > + }; > + map2 { > + trip = <&target>; > + cooling-device = <&cpu_b2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + contribution = <1024>; > + }; Isn't this cooling map redundant? > + }; > + }; > + > + bigcore0_thermal: bigcore0-thermal { > + polling-delay-passive = <20>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + thermal-sensors = <&tsadc 1>; > + > + trips { > + big0_crit: big0-crit { > + /* millicelsius */ > + temperature = <115000>; > + /* millicelsius */ > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + bigcore1_thermal: bigcore1-thermal { > + polling-delay-passive = <20>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + thermal-sensors = <&tsadc 2>; > + > + trips { > + big1_crit: big1-crit { > + /* millicelsius */ > + temperature = <115000>; > + /* millicelsius */ > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + little_core_thermal: littlecore-thermal { > + polling-delay-passive = <20>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + thermal-sensors = <&tsadc 3>; > + > + trips { > + little_crit: little-crit { > + /* millicelsius */ > + temperature = <115000>; > + /* millicelsius */ > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + center_thermal: center-thermal { > + polling-delay-passive = <20>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + thermal-sensors = <&tsadc 4>; > + > + trips { > + center_crit: center-crit { > + /* millicelsius */ > + temperature = <115000>; > + /* millicelsius */ > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + gpu_thermal: gpu-thermal { > + polling-delay-passive = <20>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + thermal-sensors = <&tsadc 5>; > + > + trips { > + gpu_crit: gpu-crit { > + /* millicelsius */ > + temperature = <115000>; > + /* millicelsius */ > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + npu_thermal: npu-thermal { > + polling-delay-passive = <20>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + thermal-sensors = <&tsadc 6>; > + > + trips { > + npu_crit: npu-crit { > + /* millicelsius */ > + temperature = <115000>; > + /* millicelsius */ > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + }; > + > saradc: adc@fec10000 { > compatible = "rockchip,rk3588-saradc"; > reg = <0x0 0xfec10000 0x0 0x10000>; _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-06 22:54 ` Dragan Simic @ 2024-01-08 13:41 ` Alexey Charkov 2024-01-18 18:48 ` Dragan Simic 0 siblings, 1 reply; 21+ messages in thread From: Alexey Charkov @ 2024-01-08 13:41 UTC (permalink / raw) To: Dragan Simic Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel Hello Dragan, Thanks a lot for your review and comments! Some reflections below. On Sun, Jan 7, 2024 at 2:54 AM Dragan Simic <dsimic@manjaro.org> wrote: > > Hello Alexey, > > Please see my comments below. > > On 2024-01-06 23:23, Alexey Charkov wrote: > > Include thermal zones information in device tree for rk3588 variants > > and enable the built-in thermal sensing ADC on RADXA Rock 5B > > > > Signed-off-by: Alexey Charkov <alchark@gmail.com> > > --- > > .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + > > arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 143 ++++++++++++++++++ > > 2 files changed, 147 insertions(+) > > > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > > b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > > index a5a104131403..f9d540000de3 100644 > > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > > @@ -772,3 +772,7 @@ &usb_host1_ehci { > > &usb_host1_ohci { > > status = "okay"; > > }; > > + > > +&tsadc { > > + status = "okay"; > > +}; > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > > index 8aa0499f9b03..8235991e3112 100644 > > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > > @@ -10,6 +10,7 @@ > > #include <dt-bindings/reset/rockchip,rk3588-cru.h> > > #include <dt-bindings/phy/phy.h> > > #include <dt-bindings/ata/ahci.h> > > +#include <dt-bindings/thermal/thermal.h> > > > > / { > > compatible = "rockchip,rk3588"; > > @@ -2112,6 +2113,148 @@ tsadc: tsadc@fec00000 { > > status = "disabled"; > > }; > > > > + thermal_zones: thermal-zones { > > + soc_thermal: soc-thermal { > > It should be better to name it cpu_thermal instead. In the end, that's > what it is. The TRM document says the first TSADC channel (to which this section applies) measures the temperature near the center of the SoC die, which implies not only the CPU but also the GPU at least. RADXA's kernel for Rock 5B also has GPU passive cooling as one of the cooling maps under this node (not included here, as we don't have the GPU node in .dtsi just yet). So perhaps naming this one cpu_thermal could be misleading? > > + polling-delay-passive = <20>; /* milliseconds */ > > + polling-delay = <1000>; /* milliseconds */ > > + sustainable-power = <2100>; /* milliwatts */ > > These three comments above are pretty much redundant. Noted, will drop in v2 > > + > > + thermal-sensors = <&tsadc 0>; > > An empty line should be added here. Noted, will add in v2 > > + trips { > > + threshold: trip-point-0 { > > It should be better to name it cpu_alert0 instead, because that's what > other newer dtsi files already use. Reflecting on your comments here and below, I'm thinking that maybe it would be better to define only the critical trip point for the SoC overall, and then have alerts along with the respective cooling maps separately for A76-0,1, A76-2,3, A55-0,1,2,3? After all, given that we have more granular temperature measurement here than in previous RK chipsets it might be better to only throttle the "offending" cores, not the full package. What do you think? Downstream DT doesn't follow this approach though, so maybe there's something I'm missing here. > > + temperature = <75000>; > > + hysteresis = <2000>; > > + type = "passive"; > > + }; > > + target: trip-point-1 { > > It should be better to name it cpu_alert1 instead, because that's what > other newer dtsi files already use. > > > + temperature = <85000>; > > + hysteresis = <2000>; > > + type = "passive"; > > + }; > > + soc_crit: soc-crit { > > It should be better to name it cpu_crit instead, because that's what > other newer dtsi files already use. Seems to me that if I define separate trips for the three groups of CPU cores as mentioned above this would better stay as soc_crit, as it applies to the whole die rather than the CPU cluster alone. Then 'threshold' and 'target' will go altogether, and I'll have separate *_alert0 and *_alert1 per CPU group. >> > + /* millicelsius */ > > + temperature = <115000>; > > + /* millicelsius */ > > These two comments above are pretty much redundant. It also applies to > all other similar comments below. Noted, thanks - will drop all the noise in v2. > > + hysteresis = <2000>; > > + type = "critical"; > > + }; > > + }; > > + cooling-maps { > > + map0 { > > + trip = <&target>; > > Shouldn't &threshold (i.e. &cpu_alert0) be referenced here instead? > > > + cooling-device = <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > > Shouldn't all big CPU cores be listed here instead? I guess if a separate trip point is defined for cpu_l0,1,2,3 then it would need to throttle at 75C, and then cpu_b0,1 and cpu_b2,3 at 85C each. Logic being that if a sensor stacked in the middle of a group of four cores shows 75C then one of the four might well be in abnormal temperature range already (85+), while sensors next to only two big cores each will likely show temperatures similar to the actual core temperature. > > + contribution = <1024>; > > + }; > > + map1 { > > + trip = <&target>; > > Shouldn't &target (i.e. &cpu_alert1) be referenced here instead? > > > + cooling-device = <&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > > Shouldn't all little and big CPU cores be listed here instead? > > > + contribution = <1024>; > > + }; > > + map2 { > > + trip = <&target>; > > + cooling-device = <&cpu_b2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > > + contribution = <1024>; > > + }; > > Isn't this cooling map redundant? Will refactor the above, thanks. > > + }; > > + }; > > + > > + bigcore0_thermal: bigcore0-thermal { > > + polling-delay-passive = <20>; /* milliseconds */ > > + polling-delay = <1000>; /* milliseconds */ > > + thermal-sensors = <&tsadc 1>; > > + > > + trips { > > + big0_crit: big0-crit { > > + /* millicelsius */ > > + temperature = <115000>; > > + /* millicelsius */ > > + hysteresis = <2000>; > > + type = "critical"; > > + }; > > + }; > > + }; > > + > > + bigcore1_thermal: bigcore1-thermal { > > + polling-delay-passive = <20>; /* milliseconds */ > > + polling-delay = <1000>; /* milliseconds */ > > + thermal-sensors = <&tsadc 2>; > > + > > + trips { > > + big1_crit: big1-crit { > > + /* millicelsius */ > > + temperature = <115000>; > > + /* millicelsius */ > > + hysteresis = <2000>; > > + type = "critical"; > > + }; > > + }; > > + }; > > + > > + little_core_thermal: littlecore-thermal { > > + polling-delay-passive = <20>; /* milliseconds */ > > + polling-delay = <1000>; /* milliseconds */ > > + thermal-sensors = <&tsadc 3>; > > + > > + trips { > > + little_crit: little-crit { > > + /* millicelsius */ > > + temperature = <115000>; > > + /* millicelsius */ > > + hysteresis = <2000>; > > + type = "critical"; > > + }; > > + }; > > + }; > > + > > + center_thermal: center-thermal { > > + polling-delay-passive = <20>; /* milliseconds */ > > + polling-delay = <1000>; /* milliseconds */ > > + thermal-sensors = <&tsadc 4>; > > + > > + trips { > > + center_crit: center-crit { > > + /* millicelsius */ > > + temperature = <115000>; > > + /* millicelsius */ > > + hysteresis = <2000>; > > + type = "critical"; > > + }; > > + }; > > + }; > > + > > + gpu_thermal: gpu-thermal { > > + polling-delay-passive = <20>; /* milliseconds */ > > + polling-delay = <1000>; /* milliseconds */ > > + thermal-sensors = <&tsadc 5>; > > + > > + trips { > > + gpu_crit: gpu-crit { > > + /* millicelsius */ > > + temperature = <115000>; > > + /* millicelsius */ > > + hysteresis = <2000>; > > + type = "critical"; > > + }; > > + }; > > + }; > > + > > + npu_thermal: npu-thermal { > > + polling-delay-passive = <20>; /* milliseconds */ > > + polling-delay = <1000>; /* milliseconds */ > > + thermal-sensors = <&tsadc 6>; > > + > > + trips { > > + npu_crit: npu-crit { > > + /* millicelsius */ > > + temperature = <115000>; > > + /* millicelsius */ > > + hysteresis = <2000>; > > + type = "critical"; > > + }; > > + }; > > + }; > > + }; > > + > > saradc: adc@fec10000 { > > compatible = "rockchip,rk3588-saradc"; > > reg = <0x0 0xfec10000 0x0 0x10000>; _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-08 13:41 ` Alexey Charkov @ 2024-01-18 18:48 ` Dragan Simic 2024-01-21 18:56 ` Alexey Charkov 0 siblings, 1 reply; 21+ messages in thread From: Dragan Simic @ 2024-01-18 18:48 UTC (permalink / raw) To: Alexey Charkov Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On 2024-01-08 14:41, Alexey Charkov wrote: > Hello Dragan, Hello Alexey! :) I apologize for my delayed response. It took me almost a month to nearly fully recover from some really nasty flu that eventually went into my lungs. It was awful and I'm still not back to my 100%. :( > Thanks a lot for your review and comments! Some reflections below. Thank you for your work and for your detailed response. Please see my comments below, which apply to your v2 submission as a well, to which I'll respond separately a bit later. > On Sun, Jan 7, 2024 at 2:54 AM Dragan Simic <dsimic@manjaro.org> wrote: >> On 2024-01-06 23:23, Alexey Charkov wrote: >> > Include thermal zones information in device tree for rk3588 variants >> > and enable the built-in thermal sensing ADC on RADXA Rock 5B >> > >> > Signed-off-by: Alexey Charkov <alchark@gmail.com> >> > --- >> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >> > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >> > index 8aa0499f9b03..8235991e3112 100644 >> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >> > @@ -10,6 +10,7 @@ >> > #include <dt-bindings/reset/rockchip,rk3588-cru.h> >> > #include <dt-bindings/phy/phy.h> >> > #include <dt-bindings/ata/ahci.h> >> > +#include <dt-bindings/thermal/thermal.h> >> > >> > / { >> > compatible = "rockchip,rk3588"; >> > @@ -2112,6 +2113,148 @@ tsadc: tsadc@fec00000 { >> > status = "disabled"; >> > }; >> > >> > + thermal_zones: thermal-zones { >> > + soc_thermal: soc-thermal { >> >> It should be better to name it cpu_thermal instead. In the end, >> that's >> what it is. > > The TRM document says the first TSADC channel (to which this section > applies) measures the temperature near the center of the SoC die, > which implies not only the CPU but also the GPU at least. RADXA's > kernel for Rock 5B also has GPU passive cooling as one of the cooling > maps under this node (not included here, as we don't have the GPU node > in .dtsi just yet). So perhaps naming this one cpu_thermal could be > misleading? Ah, I see now, thanks for reminding; it's all described on page 1,372 of the first part of the RK3588 TRM v1.0. Having that in mind, I'd suggest that we end up naming it package_thermal. The temperature near the center of the chip is usually considered to be the overall package temperature; for example, that's how the user-facing CPU temperatures are measured in the x86_64 world. >> > + trips { >> > + threshold: trip-point-0 { >> >> It should be better to name it cpu_alert0 instead, because that's what >> other newer dtsi files already use. > > Reflecting on your comments here and below, I'm thinking that maybe it > would be better to define only the critical trip point for the SoC > overall, and then have alerts along with the respective cooling maps > separately for A76-0,1, A76-2,3, A55-0,1,2,3? After all, given that we > have more granular temperature measurement here than in previous RK > chipsets it might be better to only throttle the "offending" cores, > not the full package. > > What do you think? > > Downstream DT doesn't follow this approach though, so maybe there's > something I'm missing here. I agree, it's better to fully utilize the higher measurement granularity made possible by having multiple temperature sensors available. I also agree that we should have only the critical trip defined for the package-level temperature sensor. Let's have the separate temperature measurements for the CPU (sub)clusters do the thermal throttling, and let's keep the package-level measurement for the critical shutdowns only. IIRC, some MediaTek SoC dtsi already does exactly that. Of course, there are no reasons not to have the critical trips defined for the CPU (sub)clusters as well. >> > + temperature = <75000>; >> > + hysteresis = <2000>; >> > + type = "passive"; >> > + }; >> > + target: trip-point-1 { >> >> It should be better to name it cpu_alert1 instead, because that's what >> other newer dtsi files already use. >> >> > + temperature = <85000>; >> > + hysteresis = <2000>; >> > + type = "passive"; >> > + }; >> > + soc_crit: soc-crit { >> >> It should be better to name it cpu_crit instead, because that's what >> other newer dtsi files already use. > > Seems to me that if I define separate trips for the three groups of > CPU cores as mentioned above this would better stay as soc_crit, as it > applies to the whole die rather than the CPU cluster alone. Then > 'threshold' and 'target' will go altogether, and I'll have separate > *_alert0 and *_alert1 per CPU group. It should perhaps be the best to have "passive", "hot" and "critical" trips defined for all three CPU groups/(sub)clusters, separately of course, to have even higher granularity when it comes to the resulting thermal throttling. >> > + hysteresis = <2000>; >> > + type = "critical"; >> > + }; >> > + }; >> > + cooling-maps { >> > + map0 { >> > + trip = <&target>; >> >> Shouldn't &threshold (i.e. &cpu_alert0) be referenced here instead? >> >> > + cooling-device = <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; >> >> Shouldn't all big CPU cores be listed here instead? > > I guess if a separate trip point is defined for cpu_l0,1,2,3 then it > would need to throttle at 75C, and then cpu_b0,1 and cpu_b2,3 at 85C > each. Logic being that if a sensor stacked in the middle of a group of > four cores shows 75C then one of the four might well be in abnormal > temperature range already (85+), while sensors next to only two big > cores each will likely show temperatures similar to the actual core > temperature. I think we shouldn't make any assumptions of how the CPU cores heat up and affect each other, because we don't really know the required details. Instead, we should simply define the reasonable values for the "passive", "hot" and "critical" trips, and leave the rest to the standard thermal throttling logic. I hope you agree. In the end, that's why we have separate thermal sensors available. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-18 18:48 ` Dragan Simic @ 2024-01-21 18:56 ` Alexey Charkov 2024-01-22 4:55 ` Dragan Simic 0 siblings, 1 reply; 21+ messages in thread From: Alexey Charkov @ 2024-01-21 18:56 UTC (permalink / raw) To: Dragan Simic Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On Thu, Jan 18, 2024 at 10:48 PM Dragan Simic <dsimic@manjaro.org> wrote: > > On 2024-01-08 14:41, Alexey Charkov wrote: > > Hello Dragan, > > Hello Alexey! :) > > I apologize for my delayed response. It took me almost a month to > nearly fully recover from some really nasty flu that eventually went > into my lungs. It was awful and I'm still not back to my 100%. :( Ouch, I hope you get well soon! > > Thanks a lot for your review and comments! Some reflections below. > > Thank you for your work and for your detailed response. Please see my > comments below, which apply to your v2 submission as a well, to which > I'll respond separately a bit later. > > > On Sun, Jan 7, 2024 at 2:54 AM Dragan Simic <dsimic@manjaro.org> wrote: > >> On 2024-01-06 23:23, Alexey Charkov wrote: > >> > Include thermal zones information in device tree for rk3588 variants > >> > and enable the built-in thermal sensing ADC on RADXA Rock 5B > >> > > >> > Signed-off-by: Alexey Charkov <alchark@gmail.com> > >> > --- > >> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >> > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >> > index 8aa0499f9b03..8235991e3112 100644 > >> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >> > @@ -10,6 +10,7 @@ > >> > #include <dt-bindings/reset/rockchip,rk3588-cru.h> > >> > #include <dt-bindings/phy/phy.h> > >> > #include <dt-bindings/ata/ahci.h> > >> > +#include <dt-bindings/thermal/thermal.h> > >> > > >> > / { > >> > compatible = "rockchip,rk3588"; > >> > @@ -2112,6 +2113,148 @@ tsadc: tsadc@fec00000 { > >> > status = "disabled"; > >> > }; > >> > > >> > + thermal_zones: thermal-zones { > >> > + soc_thermal: soc-thermal { > >> > >> It should be better to name it cpu_thermal instead. In the end, > >> that's > >> what it is. > > > > The TRM document says the first TSADC channel (to which this section > > applies) measures the temperature near the center of the SoC die, > > which implies not only the CPU but also the GPU at least. RADXA's > > kernel for Rock 5B also has GPU passive cooling as one of the cooling > > maps under this node (not included here, as we don't have the GPU node > > in .dtsi just yet). So perhaps naming this one cpu_thermal could be > > misleading? > > Ah, I see now, thanks for reminding; it's all described on page 1,372 > of the first part of the RK3588 TRM v1.0. > > Having that in mind, I'd suggest that we end up naming it > package_thermal. > The temperature near the center of the chip is usually considered to be > the overall package temperature; for example, that's how the > user-facing > CPU temperatures are measured in the x86_64 world. Sounds good, will rename in v3! > >> > + trips { > >> > + threshold: trip-point-0 { > >> > >> It should be better to name it cpu_alert0 instead, because that's what > >> other newer dtsi files already use. > > > > Reflecting on your comments here and below, I'm thinking that maybe it > > would be better to define only the critical trip point for the SoC > > overall, and then have alerts along with the respective cooling maps > > separately for A76-0,1, A76-2,3, A55-0,1,2,3? After all, given that we > > have more granular temperature measurement here than in previous RK > > chipsets it might be better to only throttle the "offending" cores, > > not the full package. > > > > What do you think? > > > > Downstream DT doesn't follow this approach though, so maybe there's > > something I'm missing here. > > I agree, it's better to fully utilize the higher measurement granularity > made possible by having multiple temperature sensors available. > > I also agree that we should have only the critical trip defined for the > package-level temperature sensor. Let's have the separate temperature > measurements for the CPU (sub)clusters do the thermal throttling, and > let's keep the package-level measurement for the critical shutdowns > only. > IIRC, some MediaTek SoC dtsi already does exactly that. > > Of course, there are no reasons not to have the critical trips defined > for the CPU (sub)clusters as well. I think I'll also add a board-specific active cooling mechanism on the package level in the next iteration, given that Rock 5B has a PWM fan defined as a cooling device. That will go in the separate patch that updates rk3588-rock-5b.dts (your feedback to v2 of this patch is also duly noted, thank you!) > >> > + temperature = <75000>; > >> > + hysteresis = <2000>; > >> > + type = "passive"; > >> > + }; > >> > + target: trip-point-1 { > >> > >> It should be better to name it cpu_alert1 instead, because that's what > >> other newer dtsi files already use. > >> > >> > + temperature = <85000>; > >> > + hysteresis = <2000>; > >> > + type = "passive"; > >> > + }; > >> > + soc_crit: soc-crit { > >> > >> It should be better to name it cpu_crit instead, because that's what > >> other newer dtsi files already use. > > > > Seems to me that if I define separate trips for the three groups of > > CPU cores as mentioned above this would better stay as soc_crit, as it > > applies to the whole die rather than the CPU cluster alone. Then > > 'threshold' and 'target' will go altogether, and I'll have separate > > *_alert0 and *_alert1 per CPU group. > > It should perhaps be the best to have "passive", "hot" and "critical" > trips defined for all three CPU groups/(sub)clusters, separately of > course, to have even higher granularity when it comes to the resulting > thermal throttling. I looked through drivers/thermal/rockchip_thermal.c, and it doesn't seem to provide any callback for the "hot" trip as part of its struct thermal_zone_device_ops, so I guess it would be redundant in our case here? I couldn't find any generic mechanism to react to "hot" trips, and they seem to be purely driver-specific, thus no-op in case of Rockchips - or am I missing something? > >> > + hysteresis = <2000>; > >> > + type = "critical"; > >> > + }; > >> > + }; > >> > + cooling-maps { > >> > + map0 { > >> > + trip = <&target>; > >> > >> Shouldn't &threshold (i.e. &cpu_alert0) be referenced here instead? > >> > >> > + cooling-device = <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > >> > >> Shouldn't all big CPU cores be listed here instead? > > > > I guess if a separate trip point is defined for cpu_l0,1,2,3 then it > > would need to throttle at 75C, and then cpu_b0,1 and cpu_b2,3 at 85C > > each. Logic being that if a sensor stacked in the middle of a group of > > four cores shows 75C then one of the four might well be in abnormal > > temperature range already (85+), while sensors next to only two big > > cores each will likely show temperatures similar to the actual core > > temperature. > > I think we shouldn't make any assumptions of how the CPU cores heat up > and affect each other, because we don't really know the required > details. > Instead, we should simply define the reasonable values for the > "passive", > "hot" and "critical" trips, and leave the rest to the standard thermal > throttling logic. I hope you agree. > > In the end, that's why we have separate thermal sensors available. Indeed! I'll add extra "passive" alerts though (at 75C) to enable the power allocation governor to initialize its PID parameters calculation before the control temperature setpoint gets hit (per Daniel's feedback under separate cover). Thanks again for your review and comments on this one! Best regards, Alexey _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-21 18:56 ` Alexey Charkov @ 2024-01-22 4:55 ` Dragan Simic 2024-01-22 6:03 ` Alexey Charkov 0 siblings, 1 reply; 21+ messages in thread From: Dragan Simic @ 2024-01-22 4:55 UTC (permalink / raw) To: Alexey Charkov Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel Hello Alexey, On 2024-01-21 19:56, Alexey Charkov wrote: > On Thu, Jan 18, 2024 at 10:48 PM Dragan Simic <dsimic@manjaro.org> > wrote: >> On 2024-01-08 14:41, Alexey Charkov wrote: >> I apologize for my delayed response. It took me almost a month to >> nearly fully recover from some really nasty flu that eventually went >> into my lungs. It was awful and I'm still not back to my 100%. :( > > Ouch, I hope you get well soon! Thank you, let's hope so. It's been really exhausting. :( >> > On Sun, Jan 7, 2024 at 2:54 AM Dragan Simic <dsimic@manjaro.org> wrote: >> >> On 2024-01-06 23:23, Alexey Charkov wrote: >> >> > Include thermal zones information in device tree for rk3588 variants >> >> > and enable the built-in thermal sensing ADC on RADXA Rock 5B >> >> > >> >> > Signed-off-by: Alexey Charkov <alchark@gmail.com> >> >> > --- >> >> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >> >> > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >> >> > index 8aa0499f9b03..8235991e3112 100644 >> >> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >> >> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >> >> > @@ -10,6 +10,7 @@ >> >> > #include <dt-bindings/reset/rockchip,rk3588-cru.h> >> >> > #include <dt-bindings/phy/phy.h> >> >> > #include <dt-bindings/ata/ahci.h> >> >> > +#include <dt-bindings/thermal/thermal.h> >> >> > >> >> > / { >> >> > compatible = "rockchip,rk3588"; >> >> > @@ -2112,6 +2113,148 @@ tsadc: tsadc@fec00000 { >> >> > status = "disabled"; >> >> > }; >> >> > >> >> > + thermal_zones: thermal-zones { >> >> > + soc_thermal: soc-thermal { >> >> >> >> It should be better to name it cpu_thermal instead. In the end, >> >> that's what it is. >> > >> > The TRM document says the first TSADC channel (to which this section >> > applies) measures the temperature near the center of the SoC die, >> > which implies not only the CPU but also the GPU at least. RADXA's >> > kernel for Rock 5B also has GPU passive cooling as one of the cooling >> > maps under this node (not included here, as we don't have the GPU node >> > in .dtsi just yet). So perhaps naming this one cpu_thermal could be >> > misleading? >> >> Ah, I see now, thanks for reminding; it's all described on page 1,372 >> of the first part of the RK3588 TRM v1.0. >> >> Having that in mind, I'd suggest that we end up naming it >> package_thermal. >> The temperature near the center of the chip is usually considered to >> be >> the overall package temperature; for example, that's how the >> user-facing >> CPU temperatures are measured in the x86_64 world. > > Sounds good, will rename in v3! Thanks, I'm glad you agree. >> >> > + trips { >> >> > + threshold: trip-point-0 { >> >> >> >> It should be better to name it cpu_alert0 instead, because that's what >> >> other newer dtsi files already use. >> > >> > Reflecting on your comments here and below, I'm thinking that maybe it >> > would be better to define only the critical trip point for the SoC >> > overall, and then have alerts along with the respective cooling maps >> > separately for A76-0,1, A76-2,3, A55-0,1,2,3? After all, given that we >> > have more granular temperature measurement here than in previous RK >> > chipsets it might be better to only throttle the "offending" cores, >> > not the full package. >> > >> > What do you think? >> > >> > Downstream DT doesn't follow this approach though, so maybe there's >> > something I'm missing here. >> >> I agree, it's better to fully utilize the higher measurement >> granularity >> made possible by having multiple temperature sensors available. >> >> I also agree that we should have only the critical trip defined for >> the >> package-level temperature sensor. Let's have the separate temperature >> measurements for the CPU (sub)clusters do the thermal throttling, and >> let's keep the package-level measurement for the critical shutdowns >> only. IIRC, some MediaTek SoC dtsi already does exactly that. >> >> Of course, there are no reasons not to have the critical trips defined >> for the CPU (sub)clusters as well. > > I think I'll also add a board-specific active cooling mechanism on the > package level in the next iteration, given that Rock 5B has a PWM fan > defined as a cooling device. That will go in the separate patch that > updates rk3588-rock-5b.dts (your feedback to v2 of this patch is also > duly noted, thank you!) Great, thanks. Sure, making use of the Rock 5B's support for attaching a PWM-controlled cooling fan is the way to go. Just to reiterate a bit, any "active" trip points belong to the board dts file(s), because having a cooling fan is a board-specific feature. As a note, you may also want to have a look at the RockPro64 dts(i) files, for example; the RockPro64 also comes with a cooling fan connector and the associated PWM fan control logic. >> >> > + temperature = <75000>; >> >> > + hysteresis = <2000>; >> >> > + type = "passive"; >> >> > + }; >> >> > + target: trip-point-1 { >> >> >> >> It should be better to name it cpu_alert1 instead, because that's what >> >> other newer dtsi files already use. >> >> >> >> > + temperature = <85000>; >> >> > + hysteresis = <2000>; >> >> > + type = "passive"; >> >> > + }; >> >> > + soc_crit: soc-crit { >> >> >> >> It should be better to name it cpu_crit instead, because that's what >> >> other newer dtsi files already use. >> > >> > Seems to me that if I define separate trips for the three groups of >> > CPU cores as mentioned above this would better stay as soc_crit, as it >> > applies to the whole die rather than the CPU cluster alone. Then >> > 'threshold' and 'target' will go altogether, and I'll have separate >> > *_alert0 and *_alert1 per CPU group. >> >> It should perhaps be the best to have "passive", "hot" and "critical" >> trips defined for all three CPU groups/(sub)clusters, separately of >> course, to have even higher granularity when it comes to the resulting >> thermal throttling. > > I looked through drivers/thermal/rockchip_thermal.c, and it doesn't > seem to provide any callback for the "hot" trip as part of its struct > thermal_zone_device_ops, so I guess it would be redundant in our case > here? I couldn't find any generic mechanism to react to "hot" trips, > and they seem to be purely driver-specific, thus no-op in case of > Rockchips - or am I missing something? That's a good question. Please, let me go through the code in detail, and I'll get back with an update soon. Also, please wait a bit with sending the v3, until all open questions are addressed. >> >> > + hysteresis = <2000>; >> >> > + type = "critical"; >> >> > + }; >> >> > + }; >> >> > + cooling-maps { >> >> > + map0 { >> >> > + trip = <&target>; >> >> >> >> Shouldn't &threshold (i.e. &cpu_alert0) be referenced here instead? >> >> >> >> > + cooling-device = <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; >> >> >> >> Shouldn't all big CPU cores be listed here instead? >> > >> > I guess if a separate trip point is defined for cpu_l0,1,2,3 then it >> > would need to throttle at 75C, and then cpu_b0,1 and cpu_b2,3 at 85C >> > each. Logic being that if a sensor stacked in the middle of a group of >> > four cores shows 75C then one of the four might well be in abnormal >> > temperature range already (85+), while sensors next to only two big >> > cores each will likely show temperatures similar to the actual core >> > temperature. >> >> I think we shouldn't make any assumptions of how the CPU cores heat up >> and affect each other, because we don't really know the required >> details. >> Instead, we should simply define the reasonable values for the >> "passive", >> "hot" and "critical" trips, and leave the rest to the standard thermal >> throttling logic. I hope you agree. >> >> In the end, that's why we have separate thermal sensors available. > > Indeed! I'll add extra "passive" alerts though (at 75C) to enable the > power allocation governor to initialize its PID parameters calculation > before the control temperature setpoint gets hit (per Daniel's > feedback under separate cover). I'm glad you agree. Adding one more "passive" trip point makes sense, but please let me go through the code in detail first. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-22 4:55 ` Dragan Simic @ 2024-01-22 6:03 ` Alexey Charkov 2024-01-22 6:22 ` Dragan Simic 0 siblings, 1 reply; 21+ messages in thread From: Alexey Charkov @ 2024-01-22 6:03 UTC (permalink / raw) To: Dragan Simic Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On Mon, Jan 22, 2024 at 8:55 AM Dragan Simic <dsimic@manjaro.org> wrote: > > Hello Alexey, > > On 2024-01-21 19:56, Alexey Charkov wrote: > > On Thu, Jan 18, 2024 at 10:48 PM Dragan Simic <dsimic@manjaro.org> > > wrote: > >> On 2024-01-08 14:41, Alexey Charkov wrote: > >> I apologize for my delayed response. It took me almost a month to > >> nearly fully recover from some really nasty flu that eventually went > >> into my lungs. It was awful and I'm still not back to my 100%. :( > > > > Ouch, I hope you get well soon! > > Thank you, let's hope so. It's been really exhausting. :( > > >> > On Sun, Jan 7, 2024 at 2:54 AM Dragan Simic <dsimic@manjaro.org> wrote: > >> >> On 2024-01-06 23:23, Alexey Charkov wrote: > >> >> > Include thermal zones information in device tree for rk3588 variants > >> >> > and enable the built-in thermal sensing ADC on RADXA Rock 5B > >> >> > > >> >> > Signed-off-by: Alexey Charkov <alchark@gmail.com> > >> >> > --- > >> >> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >> >> > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >> >> > index 8aa0499f9b03..8235991e3112 100644 > >> >> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >> >> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >> >> > @@ -10,6 +10,7 @@ > >> >> > #include <dt-bindings/reset/rockchip,rk3588-cru.h> > >> >> > #include <dt-bindings/phy/phy.h> > >> >> > #include <dt-bindings/ata/ahci.h> > >> >> > +#include <dt-bindings/thermal/thermal.h> > >> >> > > >> >> > / { > >> >> > compatible = "rockchip,rk3588"; > >> >> > @@ -2112,6 +2113,148 @@ tsadc: tsadc@fec00000 { > >> >> > status = "disabled"; > >> >> > }; > >> >> > > >> >> > + thermal_zones: thermal-zones { > >> >> > + soc_thermal: soc-thermal { > >> >> > >> >> It should be better to name it cpu_thermal instead. In the end, > >> >> that's what it is. > >> > > >> > The TRM document says the first TSADC channel (to which this section > >> > applies) measures the temperature near the center of the SoC die, > >> > which implies not only the CPU but also the GPU at least. RADXA's > >> > kernel for Rock 5B also has GPU passive cooling as one of the cooling > >> > maps under this node (not included here, as we don't have the GPU node > >> > in .dtsi just yet). So perhaps naming this one cpu_thermal could be > >> > misleading? > >> > >> Ah, I see now, thanks for reminding; it's all described on page 1,372 > >> of the first part of the RK3588 TRM v1.0. > >> > >> Having that in mind, I'd suggest that we end up naming it > >> package_thermal. > >> The temperature near the center of the chip is usually considered to > >> be > >> the overall package temperature; for example, that's how the > >> user-facing > >> CPU temperatures are measured in the x86_64 world. > > > > Sounds good, will rename in v3! > > Thanks, I'm glad you agree. > > >> >> > + trips { > >> >> > + threshold: trip-point-0 { > >> >> > >> >> It should be better to name it cpu_alert0 instead, because that's what > >> >> other newer dtsi files already use. > >> > > >> > Reflecting on your comments here and below, I'm thinking that maybe it > >> > would be better to define only the critical trip point for the SoC > >> > overall, and then have alerts along with the respective cooling maps > >> > separately for A76-0,1, A76-2,3, A55-0,1,2,3? After all, given that we > >> > have more granular temperature measurement here than in previous RK > >> > chipsets it might be better to only throttle the "offending" cores, > >> > not the full package. > >> > > >> > What do you think? > >> > > >> > Downstream DT doesn't follow this approach though, so maybe there's > >> > something I'm missing here. > >> > >> I agree, it's better to fully utilize the higher measurement > >> granularity > >> made possible by having multiple temperature sensors available. > >> > >> I also agree that we should have only the critical trip defined for > >> the > >> package-level temperature sensor. Let's have the separate temperature > >> measurements for the CPU (sub)clusters do the thermal throttling, and > >> let's keep the package-level measurement for the critical shutdowns > >> only. IIRC, some MediaTek SoC dtsi already does exactly that. > >> > >> Of course, there are no reasons not to have the critical trips defined > >> for the CPU (sub)clusters as well. > > > > I think I'll also add a board-specific active cooling mechanism on the > > package level in the next iteration, given that Rock 5B has a PWM fan > > defined as a cooling device. That will go in the separate patch that > > updates rk3588-rock-5b.dts (your feedback to v2 of this patch is also > > duly noted, thank you!) > > Great, thanks. Sure, making use of the Rock 5B's support for attaching > a PWM-controlled cooling fan is the way to go. > > Just to reiterate a bit, any "active" trip points belong to the board > dts > file(s), because having a cooling fan is a board-specific feature. As a > note, you may also want to have a look at the RockPro64 dts(i) files, > for > example; the RockPro64 also comes with a cooling fan connector and the > associated PWM fan control logic. Thanks for the pointer! There is also a helpful doc within devicetree bindings descriptions, although it sits under hwmon which was a bit confusing to me. I've already tested it locally (by adding to the board dts), and it spins up and down quite nicely, and even modulates the fan speed swiftly when the load changes - yay! > >> >> > + temperature = <75000>; > >> >> > + hysteresis = <2000>; > >> >> > + type = "passive"; > >> >> > + }; > >> >> > + target: trip-point-1 { > >> >> > >> >> It should be better to name it cpu_alert1 instead, because that's what > >> >> other newer dtsi files already use. > >> >> > >> >> > + temperature = <85000>; > >> >> > + hysteresis = <2000>; > >> >> > + type = "passive"; > >> >> > + }; > >> >> > + soc_crit: soc-crit { > >> >> > >> >> It should be better to name it cpu_crit instead, because that's what > >> >> other newer dtsi files already use. > >> > > >> > Seems to me that if I define separate trips for the three groups of > >> > CPU cores as mentioned above this would better stay as soc_crit, as it > >> > applies to the whole die rather than the CPU cluster alone. Then > >> > 'threshold' and 'target' will go altogether, and I'll have separate > >> > *_alert0 and *_alert1 per CPU group. > >> > >> It should perhaps be the best to have "passive", "hot" and "critical" > >> trips defined for all three CPU groups/(sub)clusters, separately of > >> course, to have even higher granularity when it comes to the resulting > >> thermal throttling. > > > > I looked through drivers/thermal/rockchip_thermal.c, and it doesn't > > seem to provide any callback for the "hot" trip as part of its struct > > thermal_zone_device_ops, so I guess it would be redundant in our case > > here? I couldn't find any generic mechanism to react to "hot" trips, > > and they seem to be purely driver-specific, thus no-op in case of > > Rockchips - or am I missing something? > > That's a good question. Please, let me go through the code in detail, > and I'll get back with an update soon. Also, please wait a bit with > sending the v3, until all open questions are addressed. Of course. Thank you for taking the time to dig through this one with me! Best regards, Alexey _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-22 6:03 ` Alexey Charkov @ 2024-01-22 6:22 ` Dragan Simic 2024-01-22 7:36 ` Alexey Charkov 0 siblings, 1 reply; 21+ messages in thread From: Dragan Simic @ 2024-01-22 6:22 UTC (permalink / raw) To: Alexey Charkov Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On 2024-01-22 07:03, Alexey Charkov wrote: > On Mon, Jan 22, 2024 at 8:55 AM Dragan Simic <dsimic@manjaro.org> > wrote: >> On 2024-01-21 19:56, Alexey Charkov wrote: >> > On Thu, Jan 18, 2024 at 10:48 PM Dragan Simic <dsimic@manjaro.org> wrote: >> >> On 2024-01-08 14:41, Alexey Charkov wrote: >> >> > On Sun, Jan 7, 2024 at 2:54 AM Dragan Simic <dsimic@manjaro.org> wrote: >> >> >> On 2024-01-06 23:23, Alexey Charkov wrote: >> >> >> > Include thermal zones information in device tree for rk3588 variants >> >> >> > and enable the built-in thermal sensing ADC on RADXA Rock 5B >> >> >> > >> >> >> > Signed-off-by: Alexey Charkov <alchark@gmail.com> >> >> >> > --- >> >> >> > + trips { >> >> >> > + threshold: trip-point-0 { >> >> >> >> >> >> It should be better to name it cpu_alert0 instead, because that's what >> >> >> other newer dtsi files already use. >> >> > >> >> > Reflecting on your comments here and below, I'm thinking that maybe it >> >> > would be better to define only the critical trip point for the SoC >> >> > overall, and then have alerts along with the respective cooling maps >> >> > separately for A76-0,1, A76-2,3, A55-0,1,2,3? After all, given that we >> >> > have more granular temperature measurement here than in previous RK >> >> > chipsets it might be better to only throttle the "offending" cores, >> >> > not the full package. >> >> > >> >> > What do you think? >> >> > >> >> > Downstream DT doesn't follow this approach though, so maybe there's >> >> > something I'm missing here. >> >> >> >> I agree, it's better to fully utilize the higher measurement >> >> granularity >> >> made possible by having multiple temperature sensors available. >> >> >> >> I also agree that we should have only the critical trip defined for >> >> the >> >> package-level temperature sensor. Let's have the separate temperature >> >> measurements for the CPU (sub)clusters do the thermal throttling, and >> >> let's keep the package-level measurement for the critical shutdowns >> >> only. IIRC, some MediaTek SoC dtsi already does exactly that. >> >> >> >> Of course, there are no reasons not to have the critical trips defined >> >> for the CPU (sub)clusters as well. >> > >> > I think I'll also add a board-specific active cooling mechanism on the >> > package level in the next iteration, given that Rock 5B has a PWM fan >> > defined as a cooling device. That will go in the separate patch that >> > updates rk3588-rock-5b.dts (your feedback to v2 of this patch is also >> > duly noted, thank you!) >> >> Great, thanks. Sure, making use of the Rock 5B's support for >> attaching >> a PWM-controlled cooling fan is the way to go. >> >> Just to reiterate a bit, any "active" trip points belong to the board >> dts file(s), because having a cooling fan is a board-specific feature. >> As a note, you may also want to have a look at the RockPro64 dts(i) >> files, for example; the RockPro64 also comes with a cooling fan >> connector and the associated PWM fan control logic. > > Thanks for the pointer! There is also a helpful doc within devicetree > bindings descriptions, although it sits under hwmon which was a bit > confusing to me. I've already tested it locally (by adding to the > board dts), and it spins up and down quite nicely, and even modulates > the fan speed swiftly when the load changes - yay! Nice! Also, isn't it like magic? :) To me, turning LEDs on/off and controlling fans acts as some kind of a "bridge" between the virtual and the real world. :) As a suggestion, it would be good to test with a couple of different fans, to make sure that the PWM values work well for more that one fan model. The Rock 5B requires a 5 V fan, if I'm not mistaken? >> >> >> > + temperature = <75000>; >> >> >> > + hysteresis = <2000>; >> >> >> > + type = "passive"; >> >> >> > + }; >> >> >> > + target: trip-point-1 { >> >> >> >> >> >> It should be better to name it cpu_alert1 instead, because that's what >> >> >> other newer dtsi files already use. >> >> >> >> >> >> > + temperature = <85000>; >> >> >> > + hysteresis = <2000>; >> >> >> > + type = "passive"; >> >> >> > + }; >> >> >> > + soc_crit: soc-crit { >> >> >> >> >> >> It should be better to name it cpu_crit instead, because that's what >> >> >> other newer dtsi files already use. >> >> > >> >> > Seems to me that if I define separate trips for the three groups of >> >> > CPU cores as mentioned above this would better stay as soc_crit, as it >> >> > applies to the whole die rather than the CPU cluster alone. Then >> >> > 'threshold' and 'target' will go altogether, and I'll have separate >> >> > *_alert0 and *_alert1 per CPU group. >> >> >> >> It should perhaps be the best to have "passive", "hot" and "critical" >> >> trips defined for all three CPU groups/(sub)clusters, separately of >> >> course, to have even higher granularity when it comes to the resulting >> >> thermal throttling. >> > >> > I looked through drivers/thermal/rockchip_thermal.c, and it doesn't >> > seem to provide any callback for the "hot" trip as part of its struct >> > thermal_zone_device_ops, so I guess it would be redundant in our case >> > here? I couldn't find any generic mechanism to react to "hot" trips, >> > and they seem to be purely driver-specific, thus no-op in case of >> > Rockchips - or am I missing something? >> >> That's a good question. Please, let me go through the code in detail, >> and I'll get back with an update soon. Also, please wait a bit with >> sending the v3, until all open questions are addressed. > > Of course. Thank you for taking the time to dig through this one with > me! I'm glad to help. It's important to have working thermal throttling on the supported RK3588-based boards. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-22 6:22 ` Dragan Simic @ 2024-01-22 7:36 ` Alexey Charkov 2024-01-22 7:57 ` Dragan Simic 0 siblings, 1 reply; 21+ messages in thread From: Alexey Charkov @ 2024-01-22 7:36 UTC (permalink / raw) To: Dragan Simic Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On Mon, Jan 22, 2024 at 10:22 AM Dragan Simic <dsimic@manjaro.org> wrote: > > On 2024-01-22 07:03, Alexey Charkov wrote: > > On Mon, Jan 22, 2024 at 8:55 AM Dragan Simic <dsimic@manjaro.org> > > wrote: > >> On 2024-01-21 19:56, Alexey Charkov wrote: > >> > On Thu, Jan 18, 2024 at 10:48 PM Dragan Simic <dsimic@manjaro.org> wrote: > >> >> On 2024-01-08 14:41, Alexey Charkov wrote: > >> >> > On Sun, Jan 7, 2024 at 2:54 AM Dragan Simic <dsimic@manjaro.org> wrote: > >> >> >> On 2024-01-06 23:23, Alexey Charkov wrote: > >> >> >> > Include thermal zones information in device tree for rk3588 variants > >> >> >> > and enable the built-in thermal sensing ADC on RADXA Rock 5B > >> >> >> > > >> >> >> > Signed-off-by: Alexey Charkov <alchark@gmail.com> > >> >> >> > --- > >> >> >> > + trips { > >> >> >> > + threshold: trip-point-0 { > >> >> >> > >> >> >> It should be better to name it cpu_alert0 instead, because that's what > >> >> >> other newer dtsi files already use. > >> >> > > >> >> > Reflecting on your comments here and below, I'm thinking that maybe it > >> >> > would be better to define only the critical trip point for the SoC > >> >> > overall, and then have alerts along with the respective cooling maps > >> >> > separately for A76-0,1, A76-2,3, A55-0,1,2,3? After all, given that we > >> >> > have more granular temperature measurement here than in previous RK > >> >> > chipsets it might be better to only throttle the "offending" cores, > >> >> > not the full package. > >> >> > > >> >> > What do you think? > >> >> > > >> >> > Downstream DT doesn't follow this approach though, so maybe there's > >> >> > something I'm missing here. > >> >> > >> >> I agree, it's better to fully utilize the higher measurement > >> >> granularity > >> >> made possible by having multiple temperature sensors available. > >> >> > >> >> I also agree that we should have only the critical trip defined for > >> >> the > >> >> package-level temperature sensor. Let's have the separate temperature > >> >> measurements for the CPU (sub)clusters do the thermal throttling, and > >> >> let's keep the package-level measurement for the critical shutdowns > >> >> only. IIRC, some MediaTek SoC dtsi already does exactly that. > >> >> > >> >> Of course, there are no reasons not to have the critical trips defined > >> >> for the CPU (sub)clusters as well. > >> > > >> > I think I'll also add a board-specific active cooling mechanism on the > >> > package level in the next iteration, given that Rock 5B has a PWM fan > >> > defined as a cooling device. That will go in the separate patch that > >> > updates rk3588-rock-5b.dts (your feedback to v2 of this patch is also > >> > duly noted, thank you!) > >> > >> Great, thanks. Sure, making use of the Rock 5B's support for > >> attaching > >> a PWM-controlled cooling fan is the way to go. > >> > >> Just to reiterate a bit, any "active" trip points belong to the board > >> dts file(s), because having a cooling fan is a board-specific feature. > >> As a note, you may also want to have a look at the RockPro64 dts(i) > >> files, for example; the RockPro64 also comes with a cooling fan > >> connector and the associated PWM fan control logic. > > > > Thanks for the pointer! There is also a helpful doc within devicetree > > bindings descriptions, although it sits under hwmon which was a bit > > confusing to me. I've already tested it locally (by adding to the > > board dts), and it spins up and down quite nicely, and even modulates > > the fan speed swiftly when the load changes - yay! > > Nice! Also, isn't it like magic? :) To me, turning LEDs on/off and > controlling fans acts as some kind of a "bridge" between the virtual > and the real world. :) Oh yes! I also keep admiring how one can add just a couple of lines of text here and there that's not even real code, and the whole kernel machinery starts crunching numbers, analyzing temperatures, running PID loops, etc etc so that I could enjoy the satisfying whistle of a small fan when I type `make -j8` :-D > As a suggestion, it would be good to test with a couple of different > fans, to make sure that the PWM values work well for more that one fan > model. The Rock 5B requires a 5 V fan, if I'm not mistaken? It is 5V, yes. I only have one fan to try though, and I simply relied on the PWM values that are already defined in the upstream rk3588-rock-5b.dts. They don't look ideal for my particular fan, because the lowest non-zero cooling state currently uses a PWM value of 95, which doesn't always make it spin up. But in the end it doesn't seem to matter that much, because that tiny fan needs to spin at full 255 whenever all eight cores are loaded (and even then it can only balance the temperature at around 60.5С), and when the load is lighter (such as during various ./configure runs) it just switches off completely as the temperature goes down to 46C even with the fan not spinning. I don't currently use the GPU/NPU/VPU though - maybe those would produce more moderate load which could benefit from spinning the fan at medium speeds. Best regards, Alexey _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-22 7:36 ` Alexey Charkov @ 2024-01-22 7:57 ` Dragan Simic 2024-01-22 14:20 ` Alexey Charkov 0 siblings, 1 reply; 21+ messages in thread From: Dragan Simic @ 2024-01-22 7:57 UTC (permalink / raw) To: Alexey Charkov Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On 2024-01-22 08:36, Alexey Charkov wrote: > On Mon, Jan 22, 2024 at 10:22 AM Dragan Simic <dsimic@manjaro.org> > wrote: >> On 2024-01-22 07:03, Alexey Charkov wrote: >> > On Mon, Jan 22, 2024 at 8:55 AM Dragan Simic <dsimic@manjaro.org> wrote: >> >> On 2024-01-21 19:56, Alexey Charkov wrote: >> >> > I think I'll also add a board-specific active cooling mechanism on the >> >> > package level in the next iteration, given that Rock 5B has a PWM fan >> >> > defined as a cooling device. That will go in the separate patch that >> >> > updates rk3588-rock-5b.dts (your feedback to v2 of this patch is also >> >> > duly noted, thank you!) >> >> >> >> Great, thanks. Sure, making use of the Rock 5B's support for attaching >> >> a PWM-controlled cooling fan is the way to go. >> >> >> >> Just to reiterate a bit, any "active" trip points belong to the board >> >> dts file(s), because having a cooling fan is a board-specific feature. >> >> As a note, you may also want to have a look at the RockPro64 dts(i) >> >> files, for example; the RockPro64 also comes with a cooling fan >> >> connector and the associated PWM fan control logic. >> > >> > Thanks for the pointer! There is also a helpful doc within devicetree >> > bindings descriptions, although it sits under hwmon which was a bit >> > confusing to me. I've already tested it locally (by adding to the >> > board dts), and it spins up and down quite nicely, and even modulates >> > the fan speed swiftly when the load changes - yay! >> >> Nice! Also, isn't it like magic? :) To me, turning LEDs on/off and >> controlling fans acts as some kind of a "bridge" between the virtual >> and the real world. :) > > Oh yes! I also keep admiring how one can add just a couple of lines of > text here and there that's not even real code, and the whole kernel > machinery starts crunching numbers, analyzing temperatures, running > PID loops, etc etc so that I could enjoy the satisfying whistle of a > small fan when I type `make -j8` :-D Yes, it's very satisfying, :) and it also demonstrates the true power of the device trees as hardware definitions. Just a few more lines and the cooling works! :) >> As a suggestion, it would be good to test with a couple of different >> fans, to make sure that the PWM values work well for more that one fan >> model. The Rock 5B requires a 5 V fan, if I'm not mistaken? > > It is 5V, yes. I only have one fan to try though, and I simply relied > on the PWM values that are already defined in the upstream > rk3588-rock-5b.dts. They don't look ideal for my particular fan, > because the lowest non-zero cooling state currently uses a PWM value > of 95, which doesn't always make it spin up. But in the end it doesn't > seem to matter that much, because that tiny fan needs to spin at full > 255 whenever all eight cores are loaded (and even then it can only > balance the temperature at around 60.5С), and when the load is lighter > (such as during various ./configure runs) it just switches off > completely as the temperature goes down to 46C even with the fan not > spinning. I see, 5 V fans unfortunately aren't very common. I'm not sure why Radxa opted for 5 V there; maybe the goal was to use Raspberry Pi 5 V fans, but using those tiny fans doesn't make much sense, IMHO. I think you can freely adjust the PWM values a bit to make your fan start reliably at the lowest state, regardless of how rarely that state will be used. See, if your fan doesn't spin up reliably with the current lowest state, chances for other fan models not to spin up are quite high. IOW, it's better to play safe there, if you agree. What kind of heatsink are you using with your Rock 5B? Ah yes, and what's the actual model of the fan you're using? > I don't currently use the GPU/NPU/VPU though - maybe those would > produce more moderate load which could benefit from spinning the fan > at medium speeds. Perhaps, but it will need to be tested at some point. Have you tried loading only one or two CPU cores? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-22 7:57 ` Dragan Simic @ 2024-01-22 14:20 ` Alexey Charkov 2024-01-22 17:33 ` Dragan Simic 0 siblings, 1 reply; 21+ messages in thread From: Alexey Charkov @ 2024-01-22 14:20 UTC (permalink / raw) To: Dragan Simic Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On Mon, Jan 22, 2024 at 11:57 AM Dragan Simic <dsimic@manjaro.org> wrote: > > On 2024-01-22 08:36, Alexey Charkov wrote: > > On Mon, Jan 22, 2024 at 10:22 AM Dragan Simic <dsimic@manjaro.org> > > wrote: > >> On 2024-01-22 07:03, Alexey Charkov wrote: > >> > On Mon, Jan 22, 2024 at 8:55 AM Dragan Simic <dsimic@manjaro.org> wrote: > >> >> On 2024-01-21 19:56, Alexey Charkov wrote: > >> >> > I think I'll also add a board-specific active cooling mechanism on the > >> >> > package level in the next iteration, given that Rock 5B has a PWM fan > >> >> > defined as a cooling device. That will go in the separate patch that > >> >> > updates rk3588-rock-5b.dts (your feedback to v2 of this patch is also > >> >> > duly noted, thank you!) > >> >> > >> >> Great, thanks. Sure, making use of the Rock 5B's support for attaching > >> >> a PWM-controlled cooling fan is the way to go. > >> >> > >> >> Just to reiterate a bit, any "active" trip points belong to the board > >> >> dts file(s), because having a cooling fan is a board-specific feature. > >> >> As a note, you may also want to have a look at the RockPro64 dts(i) > >> >> files, for example; the RockPro64 also comes with a cooling fan > >> >> connector and the associated PWM fan control logic. > >> > > >> > Thanks for the pointer! There is also a helpful doc within devicetree > >> > bindings descriptions, although it sits under hwmon which was a bit > >> > confusing to me. I've already tested it locally (by adding to the > >> > board dts), and it spins up and down quite nicely, and even modulates > >> > the fan speed swiftly when the load changes - yay! > >> > >> Nice! Also, isn't it like magic? :) To me, turning LEDs on/off and > >> controlling fans acts as some kind of a "bridge" between the virtual > >> and the real world. :) > > > > Oh yes! I also keep admiring how one can add just a couple of lines of > > text here and there that's not even real code, and the whole kernel > > machinery starts crunching numbers, analyzing temperatures, running > > PID loops, etc etc so that I could enjoy the satisfying whistle of a > > small fan when I type `make -j8` :-D > > Yes, it's very satisfying, :) and it also demonstrates the true power > of the device trees as hardware definitions. Just a few more lines and > the cooling works! :) > > >> As a suggestion, it would be good to test with a couple of different > >> fans, to make sure that the PWM values work well for more that one fan > >> model. The Rock 5B requires a 5 V fan, if I'm not mistaken? > > > > It is 5V, yes. I only have one fan to try though, and I simply relied > > on the PWM values that are already defined in the upstream > > rk3588-rock-5b.dts. They don't look ideal for my particular fan, > > because the lowest non-zero cooling state currently uses a PWM value > > of 95, which doesn't always make it spin up. But in the end it doesn't > > seem to matter that much, because that tiny fan needs to spin at full > > 255 whenever all eight cores are loaded (and even then it can only > > balance the temperature at around 60.5С), and when the load is lighter > > (such as during various ./configure runs) it just switches off > > completely as the temperature goes down to 46C even with the fan not > > spinning. > > I see, 5 V fans unfortunately aren't very common. I'm not sure why > Radxa opted for 5 V there; maybe the goal was to use Raspberry Pi 5 V > fans, but using those tiny fans doesn't make much sense, IMHO. > > I think you can freely adjust the PWM values a bit to make your fan > start reliably at the lowest state, regardless of how rarely that state > will be used. See, if your fan doesn't spin up reliably with the > current > lowest state, chances for other fan models not to spin up are quite > high. > IOW, it's better to play safe there, if you agree. > > What kind of heatsink are you using with your Rock 5B? Ah yes, and > what's the actual model of the fan you're using? I use Radxa's 4012 heatsink-fan assembly that comes as an add-on option when buying the board itself from Allnet. I guess I'll include slightly adjusted PWM values in the rk3588-rock-5b.dts patch to better represent my fan's "preferred" range (in my experience a PWM value of around 120 is the reliable lower end - it would continue spinning below that point but won't always start without being pushed manually) > > I don't currently use the GPU/NPU/VPU though - maybe those would > > produce more moderate load which could benefit from spinning the fan > > at medium speeds. > > Perhaps, but it will need to be tested at some point. Have you tried > loading only one or two CPU cores? I do see the full range of PWM values being used, including intermediate ones. It doesn't go zero to hero :) My point was more about the default fan not being super mighty vs. the full package thermal output, which will likely mean that intermediate values are rarely used. But I'll double check with more varied loads to make sure it behaves in a sensible way (especially given that I'll be testing purely interrupt-driven operation per Daniel's guidance in the other sub-thread). Best regards, Alexey _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-22 14:20 ` Alexey Charkov @ 2024-01-22 17:33 ` Dragan Simic 0 siblings, 0 replies; 21+ messages in thread From: Dragan Simic @ 2024-01-22 17:33 UTC (permalink / raw) To: Alexey Charkov Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On 2024-01-22 15:20, Alexey Charkov wrote: > On Mon, Jan 22, 2024 at 11:57 AM Dragan Simic <dsimic@manjaro.org> > wrote: >> On 2024-01-22 08:36, Alexey Charkov wrote: >> > On Mon, Jan 22, 2024 at 10:22 AM Dragan Simic <dsimic@manjaro.org> wrote: >> >> As a suggestion, it would be good to test with a couple of different >> >> fans, to make sure that the PWM values work well for more that one fan >> >> model. The Rock 5B requires a 5 V fan, if I'm not mistaken? >> > >> > It is 5V, yes. I only have one fan to try though, and I simply relied >> > on the PWM values that are already defined in the upstream >> > rk3588-rock-5b.dts. They don't look ideal for my particular fan, >> > because the lowest non-zero cooling state currently uses a PWM value >> > of 95, which doesn't always make it spin up. But in the end it doesn't >> > seem to matter that much, because that tiny fan needs to spin at full >> > 255 whenever all eight cores are loaded (and even then it can only >> > balance the temperature at around 60.5С), and when the load is lighter >> > (such as during various ./configure runs) it just switches off >> > completely as the temperature goes down to 46C even with the fan not >> > spinning. >> >> I see, 5 V fans unfortunately aren't very common. I'm not sure why >> Radxa opted for 5 V there; maybe the goal was to use Raspberry Pi 5 V >> fans, but using those tiny fans doesn't make much sense, IMHO. >> >> I think you can freely adjust the PWM values a bit to make your fan >> start reliably at the lowest state, regardless of how rarely that >> state >> will be used. See, if your fan doesn't spin up reliably with the >> current >> lowest state, chances for other fan models not to spin up are quite >> high. IOW, it's better to play safe there, if you agree. >> >> What kind of heatsink are you using with your Rock 5B? Ah yes, and >> what's the actual model of the fan you're using? > > I use Radxa's 4012 heatsink-fan assembly that comes as an add-on > option when buying the board itself from Allnet. I guess I'll include > slightly adjusted PWM values in the rk3588-rock-5b.dts patch to better > represent my fan's "preferred" range (in my experience a PWM value of > around 120 is the reliable lower end - it would continue spinning > below that point but won't always start without being pushed manually) Sure, just go ahead. You're using an official active cooling solution for the Rock 5B, so adjusting the PWM values a bit to make the fan start up reliably is even more warranted. >> > I don't currently use the GPU/NPU/VPU though - maybe those would >> > produce more moderate load which could benefit from spinning the fan >> > at medium speeds. >> >> Perhaps, but it will need to be tested at some point. Have you tried >> loading only one or two CPU cores? > > I do see the full range of PWM values being used, including > intermediate ones. It doesn't go zero to hero :) My point was more > about the default fan not being super mighty vs. the full package > thermal output, which will likely mean that intermediate values are > rarely used. But I'll double check with more varied loads to make sure > it behaves in a sensible way (especially given that I'll be testing > purely interrupt-driven operation per Daniel's guidance in the other > sub-thread). I see, thanks for the clarification. :) The 4012 fan and heatsink seem rather tiny; [1] a more beefy assembly, with more thermal mass, larger fin surface and a fan that pushes a bit more air across its RPM range, would probably result in a noticeably broader use of the intermediate PWM values. [1] https://shop.allnetchina.cn/products/active-heat-sink-for-visionfive-sbc _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-06 22:23 [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 Alexey Charkov 2024-01-06 22:54 ` Dragan Simic @ 2024-01-09 19:19 ` Alexey Charkov 2024-01-18 19:20 ` Dragan Simic ` (2 more replies) 1 sibling, 3 replies; 21+ messages in thread From: Alexey Charkov @ 2024-01-09 19:19 UTC (permalink / raw) To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Alexey Charkov, Kever Yang, Jagan Teki, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel Include thermal zones information in device tree for rk3588 variants and enable the built-in thermal sensing ADC on RADXA Rock 5B Signed-off-by: Alexey Charkov <alchark@gmail.com> --- Changes in v2: - Dropped redundant comments - Included all CPU cores in cooling maps - Split cooling maps into more granular ones utilizing TSADC channels 1-3 which measure temperature by separate CPU clusters instead of channel 0 which measures the center of the SoC die --- .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 151 ++++++++++++++++++ 2 files changed, 155 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index a5a104131403..f9d540000de3 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts @@ -772,3 +772,7 @@ &usb_host1_ehci { &usb_host1_ohci { status = "okay"; }; + +&tsadc { + status = "okay"; +}; diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi index 8aa0499f9b03..8d54998d0ecc 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi @@ -10,6 +10,7 @@ #include <dt-bindings/reset/rockchip,rk3588-cru.h> #include <dt-bindings/phy/phy.h> #include <dt-bindings/ata/ahci.h> +#include <dt-bindings/thermal/thermal.h> / { compatible = "rockchip,rk3588"; @@ -2112,6 +2113,156 @@ tsadc: tsadc@fec00000 { status = "disabled"; }; + thermal_zones: thermal-zones { + /* sensor near the center of the whole chip */ + soc_thermal: soc-thermal { + polling-delay-passive = <20>; + polling-delay = <1000>; + sustainable-power = <2100>; + thermal-sensors = <&tsadc 0>; + + trips { + soc_crit: soc-crit { + temperature = <115000>; + hysteresis = <2000>; + type = "critical"; + }; + }; + }; + + /* sensor between A76 cores 0 and 1 */ + bigcore0_thermal: bigcore0-thermal { + polling-delay-passive = <20>; + polling-delay = <1000>; + thermal-sensors = <&tsadc 1>; + + trips { + bigcore0_alert: bigcore0-alert { + temperature = <85000>; + hysteresis = <2000>; + type = "passive"; + }; + bigcore0_crit: bigcore0-crit { + temperature = <115000>; + hysteresis = <2000>; + type = "critical"; + }; + }; + cooling-maps { + map0 { + trip = <&bigcore0_alert>; + cooling-device = + <&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + contribution = <1024>; + }; + }; + }; + + /* sensor between A76 cores 2 and 3 */ + bigcore2_thermal: bigcore2-thermal { + polling-delay-passive = <20>; + polling-delay = <1000>; + thermal-sensors = <&tsadc 2>; + + trips { + bigcore2_alert: bigcore2-alert { + temperature = <85000>; + hysteresis = <2000>; + type = "passive"; + }; + bigcore2_crit: bigcore2-crit { + temperature = <115000>; + hysteresis = <2000>; + type = "critical"; + }; + }; + cooling-maps { + map1 { + trip = <&bigcore2_alert>; + cooling-device = + <&cpu_b2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&cpu_b3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + contribution = <1024>; + }; + }; + }; + + /* sensor between the four A55 cores */ + little_core_thermal: littlecore-thermal { + polling-delay-passive = <20>; + polling-delay = <1000>; + thermal-sensors = <&tsadc 3>; + + trips { + littlecore_alert: littlecore-alert { + temperature = <85000>; + hysteresis = <2000>; + type = "passive"; + }; + littlecore_crit: littlecore-crit { + temperature = <115000>; + hysteresis = <2000>; + type = "critical"; + }; + }; + cooling-maps { + map2 { + trip = <&littlecore_alert>; + cooling-device = + <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&cpu_l1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&cpu_l2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&cpu_l3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + contribution = <1024>; + }; + }; + }; + + /* sensor near the PD_CENTER power domain */ + center_thermal: center-thermal { + polling-delay-passive = <20>; + polling-delay = <1000>; + thermal-sensors = <&tsadc 4>; + + trips { + center_crit: center-crit { + temperature = <115000>; + hysteresis = <2000>; + type = "critical"; + }; + }; + }; + + gpu_thermal: gpu-thermal { + polling-delay-passive = <20>; + polling-delay = <1000>; + thermal-sensors = <&tsadc 5>; + + trips { + gpu_crit: gpu-crit { + temperature = <115000>; + hysteresis = <2000>; + type = "critical"; + }; + }; + }; + + npu_thermal: npu-thermal { + polling-delay-passive = <20>; + polling-delay = <1000>; + thermal-sensors = <&tsadc 6>; + + trips { + npu_crit: npu-crit { + temperature = <115000>; + hysteresis = <2000>; + type = "critical"; + }; + }; + }; + }; + saradc: adc@fec10000 { compatible = "rockchip,rk3588-saradc"; reg = <0x0 0xfec10000 0x0 0x10000>; -- 2.43.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-09 19:19 ` [PATCH v2] " Alexey Charkov @ 2024-01-18 19:20 ` Dragan Simic 2024-01-19 13:15 ` Heiko Stübner 2024-01-19 16:21 ` Daniel Lezcano 2 siblings, 0 replies; 21+ messages in thread From: Dragan Simic @ 2024-01-18 19:20 UTC (permalink / raw) To: Alexey Charkov Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Jagan Teki, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On 2024-01-09 20:19, Alexey Charkov wrote: > Include thermal zones information in device tree for rk3588 variants > and enable the built-in thermal sensing ADC on RADXA Rock 5B > > Signed-off-by: Alexey Charkov <alchark@gmail.com> > --- > Changes in v2: > - Dropped redundant comments > - Included all CPU cores in cooling maps > - Split cooling maps into more granular ones utilizing TSADC > channels 1-3 which measure temperature by separate CPU clusters > instead of channel 0 which measures the center of the SoC die > --- > .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + > arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 151 ++++++++++++++++++ > 2 files changed, 155 insertions(+) > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > index a5a104131403..f9d540000de3 100644 > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > @@ -772,3 +772,7 @@ &usb_host1_ehci { > &usb_host1_ohci { > status = "okay"; > }; > + > +&tsadc { > + status = "okay"; > +}; I keep forgetting to note that enabling it for the Rock 5B should be performed in a separate patch. > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > index 8aa0499f9b03..8d54998d0ecc 100644 > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > @@ -10,6 +10,7 @@ > #include <dt-bindings/reset/rockchip,rk3588-cru.h> > #include <dt-bindings/phy/phy.h> > #include <dt-bindings/ata/ahci.h> > +#include <dt-bindings/thermal/thermal.h> > > / { > compatible = "rockchip,rk3588"; > @@ -2112,6 +2113,156 @@ tsadc: tsadc@fec00000 { > status = "disabled"; > }; > > + thermal_zones: thermal-zones { > + /* sensor near the center of the whole chip */ > + soc_thermal: soc-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + sustainable-power = <2100>; > + thermal-sensors = <&tsadc 0>; > + > + trips { > + soc_crit: soc-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; As already noted in my previous response, perhaps it whould be better to name it package_thermal instead. That way, it should be more self descriptive. > + /* sensor between A76 cores 0 and 1 */ > + bigcore0_thermal: bigcore0-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 1>; > + > + trips { > + bigcore0_alert: bigcore0-alert { > + temperature = <85000>; > + hysteresis = <2000>; > + type = "passive"; > + }; > + bigcore0_crit: bigcore0-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; As already noted in my previous message, perhaps another trip, of the "hot" type, should be added here. > + cooling-maps { > + map0 { > + trip = <&bigcore0_alert>; > + cooling-device = > + <&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + contribution = <1024>; > + }; > + }; > + }; > + > + /* sensor between A76 cores 2 and 3 */ > + bigcore2_thermal: bigcore2-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 2>; > + > + trips { > + bigcore2_alert: bigcore2-alert { > + temperature = <85000>; > + hysteresis = <2000>; > + type = "passive"; > + }; > + bigcore2_crit: bigcore2-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; The same suggestion about one more "hot" trip applies here as well. > + cooling-maps { > + map1 { > + trip = <&bigcore2_alert>; > + cooling-device = > + <&cpu_b2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_b3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + contribution = <1024>; > + }; > + }; > + }; > + > + /* sensor between the four A55 cores */ > + little_core_thermal: littlecore-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 3>; > + > + trips { > + littlecore_alert: littlecore-alert { > + temperature = <85000>; > + hysteresis = <2000>; > + type = "passive"; > + }; > + littlecore_crit: littlecore-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; The same suggestion about one more "hot" trip applies here as well. > + cooling-maps { > + map2 { > + trip = <&littlecore_alert>; > + cooling-device = > + <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_l1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_l2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_l3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + contribution = <1024>; > + }; > + }; > + }; > + > + /* sensor near the PD_CENTER power domain */ > + center_thermal: center-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 4>; > + > + trips { > + center_crit: center-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + gpu_thermal: gpu-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 5>; > + > + trips { > + gpu_crit: gpu-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + npu_thermal: npu-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 6>; > + > + trips { > + npu_crit: npu-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + }; > + > saradc: adc@fec10000 { > compatible = "rockchip,rk3588-saradc"; > reg = <0x0 0xfec10000 0x0 0x10000>; _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-09 19:19 ` [PATCH v2] " Alexey Charkov 2024-01-18 19:20 ` Dragan Simic @ 2024-01-19 13:15 ` Heiko Stübner 2024-01-19 16:21 ` Daniel Lezcano 2 siblings, 0 replies; 21+ messages in thread From: Heiko Stübner @ 2024-01-19 13:15 UTC (permalink / raw) To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Alexey Charkov, Kever Yang, Jagan Teki, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel, Alexey Charkov Am Dienstag, 9. Januar 2024, 20:19:47 CET schrieb Alexey Charkov: > Include thermal zones information in device tree for rk3588 variants > and enable the built-in thermal sensing ADC on RADXA Rock 5B > > Signed-off-by: Alexey Charkov <alchark@gmail.com> > --- > Changes in v2: > - Dropped redundant comments > - Included all CPU cores in cooling maps > - Split cooling maps into more granular ones utilizing TSADC > channels 1-3 which measure temperature by separate CPU clusters > instead of channel 0 which measures the center of the SoC die all of what Dragan wrote and additionally, please don't post v2 patches as reply to earlier versions. It confuses tooling like "b4" when trying to retrieve patches. Thanks Heiko > --- > .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + > arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 151 ++++++++++++++++++ > 2 files changed, 155 insertions(+) > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > index a5a104131403..f9d540000de3 100644 > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > @@ -772,3 +772,7 @@ &usb_host1_ehci { > &usb_host1_ohci { > status = "okay"; > }; > + > +&tsadc { > + status = "okay"; > +}; > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > index 8aa0499f9b03..8d54998d0ecc 100644 > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > @@ -10,6 +10,7 @@ > #include <dt-bindings/reset/rockchip,rk3588-cru.h> > #include <dt-bindings/phy/phy.h> > #include <dt-bindings/ata/ahci.h> > +#include <dt-bindings/thermal/thermal.h> > > / { > compatible = "rockchip,rk3588"; > @@ -2112,6 +2113,156 @@ tsadc: tsadc@fec00000 { > status = "disabled"; > }; > > + thermal_zones: thermal-zones { > + /* sensor near the center of the whole chip */ > + soc_thermal: soc-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + sustainable-power = <2100>; > + thermal-sensors = <&tsadc 0>; > + > + trips { > + soc_crit: soc-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + /* sensor between A76 cores 0 and 1 */ > + bigcore0_thermal: bigcore0-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 1>; > + > + trips { > + bigcore0_alert: bigcore0-alert { > + temperature = <85000>; > + hysteresis = <2000>; > + type = "passive"; > + }; > + bigcore0_crit: bigcore0-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + cooling-maps { > + map0 { > + trip = <&bigcore0_alert>; > + cooling-device = > + <&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + contribution = <1024>; > + }; > + }; > + }; > + > + /* sensor between A76 cores 2 and 3 */ > + bigcore2_thermal: bigcore2-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 2>; > + > + trips { > + bigcore2_alert: bigcore2-alert { > + temperature = <85000>; > + hysteresis = <2000>; > + type = "passive"; > + }; > + bigcore2_crit: bigcore2-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + cooling-maps { > + map1 { > + trip = <&bigcore2_alert>; > + cooling-device = > + <&cpu_b2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_b3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + contribution = <1024>; > + }; > + }; > + }; > + > + /* sensor between the four A55 cores */ > + little_core_thermal: littlecore-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 3>; > + > + trips { > + littlecore_alert: littlecore-alert { > + temperature = <85000>; > + hysteresis = <2000>; > + type = "passive"; > + }; > + littlecore_crit: littlecore-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + cooling-maps { > + map2 { > + trip = <&littlecore_alert>; > + cooling-device = > + <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_l1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_l2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_l3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + contribution = <1024>; > + }; > + }; > + }; > + > + /* sensor near the PD_CENTER power domain */ > + center_thermal: center-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 4>; > + > + trips { > + center_crit: center-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + gpu_thermal: gpu-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 5>; > + > + trips { > + gpu_crit: gpu-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + npu_thermal: npu-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; > + thermal-sensors = <&tsadc 6>; > + > + trips { > + npu_crit: npu-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + }; > + > saradc: adc@fec10000 { > compatible = "rockchip,rk3588-saradc"; > reg = <0x0 0xfec10000 0x0 0x10000>; > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-09 19:19 ` [PATCH v2] " Alexey Charkov 2024-01-18 19:20 ` Dragan Simic 2024-01-19 13:15 ` Heiko Stübner @ 2024-01-19 16:21 ` Daniel Lezcano 2024-01-21 19:57 ` Alexey Charkov 2 siblings, 1 reply; 21+ messages in thread From: Daniel Lezcano @ 2024-01-19 16:21 UTC (permalink / raw) To: Alexey Charkov, Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Jagan Teki, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On 09/01/2024 20:19, Alexey Charkov wrote: > Include thermal zones information in device tree for rk3588 variants > and enable the built-in thermal sensing ADC on RADXA Rock 5B > > Signed-off-by: Alexey Charkov <alchark@gmail.com> > --- > Changes in v2: > - Dropped redundant comments > - Included all CPU cores in cooling maps > - Split cooling maps into more granular ones utilizing TSADC > channels 1-3 which measure temperature by separate CPU clusters > instead of channel 0 which measures the center of the SoC die > --- > .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + > arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 151 ++++++++++++++++++ > 2 files changed, 155 insertions(+) > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > index a5a104131403..f9d540000de3 100644 > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > @@ -772,3 +772,7 @@ &usb_host1_ehci { > &usb_host1_ohci { > status = "okay"; > }; > + > +&tsadc { > + status = "okay"; > +}; > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > index 8aa0499f9b03..8d54998d0ecc 100644 > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > @@ -10,6 +10,7 @@ > #include <dt-bindings/reset/rockchip,rk3588-cru.h> > #include <dt-bindings/phy/phy.h> > #include <dt-bindings/ata/ahci.h> > +#include <dt-bindings/thermal/thermal.h> > > / { > compatible = "rockchip,rk3588"; > @@ -2112,6 +2113,156 @@ tsadc: tsadc@fec00000 { > status = "disabled"; > }; > > + thermal_zones: thermal-zones { > + /* sensor near the center of the whole chip */ > + soc_thermal: soc-thermal { > + polling-delay-passive = <20>; There is no mitigation set for this thermal zone. It is pointless to specify a passive polling. > + polling-delay = <1000>; The driver is interrupt driven. No need to poll. > + sustainable-power = <2100>; There is no mitigation with this thermal zone. Specifying a sustainable power does not make sense. > + thermal-sensors = <&tsadc 0>; > + > + trips { > + soc_crit: soc-crit { > + temperature = <115000>; > + hysteresis = <2000>; This trip point leads to a system shutdown / reboot. It is not necessary to specify a hysteresis. > + type = "critical"; > + }; > + }; > + }; > + > + /* sensor between A76 cores 0 and 1 */ > + bigcore0_thermal: bigcore0-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; The driver is interrupt driven. No need to poll. > + thermal-sensors = <&tsadc 1>; > + > + trips { > + bigcore0_alert: bigcore0-alert { > + temperature = <85000>; > + hysteresis = <2000>; > + type = "passive"; > + }; > + bigcore0_crit: bigcore0-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + cooling-maps { > + map0 { > + trip = <&bigcore0_alert>; > + cooling-device = > + <&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + contribution = <1024>; If you specify the contribution, that means it is expected to use the IPA governor. However, this one needs an extra trip point before 'alert' to begin collecting temperatures in order to initialize the PID loop of the IPA. > + }; > + }; > + }; > + > + /* sensor between A76 cores 2 and 3 */ > + bigcore2_thermal: bigcore2-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; The driver is interrupt driven. No need to poll. > + thermal-sensors = <&tsadc 2>; > + > + trips { > + bigcore2_alert: bigcore2-alert { > + temperature = <85000>; > + hysteresis = <2000>; > + type = "passive"; > + }; > + bigcore2_crit: bigcore2-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + cooling-maps { > + map1 { > + trip = <&bigcore2_alert>; > + cooling-device = > + <&cpu_b2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_b3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + contribution = <1024>; > + }; > + }; > + }; > + > + /* sensor between the four A55 cores */ > + little_core_thermal: littlecore-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; The driver is interrupt driven. No need to poll. > + thermal-sensors = <&tsadc 3>; > + > + trips { > + littlecore_alert: littlecore-alert { > + temperature = <85000>; > + hysteresis = <2000>; > + type = "passive"; > + }; > + littlecore_crit: littlecore-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + cooling-maps { > + map2 { > + trip = <&littlecore_alert>; > + cooling-device = > + <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_l1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_l2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > + <&cpu_l3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + contribution = <1024>; > + }; > + }; > + }; > + > + /* sensor near the PD_CENTER power domain */ > + center_thermal: center-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; Same comment as above for "soc-thermal" > + thermal-sensors = <&tsadc 4>; > + > + trips { > + center_crit: center-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + gpu_thermal: gpu-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; Same comment as above for "soc-thermal" > + thermal-sensors = <&tsadc 5>; > + > + trips { > + gpu_crit: gpu-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + > + npu_thermal: npu-thermal { > + polling-delay-passive = <20>; > + polling-delay = <1000>; Same comment as above for "soc-thermal" > + thermal-sensors = <&tsadc 6>; > + > + trips { > + npu_crit: npu-crit { > + temperature = <115000>; > + hysteresis = <2000>; > + type = "critical"; > + }; > + }; > + }; > + }; > + > saradc: adc@fec10000 { > compatible = "rockchip,rk3588-saradc"; > reg = <0x0 0xfec10000 0x0 0x10000>; -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-19 16:21 ` Daniel Lezcano @ 2024-01-21 19:57 ` Alexey Charkov 2024-01-22 0:04 ` Daniel Lezcano 0 siblings, 1 reply; 21+ messages in thread From: Alexey Charkov @ 2024-01-21 19:57 UTC (permalink / raw) To: Daniel Lezcano Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Jagan Teki, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On Fri, Jan 19, 2024 at 8:21 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: Hello Daniel, Thanks a lot for your review and comments! Please see some reflections below. > On 09/01/2024 20:19, Alexey Charkov wrote: > > Include thermal zones information in device tree for rk3588 variants > > and enable the built-in thermal sensing ADC on RADXA Rock 5B > > > > Signed-off-by: Alexey Charkov <alchark@gmail.com> > > --- > > Changes in v2: > > - Dropped redundant comments > > - Included all CPU cores in cooling maps > > - Split cooling maps into more granular ones utilizing TSADC > > channels 1-3 which measure temperature by separate CPU clusters > > instead of channel 0 which measures the center of the SoC die > > --- > > .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + > > arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 151 ++++++++++++++++++ > > 2 files changed, 155 insertions(+) > > > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > > index a5a104131403..f9d540000de3 100644 > > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > > @@ -772,3 +772,7 @@ &usb_host1_ehci { > > &usb_host1_ohci { > > status = "okay"; > > }; > > + > > +&tsadc { > > + status = "okay"; > > +}; > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > > index 8aa0499f9b03..8d54998d0ecc 100644 > > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > > @@ -10,6 +10,7 @@ > > #include <dt-bindings/reset/rockchip,rk3588-cru.h> > > #include <dt-bindings/phy/phy.h> > > #include <dt-bindings/ata/ahci.h> > > +#include <dt-bindings/thermal/thermal.h> > > > > / { > > compatible = "rockchip,rk3588"; > > @@ -2112,6 +2113,156 @@ tsadc: tsadc@fec00000 { > > status = "disabled"; > > }; > > > > + thermal_zones: thermal-zones { > > + /* sensor near the center of the whole chip */ > > + soc_thermal: soc-thermal { > > + polling-delay-passive = <20>; > > There is no mitigation set for this thermal zone. It is pointless to > specify a passive polling. Indeed, it makes sense to me. There seems to be a catch though in that the driver calls the generic thermal_of_zone_register during the initial probe, which expects both of those polling delays to be present in the device tree, otherwise it simply refuses to add the respective thermal zone, see drivers/thermal/thermal_of.c:502 > > + polling-delay = <1000>; > > The driver is interrupt driven. No need to poll. Same here as above > > + sustainable-power = <2100>; > > There is no mitigation with this thermal zone. Specifying a sustainable > power does not make sense. Thanks, will drop this in v3! > > + thermal-sensors = <&tsadc 0>; > > + > > + trips { > > + soc_crit: soc-crit { > > + temperature = <115000>; > > + hysteresis = <2000>; > > This trip point leads to a system shutdown / reboot. It is not necessary > to specify a hysteresis. Similar to the above, the generic thermal_of code refuses to add the trip point if it has no hysteresis property defined (regardless of the trip type), see drivers/thermal/thermal_of.c:109 > > + type = "critical"; > > + }; > > + }; > > + }; > > + > > + /* sensor between A76 cores 0 and 1 */ > > + bigcore0_thermal: bigcore0-thermal { > > + polling-delay-passive = <20>; > > + polling-delay = <1000>; > > The driver is interrupt driven. No need to poll. > > > + thermal-sensors = <&tsadc 1>; > > + > > + trips { > > + bigcore0_alert: bigcore0-alert { > > + temperature = <85000>; > > + hysteresis = <2000>; > > + type = "passive"; > > + }; > > + bigcore0_crit: bigcore0-crit { > > + temperature = <115000>; > > + hysteresis = <2000>; > > + type = "critical"; > > + }; > > + }; > > + cooling-maps { > > + map0 { > > + trip = <&bigcore0_alert>; > > + cooling-device = > > + <&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > + <&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > > + contribution = <1024>; > > If you specify the contribution, that means it is expected to use the > IPA governor. However, this one needs an extra trip point before 'alert' > to begin collecting temperatures in order to initialize the PID loop of > the IPA. Thank you! Will add extra passive cooling trip points at 75C to all three CPU clusters. Best regards, Alexey _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-21 19:57 ` Alexey Charkov @ 2024-01-22 0:04 ` Daniel Lezcano 2024-01-22 5:57 ` Alexey Charkov 2024-01-23 19:47 ` Alexey Charkov 0 siblings, 2 replies; 21+ messages in thread From: Daniel Lezcano @ 2024-01-22 0:04 UTC (permalink / raw) To: Alexey Charkov Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Jagan Teki, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel Hi Alexey, On 21/01/2024 20:57, Alexey Charkov wrote: > On Fri, Jan 19, 2024 at 8:21 PM Daniel Lezcano > <daniel.lezcano@linaro.org> wrote: > Hello Daniel, > > Thanks a lot for your review and comments! Please see some reflections below. > >> On 09/01/2024 20:19, Alexey Charkov wrote: >>> Include thermal zones information in device tree for rk3588 variants >>> and enable the built-in thermal sensing ADC on RADXA Rock 5B >>> >>> Signed-off-by: Alexey Charkov <alchark@gmail.com> >>> --- >>> Changes in v2: >>> - Dropped redundant comments >>> - Included all CPU cores in cooling maps >>> - Split cooling maps into more granular ones utilizing TSADC >>> channels 1-3 which measure temperature by separate CPU clusters >>> instead of channel 0 which measures the center of the SoC die >>> --- >>> .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + >>> arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 151 ++++++++++++++++++ >>> 2 files changed, 155 insertions(+) >>> >>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts >>> index a5a104131403..f9d540000de3 100644 >>> --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts >>> +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts >>> @@ -772,3 +772,7 @@ &usb_host1_ehci { >>> &usb_host1_ohci { >>> status = "okay"; >>> }; >>> + >>> +&tsadc { >>> + status = "okay"; >>> +}; >>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >>> index 8aa0499f9b03..8d54998d0ecc 100644 >>> --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >>> +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >>> @@ -10,6 +10,7 @@ >>> #include <dt-bindings/reset/rockchip,rk3588-cru.h> >>> #include <dt-bindings/phy/phy.h> >>> #include <dt-bindings/ata/ahci.h> >>> +#include <dt-bindings/thermal/thermal.h> >>> >>> / { >>> compatible = "rockchip,rk3588"; >>> @@ -2112,6 +2113,156 @@ tsadc: tsadc@fec00000 { >>> status = "disabled"; >>> }; >>> >>> + thermal_zones: thermal-zones { >>> + /* sensor near the center of the whole chip */ >>> + soc_thermal: soc-thermal { >>> + polling-delay-passive = <20>; >> >> There is no mitigation set for this thermal zone. It is pointless to >> specify a passive polling. > > Indeed, it makes sense to me. There seems to be a catch though in that > the driver calls the generic thermal_of_zone_register during the > initial probe, which expects both of those polling delays to be > present in the device tree, otherwise it simply refuses to add the > respective thermal zone, see drivers/thermal/thermal_of.c:502 Usually: polling-delay-passive = <0>; polling-delay = <0>; cf: git grep "polling-delay = <0>" arch/arm64/boot/dts >>> + polling-delay = <1000>; >> >> The driver is interrupt driven. No need to poll. > > Same here as above > >>> + sustainable-power = <2100>; >> >> There is no mitigation with this thermal zone. Specifying a sustainable >> power does not make sense. > > Thanks, will drop this in v3! > >>> + thermal-sensors = <&tsadc 0>; >>> + >>> + trips { >>> + soc_crit: soc-crit { >>> + temperature = <115000>; >>> + hysteresis = <2000>; >> >> This trip point leads to a system shutdown / reboot. It is not necessary >> to specify a hysteresis. > > Similar to the above, the generic thermal_of code refuses to add the > trip point if it has no hysteresis property defined (regardless of the > trip type), see drivers/thermal/thermal_of.c:109 hysteresis = <0>; -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-22 0:04 ` Daniel Lezcano @ 2024-01-22 5:57 ` Alexey Charkov 2024-01-23 19:47 ` Alexey Charkov 1 sibling, 0 replies; 21+ messages in thread From: Alexey Charkov @ 2024-01-22 5:57 UTC (permalink / raw) To: Daniel Lezcano Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Jagan Teki, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On Mon, Jan 22, 2024 at 4:04 AM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > > Hi Alexey, > > > On 21/01/2024 20:57, Alexey Charkov wrote: > > On Fri, Jan 19, 2024 at 8:21 PM Daniel Lezcano > > <daniel.lezcano@linaro.org> wrote: > > Hello Daniel, > > > > Thanks a lot for your review and comments! Please see some reflections below. > > > >> On 09/01/2024 20:19, Alexey Charkov wrote: > >>> Include thermal zones information in device tree for rk3588 variants > >>> and enable the built-in thermal sensing ADC on RADXA Rock 5B > >>> > >>> Signed-off-by: Alexey Charkov <alchark@gmail.com> > >>> --- > >>> Changes in v2: > >>> - Dropped redundant comments > >>> - Included all CPU cores in cooling maps > >>> - Split cooling maps into more granular ones utilizing TSADC > >>> channels 1-3 which measure temperature by separate CPU clusters > >>> instead of channel 0 which measures the center of the SoC die > >>> --- > >>> .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + > >>> arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 151 ++++++++++++++++++ > >>> 2 files changed, 155 insertions(+) > >>> > >>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > >>> index a5a104131403..f9d540000de3 100644 > >>> --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > >>> +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > >>> @@ -772,3 +772,7 @@ &usb_host1_ehci { > >>> &usb_host1_ohci { > >>> status = "okay"; > >>> }; > >>> + > >>> +&tsadc { > >>> + status = "okay"; > >>> +}; > >>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >>> index 8aa0499f9b03..8d54998d0ecc 100644 > >>> --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >>> +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >>> @@ -10,6 +10,7 @@ > >>> #include <dt-bindings/reset/rockchip,rk3588-cru.h> > >>> #include <dt-bindings/phy/phy.h> > >>> #include <dt-bindings/ata/ahci.h> > >>> +#include <dt-bindings/thermal/thermal.h> > >>> > >>> / { > >>> compatible = "rockchip,rk3588"; > >>> @@ -2112,6 +2113,156 @@ tsadc: tsadc@fec00000 { > >>> status = "disabled"; > >>> }; > >>> > >>> + thermal_zones: thermal-zones { > >>> + /* sensor near the center of the whole chip */ > >>> + soc_thermal: soc-thermal { > >>> + polling-delay-passive = <20>; > >> > >> There is no mitigation set for this thermal zone. It is pointless to > >> specify a passive polling. > > > > Indeed, it makes sense to me. There seems to be a catch though in that > > the driver calls the generic thermal_of_zone_register during the > > initial probe, which expects both of those polling delays to be > > present in the device tree, otherwise it simply refuses to add the > > respective thermal zone, see drivers/thermal/thermal_of.c:502 > > Usually: > > polling-delay-passive = <0>; > polling-delay = <0>; > > cf: > > git grep "polling-delay = <0>" arch/arm64/boot/dts Indeed, thanks a lot for the pointer! Somehow it slipped my attention. Will test and amend accordingly. > >>> + polling-delay = <1000>; > >> > >> The driver is interrupt driven. No need to poll. > > > > Same here as above > > > >>> + sustainable-power = <2100>; > >> > >> There is no mitigation with this thermal zone. Specifying a sustainable > >> power does not make sense. > > > > Thanks, will drop this in v3! > > > >>> + thermal-sensors = <&tsadc 0>; > >>> + > >>> + trips { > >>> + soc_crit: soc-crit { > >>> + temperature = <115000>; > >>> + hysteresis = <2000>; > >> > >> This trip point leads to a system shutdown / reboot. It is not necessary > >> to specify a hysteresis. > > > > Similar to the above, the generic thermal_of code refuses to add the > > trip point if it has no hysteresis property defined (regardless of the > > trip type), see drivers/thermal/thermal_of.c:109 > > hysteresis = <0>; Makes sense, thank you! Will amend accordingly. Best regards, Alexey _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-22 0:04 ` Daniel Lezcano 2024-01-22 5:57 ` Alexey Charkov @ 2024-01-23 19:47 ` Alexey Charkov 2024-01-24 0:14 ` Daniel Lezcano 1 sibling, 1 reply; 21+ messages in thread From: Alexey Charkov @ 2024-01-23 19:47 UTC (permalink / raw) To: Daniel Lezcano Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Jagan Teki, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On Mon, Jan 22, 2024 at 4:04 AM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > > Hi Alexey, > > > On 21/01/2024 20:57, Alexey Charkov wrote: > > On Fri, Jan 19, 2024 at 8:21 PM Daniel Lezcano > > <daniel.lezcano@linaro.org> wrote: > > Hello Daniel, > > > > Thanks a lot for your review and comments! Please see some reflections below. > > > >> On 09/01/2024 20:19, Alexey Charkov wrote: > >>> Include thermal zones information in device tree for rk3588 variants > >>> and enable the built-in thermal sensing ADC on RADXA Rock 5B > >>> > >>> Signed-off-by: Alexey Charkov <alchark@gmail.com> > >>> --- > >>> Changes in v2: > >>> - Dropped redundant comments > >>> - Included all CPU cores in cooling maps > >>> - Split cooling maps into more granular ones utilizing TSADC > >>> channels 1-3 which measure temperature by separate CPU clusters > >>> instead of channel 0 which measures the center of the SoC die > >>> --- > >>> .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + > >>> arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 151 ++++++++++++++++++ > >>> 2 files changed, 155 insertions(+) > >>> > >>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > >>> index a5a104131403..f9d540000de3 100644 > >>> --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > >>> +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts > >>> @@ -772,3 +772,7 @@ &usb_host1_ehci { > >>> &usb_host1_ohci { > >>> status = "okay"; > >>> }; > >>> + > >>> +&tsadc { > >>> + status = "okay"; > >>> +}; > >>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >>> index 8aa0499f9b03..8d54998d0ecc 100644 > >>> --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >>> +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi > >>> @@ -10,6 +10,7 @@ > >>> #include <dt-bindings/reset/rockchip,rk3588-cru.h> > >>> #include <dt-bindings/phy/phy.h> > >>> #include <dt-bindings/ata/ahci.h> > >>> +#include <dt-bindings/thermal/thermal.h> > >>> > >>> / { > >>> compatible = "rockchip,rk3588"; > >>> @@ -2112,6 +2113,156 @@ tsadc: tsadc@fec00000 { > >>> status = "disabled"; > >>> }; > >>> > >>> + thermal_zones: thermal-zones { > >>> + /* sensor near the center of the whole chip */ > >>> + soc_thermal: soc-thermal { > >>> + polling-delay-passive = <20>; > >> > >> There is no mitigation set for this thermal zone. It is pointless to > >> specify a passive polling. > > > > Indeed, it makes sense to me. There seems to be a catch though in that > > the driver calls the generic thermal_of_zone_register during the > > initial probe, which expects both of those polling delays to be > > present in the device tree, otherwise it simply refuses to add the > > respective thermal zone, see drivers/thermal/thermal_of.c:502 > > Usually: > > polling-delay-passive = <0>; > polling-delay = <0>; > > cf: > > git grep "polling-delay = <0>" arch/arm64/boot/dts For some reason when I have both polling-delay-passive and polling-delay set to 0, the active cooling map I have in my board DT (using a PWM controlled fan) behaves weirdly. I use the following fragment in my board DTS: +&package_thermal { + trips { + package_fan: package-fan { + temperature = <55000>; + hysteresis = <2000>; + type = "active"; + }; + }; + + cooling-maps { + map-fan { + trip = <&package_fan>; + cooling-device = <&fan THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; +}; If I add polling-delay = <1000>; at the top, the fan speeds up and down dynamically as the package temperature swings around 55C. If I remove that (having set polling-delay = <0>; in rk3588s.dtsi), the fan speeds up to the midpoint cooling state once the package temperature approaches 55C, and then it just stays there forever: it doesn't speed up above the midpoint even as the temperature climbs above 70C, nor does it spin down as it falls back to around 45C. Is that the expected behavior for when the polling is disabled? I haven't yet studied in detail if passive cooling kicks in correctly with polling disabled, but this behavior with active cooling left me quite confused - any pointers would be much appreciated. Thanks a lot, Alexey _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 2024-01-23 19:47 ` Alexey Charkov @ 2024-01-24 0:14 ` Daniel Lezcano 0 siblings, 0 replies; 21+ messages in thread From: Daniel Lezcano @ 2024-01-24 0:14 UTC (permalink / raw) To: Alexey Charkov Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner, Sebastian Reichel, Cristian Ciocaltea, Christopher Obbard, Tamás Szűcs, Shreeya Patel, Kever Yang, Jagan Teki, Chris Morgan, devicetree, linux-arm-kernel, linux-rockchip, linux-kernel On 23/01/2024 20:47, Alexey Charkov wrote: > On Mon, Jan 22, 2024 at 4:04 AM Daniel Lezcano > <daniel.lezcano@linaro.org> wrote: >> >> >> Hi Alexey, >> >> >> On 21/01/2024 20:57, Alexey Charkov wrote: >>> On Fri, Jan 19, 2024 at 8:21 PM Daniel Lezcano >>> <daniel.lezcano@linaro.org> wrote: >>> Hello Daniel, >>> >>> Thanks a lot for your review and comments! Please see some reflections below. >>> >>>> On 09/01/2024 20:19, Alexey Charkov wrote: >>>>> Include thermal zones information in device tree for rk3588 variants >>>>> and enable the built-in thermal sensing ADC on RADXA Rock 5B >>>>> >>>>> Signed-off-by: Alexey Charkov <alchark@gmail.com> >>>>> --- >>>>> Changes in v2: >>>>> - Dropped redundant comments >>>>> - Included all CPU cores in cooling maps >>>>> - Split cooling maps into more granular ones utilizing TSADC >>>>> channels 1-3 which measure temperature by separate CPU clusters >>>>> instead of channel 0 which measures the center of the SoC die >>>>> --- >>>>> .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 + >>>>> arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 151 ++++++++++++++++++ >>>>> 2 files changed, 155 insertions(+) >>>>> >>>>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts >>>>> index a5a104131403..f9d540000de3 100644 >>>>> --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts >>>>> +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts >>>>> @@ -772,3 +772,7 @@ &usb_host1_ehci { >>>>> &usb_host1_ohci { >>>>> status = "okay"; >>>>> }; >>>>> + >>>>> +&tsadc { >>>>> + status = "okay"; >>>>> +}; >>>>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >>>>> index 8aa0499f9b03..8d54998d0ecc 100644 >>>>> --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >>>>> +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi >>>>> @@ -10,6 +10,7 @@ >>>>> #include <dt-bindings/reset/rockchip,rk3588-cru.h> >>>>> #include <dt-bindings/phy/phy.h> >>>>> #include <dt-bindings/ata/ahci.h> >>>>> +#include <dt-bindings/thermal/thermal.h> >>>>> >>>>> / { >>>>> compatible = "rockchip,rk3588"; >>>>> @@ -2112,6 +2113,156 @@ tsadc: tsadc@fec00000 { >>>>> status = "disabled"; >>>>> }; >>>>> >>>>> + thermal_zones: thermal-zones { >>>>> + /* sensor near the center of the whole chip */ >>>>> + soc_thermal: soc-thermal { >>>>> + polling-delay-passive = <20>; >>>> >>>> There is no mitigation set for this thermal zone. It is pointless to >>>> specify a passive polling. >>> >>> Indeed, it makes sense to me. There seems to be a catch though in that >>> the driver calls the generic thermal_of_zone_register during the >>> initial probe, which expects both of those polling delays to be >>> present in the device tree, otherwise it simply refuses to add the >>> respective thermal zone, see drivers/thermal/thermal_of.c:502 >> >> Usually: >> >> polling-delay-passive = <0>; >> polling-delay = <0>; >> >> cf: >> >> git grep "polling-delay = <0>" arch/arm64/boot/dts > > For some reason when I have both polling-delay-passive and > polling-delay set to 0, the active cooling map I have in my board DT > (using a PWM controlled fan) behaves weirdly. > I use the following fragment in my board DTS: > > +&package_thermal { > + trips { > + package_fan: package-fan { > + temperature = <55000>; > + hysteresis = <2000>; > + type = "active"; > + }; > + }; > + > + cooling-maps { > + map-fan { > + trip = <&package_fan>; > + cooling-device = <&fan THERMAL_NO_LIMIT > THERMAL_NO_LIMIT>; > + }; > + }; > +}; > > If I add polling-delay = <1000>; at the top, the fan speeds up and > down dynamically as the package temperature swings around 55C. If I > remove that (having set polling-delay = <0>; in rk3588s.dtsi), the fan > speeds up to the midpoint cooling state once the package temperature > approaches 55C, and then it just stays there forever: it doesn't speed > up above the midpoint even as the temperature climbs above 70C, nor > does it spin down as it falls back to around 45C. > > Is that the expected behavior for when the polling is disabled? I don't know the rest of the DT this fragment was added to, but I'm not surprised there is misbehavior because the configuration is not correct in this case. If there is a thermal zone with an active trip and an associated cooling device like a fan, then: -> polling-delay = <a_value>; -> polling-delay-passive = <0>; If there is a thermal zone with a passive cooling device like cpufreq cooling device, then 2 cases: 1. The sensor supports interrupt when crossing the trip point -> polling-delay = <0>; -> polling-delay-passive = <a_value>; 2. The sensor does not support interrupt when crossing the trip point -> polling-delay = <a_value>; -> polling-delay-passive = <another_value>; Why? When the cooling device is a passive cooling device, then the mitigation happens with a higher temperature sampling rate in order to change the state of the cooling device hundred of times per second. On a fan, the cooling effect is too slow for that so we keep the polling for that. > I haven't yet studied in detail if passive cooling kicks in correctly > with polling disabled, but this behavior with active cooling left me > quite confused - any pointers would be much appreciated. > > Thanks a lot, > Alexey -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2024-01-24 0:15 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-01-06 22:23 [PATCH] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588 Alexey Charkov 2024-01-06 22:54 ` Dragan Simic 2024-01-08 13:41 ` Alexey Charkov 2024-01-18 18:48 ` Dragan Simic 2024-01-21 18:56 ` Alexey Charkov 2024-01-22 4:55 ` Dragan Simic 2024-01-22 6:03 ` Alexey Charkov 2024-01-22 6:22 ` Dragan Simic 2024-01-22 7:36 ` Alexey Charkov 2024-01-22 7:57 ` Dragan Simic 2024-01-22 14:20 ` Alexey Charkov 2024-01-22 17:33 ` Dragan Simic 2024-01-09 19:19 ` [PATCH v2] " Alexey Charkov 2024-01-18 19:20 ` Dragan Simic 2024-01-19 13:15 ` Heiko Stübner 2024-01-19 16:21 ` Daniel Lezcano 2024-01-21 19:57 ` Alexey Charkov 2024-01-22 0:04 ` Daniel Lezcano 2024-01-22 5:57 ` Alexey Charkov 2024-01-23 19:47 ` Alexey Charkov 2024-01-24 0:14 ` Daniel Lezcano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).