* [PATCH] thermal/drivers/sun8i: Don't fail probe due to zone registration failure
@ 2023-07-18 15:04 Mark Brown
2023-07-18 16:50 ` Vasily Khoruzhick
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Mark Brown @ 2023-07-18 15:04 UTC (permalink / raw)
To: Vasily Khoruzhick, Yangtao Li, Rafael J. Wysocki, Daniel Lezcano,
Amit Kucheria, Zhang Rui, Chen-Yu Tsai, Jernej Skrabec,
Samuel Holland
Cc: Hugh Dickins, linux-pm, linux-arm-kernel, linux-sunxi,
linux-kernel, Mark Brown
Currently the sun8i thermal driver will fail to probe if any of the
thermal zones it is registering fails to register with the thermal core.
Since we currently do not define any trip points for the GPU thermal
zones on at least A64 or H5 this means that we have no thermal support
on these platforms:
[ 1.698703] thermal_sys: Failed to find 'trips' node
[ 1.698707] thermal_sys: Failed to find trip points for thermal-sensor id=1
even though the main CPU thermal zone on both SoCs is fully configured.
This does not seem ideal, while we may not be able to use all the zones
it seems better to have those zones which are usable be operational.
Instead just carry on registering zones if we get any non-deferral
error, allowing use of those zones which are usable.
This means that we also need to update the interrupt handler to not
attempt to notify the core for events on zones which we have not
registered, I didn't see an ability to mask individual interrupts and
I would expect that interrupts would still be indicated in the ISR even
if they were masked.
Signed-off-by: Mark Brown <broonie@kernel.org>
---
I noticed this while trying to debug an issue with memory corruption on
boot which since the merge window has prevented Pine64 Plus (an A64)
from booting at all:
https://storage.kernelci.org/mainline/master/v6.5-rc2/arm64/defconfig/gcc-10/lab-baylibre/baseline-sun50i-a64-pine64-plus.txt
(which I bisected to a random memory management change that clearly
wasn't at fault) and has been causing less consistent but still very
severe boot issues on Libretech Tritium (a H3). The corruption appears
to happen when unbinding a the one thermal zone that does register, I've
not figured out exactly where.
The memory corruption issue obviously needs to be dealt with properly
(I'm still digging into it) but this does allow both platforms to boot
reliably and seems like a sensible thing to do independently, ideally we
could get this in as a fix.
---
drivers/thermal/sun8i_thermal.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/thermal/sun8i_thermal.c b/drivers/thermal/sun8i_thermal.c
index 195f3c5d0b38..b69134538867 100644
--- a/drivers/thermal/sun8i_thermal.c
+++ b/drivers/thermal/sun8i_thermal.c
@@ -190,6 +190,9 @@ static irqreturn_t sun8i_irq_thread(int irq, void *data)
int i;
for_each_set_bit(i, &irq_bitmap, tmdev->chip->sensor_num) {
+ /* We allow some zones to not register. */
+ if (IS_ERR(tmdev->sensor[i].tzd))
+ continue;
thermal_zone_device_update(tmdev->sensor[i].tzd,
THERMAL_EVENT_UNSPECIFIED);
}
@@ -465,8 +468,17 @@ static int sun8i_ths_register(struct ths_device *tmdev)
i,
&tmdev->sensor[i],
&ths_ops);
- if (IS_ERR(tmdev->sensor[i].tzd))
- return PTR_ERR(tmdev->sensor[i].tzd);
+
+ /*
+ * If an individual zone fails to register for reasons
+ * other than probe deferral (eg, a bad DT) then carry
+ * on, other zones might register successfully.
+ */
+ if (IS_ERR(tmdev->sensor[i].tzd)) {
+ if (PTR_ERR(tmdev->sensor[i].tzd) == -EPROBE_DEFER)
+ return PTR_ERR(tmdev->sensor[i].tzd);
+ continue;
+ }
devm_thermal_add_hwmon_sysfs(tmdev->dev, tmdev->sensor[i].tzd);
}
---
base-commit: fdf0eaf11452d72945af31804e2a1048ee1b574c
change-id: 20230718-thermal-sun8i-registration-df3a136ccafa
Best regards,
--
Mark Brown <broonie@kernel.org>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] thermal/drivers/sun8i: Don't fail probe due to zone registration failure
2023-07-18 15:04 [PATCH] thermal/drivers/sun8i: Don't fail probe due to zone registration failure Mark Brown
@ 2023-07-18 16:50 ` Vasily Khoruzhick
2023-07-19 20:24 ` Jernej Škrabec
2023-07-22 12:11 ` Icenowy Zheng
2 siblings, 0 replies; 6+ messages in thread
From: Vasily Khoruzhick @ 2023-07-18 16:50 UTC (permalink / raw)
To: Mark Brown
Cc: Yangtao Li, Rafael J. Wysocki, Daniel Lezcano, Amit Kucheria,
Zhang Rui, Chen-Yu Tsai, Jernej Skrabec, Samuel Holland,
Hugh Dickins, linux-pm, linux-arm-kernel, linux-sunxi,
linux-kernel
On Tue, Jul 18, 2023 at 8:05 AM Mark Brown <broonie@kernel.org> wrote:
>
> Currently the sun8i thermal driver will fail to probe if any of the
> thermal zones it is registering fails to register with the thermal core.
> Since we currently do not define any trip points for the GPU thermal
> zones on at least A64 or H5 this means that we have no thermal support
> on these platforms:
>
> [ 1.698703] thermal_sys: Failed to find 'trips' node
> [ 1.698707] thermal_sys: Failed to find trip points for thermal-sensor id=1
>
> even though the main CPU thermal zone on both SoCs is fully configured.
> This does not seem ideal, while we may not be able to use all the zones
> it seems better to have those zones which are usable be operational.
> Instead just carry on registering zones if we get any non-deferral
> error, allowing use of those zones which are usable.
>
> This means that we also need to update the interrupt handler to not
> attempt to notify the core for events on zones which we have not
> registered, I didn't see an ability to mask individual interrupts and
> I would expect that interrupts would still be indicated in the ISR even
> if they were masked.
>
> Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] thermal/drivers/sun8i: Don't fail probe due to zone registration failure
2023-07-18 15:04 [PATCH] thermal/drivers/sun8i: Don't fail probe due to zone registration failure Mark Brown
2023-07-18 16:50 ` Vasily Khoruzhick
@ 2023-07-19 20:24 ` Jernej Škrabec
2023-07-22 12:11 ` Icenowy Zheng
2 siblings, 0 replies; 6+ messages in thread
From: Jernej Škrabec @ 2023-07-19 20:24 UTC (permalink / raw)
To: Vasily Khoruzhick, Yangtao Li, Rafael J. Wysocki, Daniel Lezcano,
Amit Kucheria, Zhang Rui, Chen-Yu Tsai, Samuel Holland,
Mark Brown
Cc: Hugh Dickins, linux-pm, linux-arm-kernel, linux-sunxi,
linux-kernel, Mark Brown
Dne torek, 18. julij 2023 ob 17:04:22 CEST je Mark Brown napisal(a):
> Currently the sun8i thermal driver will fail to probe if any of the
> thermal zones it is registering fails to register with the thermal core.
> Since we currently do not define any trip points for the GPU thermal
> zones on at least A64 or H5 this means that we have no thermal support
> on these platforms:
>
> [ 1.698703] thermal_sys: Failed to find 'trips' node
> [ 1.698707] thermal_sys: Failed to find trip points for thermal-sensor
> id=1
>
> even though the main CPU thermal zone on both SoCs is fully configured.
> This does not seem ideal, while we may not be able to use all the zones
> it seems better to have those zones which are usable be operational.
> Instead just carry on registering zones if we get any non-deferral
> error, allowing use of those zones which are usable.
>
> This means that we also need to update the interrupt handler to not
> attempt to notify the core for events on zones which we have not
> registered, I didn't see an ability to mask individual interrupts and
> I would expect that interrupts would still be indicated in the ISR even
> if they were masked.
>
> Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Best regards,
Jernej
> ---
> I noticed this while trying to debug an issue with memory corruption on
> boot which since the merge window has prevented Pine64 Plus (an A64)
> from booting at all:
>
>
> https://storage.kernelci.org/mainline/master/v6.5-rc2/arm64/defconfig/gcc-1
> 0/lab-baylibre/baseline-sun50i-a64-pine64-plus.txt
>
> (which I bisected to a random memory management change that clearly
> wasn't at fault) and has been causing less consistent but still very
> severe boot issues on Libretech Tritium (a H3). The corruption appears
> to happen when unbinding a the one thermal zone that does register, I've
> not figured out exactly where.
>
> The memory corruption issue obviously needs to be dealt with properly
> (I'm still digging into it) but this does allow both platforms to boot
> reliably and seems like a sensible thing to do independently, ideally we
> could get this in as a fix.
> ---
> drivers/thermal/sun8i_thermal.c | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/thermal/sun8i_thermal.c
> b/drivers/thermal/sun8i_thermal.c index 195f3c5d0b38..b69134538867 100644
> --- a/drivers/thermal/sun8i_thermal.c
> +++ b/drivers/thermal/sun8i_thermal.c
> @@ -190,6 +190,9 @@ static irqreturn_t sun8i_irq_thread(int irq, void *data)
> int i;
>
> for_each_set_bit(i, &irq_bitmap, tmdev->chip->sensor_num) {
> + /* We allow some zones to not register. */
> + if (IS_ERR(tmdev->sensor[i].tzd))
> + continue;
> thermal_zone_device_update(tmdev->sensor[i].tzd,
>
THERMAL_EVENT_UNSPECIFIED);
> }
> @@ -465,8 +468,17 @@ static int sun8i_ths_register(struct ths_device *tmdev)
> i,
>
&tmdev->sensor[i],
>
&ths_ops);
> - if (IS_ERR(tmdev->sensor[i].tzd))
> - return PTR_ERR(tmdev->sensor[i].tzd);
> +
> + /*
> + * If an individual zone fails to register for reasons
> + * other than probe deferral (eg, a bad DT) then carry
> + * on, other zones might register successfully.
> + */
> + if (IS_ERR(tmdev->sensor[i].tzd)) {
> + if (PTR_ERR(tmdev->sensor[i].tzd) == -
EPROBE_DEFER)
> + return PTR_ERR(tmdev-
>sensor[i].tzd);
> + continue;
> + }
>
> devm_thermal_add_hwmon_sysfs(tmdev->dev, tmdev-
>sensor[i].tzd);
> }
>
> ---
> base-commit: fdf0eaf11452d72945af31804e2a1048ee1b574c
> change-id: 20230718-thermal-sun8i-registration-df3a136ccafa
>
> Best regards,
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] thermal/drivers/sun8i: Don't fail probe due to zone registration failure
2023-07-18 15:04 [PATCH] thermal/drivers/sun8i: Don't fail probe due to zone registration failure Mark Brown
2023-07-18 16:50 ` Vasily Khoruzhick
2023-07-19 20:24 ` Jernej Škrabec
@ 2023-07-22 12:11 ` Icenowy Zheng
2023-07-22 16:46 ` Mark Brown
2 siblings, 1 reply; 6+ messages in thread
From: Icenowy Zheng @ 2023-07-22 12:11 UTC (permalink / raw)
To: Mark Brown, Vasily Khoruzhick, Yangtao Li, Rafael J. Wysocki,
Daniel Lezcano, Amit Kucheria, Zhang Rui, Chen-Yu Tsai,
Jernej Skrabec, Samuel Holland
Cc: Hugh Dickins, linux-pm, linux-arm-kernel, linux-sunxi,
linux-kernel
在 2023-07-18星期二的 16:04 +0100,Mark Brown写道:
> Currently the sun8i thermal driver will fail to probe if any of the
> thermal zones it is registering fails to register with the thermal
> core.
> Since we currently do not define any trip points for the GPU thermal
> zones on at least A64 or H5 this means that we have no thermal
> support
> on these platforms:
>
> [ 1.698703] thermal_sys: Failed to find 'trips' node
> [ 1.698707] thermal_sys: Failed to find trip points for thermal-
> sensor id=1
I think this is an issue in the core thermal subsystem, and sent a
patch; Unfortunately the patch seems to be rejected by linux-arm-kernel
(and some other mailing lists)...
I will then resend it again and put Mark into CC list.
>
> even though the main CPU thermal zone on both SoCs is fully
> configured.
> This does not seem ideal, while we may not be able to use all the
> zones
> it seems better to have those zones which are usable be operational.
> Instead just carry on registering zones if we get any non-deferral
> error, allowing use of those zones which are usable.
>
> This means that we also need to update the interrupt handler to not
> attempt to notify the core for events on zones which we have not
> registered, I didn't see an ability to mask individual interrupts and
> I would expect that interrupts would still be indicated in the ISR
> even
> if they were masked.
>
> Signed-off-by: Mark Brown <broonie@kernel.org>
> ---
> I noticed this while trying to debug an issue with memory corruption
> on
> boot which since the merge window has prevented Pine64 Plus (an A64)
> from booting at all:
>
>
> https://storage.kernelci.org/mainline/master/v6.5-rc2/arm64/defconfig/gcc-10/lab-baylibre/baseline-sun50i-a64-pine64-plus.txt
>
> (which I bisected to a random memory management change that clearly
> wasn't at fault) and has been causing less consistent but still very
> severe boot issues on Libretech Tritium (a H3). The corruption
> appears
> to happen when unbinding a the one thermal zone that does register,
> I've
> not figured out exactly where.
>
> The memory corruption issue obviously needs to be dealt with properly
> (I'm still digging into it) but this does allow both platforms to
> boot
> reliably and seems like a sensible thing to do independently, ideally
> we
> could get this in as a fix.
> ---
> drivers/thermal/sun8i_thermal.c | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/thermal/sun8i_thermal.c
> b/drivers/thermal/sun8i_thermal.c
> index 195f3c5d0b38..b69134538867 100644
> --- a/drivers/thermal/sun8i_thermal.c
> +++ b/drivers/thermal/sun8i_thermal.c
> @@ -190,6 +190,9 @@ static irqreturn_t sun8i_irq_thread(int irq, void
> *data)
> int i;
>
> for_each_set_bit(i, &irq_bitmap, tmdev->chip->sensor_num) {
> + /* We allow some zones to not register. */
> + if (IS_ERR(tmdev->sensor[i].tzd))
> + continue;
> thermal_zone_device_update(tmdev->sensor[i].tzd,
>
> THERMAL_EVENT_UNSPECIFIED);
> }
> @@ -465,8 +468,17 @@ static int sun8i_ths_register(struct ths_device
> *tmdev)
> i,
> &tmdev-
> >sensor[i],
> &ths_ops);
> - if (IS_ERR(tmdev->sensor[i].tzd))
> - return PTR_ERR(tmdev->sensor[i].tzd);
> +
> + /*
> + * If an individual zone fails to register for
> reasons
> + * other than probe deferral (eg, a bad DT) then
> carry
> + * on, other zones might register successfully.
> + */
> + if (IS_ERR(tmdev->sensor[i].tzd)) {
> + if (PTR_ERR(tmdev->sensor[i].tzd) == -
> EPROBE_DEFER)
> + return PTR_ERR(tmdev->sensor[i].tzd);
> + continue;
> + }
>
> devm_thermal_add_hwmon_sysfs(tmdev->dev, tmdev-
> >sensor[i].tzd);
> }
>
> ---
> base-commit: fdf0eaf11452d72945af31804e2a1048ee1b574c
> change-id: 20230718-thermal-sun8i-registration-df3a136ccafa
>
> Best regards,
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] thermal/drivers/sun8i: Don't fail probe due to zone registration failure
2023-07-22 12:11 ` Icenowy Zheng
@ 2023-07-22 16:46 ` Mark Brown
2023-07-23 9:40 ` Daniel Lezcano
0 siblings, 1 reply; 6+ messages in thread
From: Mark Brown @ 2023-07-22 16:46 UTC (permalink / raw)
To: Icenowy Zheng
Cc: Vasily Khoruzhick, Yangtao Li, Rafael J. Wysocki, Daniel Lezcano,
Amit Kucheria, Zhang Rui, Chen-Yu Tsai, Jernej Skrabec,
Samuel Holland, Hugh Dickins, linux-pm, linux-arm-kernel,
linux-sunxi, linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 996 bytes --]
On Sat, Jul 22, 2023 at 08:11:43PM +0800, Icenowy Zheng wrote:
> 在 2023-07-18星期二的 16:04 +0100,Mark Brown写道:
> > Since we currently do not define any trip points for the GPU thermal
> > zones on at least A64 or H5 this means that we have no thermal
> > support
> > on these platforms:
> > [ 1.698703] thermal_sys: Failed to find 'trips' node
> > [ 1.698707] thermal_sys: Failed to find trip points for thermal-
> > sensor id=1
> I think this is an issue in the core thermal subsystem, and sent a
> patch; Unfortunately the patch seems to be rejected by linux-arm-kernel
> (and some other mailing lists)...
It did seem to be a bit of an excessively strict requirement, I was
going to poke at that myself. It does seem worthwhile doing the change
in the sun8i driver anyway, there might be some other issue that causes
registration to fail which would have the same issue.
> I will then resend it again and put Mark into CC list.
Thanks.
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] thermal/drivers/sun8i: Don't fail probe due to zone registration failure
2023-07-22 16:46 ` Mark Brown
@ 2023-07-23 9:40 ` Daniel Lezcano
0 siblings, 0 replies; 6+ messages in thread
From: Daniel Lezcano @ 2023-07-23 9:40 UTC (permalink / raw)
To: Mark Brown, Icenowy Zheng
Cc: Vasily Khoruzhick, Yangtao Li, Rafael J. Wysocki, Amit Kucheria,
Zhang Rui, Chen-Yu Tsai, Jernej Skrabec, Samuel Holland,
Hugh Dickins, linux-pm, linux-arm-kernel, linux-sunxi,
linux-kernel
Hi Mark,
On 22/07/2023 18:46, Mark Brown wrote:
> On Sat, Jul 22, 2023 at 08:11:43PM +0800, Icenowy Zheng wrote:
>> 在 2023-07-18星期二的 16:04 +0100,Mark Brown写道:
>
>>> Since we currently do not define any trip points for the GPU thermal
>>> zones on at least A64 or H5 this means that we have no thermal
>>> support
>>> on these platforms:
>
>>> [ 1.698703] thermal_sys: Failed to find 'trips' node
>>> [ 1.698707] thermal_sys: Failed to find trip points for thermal-
>>> sensor id=1
>
>> I think this is an issue in the core thermal subsystem, and sent a
>> patch; Unfortunately the patch seems to be rejected by linux-arm-kernel
>> (and some other mailing lists)...
>
> It did seem to be a bit of an excessively strict requirement, I was
> going to poke at that myself. It does seem worthwhile doing the change
> in the sun8i driver anyway, there might be some other issue that causes
> registration to fail which would have the same issue.
Why do you want a thermal zone if there is no trip point ?
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-07-23 9:40 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-18 15:04 [PATCH] thermal/drivers/sun8i: Don't fail probe due to zone registration failure Mark Brown
2023-07-18 16:50 ` Vasily Khoruzhick
2023-07-19 20:24 ` Jernej Škrabec
2023-07-22 12:11 ` Icenowy Zheng
2023-07-22 16:46 ` Mark Brown
2023-07-23 9:40 ` Daniel Lezcano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).