public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] power: supply: core: return -EAGAIN on uninitialized read temp
@ 2024-07-04  8:52 Neil Armstrong
  2024-07-04  9:12 ` Greg KH
  2024-07-04 16:41 ` Daniel Lezcano
  0 siblings, 2 replies; 7+ messages in thread
From: Neil Armstrong @ 2024-07-04  8:52 UTC (permalink / raw)
  To: Sebastian Reichel, Krzysztof Kozlowski, Rhyland Klein,
	Anton Vorontsov, Jenny TC
  Cc: Daniel Lezcano, Rafael J. Wysocki, linux-arm-msm, linux-pm,
	linux-kernel, regressions, Neil Armstrong

If the thermal core tries to update the temperature from an
uninitialized power supply, it will swawn the following warning:
thermal thermal_zoneXX: failed to read out thermal zone (-19)

But reading from an uninitialized power supply should not be
considered as a fatal error, but the thermal core expects
the -EAGAIN error to be returned in this particular case.

So convert -ENODEV as -EAGAIN to express the fact that reading
temperature from an uninitialized power supply shouldn't be
a fatal error, but should indicate to the thermal zone it should
retry later.

It notably removes such messages on Qualcomm platforms using the
qcom_battmgr driver spawning warnings until the aDSP firmware
gets up and the battery manager reports valid data.

Link: https://lore.kernel.org/all/2ed4c630-204a-4f80-a37f-f2ca838eb455@linaro.org/
Fixes: 5bc28b93a36e ("power_supply: power_supply_read_temp only if use_cnt > 0")
Fixes: 3be330bf8860 ("power_supply: Register battery as a thermal zone")
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
 drivers/power/supply/power_supply_core.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/power/supply/power_supply_core.c b/drivers/power/supply/power_supply_core.c
index 8f6025acd10a..b38bff4dbfc7 100644
--- a/drivers/power/supply/power_supply_core.c
+++ b/drivers/power/supply/power_supply_core.c
@@ -1287,8 +1287,13 @@ static int power_supply_read_temp(struct thermal_zone_device *tzd,
 	WARN_ON(tzd == NULL);
 	psy = thermal_zone_device_priv(tzd);
 	ret = power_supply_get_property(psy, POWER_SUPPLY_PROP_TEMP, &val);
+	/*
+	 * The thermal core expects -EAGAIN as non-fatal error,
+	 * convert -ENODEV as -EAGAIN since -ENODEV is returned
+	 * when a power supply device isn't initialized
+	 */
 	if (ret)
-		return ret;
+		return ret == -ENODEV ? -EAGAIN : ret;
 
 	/* Convert tenths of degree Celsius to milli degree Celsius. */
 	*temp = val.intval * 100;

---
base-commit: 82e4255305c554b0bb18b7ccf2db86041b4c8b6e
change-id: 20240704-topic-sm8x50-upstream-fix-battmgr-temp-tz-warn-077166861efb

Best regards,
-- 
Neil Armstrong <neil.armstrong@linaro.org>


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] power: supply: core: return -EAGAIN on uninitialized read temp
  2024-07-04  8:52 [PATCH] power: supply: core: return -EAGAIN on uninitialized read temp Neil Armstrong
@ 2024-07-04  9:12 ` Greg KH
  2024-07-04 16:41 ` Daniel Lezcano
  1 sibling, 0 replies; 7+ messages in thread
From: Greg KH @ 2024-07-04  9:12 UTC (permalink / raw)
  To: Neil Armstrong
  Cc: Sebastian Reichel, Krzysztof Kozlowski, Rhyland Klein,
	Anton Vorontsov, Jenny TC, Daniel Lezcano, Rafael J. Wysocki,
	linux-arm-msm, linux-pm, linux-kernel, regressions

On Thu, Jul 04, 2024 at 10:52:08AM +0200, Neil Armstrong wrote:
> If the thermal core tries to update the temperature from an
> uninitialized power supply, it will swawn the following warning:
> thermal thermal_zoneXX: failed to read out thermal zone (-19)
> 
> But reading from an uninitialized power supply should not be
> considered as a fatal error, but the thermal core expects
> the -EAGAIN error to be returned in this particular case.
> 
> So convert -ENODEV as -EAGAIN to express the fact that reading
> temperature from an uninitialized power supply shouldn't be
> a fatal error, but should indicate to the thermal zone it should
> retry later.
> 
> It notably removes such messages on Qualcomm platforms using the
> qcom_battmgr driver spawning warnings until the aDSP firmware
> gets up and the battery manager reports valid data.
> 
> Link: https://lore.kernel.org/all/2ed4c630-204a-4f80-a37f-f2ca838eb455@linaro.org/
> Fixes: 5bc28b93a36e ("power_supply: power_supply_read_temp only if use_cnt > 0")
> Fixes: 3be330bf8860 ("power_supply: Register battery as a thermal zone")
> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> ---
>  drivers/power/supply/power_supply_core.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- You have marked a patch with a "Fixes:" tag for a commit that is in an
  older released kernel, yet you do not have a cc: stable line in the
  signed-off-by area at all, which means that the patch will not be
  applied to any older kernel releases.  To properly fix this, please
  follow the documented rules in the
  Documentation/process/stable-kernel-rules.rst file for how to resolve
  this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] power: supply: core: return -EAGAIN on uninitialized read temp
  2024-07-04  8:52 [PATCH] power: supply: core: return -EAGAIN on uninitialized read temp Neil Armstrong
  2024-07-04  9:12 ` Greg KH
@ 2024-07-04 16:41 ` Daniel Lezcano
  2024-07-05  5:56   ` Krzysztof Kozlowski
  1 sibling, 1 reply; 7+ messages in thread
From: Daniel Lezcano @ 2024-07-04 16:41 UTC (permalink / raw)
  To: Neil Armstrong, Sebastian Reichel, Krzysztof Kozlowski,
	Rhyland Klein, Anton Vorontsov, Jenny TC
  Cc: Rafael J. Wysocki, linux-arm-msm, linux-pm, linux-kernel,
	regressions

On 04/07/2024 10:52, Neil Armstrong wrote:
> If the thermal core tries to update the temperature from an
> uninitialized power supply, it will swawn the following warning:
> thermal thermal_zoneXX: failed to read out thermal zone (-19)
> 
> But reading from an uninitialized power supply should not be
> considered as a fatal error, but the thermal core expects
> the -EAGAIN error to be returned in this particular case.
> 
> So convert -ENODEV as -EAGAIN to express the fact that reading
> temperature from an uninitialized power supply shouldn't be
> a fatal error, but should indicate to the thermal zone it should
> retry later.
> 
> It notably removes such messages on Qualcomm platforms using the
> qcom_battmgr driver spawning warnings until the aDSP firmware
> gets up and the battery manager reports valid data.

Is it possible to have the aDSP firmware ready first ?

> Link: https://lore.kernel.org/all/2ed4c630-204a-4f80-a37f-f2ca838eb455@linaro.org/
> Fixes: 5bc28b93a36e ("power_supply: power_supply_read_temp only if use_cnt > 0")
> Fixes: 3be330bf8860 ("power_supply: Register battery as a thermal zone")
> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> ---
>   drivers/power/supply/power_supply_core.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/power/supply/power_supply_core.c b/drivers/power/supply/power_supply_core.c
> index 8f6025acd10a..b38bff4dbfc7 100644
> --- a/drivers/power/supply/power_supply_core.c
> +++ b/drivers/power/supply/power_supply_core.c
> @@ -1287,8 +1287,13 @@ static int power_supply_read_temp(struct thermal_zone_device *tzd,
>   	WARN_ON(tzd == NULL);
>   	psy = thermal_zone_device_priv(tzd);
>   	ret = power_supply_get_property(psy, POWER_SUPPLY_PROP_TEMP, &val);
> +	/*
> +	 * The thermal core expects -EAGAIN as non-fatal error,
> +	 * convert -ENODEV as -EAGAIN since -ENODEV is returned
> +	 * when a power supply device isn't initialized
> +	 */
>   	if (ret)
> -		return ret;
> +		return ret == -ENODEV ? -EAGAIN : ret;
>   
>   	/* Convert tenths of degree Celsius to milli degree Celsius. */
>   	*temp = val.intval * 100;
> 
> ---
> base-commit: 82e4255305c554b0bb18b7ccf2db86041b4c8b6e
> change-id: 20240704-topic-sm8x50-upstream-fix-battmgr-temp-tz-warn-077166861efb
> 
> Best regards,

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] power: supply: core: return -EAGAIN on uninitialized read temp
  2024-07-04 16:41 ` Daniel Lezcano
@ 2024-07-05  5:56   ` Krzysztof Kozlowski
  2024-07-05  8:08     ` Daniel Lezcano
  0 siblings, 1 reply; 7+ messages in thread
From: Krzysztof Kozlowski @ 2024-07-05  5:56 UTC (permalink / raw)
  To: Daniel Lezcano, Neil Armstrong, Sebastian Reichel, Rhyland Klein,
	Anton Vorontsov, Jenny TC
  Cc: Rafael J. Wysocki, linux-arm-msm, linux-pm, linux-kernel,
	regressions

On 04/07/2024 18:41, Daniel Lezcano wrote:
> On 04/07/2024 10:52, Neil Armstrong wrote:
>> If the thermal core tries to update the temperature from an
>> uninitialized power supply, it will swawn the following warning:
>> thermal thermal_zoneXX: failed to read out thermal zone (-19)
>>
>> But reading from an uninitialized power supply should not be
>> considered as a fatal error, but the thermal core expects
>> the -EAGAIN error to be returned in this particular case.
>>
>> So convert -ENODEV as -EAGAIN to express the fact that reading
>> temperature from an uninitialized power supply shouldn't be
>> a fatal error, but should indicate to the thermal zone it should
>> retry later.
>>
>> It notably removes such messages on Qualcomm platforms using the
>> qcom_battmgr driver spawning warnings until the aDSP firmware
>> gets up and the battery manager reports valid data.
> 
> Is it possible to have the aDSP firmware ready first ?

I don't think so. ADSP firmware is a file, so as every firmware it can
be loaded from rootfs, not initramfs (unlike this driver), or even missing.

Best regards,
Krzysztof


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] power: supply: core: return -EAGAIN on uninitialized read temp
  2024-07-05  5:56   ` Krzysztof Kozlowski
@ 2024-07-05  8:08     ` Daniel Lezcano
  2024-07-15  9:30       ` Neil Armstrong
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Lezcano @ 2024-07-05  8:08 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Neil Armstrong, Sebastian Reichel,
	Rhyland Klein, Anton Vorontsov, Jenny TC
  Cc: Rafael J. Wysocki, linux-arm-msm, linux-pm, linux-kernel,
	regressions

On 05/07/2024 07:56, Krzysztof Kozlowski wrote:
> On 04/07/2024 18:41, Daniel Lezcano wrote:
>> On 04/07/2024 10:52, Neil Armstrong wrote:
>>> If the thermal core tries to update the temperature from an
>>> uninitialized power supply, it will swawn the following warning:
>>> thermal thermal_zoneXX: failed to read out thermal zone (-19)
>>>
>>> But reading from an uninitialized power supply should not be
>>> considered as a fatal error, but the thermal core expects
>>> the -EAGAIN error to be returned in this particular case.
>>>
>>> So convert -ENODEV as -EAGAIN to express the fact that reading
>>> temperature from an uninitialized power supply shouldn't be
>>> a fatal error, but should indicate to the thermal zone it should
>>> retry later.
>>>
>>> It notably removes such messages on Qualcomm platforms using the
>>> qcom_battmgr driver spawning warnings until the aDSP firmware
>>> gets up and the battery manager reports valid data.
>>
>> Is it possible to have the aDSP firmware ready first ?
> 
> I don't think so. ADSP firmware is a file, so as every firmware it can
> be loaded from rootfs, not initramfs (unlike this driver), or even missing.

Ok, said differently, can't we initialize the thermal zone after the 
firmware is loaded ?


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] power: supply: core: return -EAGAIN on uninitialized read temp
  2024-07-05  8:08     ` Daniel Lezcano
@ 2024-07-15  9:30       ` Neil Armstrong
  2024-07-15  9:41         ` Daniel Lezcano
  0 siblings, 1 reply; 7+ messages in thread
From: Neil Armstrong @ 2024-07-15  9:30 UTC (permalink / raw)
  To: Daniel Lezcano, Krzysztof Kozlowski, Sebastian Reichel,
	Rhyland Klein, Anton Vorontsov, Jenny TC
  Cc: Rafael J. Wysocki, linux-arm-msm, linux-pm, linux-kernel,
	regressions

On 05/07/2024 10:08, Daniel Lezcano wrote:
> On 05/07/2024 07:56, Krzysztof Kozlowski wrote:
>> On 04/07/2024 18:41, Daniel Lezcano wrote:
>>> On 04/07/2024 10:52, Neil Armstrong wrote:
>>>> If the thermal core tries to update the temperature from an
>>>> uninitialized power supply, it will swawn the following warning:
>>>> thermal thermal_zoneXX: failed to read out thermal zone (-19)
>>>>
>>>> But reading from an uninitialized power supply should not be
>>>> considered as a fatal error, but the thermal core expects
>>>> the -EAGAIN error to be returned in this particular case.
>>>>
>>>> So convert -ENODEV as -EAGAIN to express the fact that reading
>>>> temperature from an uninitialized power supply shouldn't be
>>>> a fatal error, but should indicate to the thermal zone it should
>>>> retry later.
>>>>
>>>> It notably removes such messages on Qualcomm platforms using the
>>>> qcom_battmgr driver spawning warnings until the aDSP firmware
>>>> gets up and the battery manager reports valid data.
>>>
>>> Is it possible to have the aDSP firmware ready first ?
>>
>> I don't think so. ADSP firmware is a file, so as every firmware it can
>> be loaded from rootfs, not initramfs (unlike this driver), or even missing.
> 
> Ok, said differently, can't we initialize the thermal zone after the firmware is loaded ?

This is the goal, but this can't be a fix but a proper rework.

> 

I think changing power_supply_core.c is not the right solution.

qcom_battmgr_bat_get_property() should return -EAGAIN instead of
-ENODEV.

Neil


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] power: supply: core: return -EAGAIN on uninitialized read temp
  2024-07-15  9:30       ` Neil Armstrong
@ 2024-07-15  9:41         ` Daniel Lezcano
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Lezcano @ 2024-07-15  9:41 UTC (permalink / raw)
  To: neil.armstrong, Krzysztof Kozlowski, Sebastian Reichel,
	Rhyland Klein, Anton Vorontsov, Jenny TC
  Cc: Rafael J. Wysocki, linux-arm-msm, linux-pm, linux-kernel,
	regressions

On 15/07/2024 11:30, Neil Armstrong wrote:
> On 05/07/2024 10:08, Daniel Lezcano wrote:
>> On 05/07/2024 07:56, Krzysztof Kozlowski wrote:
>>> On 04/07/2024 18:41, Daniel Lezcano wrote:
>>>> On 04/07/2024 10:52, Neil Armstrong wrote:
>>>>> If the thermal core tries to update the temperature from an
>>>>> uninitialized power supply, it will swawn the following warning:
>>>>> thermal thermal_zoneXX: failed to read out thermal zone (-19)
>>>>>
>>>>> But reading from an uninitialized power supply should not be
>>>>> considered as a fatal error, but the thermal core expects
>>>>> the -EAGAIN error to be returned in this particular case.
>>>>>
>>>>> So convert -ENODEV as -EAGAIN to express the fact that reading
>>>>> temperature from an uninitialized power supply shouldn't be
>>>>> a fatal error, but should indicate to the thermal zone it should
>>>>> retry later.
>>>>>
>>>>> It notably removes such messages on Qualcomm platforms using the
>>>>> qcom_battmgr driver spawning warnings until the aDSP firmware
>>>>> gets up and the battery manager reports valid data.
>>>>
>>>> Is it possible to have the aDSP firmware ready first ?
>>>
>>> I don't think so. ADSP firmware is a file, so as every firmware it can
>>> be loaded from rootfs, not initramfs (unlike this driver), or even 
>>> missing.
>>
>> Ok, said differently, can't we initialize the thermal zone after the 
>> firmware is loaded ?
> 
> This is the goal, but this can't be a fix but a proper rework.

Right, it is a design issue and we are finding this problem in several 
drivers using the thermal zone. Unfortunately that forces the thermal 
core to do cumbersome mechanisms because of this and obviously it is a 
friction for thermal core cleanups / rework. IOW, bad driver design => 
thermal core impacted.

> I think changing power_supply_core.c is not the right solution.

 From my POV, it is the right solution but I agree it could take a cycle 
or more to fix.

> qcom_battmgr_bat_get_property() should return -EAGAIN instead of
> -ENODEV.

Yes, we can do that in the first place and come back to solve this 
firmware / async issue in a more generic way later


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-07-15  9:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-04  8:52 [PATCH] power: supply: core: return -EAGAIN on uninitialized read temp Neil Armstrong
2024-07-04  9:12 ` Greg KH
2024-07-04 16:41 ` Daniel Lezcano
2024-07-05  5:56   ` Krzysztof Kozlowski
2024-07-05  8:08     ` Daniel Lezcano
2024-07-15  9:30       ` Neil Armstrong
2024-07-15  9:41         ` Daniel Lezcano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox