linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] thermal: tegra-bpmp: Handle offline zones
@ 2023-03-30  9:49 Mikko Perttunen
  2023-03-30 10:03 ` Daniel Lezcano
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Mikko Perttunen @ 2023-03-30  9:49 UTC (permalink / raw)
  To: Rafael J. Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	Thierry Reding, Jonathan Hunter
  Cc: Mikko Perttunen, linux-pm, linux-tegra, linux-kernel

From: Mikko Perttunen <mperttunen@nvidia.com>

Thermal zones located in power domains may not be accessible when
the domain is powergated. In this situation, reading the temperature
will return -BPMP_EFAULT. When evaluating trips, BPMP will internally
use -256C as the temperature for offline zones.

For smooth operation, for offline zones, return -EAGAIN when reading
the temperature and allow registration of zones even if they are
offline during probe.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v2:
* Adjusted commit message.
* Patch 2/2 dropped for now since it is more controversial,
  and this patch is more critical.

 drivers/thermal/tegra/tegra-bpmp-thermal.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/tegra/tegra-bpmp-thermal.c b/drivers/thermal/tegra/tegra-bpmp-thermal.c
index f5fd4018f72f..4ffc3bb3bf35 100644
--- a/drivers/thermal/tegra/tegra-bpmp-thermal.c
+++ b/drivers/thermal/tegra/tegra-bpmp-thermal.c
@@ -52,6 +52,8 @@ static int __tegra_bpmp_thermal_get_temp(struct tegra_bpmp_thermal_zone *zone,
 	err = tegra_bpmp_transfer(zone->tegra->bpmp, &msg);
 	if (err)
 		return err;
+	if (msg.rx.ret == -BPMP_EFAULT)
+		return -EAGAIN;
 	if (msg.rx.ret)
 		return -EINVAL;
 
@@ -259,7 +261,12 @@ static int tegra_bpmp_thermal_probe(struct platform_device *pdev)
 		zone->tegra = tegra;
 
 		err = __tegra_bpmp_thermal_get_temp(zone, &temp);
-		if (err < 0) {
+
+		/*
+		 * Sensors in powergated domains may temporarily fail to be read
+		 * (-EAGAIN), but will become accessible when the domain is powered on.
+		 */
+		if (err < 0 && err != -EAGAIN) {
 			devm_kfree(&pdev->dev, zone);
 			continue;
 		}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] thermal: tegra-bpmp: Handle offline zones
  2023-03-30  9:49 [PATCH v2] thermal: tegra-bpmp: Handle offline zones Mikko Perttunen
@ 2023-03-30 10:03 ` Daniel Lezcano
  2023-03-30 10:06   ` Mikko Perttunen
  2023-03-30 16:29 ` Thierry Reding
  2023-03-30 21:14 ` Daniel Lezcano
  2 siblings, 1 reply; 8+ messages in thread
From: Daniel Lezcano @ 2023-03-30 10:03 UTC (permalink / raw)
  To: Mikko Perttunen, Rafael J. Wysocki, Amit Kucheria, Zhang Rui,
	Thierry Reding, Jonathan Hunter
  Cc: Mikko Perttunen, linux-pm, linux-tegra, linux-kernel

On 30/03/2023 11:49, Mikko Perttunen wrote:
> From: Mikko Perttunen <mperttunen@nvidia.com>
> 
> Thermal zones located in power domains may not be accessible when
> the domain is powergated. In this situation, reading the temperature
> will return -BPMP_EFAULT. When evaluating trips, BPMP will internally
> use -256C as the temperature for offline zones.

> For smooth operation, for offline zones, return -EAGAIN when reading
> the temperature and allow registration of zones even if they are
> offline during probe.

I think it makes more sense to check if the power domain associated with 
the device is powered up and if not return -EPROBE_DEFER.


> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
> v2:
> * Adjusted commit message.
> * Patch 2/2 dropped for now since it is more controversial,
>    and this patch is more critical.
> 
>   drivers/thermal/tegra/tegra-bpmp-thermal.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/tegra/tegra-bpmp-thermal.c b/drivers/thermal/tegra/tegra-bpmp-thermal.c
> index f5fd4018f72f..4ffc3bb3bf35 100644
> --- a/drivers/thermal/tegra/tegra-bpmp-thermal.c
> +++ b/drivers/thermal/tegra/tegra-bpmp-thermal.c
> @@ -52,6 +52,8 @@ static int __tegra_bpmp_thermal_get_temp(struct tegra_bpmp_thermal_zone *zone,
>   	err = tegra_bpmp_transfer(zone->tegra->bpmp, &msg);
>   	if (err)
>   		return err;
> +	if (msg.rx.ret == -BPMP_EFAULT)
> +		return -EAGAIN;
>   	if (msg.rx.ret)
>   		return -EINVAL;
>   
> @@ -259,7 +261,12 @@ static int tegra_bpmp_thermal_probe(struct platform_device *pdev)
>   		zone->tegra = tegra;
>   
>   		err = __tegra_bpmp_thermal_get_temp(zone, &temp);
> -		if (err < 0) {
> +
> +		/*
> +		 * Sensors in powergated domains may temporarily fail to be read
> +		 * (-EAGAIN), but will become accessible when the domain is powered on.
> +		 */
> +		if (err < 0 && err != -EAGAIN) {
>   			devm_kfree(&pdev->dev, zone);
>   			continue;
>   		}

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] thermal: tegra-bpmp: Handle offline zones
  2023-03-30 10:03 ` Daniel Lezcano
@ 2023-03-30 10:06   ` Mikko Perttunen
  2023-03-30 12:36     ` Daniel Lezcano
  0 siblings, 1 reply; 8+ messages in thread
From: Mikko Perttunen @ 2023-03-30 10:06 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J. Wysocki, Amit Kucheria, Zhang Rui,
	Thierry Reding, Jonathan Hunter
  Cc: Mikko Perttunen, linux-pm, linux-tegra, linux-kernel

On 3/30/23 13:03, Daniel Lezcano wrote:
> On 30/03/2023 11:49, Mikko Perttunen wrote:
>> From: Mikko Perttunen <mperttunen@nvidia.com>
>>
>> Thermal zones located in power domains may not be accessible when
>> the domain is powergated. In this situation, reading the temperature
>> will return -BPMP_EFAULT. When evaluating trips, BPMP will internally
>> use -256C as the temperature for offline zones.
> 
>> For smooth operation, for offline zones, return -EAGAIN when reading
>> the temperature and allow registration of zones even if they are
>> offline during probe.
> 
> I think it makes more sense to check if the power domain associated with 
> the device is powered up and if not return -EPROBE_DEFER.

The power domains in question are related to computer vision engines 
that only get powered on when in use, possibly never if the user doesn't 
run a computer vision workload on the system. We still want other 
thermal zones to be available.

Mikko

> 
> 
>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
>> ---
>> v2:
>> * Adjusted commit message.
>> * Patch 2/2 dropped for now since it is more controversial,
>>    and this patch is more critical.
>>
>>   drivers/thermal/tegra/tegra-bpmp-thermal.c | 9 ++++++++-
>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/thermal/tegra/tegra-bpmp-thermal.c 
>> b/drivers/thermal/tegra/tegra-bpmp-thermal.c
>> index f5fd4018f72f..4ffc3bb3bf35 100644
>> --- a/drivers/thermal/tegra/tegra-bpmp-thermal.c
>> +++ b/drivers/thermal/tegra/tegra-bpmp-thermal.c
>> @@ -52,6 +52,8 @@ static int __tegra_bpmp_thermal_get_temp(struct 
>> tegra_bpmp_thermal_zone *zone,
>>       err = tegra_bpmp_transfer(zone->tegra->bpmp, &msg);
>>       if (err)
>>           return err;
>> +    if (msg.rx.ret == -BPMP_EFAULT)
>> +        return -EAGAIN;
>>       if (msg.rx.ret)
>>           return -EINVAL;
>> @@ -259,7 +261,12 @@ static int tegra_bpmp_thermal_probe(struct 
>> platform_device *pdev)
>>           zone->tegra = tegra;
>>           err = __tegra_bpmp_thermal_get_temp(zone, &temp);
>> -        if (err < 0) {
>> +
>> +        /*
>> +         * Sensors in powergated domains may temporarily fail to be read
>> +         * (-EAGAIN), but will become accessible when the domain is 
>> powered on.
>> +         */
>> +        if (err < 0 && err != -EAGAIN) {
>>               devm_kfree(&pdev->dev, zone);
>>               continue;
>>           }
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] thermal: tegra-bpmp: Handle offline zones
  2023-03-30 10:06   ` Mikko Perttunen
@ 2023-03-30 12:36     ` Daniel Lezcano
  2023-03-30 13:11       ` Mikko Perttunen
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Lezcano @ 2023-03-30 12:36 UTC (permalink / raw)
  To: Mikko Perttunen, Rafael J. Wysocki, Amit Kucheria, Zhang Rui,
	Thierry Reding, Jonathan Hunter
  Cc: Mikko Perttunen, linux-pm, linux-tegra, linux-kernel

On 30/03/2023 12:06, Mikko Perttunen wrote:
> On 3/30/23 13:03, Daniel Lezcano wrote:
>> On 30/03/2023 11:49, Mikko Perttunen wrote:
>>> From: Mikko Perttunen <mperttunen@nvidia.com>
>>>
>>> Thermal zones located in power domains may not be accessible when
>>> the domain is powergated. In this situation, reading the temperature
>>> will return -BPMP_EFAULT. When evaluating trips, BPMP will internally
>>> use -256C as the temperature for offline zones.
>>
>>> For smooth operation, for offline zones, return -EAGAIN when reading
>>> the temperature and allow registration of zones even if they are
>>> offline during probe.
>>
>> I think it makes more sense to check if the power domain associated 
>> with the device is powered up and if not return -EPROBE_DEFER.
> 
> The power domains in question are related to computer vision engines 
> that only get powered on when in use, possibly never if the user doesn't 
> run a computer vision workload on the system. We still want other 
> thermal zones to be available.

Ok, I see the point.

I'm worried about the semantic of the errors returned, the translation 
from BPMP_EFAULT to EAGAIN and the assumption it is a disabled (may be 
forever) thermal zone.

What does the documentation say for the error msg.rx.ret == -BPMP_EFAULT?



>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
>>> ---
>>> v2:
>>> * Adjusted commit message.
>>> * Patch 2/2 dropped for now since it is more controversial,
>>>    and this patch is more critical.
>>>
>>>   drivers/thermal/tegra/tegra-bpmp-thermal.c | 9 ++++++++-
>>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/thermal/tegra/tegra-bpmp-thermal.c 
>>> b/drivers/thermal/tegra/tegra-bpmp-thermal.c
>>> index f5fd4018f72f..4ffc3bb3bf35 100644
>>> --- a/drivers/thermal/tegra/tegra-bpmp-thermal.c
>>> +++ b/drivers/thermal/tegra/tegra-bpmp-thermal.c
>>> @@ -52,6 +52,8 @@ static int __tegra_bpmp_thermal_get_temp(struct 
>>> tegra_bpmp_thermal_zone *zone,
>>>       err = tegra_bpmp_transfer(zone->tegra->bpmp, &msg);
>>>       if (err)
>>>           return err;
>>> +    if (msg.rx.ret == -BPMP_EFAULT)
>>> +        return -EAGAIN;
>>>       if (msg.rx.ret)
>>>           return -EINVAL;
>>> @@ -259,7 +261,12 @@ static int tegra_bpmp_thermal_probe(struct 
>>> platform_device *pdev)
>>>           zone->tegra = tegra;
>>>           err = __tegra_bpmp_thermal_get_temp(zone, &temp);
>>> -        if (err < 0) {
>>> +
>>> +        /*
>>> +         * Sensors in powergated domains may temporarily fail to be 
>>> read
>>> +         * (-EAGAIN), but will become accessible when the domain is 
>>> powered on.
>>> +         */
>>> +        if (err < 0 && err != -EAGAIN) {
>>>               devm_kfree(&pdev->dev, zone);
>>>               continue;
>>>           }
>>
> 

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] thermal: tegra-bpmp: Handle offline zones
  2023-03-30 12:36     ` Daniel Lezcano
@ 2023-03-30 13:11       ` Mikko Perttunen
  0 siblings, 0 replies; 8+ messages in thread
From: Mikko Perttunen @ 2023-03-30 13:11 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J. Wysocki, Amit Kucheria, Zhang Rui,
	Thierry Reding, Jonathan Hunter
  Cc: Mikko Perttunen, linux-pm, linux-tegra, linux-kernel

On 3/30/23 15:36, Daniel Lezcano wrote:
> On 30/03/2023 12:06, Mikko Perttunen wrote:
>> On 3/30/23 13:03, Daniel Lezcano wrote:
>>> On 30/03/2023 11:49, Mikko Perttunen wrote:
>>>> From: Mikko Perttunen <mperttunen@nvidia.com>
>>>>
>>>> Thermal zones located in power domains may not be accessible when
>>>> the domain is powergated. In this situation, reading the temperature
>>>> will return -BPMP_EFAULT. When evaluating trips, BPMP will internally
>>>> use -256C as the temperature for offline zones.
>>>
>>>> For smooth operation, for offline zones, return -EAGAIN when reading
>>>> the temperature and allow registration of zones even if they are
>>>> offline during probe.
>>>
>>> I think it makes more sense to check if the power domain associated 
>>> with the device is powered up and if not return -EPROBE_DEFER.
>>
>> The power domains in question are related to computer vision engines 
>> that only get powered on when in use, possibly never if the user 
>> doesn't run a computer vision workload on the system. We still want 
>> other thermal zones to be available.
> 
> Ok, I see the point.
> 
> I'm worried about the semantic of the errors returned, the translation 
> from BPMP_EFAULT to EAGAIN and the assumption it is a disabled (may be 
> forever) thermal zone.
> 
> What does the documentation say for the error msg.rx.ret == -BPMP_EFAULT?
> 

The documentation says

Value          | Description
-------------- | -----------------------------------------
0              | Temperature query succeeded.
-#BPMP_EINVAL  | Invalid request parameters.
-#BPMP_ENOENT  | No driver registered for thermal zone.
-#BPMP_EFAULT  | Problem reading temperature measurement.

In practice, what BPMP_EFAULT means here is that the hardware has no 
indicated temperature for the zone, which really only happens if the 
power domain is powered off.

Mikko

> 
>>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
>>>> ---
>>>> v2:
>>>> * Adjusted commit message.
>>>> * Patch 2/2 dropped for now since it is more controversial,
>>>>    and this patch is more critical.
>>>>
>>>>   drivers/thermal/tegra/tegra-bpmp-thermal.c | 9 ++++++++-
>>>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/thermal/tegra/tegra-bpmp-thermal.c 
>>>> b/drivers/thermal/tegra/tegra-bpmp-thermal.c
>>>> index f5fd4018f72f..4ffc3bb3bf35 100644
>>>> --- a/drivers/thermal/tegra/tegra-bpmp-thermal.c
>>>> +++ b/drivers/thermal/tegra/tegra-bpmp-thermal.c
>>>> @@ -52,6 +52,8 @@ static int __tegra_bpmp_thermal_get_temp(struct 
>>>> tegra_bpmp_thermal_zone *zone,
>>>>       err = tegra_bpmp_transfer(zone->tegra->bpmp, &msg);
>>>>       if (err)
>>>>           return err;
>>>> +    if (msg.rx.ret == -BPMP_EFAULT)
>>>> +        return -EAGAIN;
>>>>       if (msg.rx.ret)
>>>>           return -EINVAL;
>>>> @@ -259,7 +261,12 @@ static int tegra_bpmp_thermal_probe(struct 
>>>> platform_device *pdev)
>>>>           zone->tegra = tegra;
>>>>           err = __tegra_bpmp_thermal_get_temp(zone, &temp);
>>>> -        if (err < 0) {
>>>> +
>>>> +        /*
>>>> +         * Sensors in powergated domains may temporarily fail to be 
>>>> read
>>>> +         * (-EAGAIN), but will become accessible when the domain is 
>>>> powered on.
>>>> +         */
>>>> +        if (err < 0 && err != -EAGAIN) {
>>>>               devm_kfree(&pdev->dev, zone);
>>>>               continue;
>>>>           }
>>>
>>
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] thermal: tegra-bpmp: Handle offline zones
  2023-03-30  9:49 [PATCH v2] thermal: tegra-bpmp: Handle offline zones Mikko Perttunen
  2023-03-30 10:03 ` Daniel Lezcano
@ 2023-03-30 16:29 ` Thierry Reding
  2023-03-30 21:14 ` Daniel Lezcano
  2 siblings, 0 replies; 8+ messages in thread
From: Thierry Reding @ 2023-03-30 16:29 UTC (permalink / raw)
  To: Mikko Perttunen
  Cc: Rafael J. Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	Jonathan Hunter, Mikko Perttunen, linux-pm, linux-tegra,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 925 bytes --]

On Thu, Mar 30, 2023 at 12:49:04PM +0300, Mikko Perttunen wrote:
> From: Mikko Perttunen <mperttunen@nvidia.com>
> 
> Thermal zones located in power domains may not be accessible when
> the domain is powergated. In this situation, reading the temperature
> will return -BPMP_EFAULT. When evaluating trips, BPMP will internally
> use -256C as the temperature for offline zones.
> 
> For smooth operation, for offline zones, return -EAGAIN when reading
> the temperature and allow registration of zones even if they are
> offline during probe.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
> v2:
> * Adjusted commit message.
> * Patch 2/2 dropped for now since it is more controversial,
>   and this patch is more critical.
> 
>  drivers/thermal/tegra/tegra-bpmp-thermal.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)

Acked-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] thermal: tegra-bpmp: Handle offline zones
  2023-03-30  9:49 [PATCH v2] thermal: tegra-bpmp: Handle offline zones Mikko Perttunen
  2023-03-30 10:03 ` Daniel Lezcano
  2023-03-30 16:29 ` Thierry Reding
@ 2023-03-30 21:14 ` Daniel Lezcano
  2023-03-31  7:12   ` Mikko Perttunen
  2 siblings, 1 reply; 8+ messages in thread
From: Daniel Lezcano @ 2023-03-30 21:14 UTC (permalink / raw)
  To: Mikko Perttunen, Rafael J. Wysocki, Amit Kucheria, Zhang Rui,
	Thierry Reding, Jonathan Hunter
  Cc: Mikko Perttunen, linux-pm, linux-tegra, linux-kernel

On 30/03/2023 11:49, Mikko Perttunen wrote:
> From: Mikko Perttunen <mperttunen@nvidia.com>
> 
> Thermal zones located in power domains may not be accessible when
> the domain is powergated. In this situation, reading the temperature
> will return -BPMP_EFAULT. When evaluating trips, BPMP will internally
> use -256C as the temperature for offline zones.
> 
> For smooth operation, for offline zones, return -EAGAIN when reading
> the temperature and allow registration of zones even if they are
> offline during probe.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>

Applied, thanks

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] thermal: tegra-bpmp: Handle offline zones
  2023-03-30 21:14 ` Daniel Lezcano
@ 2023-03-31  7:12   ` Mikko Perttunen
  0 siblings, 0 replies; 8+ messages in thread
From: Mikko Perttunen @ 2023-03-31  7:12 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J. Wysocki, Amit Kucheria, Zhang Rui,
	Thierry Reding, Jonathan Hunter
  Cc: Mikko Perttunen, linux-pm, linux-tegra, linux-kernel

On 3/31/23 00:14, Daniel Lezcano wrote:
> On 30/03/2023 11:49, Mikko Perttunen wrote:
>> From: Mikko Perttunen <mperttunen@nvidia.com>
>>
>> Thermal zones located in power domains may not be accessible when
>> the domain is powergated. In this situation, reading the temperature
>> will return -BPMP_EFAULT. When evaluating trips, BPMP will internally
>> use -256C as the temperature for offline zones.
>>
>> For smooth operation, for offline zones, return -EAGAIN when reading
>> the temperature and allow registration of zones even if they are
>> offline during probe.
>>
>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> 
> Applied, thanks
> 

Thank you!
Mikko

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-03-31  7:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-30  9:49 [PATCH v2] thermal: tegra-bpmp: Handle offline zones Mikko Perttunen
2023-03-30 10:03 ` Daniel Lezcano
2023-03-30 10:06   ` Mikko Perttunen
2023-03-30 12:36     ` Daniel Lezcano
2023-03-30 13:11       ` Mikko Perttunen
2023-03-30 16:29 ` Thierry Reding
2023-03-30 21:14 ` Daniel Lezcano
2023-03-31  7:12   ` Mikko Perttunen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).