* [PATCH 2/4] thermal/of: Return -ENODEV instead of -EINVAL if registration fails
2022-08-08 18:09 [PATCH 1/4] thermal/of: Fix error code in of_thermal_zone_find() Daniel Lezcano
@ 2022-08-08 18:09 ` Daniel Lezcano
2022-08-08 18:35 ` Guenter Roeck
2022-08-08 19:03 ` Michael Walle
2022-08-08 18:09 ` [PATCH 3/4] dt-bindings: thermal: Fix missing required property Daniel Lezcano
2022-08-08 18:09 ` [PATCH 4/4] thermal/of: Fix free after use in thermal_of_unregister() Daniel Lezcano
2 siblings, 2 replies; 11+ messages in thread
From: Daniel Lezcano @ 2022-08-08 18:09 UTC (permalink / raw)
To: daniel.lezcano, rafael
Cc: michael, dan.carpenter, linux, linux-kernel, linux-pm,
Amit Kucheria, Zhang Rui
The previous version of the OF code was returning -ENODEV if no
thermal zones description was found or if the lookup of the sensor in
the thermal zones was not found.
The backend drivers are expecting this return value as an information
about skipping the sensor initialization and considered as normal.
Fix the return value by replacing -EINVAL by -ENODEV
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
drivers/thermal/thermal_of.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index 368eb58e97cf..4210c18ef7b2 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -329,7 +329,7 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
np = of_find_node_by_name(NULL, "thermal-zones");
if (!np) {
pr_err("Unable to find thermal zones description\n");
- return ERR_PTR(-EINVAL);
+ return ERR_PTR(-ENODEV);
}
/*
@@ -368,7 +368,7 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
}
}
}
- tz = ERR_PTR(-EINVAL);
+ tz = ERR_PTR(-ENODEV);
out:
of_node_put(np);
return tz;
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] thermal/of: Return -ENODEV instead of -EINVAL if registration fails
2022-08-08 18:09 ` [PATCH 2/4] thermal/of: Return -ENODEV instead of -EINVAL if registration fails Daniel Lezcano
@ 2022-08-08 18:35 ` Guenter Roeck
2022-08-08 19:05 ` Michael Walle
2022-08-08 20:31 ` Daniel Lezcano
2022-08-08 19:03 ` Michael Walle
1 sibling, 2 replies; 11+ messages in thread
From: Guenter Roeck @ 2022-08-08 18:35 UTC (permalink / raw)
To: Daniel Lezcano, rafael
Cc: michael, dan.carpenter, linux-kernel, linux-pm, Amit Kucheria,
Zhang Rui
On 8/8/22 11:09, Daniel Lezcano wrote:
> The previous version of the OF code was returning -ENODEV if no
> thermal zones description was found or if the lookup of the sensor in
> the thermal zones was not found.
>
> The backend drivers are expecting this return value as an information
> about skipping the sensor initialization and considered as normal.
>
> Fix the return value by replacing -EINVAL by -ENODEV
>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
> drivers/thermal/thermal_of.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
> index 368eb58e97cf..4210c18ef7b2 100644
> --- a/drivers/thermal/thermal_of.c
> +++ b/drivers/thermal/thermal_of.c
> @@ -329,7 +329,7 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
> np = of_find_node_by_name(NULL, "thermal-zones");
> if (!np) {
> pr_err("Unable to find thermal zones description\n");
I really don't like that error message: People will see it (and complain)
whenever a sensor registers and there is no thermal zone, even though that
is perfectly normal (see description above).
Guenter
> - return ERR_PTR(-EINVAL);
> + return ERR_PTR(-ENODEV);
> }
>
> /*
> @@ -368,7 +368,7 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
> }
> }
> }
> - tz = ERR_PTR(-EINVAL);
> + tz = ERR_PTR(-ENODEV);
> out:
> of_node_put(np);
> return tz;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] thermal/of: Return -ENODEV instead of -EINVAL if registration fails
2022-08-08 18:35 ` Guenter Roeck
@ 2022-08-08 19:05 ` Michael Walle
2022-08-08 21:17 ` Daniel Lezcano
2022-08-08 20:31 ` Daniel Lezcano
1 sibling, 1 reply; 11+ messages in thread
From: Michael Walle @ 2022-08-08 19:05 UTC (permalink / raw)
To: Guenter Roeck
Cc: Daniel Lezcano, rafael, dan.carpenter, linux-kernel, linux-pm,
Amit Kucheria, Zhang Rui
Am 2022-08-08 20:35, schrieb Guenter Roeck:
> On 8/8/22 11:09, Daniel Lezcano wrote:
>> The previous version of the OF code was returning -ENODEV if no
>> thermal zones description was found or if the lookup of the sensor in
>> the thermal zones was not found.
>>
>> The backend drivers are expecting this return value as an information
>> about skipping the sensor initialization and considered as normal.
>>
>> Fix the return value by replacing -EINVAL by -ENODEV
>>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>> ---
>> drivers/thermal/thermal_of.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/thermal/thermal_of.c
>> b/drivers/thermal/thermal_of.c
>> index 368eb58e97cf..4210c18ef7b2 100644
>> --- a/drivers/thermal/thermal_of.c
>> +++ b/drivers/thermal/thermal_of.c
>> @@ -329,7 +329,7 @@ static struct device_node
>> *of_thermal_zone_find(struct device_node *sensor, int
>> np = of_find_node_by_name(NULL, "thermal-zones");
>> if (!np) {
>> pr_err("Unable to find thermal zones description\n");
>
> I really don't like that error message: People will see it (and
> complain)
> whenever a sensor registers and there is no thermal zone, even though
> that
> is perfectly normal (see description above).
I can second that, and there actually two error messages:
[ 6.156983] thermal_sys: Unable to find thermal zones description
[ 6.163125] thermal_sys: Failed to find thermal zone for hwmon id=0
On the sl28 board with the qoriq_thermal driver:
[ 1.917940] thermal_sys: Failed to find thermal zone for tmu id=2
[ 1.929231] thermal_sys: Failed to find thermal zone for tmu id=3
[ 1.940519] thermal_sys: Failed to find thermal zone for tmu id=4
[ 1.951814] thermal_sys: Failed to find thermal zone for tmu id=5
[ 1.963109] thermal_sys: Failed to find thermal zone for tmu id=6
[ 1.974399] thermal_sys: Failed to find thermal zone for tmu id=7
[ 1.985690] thermal_sys: Failed to find thermal zone for tmu id=8
[ 1.996980] thermal_sys: Failed to find thermal zone for tmu id=9
[ 2.008274] thermal_sys: Failed to find thermal zone for tmu id=10
[ 2.019656] thermal_sys: Failed to find thermal zone for tmu id=11
[ 2.031037] thermal_sys: Failed to find thermal zone for tmu id=12
[ 2.048942] thermal_sys: Failed to find thermal zone for tmu id=13
[ 2.060320] thermal_sys: Failed to find thermal zone for tmu id=14
[ 2.071700] thermal_sys: Failed to find thermal zone for tmu id=15
Btw. the driver seems to always register 16 sensors regardless how
many the actual hardware has (or rather: are described in the DT).
-michael
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] thermal/of: Return -ENODEV instead of -EINVAL if registration fails
2022-08-08 19:05 ` Michael Walle
@ 2022-08-08 21:17 ` Daniel Lezcano
0 siblings, 0 replies; 11+ messages in thread
From: Daniel Lezcano @ 2022-08-08 21:17 UTC (permalink / raw)
To: Michael Walle, Guenter Roeck
Cc: rafael, dan.carpenter, linux-kernel, linux-pm, Amit Kucheria,
Zhang Rui
On 08/08/2022 21:05, Michael Walle wrote:
> Am 2022-08-08 20:35, schrieb Guenter Roeck:
>> On 8/8/22 11:09, Daniel Lezcano wrote:
>>> The previous version of the OF code was returning -ENODEV if no
>>> thermal zones description was found or if the lookup of the sensor in
>>> the thermal zones was not found.
>>>
>>> The backend drivers are expecting this return value as an information
>>> about skipping the sensor initialization and considered as normal.
>>>
>>> Fix the return value by replacing -EINVAL by -ENODEV
>>>
>>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>>> ---
>>> drivers/thermal/thermal_of.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
>>> index 368eb58e97cf..4210c18ef7b2 100644
>>> --- a/drivers/thermal/thermal_of.c
>>> +++ b/drivers/thermal/thermal_of.c
>>> @@ -329,7 +329,7 @@ static struct device_node
>>> *of_thermal_zone_find(struct device_node *sensor, int
>>> np = of_find_node_by_name(NULL, "thermal-zones");
>>> if (!np) {
>>> pr_err("Unable to find thermal zones description\n");
>>
>> I really don't like that error message: People will see it (and complain)
>> whenever a sensor registers and there is no thermal zone, even though
>> that
>> is perfectly normal (see description above).
>
> I can second that, and there actually two error messages:
>
> [ 6.156983] thermal_sys: Unable to find thermal zones description
> [ 6.163125] thermal_sys: Failed to find thermal zone for hwmon id=0
Yeah, I can check:
np = of_thermal_zone_find(sensor, id);
And print the error if PTR_ERR(np) != ENODEV, otherwise silently return.
> On the sl28 board with the qoriq_thermal driver:
> [ 1.917940] thermal_sys: Failed to find thermal zone for tmu id=2
> [ 1.929231] thermal_sys: Failed to find thermal zone for tmu id=3
> [ 1.940519] thermal_sys: Failed to find thermal zone for tmu id=4
> [ 1.951814] thermal_sys: Failed to find thermal zone for tmu id=5
> [ 1.963109] thermal_sys: Failed to find thermal zone for tmu id=6
> [ 1.974399] thermal_sys: Failed to find thermal zone for tmu id=7
> [ 1.985690] thermal_sys: Failed to find thermal zone for tmu id=8
> [ 1.996980] thermal_sys: Failed to find thermal zone for tmu id=9
> [ 2.008274] thermal_sys: Failed to find thermal zone for tmu id=10
> [ 2.019656] thermal_sys: Failed to find thermal zone for tmu id=11
> [ 2.031037] thermal_sys: Failed to find thermal zone for tmu id=12
> [ 2.048942] thermal_sys: Failed to find thermal zone for tmu id=13
> [ 2.060320] thermal_sys: Failed to find thermal zone for tmu id=14
> [ 2.071700] thermal_sys: Failed to find thermal zone for tmu id=15
>
> Btw. the driver seems to always register 16 sensors regardless how
> many the actual hardware has (or rather: are described in the DT).
Yes, it may be nicer to rely on the compatible string to figure out the
sensors of the platform and call with full knowledge the registering
function. But anyway ...
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] thermal/of: Return -ENODEV instead of -EINVAL if registration fails
2022-08-08 18:35 ` Guenter Roeck
2022-08-08 19:05 ` Michael Walle
@ 2022-08-08 20:31 ` Daniel Lezcano
2022-08-08 21:33 ` Guenter Roeck
1 sibling, 1 reply; 11+ messages in thread
From: Daniel Lezcano @ 2022-08-08 20:31 UTC (permalink / raw)
To: Guenter Roeck, rafael
Cc: michael, dan.carpenter, linux-kernel, linux-pm, Amit Kucheria,
Zhang Rui
On 08/08/2022 20:35, Guenter Roeck wrote:
> On 8/8/22 11:09, Daniel Lezcano wrote:
>> The previous version of the OF code was returning -ENODEV if no
>> thermal zones description was found or if the lookup of the sensor in
>> the thermal zones was not found.
>>
>> The backend drivers are expecting this return value as an information
>> about skipping the sensor initialization and considered as normal.
>>
>> Fix the return value by replacing -EINVAL by -ENODEV
>>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>> ---
>> drivers/thermal/thermal_of.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
>> index 368eb58e97cf..4210c18ef7b2 100644
>> --- a/drivers/thermal/thermal_of.c
>> +++ b/drivers/thermal/thermal_of.c
>> @@ -329,7 +329,7 @@ static struct device_node
>> *of_thermal_zone_find(struct device_node *sensor, int
>> np = of_find_node_by_name(NULL, "thermal-zones");
>> if (!np) {
>> pr_err("Unable to find thermal zones description\n");
>
> I really don't like that error message: People will see it (and complain)
> whenever a sensor registers and there is no thermal zone, even though that
> is perfectly normal (see description above).
>
I agree, I'll change the message to:
pr_info("No thermal zones description\n");
sounds good ?
>> - return ERR_PTR(-EINVAL);
>> + return ERR_PTR(-ENODEV);
>> }
>> /*
>> @@ -368,7 +368,7 @@ static struct device_node
>> *of_thermal_zone_find(struct device_node *sensor, int
>> }
>> }
>> }
>> - tz = ERR_PTR(-EINVAL);
>> + tz = ERR_PTR(-ENODEV);
>> out:
>> of_node_put(np);
>> return tz;
>
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] thermal/of: Return -ENODEV instead of -EINVAL if registration fails
2022-08-08 20:31 ` Daniel Lezcano
@ 2022-08-08 21:33 ` Guenter Roeck
0 siblings, 0 replies; 11+ messages in thread
From: Guenter Roeck @ 2022-08-08 21:33 UTC (permalink / raw)
To: Daniel Lezcano, rafael
Cc: michael, dan.carpenter, linux-kernel, linux-pm, Amit Kucheria,
Zhang Rui
On 8/8/22 13:31, Daniel Lezcano wrote:
> On 08/08/2022 20:35, Guenter Roeck wrote:
>> On 8/8/22 11:09, Daniel Lezcano wrote:
>>> The previous version of the OF code was returning -ENODEV if no
>>> thermal zones description was found or if the lookup of the sensor in
>>> the thermal zones was not found.
>>>
>>> The backend drivers are expecting this return value as an information
>>> about skipping the sensor initialization and considered as normal.
>>>
>>> Fix the return value by replacing -EINVAL by -ENODEV
>>>
>>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>>> ---
>>> drivers/thermal/thermal_of.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
>>> index 368eb58e97cf..4210c18ef7b2 100644
>>> --- a/drivers/thermal/thermal_of.c
>>> +++ b/drivers/thermal/thermal_of.c
>>> @@ -329,7 +329,7 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
>>> np = of_find_node_by_name(NULL, "thermal-zones");
>>> if (!np) {
>>> pr_err("Unable to find thermal zones description\n");
>>
>> I really don't like that error message: People will see it (and complain)
>> whenever a sensor registers and there is no thermal zone, even though that
>> is perfectly normal (see description above).
>>
>
> I agree, I'll change the message to:
>
> pr_info("No thermal zones description\n");
>
> sounds good ?
>
I think it should be silent or at best pr_debug.
Guenter
>>> - return ERR_PTR(-EINVAL);
>>> + return ERR_PTR(-ENODEV);
>>> }
>>> /*
>>> @@ -368,7 +368,7 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
>>> }
>>> }
>>> }
>>> - tz = ERR_PTR(-EINVAL);
>>> + tz = ERR_PTR(-ENODEV);
>>> out:
>>> of_node_put(np);
>>> return tz;
>>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] thermal/of: Return -ENODEV instead of -EINVAL if registration fails
2022-08-08 18:09 ` [PATCH 2/4] thermal/of: Return -ENODEV instead of -EINVAL if registration fails Daniel Lezcano
2022-08-08 18:35 ` Guenter Roeck
@ 2022-08-08 19:03 ` Michael Walle
1 sibling, 0 replies; 11+ messages in thread
From: Michael Walle @ 2022-08-08 19:03 UTC (permalink / raw)
To: Daniel Lezcano
Cc: rafael, dan.carpenter, linux, linux-kernel, linux-pm,
Amit Kucheria, Zhang Rui
Am 2022-08-08 20:09, schrieb Daniel Lezcano:
> The previous version of the OF code was returning -ENODEV if no
> thermal zones description was found or if the lookup of the sensor in
> the thermal zones was not found.
>
> The backend drivers are expecting this return value as an information
> about skipping the sensor initialization and considered as normal.
>
> Fix the return value by replacing -EINVAL by -ENODEV
>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Michael Walle <michael@walle.cc>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/4] dt-bindings: thermal: Fix missing required property
2022-08-08 18:09 [PATCH 1/4] thermal/of: Fix error code in of_thermal_zone_find() Daniel Lezcano
2022-08-08 18:09 ` [PATCH 2/4] thermal/of: Return -ENODEV instead of -EINVAL if registration fails Daniel Lezcano
@ 2022-08-08 18:09 ` Daniel Lezcano
2022-08-08 18:09 ` [PATCH 4/4] thermal/of: Fix free after use in thermal_of_unregister() Daniel Lezcano
2 siblings, 0 replies; 11+ messages in thread
From: Daniel Lezcano @ 2022-08-08 18:09 UTC (permalink / raw)
To: daniel.lezcano, rafael
Cc: michael, dan.carpenter, linux, linux-kernel, linux-pm,
Amit Kucheria, Zhang Rui, Rob Herring, Krzysztof Kozlowski,
open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS
When the thermal zone description was converted to yaml schema, the
required 'trips' property was forgotten.
The initial text bindings was describing:
"
[ ... ]
* Thermal zone nodes
The thermal zone node is the node containing all the required info
for describing a thermal zone, including its cooling device bindings. The
thermal zone node must contain, apart from its own properties, one sub-node
containing trip nodes and one sub-node containing all the zone cooling maps.
Required properties:
- polling-delay: The maximum number of milliseconds to wait between polls
Type: unsigned when checking this thermal zone.
Size: one cell
- polling-delay-passive: The maximum number of milliseconds to wait
Type: unsigned between polls when performing passive cooling.
Size: one cell
- thermal-sensors: A list of thermal sensor phandles and sensor specifier
Type: list of used while monitoring the thermal zone.
phandles + sensor
specifier
- trips: A sub-node which is a container of only trip point nodes
Type: sub-node required to describe the thermal zone.
Optional property:
- cooling-maps: A sub-node which is a container of only cooling device
Type: sub-node map nodes, used to describe the relation between trips
and cooling devices.
[ ... ]
"
Now the schema describes:
"
[ ... ]
required:
- polling-delay
- polling-delay-passive
- thermal-sensors
[ ... ]
"
Add the missing 'trips' property in the required properties.
Fixed: 1202a442a31fd ("dt-bindings: thermal: Add yaml bindings for thermal zones")
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
Documentation/devicetree/bindings/thermal/thermal-zones.yaml | 1 +
1 file changed, 1 insertion(+)
diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
index 2d34f3ccb257..8d2c6d74b605 100644
--- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
+++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
@@ -214,6 +214,7 @@ patternProperties:
- polling-delay
- polling-delay-passive
- thermal-sensors
+ - trips
additionalProperties: false
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] thermal/of: Fix free after use in thermal_of_unregister()
2022-08-08 18:09 [PATCH 1/4] thermal/of: Fix error code in of_thermal_zone_find() Daniel Lezcano
2022-08-08 18:09 ` [PATCH 2/4] thermal/of: Return -ENODEV instead of -EINVAL if registration fails Daniel Lezcano
2022-08-08 18:09 ` [PATCH 3/4] dt-bindings: thermal: Fix missing required property Daniel Lezcano
@ 2022-08-08 18:09 ` Daniel Lezcano
2022-08-08 18:44 ` Michael Walle
2 siblings, 1 reply; 11+ messages in thread
From: Daniel Lezcano @ 2022-08-08 18:09 UTC (permalink / raw)
To: daniel.lezcano, rafael
Cc: michael, dan.carpenter, linux, linux-kernel, linux-pm,
Amit Kucheria, Zhang Rui
The thermal zone is freed after being unregistered. The release method
devm_thermal_zone_device_register() calls
-> thermal_of_zone_device_unregister()
This one calls thermal_zone_device_unregister() which frees the
thermal zone. However, thermal_of_zone_device_unregister() does access
this freed pointer to free different resources allocated by the
thermal_of framework which is invalid.
It results in a kernel panic:
[ 1.915140] thermal_sys: Failed to find thermal zone for tmu id=2
[ 1.921279] qoriq_thermal 1f80000.tmu: Failed to register sensors
[ 1.927395] qoriq_thermal: probe of 1f80000.tmu failed with error -22
[ 1.934189] Unable to handle kernel paging request at virtual address 01adadadadadad88
[ 1.942146] Mem abort info:
[ 1.944948] ESR = 0x0000000096000004
[ 1.948708] EC = 0x25: DABT (current EL), IL = 32 bits
[ 1.954042] SET = 0, FnV = 0
[ 1.957107] EA = 0, S1PTW = 0
[ 1.960253] FSC = 0x04: level 0 translation fault
[ 1.965147] Data abort info:
[ 1.968030] ISV = 0, ISS = 0x00000004
[ 1.971878] CM = 0, WnR = 0
[ 1.974852] [01adadadadadad88] address between user and kernel address ranges
[ 1.982016] Internal error: Oops: 96000004 [#1] SMP
[ 1.986907] Modules linked in:
[ 1.989969] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.19.0-next-20220808-00080-g1c46f44502e0 #1697
[ 1.999135] Hardware name: Kontron KBox A-230-LS (DT)
[ 2.004199] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2.011185] pc : kfree+0x5c/0x3c0
[ 2.014516] lr : devm_thermal_of_zone_release+0x38/0x60
[ 2.019761] sp : ffff80000a22bad0
[ 2.023081] x29: ffff80000a22bad0 x28: 0000000000000000 x27: ffff800009960464
[ 2.030245] x26: ffff800009a16960 x25: 0000000000000006 x24: ffff800009f09a40
[ 2.037407] x23: ffff800009ab9008 x22: ffff800008d0eea8 x21: 01adadadadadad80
[ 2.044569] x20: 6b6b6b6b6b6b6b6b x19: ffff00200232b800 x18: 00000000fffffffb
[ 2.051731] x17: ffff800008d0eea0 x16: ffff800008d07d44 x15: ffff800008d0d154
[ 2.056647] usb 1-1: new high-speed USB device number 2 using xhci-hcd
[ 2.058893] x14: ffff800008d0cddc x13: ffff8000088d1c2c x12: ffff8000088d5034
[ 2.072597] x11: ffff8000088d46d4 x10: 0000000000000000 x9 : ffff800008d0eea8
[ 2.079759] x8 : ffff002000b1a158 x7 : bbbbbbbbbbbbbbbb x6 : ffff80000a0f53b8
[ 2.086921] x5 : ffff80000a22b960 x4 : 0000000000000000 x3 : 0000000000000000
[ 2.094082] x2 : fffffc0000000000 x1 : ffff002000838040 x0 : 01adb1adadadad80
[ 2.101244] Call trace:
[ 2.103692] kfree+0x5c/0x3c0
[ 2.106666] devm_thermal_of_zone_release+0x38/0x60
[ 2.111561] release_nodes+0x64/0xd0
[ 2.115146] devres_release_all+0xbc/0x350
[ 2.119253] device_unbind_cleanup+0x20/0x70
[ 2.123536] really_probe+0x1a0/0x2e4
[ 2.127208] __driver_probe_device+0x80/0xec
[ 2.131490] driver_probe_device+0x44/0x130
[ 2.135685] __driver_attach+0x104/0x1b4
[ 2.139619] bus_for_each_dev+0x7c/0xe0
[ 2.143465] driver_attach+0x30/0x40
[ 2.147048] bus_add_driver+0x160/0x210
[ 2.150894] driver_register+0x84/0x140
[ 2.154741] __platform_driver_register+0x34/0x40
[ 2.159461] qoriq_tmu_init+0x28/0x34
[ 2.163133] do_one_initcall+0x50/0x250
[ 2.166979] kernel_init_freeable+0x278/0x31c
[ 2.171349] kernel_init+0x30/0x140
[ 2.174847] ret_from_fork+0x10/0x20
[ 2.178433] Code: b25657e2 d34cfc00 d37ae400 8b020015 (f94006a1)
[ 2.184546] ---[ end trace 0000000000000000 ]---
Store the allocated resource pointers before the thermal zone is free
and use them to release the resource after unregistering the thermal
zone.
Reported-by: Michael Walle <michael@walle.cc>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
---
drivers/thermal/thermal_of.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index 4210c18ef7b2..91ffe8f90a2d 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -596,11 +596,15 @@ static int thermal_of_unbind(struct thermal_zone_device *tz,
*/
void thermal_of_zone_unregister(struct thermal_zone_device *tz)
{
+ struct thermal_trip *trips = tz->trips;
+ struct thermal_zone_params *tzp = tz->tzp;
+ struct thermal_zone_device_ops *ops = tz->ops;
+
thermal_zone_device_disable(tz);
thermal_zone_device_unregister(tz);
- kfree(tz->trips);
- kfree(tz->tzp);
- kfree(tz->ops);
+ kfree(trips);
+ kfree(tzp);
+ kfree(ops);
}
EXPORT_SYMBOL_GPL(thermal_of_zone_unregister);
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] thermal/of: Fix free after use in thermal_of_unregister()
2022-08-08 18:09 ` [PATCH 4/4] thermal/of: Fix free after use in thermal_of_unregister() Daniel Lezcano
@ 2022-08-08 18:44 ` Michael Walle
0 siblings, 0 replies; 11+ messages in thread
From: Michael Walle @ 2022-08-08 18:44 UTC (permalink / raw)
To: Daniel Lezcano
Cc: rafael, dan.carpenter, linux, linux-kernel, linux-pm,
Amit Kucheria, Zhang Rui
Am 2022-08-08 20:09, schrieb Daniel Lezcano:
> The thermal zone is freed after being unregistered. The release method
> devm_thermal_zone_device_register() calls
> -> thermal_of_zone_device_unregister()
>
> This one calls thermal_zone_device_unregister() which frees the
> thermal zone. However, thermal_of_zone_device_unregister() does access
> this freed pointer to free different resources allocated by the
> thermal_of framework which is invalid.
>
> It results in a kernel panic:
>
> [ 1.915140] thermal_sys: Failed to find thermal zone for tmu id=2
> [ 1.921279] qoriq_thermal 1f80000.tmu: Failed to register sensors
> [ 1.927395] qoriq_thermal: probe of 1f80000.tmu failed with error
> -22
> [ 1.934189] Unable to handle kernel paging request at virtual
> address 01adadadadadad88
> [ 1.942146] Mem abort info:
> [ 1.944948] ESR = 0x0000000096000004
> [ 1.948708] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 1.954042] SET = 0, FnV = 0
> [ 1.957107] EA = 0, S1PTW = 0
> [ 1.960253] FSC = 0x04: level 0 translation fault
> [ 1.965147] Data abort info:
> [ 1.968030] ISV = 0, ISS = 0x00000004
> [ 1.971878] CM = 0, WnR = 0
> [ 1.974852] [01adadadadadad88] address between user and kernel
> address ranges
> [ 1.982016] Internal error: Oops: 96000004 [#1] SMP
> [ 1.986907] Modules linked in:
> [ 1.989969] CPU: 1 PID: 1 Comm: swapper/0 Not tainted
> 5.19.0-next-20220808-00080-g1c46f44502e0 #1697
> [ 1.999135] Hardware name: Kontron KBox A-230-LS (DT)
> [ 2.004199] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS
> BTYPE=--)
> [ 2.011185] pc : kfree+0x5c/0x3c0
> [ 2.014516] lr : devm_thermal_of_zone_release+0x38/0x60
> [ 2.019761] sp : ffff80000a22bad0
> [ 2.023081] x29: ffff80000a22bad0 x28: 0000000000000000 x27:
> ffff800009960464
> [ 2.030245] x26: ffff800009a16960 x25: 0000000000000006 x24:
> ffff800009f09a40
> [ 2.037407] x23: ffff800009ab9008 x22: ffff800008d0eea8 x21:
> 01adadadadadad80
> [ 2.044569] x20: 6b6b6b6b6b6b6b6b x19: ffff00200232b800 x18:
> 00000000fffffffb
> [ 2.051731] x17: ffff800008d0eea0 x16: ffff800008d07d44 x15:
> ffff800008d0d154
> [ 2.056647] usb 1-1: new high-speed USB device number 2 using
> xhci-hcd
> [ 2.058893] x14: ffff800008d0cddc x13: ffff8000088d1c2c x12:
> ffff8000088d5034
> [ 2.072597] x11: ffff8000088d46d4 x10: 0000000000000000 x9 :
> ffff800008d0eea8
> [ 2.079759] x8 : ffff002000b1a158 x7 : bbbbbbbbbbbbbbbb x6 :
> ffff80000a0f53b8
> [ 2.086921] x5 : ffff80000a22b960 x4 : 0000000000000000 x3 :
> 0000000000000000
> [ 2.094082] x2 : fffffc0000000000 x1 : ffff002000838040 x0 :
> 01adb1adadadad80
> [ 2.101244] Call trace:
> [ 2.103692] kfree+0x5c/0x3c0
> [ 2.106666] devm_thermal_of_zone_release+0x38/0x60
> [ 2.111561] release_nodes+0x64/0xd0
> [ 2.115146] devres_release_all+0xbc/0x350
> [ 2.119253] device_unbind_cleanup+0x20/0x70
> [ 2.123536] really_probe+0x1a0/0x2e4
> [ 2.127208] __driver_probe_device+0x80/0xec
> [ 2.131490] driver_probe_device+0x44/0x130
> [ 2.135685] __driver_attach+0x104/0x1b4
> [ 2.139619] bus_for_each_dev+0x7c/0xe0
> [ 2.143465] driver_attach+0x30/0x40
> [ 2.147048] bus_add_driver+0x160/0x210
> [ 2.150894] driver_register+0x84/0x140
> [ 2.154741] __platform_driver_register+0x34/0x40
> [ 2.159461] qoriq_tmu_init+0x28/0x34
> [ 2.163133] do_one_initcall+0x50/0x250
> [ 2.166979] kernel_init_freeable+0x278/0x31c
> [ 2.171349] kernel_init+0x30/0x140
> [ 2.174847] ret_from_fork+0x10/0x20
> [ 2.178433] Code: b25657e2 d34cfc00 d37ae400 8b020015 (f94006a1)
> [ 2.184546] ---[ end trace 0000000000000000 ]---
>
> Store the allocated resource pointers before the thermal zone is free
> and use them to release the resource after unregistering the thermal
> zone.
>
> Reported-by: Michael Walle <michael@walle.cc>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
> ---
> drivers/thermal/thermal_of.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/thermal/thermal_of.c
> b/drivers/thermal/thermal_of.c
> index 4210c18ef7b2..91ffe8f90a2d 100644
> --- a/drivers/thermal/thermal_of.c
> +++ b/drivers/thermal/thermal_of.c
> @@ -596,11 +596,15 @@ static int thermal_of_unbind(struct
> thermal_zone_device *tz,
> */
> void thermal_of_zone_unregister(struct thermal_zone_device *tz)
> {
> + struct thermal_trip *trips = tz->trips;
> + struct thermal_zone_params *tzp = tz->tzp;
> + struct thermal_zone_device_ops *ops = tz->ops;
> +
git am shows a trailing whitespace here
otherwise:
Tested-by: Michael Walle <michael@walle.cc>
(Yes, tested without the previous patches so the error actually happens
;))
-michael
> thermal_zone_device_disable(tz);
> thermal_zone_device_unregister(tz);
> - kfree(tz->trips);
> - kfree(tz->tzp);
> - kfree(tz->ops);
> + kfree(trips);
> + kfree(tzp);
> + kfree(ops);
> }
> EXPORT_SYMBOL_GPL(thermal_of_zone_unregister);
^ permalink raw reply [flat|nested] 11+ messages in thread