* [PATCH] thermal: hwmon: Make the check for critical temp valid consistent
2014-05-21 8:22 ` Aaron Lu
@ 2014-05-21 8:33 ` Aaron Lu
2014-05-21 15:38 ` rmmod thermal, unable to handle kernel NULL pointer dereference Kui Zhang
2014-05-26 12:53 ` Rafael J. Wysocki
2 siblings, 0 replies; 5+ messages in thread
From: Aaron Lu @ 2014-05-21 8:33 UTC (permalink / raw)
To: Zhang Rui
Cc: Kui Zhang, linux-kernel@vger.kernel.org, ACPI Devel Mailing List,
Rafael J. Wysocki, Linux-pm mailing list
On 05/21/2014 04:22 PM, Aaron Lu wrote:
> On 05/21/2014 01:57 PM, Kui Zhang wrote:
>> Hello,
>>
>> I get following error when rmmod thermal.
>>
>> rmmod thermal
>> Killed
While dealing with this problem, I found another problem that also
results in a kernel crash on thermal module removal:
From: Aaron Lu <aaron.lu@intel.com>
Date: Wed, 21 May 2014 16:05:38 +0800
Subject: [PATCH] thermal: hwmon: Make the check for critical temp valid consistent
We used the tz->ops->get_crit_temp && !tz->ops->get_crit_temp(tz, temp)
to decide if we need to create the temp_crit attribute file but we just
check if tz->ops->get_crit_temp exists to decide if we need to remove
that attribute file. Some ACPI thermal zone doesn't have a valid critical
trip point and that would result in removing a non-existent device file
on thermal module unload.
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
---
drivers/thermal/thermal_hwmon.c | 33 ++++++++++++++++++---------------
1 file changed, 18 insertions(+), 15 deletions(-)
diff --git a/drivers/thermal/thermal_hwmon.c b/drivers/thermal/thermal_hwmon.c
index fdb07199d9c2..1967bee4f076 100644
--- a/drivers/thermal/thermal_hwmon.c
+++ b/drivers/thermal/thermal_hwmon.c
@@ -140,6 +140,12 @@ thermal_hwmon_lookup_temp(const struct thermal_hwmon_device *hwmon,
return NULL;
}
+static bool thermal_zone_crit_temp_valid(struct thermal_zone_device *tz)
+{
+ unsigned long temp;
+ return tz->ops->get_crit_temp && !tz->ops->get_crit_temp(tz, &temp);
+}
+
int thermal_add_hwmon_sysfs(struct thermal_zone_device *tz)
{
struct thermal_hwmon_device *hwmon;
@@ -189,21 +195,18 @@ int thermal_add_hwmon_sysfs(struct thermal_zone_device *tz)
if (result)
goto free_temp_mem;
- if (tz->ops->get_crit_temp) {
- unsigned long temperature;
- if (!tz->ops->get_crit_temp(tz, &temperature)) {
- snprintf(temp->temp_crit.name,
- sizeof(temp->temp_crit.name),
+ if (thermal_zone_crit_temp_valid(tz)) {
+ snprintf(temp->temp_crit.name,
+ sizeof(temp->temp_crit.name),
"temp%d_crit", hwmon->count);
- temp->temp_crit.attr.attr.name = temp->temp_crit.name;
- temp->temp_crit.attr.attr.mode = 0444;
- temp->temp_crit.attr.show = temp_crit_show;
- sysfs_attr_init(&temp->temp_crit.attr.attr);
- result = device_create_file(hwmon->device,
- &temp->temp_crit.attr);
- if (result)
- goto unregister_input;
- }
+ temp->temp_crit.attr.attr.name = temp->temp_crit.name;
+ temp->temp_crit.attr.attr.mode = 0444;
+ temp->temp_crit.attr.show = temp_crit_show;
+ sysfs_attr_init(&temp->temp_crit.attr.attr);
+ result = device_create_file(hwmon->device,
+ &temp->temp_crit.attr);
+ if (result)
+ goto unregister_input;
}
mutex_lock(&thermal_hwmon_list_lock);
@@ -250,7 +253,7 @@ void thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz)
}
device_remove_file(hwmon->device, &temp->temp_input.attr);
- if (tz->ops->get_crit_temp)
+ if (thermal_zone_crit_temp_valid(tz))
device_remove_file(hwmon->device, &temp->temp_crit.attr);
mutex_lock(&thermal_hwmon_list_lock);
--
1.9.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: rmmod thermal, unable to handle kernel NULL pointer dereference
2014-05-21 8:22 ` Aaron Lu
2014-05-21 8:33 ` [PATCH] thermal: hwmon: Make the check for critical temp valid consistent Aaron Lu
@ 2014-05-21 15:38 ` Kui Zhang
2014-05-26 12:53 ` Rafael J. Wysocki
2 siblings, 0 replies; 5+ messages in thread
From: Kui Zhang @ 2014-05-21 15:38 UTC (permalink / raw)
To: Aaron Lu
Cc: linux-kernel@vger.kernel.org, ACPI Devel Mailing List,
Rafael J. Wysocki, Zhang Rui
Thanks, that fixed it.
On Wed, May 21, 2014 at 1:22 AM, Aaron Lu <aaron.lu@intel.com> wrote:
> On 05/21/2014 01:57 PM, Kui Zhang wrote:
>> Hello,
>>
>> I get following error when rmmod thermal.
>>
>> rmmod thermal
>> Killed
>
> Thanks for the report. Here is a fix patch that should solve this
> problem.
>
> From: Aaron Lu <aaron.lu@intel.com>
> Date: Wed, 21 May 2014 16:00:42 +0800
> Subject: [PATCH] ACPI / thermal: fix workqueue destroy order
>
> When the thermal module is to be removed, we should destroy the wq
> acpi_thermal_pm_queue after the ACPI driver's remove callback is
> executed as we will need to flush the workqueue there, or a NULL pointer
> access will be hit.
>
> Reported-by: Kui Zhang <kuizhang@gmail.com>
> Reference: http://www.spinics.net/lists/kernel/msg1747251.html
> Cc: All applicable <stable@vger.kernel.org>
> Signed-off-by: Aaron Lu <aaron.lu@intel.com>
> ---
> drivers/acpi/thermal.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c
> index c1e31a41f949..25bbc55dca89 100644
> --- a/drivers/acpi/thermal.c
> +++ b/drivers/acpi/thermal.c
> @@ -1278,8 +1278,8 @@ static int __init acpi_thermal_init(void)
>
> static void __exit acpi_thermal_exit(void)
> {
> - destroy_workqueue(acpi_thermal_pm_queue);
> acpi_bus_unregister_driver(&acpi_thermal_driver);
> + destroy_workqueue(acpi_thermal_pm_queue);
>
> return;
> }
> --
> 1.9.0
>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: rmmod thermal, unable to handle kernel NULL pointer dereference
2014-05-21 8:22 ` Aaron Lu
2014-05-21 8:33 ` [PATCH] thermal: hwmon: Make the check for critical temp valid consistent Aaron Lu
2014-05-21 15:38 ` rmmod thermal, unable to handle kernel NULL pointer dereference Kui Zhang
@ 2014-05-26 12:53 ` Rafael J. Wysocki
2 siblings, 0 replies; 5+ messages in thread
From: Rafael J. Wysocki @ 2014-05-26 12:53 UTC (permalink / raw)
To: Aaron Lu
Cc: Kui Zhang, linux-kernel@vger.kernel.org, ACPI Devel Mailing List,
Zhang Rui
On Wednesday, May 21, 2014 04:22:58 PM Aaron Lu wrote:
> On 05/21/2014 01:57 PM, Kui Zhang wrote:
> > Hello,
> >
> > I get following error when rmmod thermal.
> >
> > rmmod thermal
> > Killed
>
> Thanks for the report. Here is a fix patch that should solve this
> problem.
>
> From: Aaron Lu <aaron.lu@intel.com>
> Date: Wed, 21 May 2014 16:00:42 +0800
> Subject: [PATCH] ACPI / thermal: fix workqueue destroy order
>
> When the thermal module is to be removed, we should destroy the wq
> acpi_thermal_pm_queue after the ACPI driver's remove callback is
> executed as we will need to flush the workqueue there, or a NULL pointer
> access will be hit.
>
> Reported-by: Kui Zhang <kuizhang@gmail.com>
> Reference: http://www.spinics.net/lists/kernel/msg1747251.html
> Cc: All applicable <stable@vger.kernel.org>
> Signed-off-by: Aaron Lu <aaron.lu@intel.com>
I'm going to push this as a fix for 3.15, thanks!
> ---
> drivers/acpi/thermal.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c
> index c1e31a41f949..25bbc55dca89 100644
> --- a/drivers/acpi/thermal.c
> +++ b/drivers/acpi/thermal.c
> @@ -1278,8 +1278,8 @@ static int __init acpi_thermal_init(void)
>
> static void __exit acpi_thermal_exit(void)
> {
> - destroy_workqueue(acpi_thermal_pm_queue);
> acpi_bus_unregister_driver(&acpi_thermal_driver);
> + destroy_workqueue(acpi_thermal_pm_queue);
>
> return;
> }
>
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
^ permalink raw reply [flat|nested] 5+ messages in thread