public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
* Re: rmmod thermal, unable to handle kernel NULL pointer dereference
       [not found] <CAKx2Y1mzLesGFYz30fH8xCZTkvNGPUNYkHptYLYq_hVExoh_8g@mail.gmail.com>
@ 2014-05-21  8:22 ` Aaron Lu
  2014-05-21  8:33   ` [PATCH] thermal: hwmon: Make the check for critical temp valid consistent Aaron Lu
                     ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Aaron Lu @ 2014-05-21  8:22 UTC (permalink / raw)
  To: Kui Zhang
  Cc: linux-kernel@vger.kernel.org, ACPI Devel Mailing List,
	Rafael J. Wysocki, Zhang Rui

On 05/21/2014 01:57 PM, Kui Zhang wrote:
> Hello,
> 
> I get following error when rmmod thermal.
> 
> rmmod  thermal
> Killed

Thanks for the report. Here is a fix patch that should solve this
problem.

From: Aaron Lu <aaron.lu@intel.com>
Date: Wed, 21 May 2014 16:00:42 +0800
Subject: [PATCH] ACPI / thermal: fix workqueue destroy order

When the thermal module is to be removed, we should destroy the wq
acpi_thermal_pm_queue after the ACPI driver's remove callback is
executed as we will need to flush the workqueue there, or a NULL pointer
access will be hit.

Reported-by: Kui Zhang <kuizhang@gmail.com>
Reference: http://www.spinics.net/lists/kernel/msg1747251.html
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
---
 drivers/acpi/thermal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c
index c1e31a41f949..25bbc55dca89 100644
--- a/drivers/acpi/thermal.c
+++ b/drivers/acpi/thermal.c
@@ -1278,8 +1278,8 @@ static int __init acpi_thermal_init(void)
 
 static void __exit acpi_thermal_exit(void)
 {
-	destroy_workqueue(acpi_thermal_pm_queue);
 	acpi_bus_unregister_driver(&acpi_thermal_driver);
+	destroy_workqueue(acpi_thermal_pm_queue);
 
 	return;
 }
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH] thermal: hwmon: Make the check for critical temp valid consistent
  2014-05-21  8:22 ` rmmod thermal, unable to handle kernel NULL pointer dereference Aaron Lu
@ 2014-05-21  8:33   ` Aaron Lu
  2014-05-21 15:38   ` rmmod thermal, unable to handle kernel NULL pointer dereference Kui Zhang
  2014-05-26 12:53   ` Rafael J. Wysocki
  2 siblings, 0 replies; 4+ messages in thread
From: Aaron Lu @ 2014-05-21  8:33 UTC (permalink / raw)
  To: Zhang Rui
  Cc: Kui Zhang, linux-kernel@vger.kernel.org, ACPI Devel Mailing List,
	Rafael J. Wysocki, Linux-pm mailing list

On 05/21/2014 04:22 PM, Aaron Lu wrote:
> On 05/21/2014 01:57 PM, Kui Zhang wrote:
>> Hello,
>>
>> I get following error when rmmod thermal.
>>
>> rmmod  thermal
>> Killed

While dealing with this problem, I found another problem that also
results in a kernel crash on thermal module removal:


From: Aaron Lu <aaron.lu@intel.com>
Date: Wed, 21 May 2014 16:05:38 +0800
Subject: [PATCH] thermal: hwmon: Make the check for critical temp valid consistent

We used the tz->ops->get_crit_temp && !tz->ops->get_crit_temp(tz, temp)
to decide if we need to create the temp_crit attribute file but we just
check if tz->ops->get_crit_temp exists to decide if we need to remove
that attribute file. Some ACPI thermal zone doesn't have a valid critical
trip point and that would result in removing a non-existent device file
on thermal module unload.

Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
---
 drivers/thermal/thermal_hwmon.c | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/drivers/thermal/thermal_hwmon.c b/drivers/thermal/thermal_hwmon.c
index fdb07199d9c2..1967bee4f076 100644
--- a/drivers/thermal/thermal_hwmon.c
+++ b/drivers/thermal/thermal_hwmon.c
@@ -140,6 +140,12 @@ thermal_hwmon_lookup_temp(const struct thermal_hwmon_device *hwmon,
 	return NULL;
 }
 
+static bool thermal_zone_crit_temp_valid(struct thermal_zone_device *tz)
+{
+	unsigned long temp;
+	return tz->ops->get_crit_temp && !tz->ops->get_crit_temp(tz, &temp);
+}
+
 int thermal_add_hwmon_sysfs(struct thermal_zone_device *tz)
 {
 	struct thermal_hwmon_device *hwmon;
@@ -189,21 +195,18 @@ int thermal_add_hwmon_sysfs(struct thermal_zone_device *tz)
 	if (result)
 		goto free_temp_mem;
 
-	if (tz->ops->get_crit_temp) {
-		unsigned long temperature;
-		if (!tz->ops->get_crit_temp(tz, &temperature)) {
-			snprintf(temp->temp_crit.name,
-				 sizeof(temp->temp_crit.name),
+	if (thermal_zone_crit_temp_valid(tz)) {
+		snprintf(temp->temp_crit.name,
+				sizeof(temp->temp_crit.name),
 				"temp%d_crit", hwmon->count);
-			temp->temp_crit.attr.attr.name = temp->temp_crit.name;
-			temp->temp_crit.attr.attr.mode = 0444;
-			temp->temp_crit.attr.show = temp_crit_show;
-			sysfs_attr_init(&temp->temp_crit.attr.attr);
-			result = device_create_file(hwmon->device,
-						    &temp->temp_crit.attr);
-			if (result)
-				goto unregister_input;
-		}
+		temp->temp_crit.attr.attr.name = temp->temp_crit.name;
+		temp->temp_crit.attr.attr.mode = 0444;
+		temp->temp_crit.attr.show = temp_crit_show;
+		sysfs_attr_init(&temp->temp_crit.attr.attr);
+		result = device_create_file(hwmon->device,
+					    &temp->temp_crit.attr);
+		if (result)
+			goto unregister_input;
 	}
 
 	mutex_lock(&thermal_hwmon_list_lock);
@@ -250,7 +253,7 @@ void thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz)
 	}
 
 	device_remove_file(hwmon->device, &temp->temp_input.attr);
-	if (tz->ops->get_crit_temp)
+	if (thermal_zone_crit_temp_valid(tz))
 		device_remove_file(hwmon->device, &temp->temp_crit.attr);
 
 	mutex_lock(&thermal_hwmon_list_lock);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: rmmod thermal, unable to handle kernel NULL pointer dereference
  2014-05-21  8:22 ` rmmod thermal, unable to handle kernel NULL pointer dereference Aaron Lu
  2014-05-21  8:33   ` [PATCH] thermal: hwmon: Make the check for critical temp valid consistent Aaron Lu
@ 2014-05-21 15:38   ` Kui Zhang
  2014-05-26 12:53   ` Rafael J. Wysocki
  2 siblings, 0 replies; 4+ messages in thread
From: Kui Zhang @ 2014-05-21 15:38 UTC (permalink / raw)
  To: Aaron Lu
  Cc: linux-kernel@vger.kernel.org, ACPI Devel Mailing List,
	Rafael J. Wysocki, Zhang Rui

Thanks, that fixed it.

On Wed, May 21, 2014 at 1:22 AM, Aaron Lu <aaron.lu@intel.com> wrote:
> On 05/21/2014 01:57 PM, Kui Zhang wrote:
>> Hello,
>>
>> I get following error when rmmod thermal.
>>
>> rmmod  thermal
>> Killed
>
> Thanks for the report. Here is a fix patch that should solve this
> problem.
>
> From: Aaron Lu <aaron.lu@intel.com>
> Date: Wed, 21 May 2014 16:00:42 +0800
> Subject: [PATCH] ACPI / thermal: fix workqueue destroy order
>
> When the thermal module is to be removed, we should destroy the wq
> acpi_thermal_pm_queue after the ACPI driver's remove callback is
> executed as we will need to flush the workqueue there, or a NULL pointer
> access will be hit.
>
> Reported-by: Kui Zhang <kuizhang@gmail.com>
> Reference: http://www.spinics.net/lists/kernel/msg1747251.html
> Cc: All applicable <stable@vger.kernel.org>
> Signed-off-by: Aaron Lu <aaron.lu@intel.com>
> ---
>  drivers/acpi/thermal.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c
> index c1e31a41f949..25bbc55dca89 100644
> --- a/drivers/acpi/thermal.c
> +++ b/drivers/acpi/thermal.c
> @@ -1278,8 +1278,8 @@ static int __init acpi_thermal_init(void)
>
>  static void __exit acpi_thermal_exit(void)
>  {
> -       destroy_workqueue(acpi_thermal_pm_queue);
>         acpi_bus_unregister_driver(&acpi_thermal_driver);
> +       destroy_workqueue(acpi_thermal_pm_queue);
>
>         return;
>  }
> --
> 1.9.0
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: rmmod thermal, unable to handle kernel NULL pointer dereference
  2014-05-21  8:22 ` rmmod thermal, unable to handle kernel NULL pointer dereference Aaron Lu
  2014-05-21  8:33   ` [PATCH] thermal: hwmon: Make the check for critical temp valid consistent Aaron Lu
  2014-05-21 15:38   ` rmmod thermal, unable to handle kernel NULL pointer dereference Kui Zhang
@ 2014-05-26 12:53   ` Rafael J. Wysocki
  2 siblings, 0 replies; 4+ messages in thread
From: Rafael J. Wysocki @ 2014-05-26 12:53 UTC (permalink / raw)
  To: Aaron Lu
  Cc: Kui Zhang, linux-kernel@vger.kernel.org, ACPI Devel Mailing List,
	Zhang Rui

On Wednesday, May 21, 2014 04:22:58 PM Aaron Lu wrote:
> On 05/21/2014 01:57 PM, Kui Zhang wrote:
> > Hello,
> > 
> > I get following error when rmmod thermal.
> > 
> > rmmod  thermal
> > Killed
> 
> Thanks for the report. Here is a fix patch that should solve this
> problem.
> 
> From: Aaron Lu <aaron.lu@intel.com>
> Date: Wed, 21 May 2014 16:00:42 +0800
> Subject: [PATCH] ACPI / thermal: fix workqueue destroy order
> 
> When the thermal module is to be removed, we should destroy the wq
> acpi_thermal_pm_queue after the ACPI driver's remove callback is
> executed as we will need to flush the workqueue there, or a NULL pointer
> access will be hit.
> 
> Reported-by: Kui Zhang <kuizhang@gmail.com>
> Reference: http://www.spinics.net/lists/kernel/msg1747251.html
> Cc: All applicable <stable@vger.kernel.org>
> Signed-off-by: Aaron Lu <aaron.lu@intel.com>

I'm going to push this as a fix for 3.15, thanks!

> ---
>  drivers/acpi/thermal.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c
> index c1e31a41f949..25bbc55dca89 100644
> --- a/drivers/acpi/thermal.c
> +++ b/drivers/acpi/thermal.c
> @@ -1278,8 +1278,8 @@ static int __init acpi_thermal_init(void)
>  
>  static void __exit acpi_thermal_exit(void)
>  {
> -	destroy_workqueue(acpi_thermal_pm_queue);
>  	acpi_bus_unregister_driver(&acpi_thermal_driver);
> +	destroy_workqueue(acpi_thermal_pm_queue);
>  
>  	return;
>  }
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-05-26 12:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAKx2Y1mzLesGFYz30fH8xCZTkvNGPUNYkHptYLYq_hVExoh_8g@mail.gmail.com>
2014-05-21  8:22 ` rmmod thermal, unable to handle kernel NULL pointer dereference Aaron Lu
2014-05-21  8:33   ` [PATCH] thermal: hwmon: Make the check for critical temp valid consistent Aaron Lu
2014-05-21 15:38   ` rmmod thermal, unable to handle kernel NULL pointer dereference Kui Zhang
2014-05-26 12:53   ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox