From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-by2nam01on0103.outbound.protection.outlook.com ([104.47.34.103]:19764 "EHLO NAM01-BY2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755993AbeCSPrK (ORCPT ); Mon, 19 Mar 2018 11:47:10 -0400 From: Sasha Levin To: "linux-kernel@vger.kernel.org" , "stable@vger.kernel.org" CC: Yi Zeng , Zhang Rui , Sasha Levin Subject: [PATCH AUTOSEL for 4.15 008/124] thermal: power_allocator: fix one race condition issue for thermal_instances list Date: Mon, 19 Mar 2018 15:47:05 +0000 Message-ID: <20180319154645.11350-8-alexander.levin@microsoft.com> References: <20180319154645.11350-1-alexander.levin@microsoft.com> In-Reply-To: <20180319154645.11350-1-alexander.levin@microsoft.com> Content-Language: en-US Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org List-ID: From: Yi Zeng [ Upstream commit a5de11d67dcd268b8d0beb73dc374de5e97f0caf ] When invoking allow_maximum_power and traverse tz->thermal_instances, we should grab thermal_zone_device->lock to avoid race condition. For example, during the system reboot, if the mali GPU device implements device shutdown callback and unregister GPU devfreq cooling device, the deleted list head may be accessed to cause panic, as the following log shows: [ 33.551070] c3 25 (kworker/3:0) Unable to handle kernel paging request a= t virtual address dead000000000070 [ 33.566708] c3 25 (kworker/3:0) pgd =3D ffffffc0ed290000 [ 33.572071] c3 25 (kworker/3:0) [dead000000000070] *pgd=3D00000001ed2920= 03, *pud=3D00000001ed292003, *pmd=3D0000000000000000 [ 33.581515] c3 25 (kworker/3:0) Internal error: Oops: 96000004 [#1] PREE= MPT SMP [ 33.599761] c3 25 (kworker/3:0) CPU: 3 PID: 25 Comm: kworker/3:0 Not tai= nted 4.4.35+ #912 [ 33.614137] c3 25 (kworker/3:0) Workqueue: events_freezable thermal_zone= _device_check [ 33.620245] c3 25 (kworker/3:0) task: ffffffc0f32e4200 ti: ffffffc0f32f0= 000 task.ti: ffffffc0f32f0000 [ 33.629466] c3 25 (kworker/3:0) PC is at power_allocator_throttle+0x7c8/= 0x8a4 [ 33.636609] c3 25 (kworker/3:0) LR is at power_allocator_throttle+0x808/= 0x8a4 [ 33.643742] c3 25 (kworker/3:0) pc : [] lr : [] pstate: 20000145 [ 33.652874] c3 25 (kworker/3:0) sp : ffffffc0f32f3bb0 [ 34.468519] c3 25 (kworker/3:0) Process kworker/3:0 (pid: 25, stack limi= t =3D 0xffffffc0f32f0020) [ 34.477220] c3 25 (kworker/3:0) Stack: (0xffffffc0f32f3bb0 to 0xffffffc0= f32f4000) [ 34.819822] c3 25 (kworker/3:0) Call trace: [ 34.824021] c3 25 (kworker/3:0) Exception stack(0xffffffc0f32f39c0 to 0x= ffffffc0f32f3af0) [ 34.924993] c3 25 (kworker/3:0) [] power_allocator_thr= ottle+0x7c8/0x8a4 [ 34.933184] c3 25 (kworker/3:0) [] handle_thermal_trip= .part.25+0x70/0x224 [ 34.941545] c3 25 (kworker/3:0) [] thermal_zone_device= _update+0xc0/0x20c [ 34.949818] c3 25 (kworker/3:0) [] thermal_zone_device= _check+0x20/0x2c [ 34.957924] c3 25 (kworker/3:0) [] process_one_work+0x= 168/0x458 [ 34.965414] c3 25 (kworker/3:0) [] worker_thread+0x13c= /0x4b4 [ 34.972650] c3 25 (kworker/3:0) [] kthread+0xe8/0xfc [ 34.979187] c3 25 (kworker/3:0) [] ret_from_fork+0x10/= 0x40 [ 34.986244] c3 25 (kworker/3:0) Code: f9405e73 eb1302bf d102e273 54ffc46= 0 (b9402a61) [ 34.994339] c3 25 (kworker/3:0) ---[ end trace 32057901e3b7e1db ]--- Signed-off-by: Yi Zeng Signed-off-by: Zhang Rui Signed-off-by: Sasha Levin --- drivers/thermal/power_allocator.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/thermal/power_allocator.c b/drivers/thermal/power_allo= cator.c index b4d3116cfdaf..3055f9a12a17 100644 --- a/drivers/thermal/power_allocator.c +++ b/drivers/thermal/power_allocator.c @@ -523,6 +523,7 @@ static void allow_maximum_power(struct thermal_zone_dev= ice *tz) struct thermal_instance *instance; struct power_allocator_params *params =3D tz->governor_data; =20 + mutex_lock(&tz->lock); list_for_each_entry(instance, &tz->thermal_instances, tz_node) { if ((instance->trip !=3D params->trip_max_desired_temperature) || (!cdev_is_power_actor(instance->cdev))) @@ -534,6 +535,7 @@ static void allow_maximum_power(struct thermal_zone_dev= ice *tz) mutex_unlock(&instance->cdev->lock); thermal_cdev_update(instance->cdev); } + mutex_unlock(&tz->lock); } =20 /** --=20 2.14.1