From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7AFEC10F26 for ; Thu, 2 Apr 2020 10:09:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9880B20787 for ; Thu, 2 Apr 2020 10:09:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729033AbgDBKJz (ORCPT ); Thu, 2 Apr 2020 06:09:55 -0400 Received: from mx2.suse.de ([195.135.220.15]:41502 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728135AbgDBKJz (ORCPT ); Thu, 2 Apr 2020 06:09:55 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id EEB66AC0C; Thu, 2 Apr 2020 10:09:52 +0000 (UTC) Date: Thu, 02 Apr 2020 12:09:52 +0200 Message-ID: From: Takashi Iwai To: Zhang Rui Cc: "linux-acpi@vger.kernel.org" , "Rafael J. Wysocki" , Len Brown , "linux-kernel@vger.kernel.org" Subject: Re: OOB access on ACPI processor thermal device via sysfs write In-Reply-To: <0926f44775e91145a83c9eb88a468c64261af20d.camel@intel.com> References: <744357E9AAD1214791ACBA4B0B90926377CEB399@SHSMSX108.ccr.corp.intel.com> <0926f44775e91145a83c9eb88a468c64261af20d.camel@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 Emacs/25.3 (x86_64-suse-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org On Thu, 02 Apr 2020 12:03:30 +0200, Zhang Rui wrote: > > On Thu, 2020-04-02 at 11:03 +0200, Takashi Iwai wrote: > > On Thu, 02 Apr 2020 09:47:50 +0200, > > Zhang, Rui wrote: > > > > > > CC Viresh. > > > > > > Yes, I've received it. > > > > > > To me, there is not a hard rule that the cooling device max_state > > > must be static. > > > We should be able to detect the max_state change and reset the > > > stats table when necessary. > > > > > > I just finished a prototype patch to do so, and will paste it > > > later. > > > > Great, that sounds like a feasible option, indeed. > > > > > Please try the patch below and see if the problem goes away or not. > > >From 7b429674a0e1a6226734c8919b876bb57d946b1d Mon Sep 17 00:00:00 2001 > From: Zhang Rui > Date: Thu, 2 Apr 2020 11:18:44 +0800 > Subject: [RFC PATCH] thermal: update thermal stats table when max cooling > state changed > > The maximum cooling state of a cooling device may be changed at > runtime. Thus the statistics table must be updated to handle the real > maximum cooling states supported. > > This fixes an OOB issue when updating the statistics of the processor > cooling device, because it only supports 1 cooling state before cpufreq > driver loaded. > > Fixes: 8ea229511e06 ("thermal: Add cooling device's statistics in sysfs") > Reported-by: Takashi Iwai > Signed-off-by: Zhang Rui > --- > drivers/thermal/thermal_sysfs.c | 38 +++++++++++++++++++++++++++------ > 1 file changed, 31 insertions(+), 7 deletions(-) > > diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c > index aa99edb4dff7..c69173eb4b24 100644 > --- a/drivers/thermal/thermal_sysfs.c > +++ b/drivers/thermal/thermal_sysfs.c > @@ -755,6 +755,9 @@ struct cooling_dev_stats { > unsigned int *trans_table; > }; > > +static int cooling_device_stats_table_update(struct thermal_cooling_device *cdev); > +static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev); > + > static void update_time_in_state(struct cooling_dev_stats *stats) > { > ktime_t now = ktime_get(), delta; > @@ -768,8 +771,12 @@ static void update_time_in_state(struct cooling_dev_stats *stats) > void thermal_cooling_device_stats_update(struct thermal_cooling_device *cdev, > unsigned long new_state) > { > - struct cooling_dev_stats *stats = cdev->stats; > + struct cooling_dev_stats *stats; > > + if (cooling_device_stats_table_update(cdev)) > + return; > + > + stats = cdev->stats; > spin_lock(&stats->lock); > > if (stats->state == new_state) > @@ -904,24 +911,32 @@ static const struct attribute_group cooling_device_stats_attr_group = { > .name = "stats" > }; > > -static void cooling_device_stats_setup(struct thermal_cooling_device *cdev) > +static int cooling_device_stats_table_update(struct thermal_cooling_device *cdev) > { > struct cooling_dev_stats *stats; > unsigned long states; > - int var; > + int var, ret; > > - if (cdev->ops->get_max_state(cdev, &states)) > - return; > + ret = cdev->ops->get_max_state(cdev, &states); > + if (ret) > + return ret; > > states++; /* Total number of states is highest state + 1 */ > > + stats = cdev->stats; > + if (stats) { > + if (stats->max_states == states) > + return 0; > + else > + cooling_device_stats_destroy(cdev); > + } This looks racy. We may have concurrent accesses and it'll lead to another access-after-free. > var = sizeof(*stats); > var += sizeof(*stats->time_in_state) * states; > var += sizeof(*stats->trans_table) * states * states; > - > stats = kzalloc(var, GFP_KERNEL); > if (!stats) > - return; > + return -ENOMEM; ... and this leaves NULL to cdev->stats. Then a later access to sysfs would lead to NULL derference. > stats->time_in_state = (ktime_t *)(stats + 1); > stats->trans_table = (unsigned int *)(stats->time_in_state + states); > @@ -930,6 +945,15 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev) > stats->max_states = states; > > spin_lock_init(&stats->lock); Also we must not re-initialize spinlock here at each resizing. Rather use spinlock for switching to the new table. thanks, Takashi