From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57FF2C433EF for ; Sat, 16 Oct 2021 22:24:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 32E5560FE3 for ; Sat, 16 Oct 2021 22:24:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236957AbhJPW0S (ORCPT ); Sat, 16 Oct 2021 18:26:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236505AbhJPW0K (ORCPT ); Sat, 16 Oct 2021 18:26:10 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61668C061766 for ; Sat, 16 Oct 2021 15:24:00 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id l38-20020a05600c1d2600b0030d80c3667aso5052382wms.5 for ; Sat, 16 Oct 2021 15:24:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=BqbX/wOwlUQGVZxjSyS1aiwJt03MkC15YQIT8Xq1qPI=; b=Vi5in+qPCWcYwZt0+DU0yyOxtYcml4HWBFOXO3+j2piaNB1BbtatiVB0u2uwvT2FD/ Emduaij2cKTKSANLgZSKY5Ui5oKh9iMzsWmZeQjY2tddPoynROuld4H0tGdIqSMluuwQ ojYbMv5G7M3aXryljgjbzMbM1aG1wQZtgj7JsY6ozzw74ohPdiYwOAuSaq2brkKXgH3u GWwvwgTjB2IKxiRxW0F2AGQFSs9m+eXhsg/kML3tZ5nYDX3CmtQWmHmw4o1gEpUqVK/4 qs0GN3QwEf2Z0O87xOFqWZ2GbiJEHCW+9P2c4WqFsgMSEju0S7bGhudC4YPjvogU/4tr cbBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=BqbX/wOwlUQGVZxjSyS1aiwJt03MkC15YQIT8Xq1qPI=; b=PxaqUQUNAjhXLW6s5hod6EKjdfF5fHRGD5NQH0nUfSJCLPLSZHCxNgs9at/YkuHWIS yi/8C7dqEoF4hI2ykyJq94OXcn5yiOvtCNc0+7a8LLFWP9MJo5cbw5IFOKqMnV+jHvIy yk2q1xXalCGwyVDDLM5QGk7ZmlESzhiqzH80v34F/WHvuI5nQc1MzwEEaifNWarn/1RM ru/+pJtl3HmWBjkTXgXCNixjkQYx1NPjtkBADvnrjilxZBSSZ+YUxLdzwkMyMLgYkm33 +nNYPvDTptsg5oXPS1X6fUgz4NQAekA8tpEzzCGxDoCVvgfwwiUucqItOKiiLqfpXh0D 9UXA== X-Gm-Message-State: AOAM533ICx/WyQYD4UIcWRYzL1lNZfSBHoNxaVRpTqklpv7ZAh1TagYt faiJ8xt4hJ5IyVHVlBURwFJpKQ== X-Google-Smtp-Source: ABdhPJySJtsAMFRmVhcDqy0W9uv1rkUuCq3gLR2Y9MaVHJElaJG9FWJJ/pDKVCMIR8jNeuEQZMV1Vw== X-Received: by 2002:a05:600c:896:: with SMTP id l22mr21122832wmp.92.1634423038800; Sat, 16 Oct 2021 15:23:58 -0700 (PDT) Received: from ?IPv6:2a01:e34:ed2f:f020:f04d:f65f:efd5:698? ([2a01:e34:ed2f:f020:f04d:f65f:efd5:698]) by smtp.googlemail.com with ESMTPSA id y191sm16229441wmc.36.2021.10.16.15.23.57 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 16 Oct 2021 15:23:58 -0700 (PDT) Subject: Re: [PATCH] thermal: imx: Fix temperature measurements on i.MX6 after alarm To: =?UTF-8?B?TWljaGFsIFZva8OhxI0=?= , Andrzej Pietrasiewicz , linux-pm@vger.kernel.org, Shawn Guo Cc: Amit Kucheria , Sascha Hauer , Pengutronix Kernel Team , Fabio Estevam , NXP Linux Team , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, =?UTF-8?Q?Petr_Bene=c5=a1?= , petrben@gmail.com, stable@vger.kernel.org References: <20211008081137.1948848-1-michal.vokac@ysoft.com> From: Daniel Lezcano Message-ID: Date: Sun, 17 Oct 2021 00:23:56 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <20211008081137.1948848-1-michal.vokac@ysoft.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org On 08/10/2021 10:11, Michal Vokáč wrote: > From: Petr Beneš > > SoC temperature readout may not work after thermal alarm fires interrupt. > This harms userspace as well as CPU cooling device. > > Two issues with the logic involved. First, there is no protection against > concurent measurements, hence one can switch the sensor off while > the other one tries to read temperature later. Second, the interrupt path > usually fails. At the end the sensor is powered off and thermal IRQ is > disabled. One has to reenable the thermal zone by the sysfs interface. > > Most of troubles come from commit d92ed2c9d3ff ("thermal: imx: Use > driver's local data to decide whether to run a measurement") Are these troubles observed and reproduced ? Or is it your understanding from reading the code ? get_temp() and tz enable/disable are protected against races in the core code via the tz mutex > It uses data->irq_enabled as the "local data". Indeed, its value is > related to the state of the sensor loosely under normal operation and, > frankly, gets unleashed when the thermal interrupt arrives. > > Current patch adds the "local data" (new member sensor_on in > imx_thermal_data) and sets its value in controlled manner.> > Fixes: d92ed2c9d3ff ("thermal: imx: Use driver's local data to decide whether to run a measurement") > Cc: petrben@gmail.com > Cc: stable@vger.kernel.org > Signed-off-by: Petr Beneš > Signed-off-by: Michal Vokáč > --- > drivers/thermal/imx_thermal.c | 30 ++++++++++++++++++++++++++---- > 1 file changed, 26 insertions(+), 4 deletions(-) > > diff --git a/drivers/thermal/imx_thermal.c b/drivers/thermal/imx_thermal.c > index 2c7473d86a59..df5658e21828 100644 > --- a/drivers/thermal/imx_thermal.c > +++ b/drivers/thermal/imx_thermal.c > @@ -209,6 +209,8 @@ struct imx_thermal_data { > struct clk *thermal_clk; > const struct thermal_soc_data *socdata; > const char *temp_grade; > + struct mutex sensor_lock; > + bool sensor_on; > }; > > static void imx_set_panic_temp(struct imx_thermal_data *data, > @@ -252,11 +254,12 @@ static int imx_get_temp(struct thermal_zone_device *tz, int *temp) > const struct thermal_soc_data *soc_data = data->socdata; > struct regmap *map = data->tempmon; > unsigned int n_meas; > - bool wait, run_measurement; > + bool wait; > u32 val; > > - run_measurement = !data->irq_enabled; > - if (!run_measurement) { > + mutex_lock(&data->sensor_lock); > + > + if (data->sensor_on) { > /* Check if a measurement is currently in progress */ > regmap_read(map, soc_data->temp_data, &val); > wait = !(val & soc_data->temp_valid_mask); > @@ -283,13 +286,15 @@ static int imx_get_temp(struct thermal_zone_device *tz, int *temp) > > regmap_read(map, soc_data->temp_data, &val); > > - if (run_measurement) { > + if (!data->sensor_on) { > regmap_write(map, soc_data->sensor_ctrl + REG_CLR, > soc_data->measure_temp_mask); > regmap_write(map, soc_data->sensor_ctrl + REG_SET, > soc_data->power_down_mask); > } > > + mutex_unlock(&data->sensor_lock); > + > if ((val & soc_data->temp_valid_mask) == 0) { > dev_dbg(&tz->device, "temp measurement never finished\n"); > return -EAGAIN; > @@ -339,20 +344,26 @@ static int imx_change_mode(struct thermal_zone_device *tz, > const struct thermal_soc_data *soc_data = data->socdata; > > if (mode == THERMAL_DEVICE_ENABLED) { > + mutex_lock(&data->sensor_lock); > regmap_write(map, soc_data->sensor_ctrl + REG_CLR, > soc_data->power_down_mask); > regmap_write(map, soc_data->sensor_ctrl + REG_SET, > soc_data->measure_temp_mask); > + data->sensor_on = true; > + mutex_unlock(&data->sensor_lock); > > if (!data->irq_enabled) { > data->irq_enabled = true; > enable_irq(data->irq); > } > } else { > + mutex_lock(&data->sensor_lock); > regmap_write(map, soc_data->sensor_ctrl + REG_CLR, > soc_data->measure_temp_mask); > regmap_write(map, soc_data->sensor_ctrl + REG_SET, > soc_data->power_down_mask); > + data->sensor_on = false; > + mutex_unlock(&data->sensor_lock); > > if (data->irq_enabled) { > disable_irq(data->irq); > @@ -728,6 +739,8 @@ static int imx_thermal_probe(struct platform_device *pdev) > } > > /* Make sure sensor is in known good state for measurements */ > + mutex_init(&data->sensor_lock); > + mutex_lock(&data->sensor_lock); > regmap_write(map, data->socdata->sensor_ctrl + REG_CLR, > data->socdata->power_down_mask); > regmap_write(map, data->socdata->sensor_ctrl + REG_CLR, > @@ -739,6 +752,8 @@ static int imx_thermal_probe(struct platform_device *pdev) > IMX6_MISC0_REFTOP_SELBIASOFF); > regmap_write(map, data->socdata->sensor_ctrl + REG_SET, > data->socdata->power_down_mask); > + data->sensor_on = false; > + mutex_unlock(&data->sensor_lock); > > ret = imx_thermal_register_legacy_cooling(data); > if (ret) > @@ -796,10 +811,13 @@ static int imx_thermal_probe(struct platform_device *pdev) > if (data->socdata->version == TEMPMON_IMX6SX) > imx_set_panic_temp(data, data->temp_critical); > > + mutex_lock(&data->sensor_lock); > regmap_write(map, data->socdata->sensor_ctrl + REG_CLR, > data->socdata->power_down_mask); > regmap_write(map, data->socdata->sensor_ctrl + REG_SET, > data->socdata->measure_temp_mask); > + data->sensor_on = true; > + mutex_unlock(&data->sensor_lock); > > data->irq_enabled = true; > ret = thermal_zone_device_enable(data->tz); > @@ -832,8 +850,12 @@ static int imx_thermal_remove(struct platform_device *pdev) > struct regmap *map = data->tempmon; > > /* Disable measurements */ > + mutex_lock(&data->sensor_lock); > regmap_write(map, data->socdata->sensor_ctrl + REG_SET, > data->socdata->power_down_mask); > + data->sensor_on = false; > + mutex_unlock(&data->sensor_lock); > + > if (!IS_ERR(data->thermal_clk)) > clk_disable_unprepare(data->thermal_clk); > > -- Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog