public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: Daniel Lezcano <daniel.lezcano@oss.qualcomm.com>
To: Priyansh Jain <priyansh.jain@oss.qualcomm.com>,
	Amit Kucheria <amitk@kernel.org>,
	Thara Gopinath <thara.gopinath@gmail.com>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Daniel Lezcano <daniel.lezcano@kernel.org>,
	Zhang Rui <rui.zhang@intel.com>,
	Lukasz Luba <lukasz.luba@arm.com>
Cc: linux-pm@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	linux-kernel@vger.kernel.org, manaf.pallikunhi@oss.qualcomm.com
Subject: Re: [PATCH 1/2] thermal: qcom: tsens: atomic temperature read with hardware-guided retries
Date: Mon, 4 May 2026 19:29:36 +0200	[thread overview]
Message-ID: <bfecf67e-faf2-4889-b29a-2d4d5cd0d1a6@oss.qualcomm.com> (raw)
In-Reply-To: <20260430054422.2461150-2-priyansh.jain@oss.qualcomm.com>

On 4/30/26 07:44, Priyansh Jain wrote:
> The existing TSENS temperature read logic polls the valid bit and then
> reads the temperature register. When temperature reads are triggered
> at very short intervals, this can race with hardware updates and allow
> the temperature field to be read while it is still being updated.
> 
> In this case, the valid bit may already be asserted even though the
> temperature value is transitioning, resulting in an incorrect reading.
> 
> Hardware programming guidelines require the temperature value and the
> valid bit to be sampled atomically in the same read transaction. A
> reading is considered valid only if the valid bit is observed set in
> that same sample.
> 
> The guidelines further specify that software should attempt the
> temperature read up to three times to account for transient update
> windows. If none of the attempts observe a valid sample, a stable
> fallback value must be returned: if the first and second samples match,
> the second value is returned; otherwise, if the second and third
> samples match, the third value is returned.
> 
> Update the TSENS sensor read logic to implement atomic sampling along
> with the recommended retry-and-compare fallback behavior. This removes
> the race window and ensures deterministic temperature values in
> accordance with hardware requirements.
> 
> Signed-off-by: Priyansh Jain <priyansh.jain@oss.qualcomm.com>
> ---
>   drivers/thermal/qcom/tsens-v1.c |   6 +-
>   drivers/thermal/qcom/tsens-v2.c |   6 +-
>   drivers/thermal/qcom/tsens.c    | 118 +++++++++++++++++++++-----------
>   drivers/thermal/qcom/tsens.h    |  22 ++----
>   4 files changed, 91 insertions(+), 61 deletions(-)
> 
> diff --git a/drivers/thermal/qcom/tsens-v1.c b/drivers/thermal/qcom/tsens-v1.c
> index faa5d00788ca..2e0a01348c48 100644
> --- a/drivers/thermal/qcom/tsens-v1.c
> +++ b/drivers/thermal/qcom/tsens-v1.c
> @@ -77,6 +77,9 @@ static struct tsens_features tsens_v1_feat = {
>   	.max_sensors	= 11,
>   	.trip_min_temp	= -40000,
>   	.trip_max_temp	= 120000,
> +	.valid_bit = BIT(14),
> +	.last_temp_mask = 0x3FF,

This is GENMASK(9, 0)

> +	.last_temp_resolution = 9,

Please comply with the SSOT, in the init function compute the mask with:

	->last_temp_mask = GENMASK(9, 0);

and remove the initialization here

>   };
>   
>   static struct tsens_features tsens_v1_no_rpm_feat = {
> @@ -132,8 +135,7 @@ static const struct reg_field tsens_v1_regfields[MAX_REGFIELDS] = {
>   	/* NO CRITICAL INTERRUPT SUPPORT on v1 */
>   
>   	/* Sn_STATUS */
> -	REG_FIELD_FOR_EACH_SENSOR11(LAST_TEMP,    TM_Sn_STATUS_OFF,  0,  9),
> -	REG_FIELD_FOR_EACH_SENSOR11(VALID,        TM_Sn_STATUS_OFF, 14, 14),
> +	REG_FIELD_FOR_EACH_SENSOR11(LAST_TEMP,    TM_Sn_STATUS_OFF,  0,  14),
>   	/* xxx_STATUS bits: 1 == threshold violated */
>   	REG_FIELD_FOR_EACH_SENSOR11(MIN_STATUS,   TM_Sn_STATUS_OFF, 10, 10),
>   	REG_FIELD_FOR_EACH_SENSOR11(LOWER_STATUS, TM_Sn_STATUS_OFF, 11, 11),
> diff --git a/drivers/thermal/qcom/tsens-v2.c b/drivers/thermal/qcom/tsens-v2.c
> index 8d9698ea3ec4..814147735ba5 100644
> --- a/drivers/thermal/qcom/tsens-v2.c
> +++ b/drivers/thermal/qcom/tsens-v2.c
> @@ -56,6 +56,9 @@ static struct tsens_features tsens_v2_feat = {
>   	.max_sensors	= 16,
>   	.trip_min_temp	= -40000,
>   	.trip_max_temp	= 120000,
> +	.valid_bit = BIT(21),
> +	.last_temp_mask = 0xFFF,
> +	.last_temp_resolution = 11,

Ditto

>   };
>   
>   static struct tsens_features ipq8074_feat = {
> @@ -125,8 +128,7 @@ static const struct reg_field tsens_v2_regfields[MAX_REGFIELDS] = {
>   	[WDOG_BARK_COUNT]  = REG_FIELD(TM_WDOG_LOG_OFF,             0,  7),
>   
>   	/* Sn_STATUS */
> -	REG_FIELD_FOR_EACH_SENSOR16(LAST_TEMP,       TM_Sn_STATUS_OFF,  0,  11),
> -	REG_FIELD_FOR_EACH_SENSOR16(VALID,           TM_Sn_STATUS_OFF, 21,  21),
> +	REG_FIELD_FOR_EACH_SENSOR16(LAST_TEMP,       TM_Sn_STATUS_OFF,  0,  21),
>   	/* xxx_STATUS bits: 1 == threshold violated */
>   	REG_FIELD_FOR_EACH_SENSOR16(MIN_STATUS,      TM_Sn_STATUS_OFF, 16,  16),
>   	REG_FIELD_FOR_EACH_SENSOR16(LOWER_STATUS,    TM_Sn_STATUS_OFF, 17,  17),
> diff --git a/drivers/thermal/qcom/tsens.c b/drivers/thermal/qcom/tsens.c
> index a2422ebee816..15392a17ef41 100644
> --- a/drivers/thermal/qcom/tsens.c
> +++ b/drivers/thermal/qcom/tsens.c
> @@ -315,10 +315,66 @@ static inline int code_to_degc(u32 adc_code, const struct tsens_sensor *s)
>   	return degc;
>   }
>   
> +static inline enum tsens_ver tsens_version(struct tsens_priv *priv)
> +{
> +	return priv->feat->ver_major;
> +}

I agree putting accessor functions is a good practice but here as it 
results in duplicating the function, the benefit is discutable.

> +/**
> + * tsens_read_temp - To read temperature from hw in deciCelsius.
> + * @s:     Pointer to sensor struct
> + * @field: Index into regmap_field array pointing to temperature data
> + * @temp: temperature in deciCelsius to be read from hardware
> + *
> + * This function handles temperature returned in ADC code or deciCelsius
> + * depending on IP version.
> + *
> + * Return: 0 on success, a negative errno will be returned in error cases
> + */
> +static int tsens_read_temp(const struct tsens_sensor *s, int field, int *temp)
> +{
> +	struct tsens_priv *priv = s->priv;
> +	int temp_val[3] = {0};
> +	unsigned int status = 0;
> +	int ret = 0, i;
> +	int max_retry = 3;

Please avoid litterals. Add a macro for max number of retries. As the 
value 3 is not an arbitrary value but a documented value, add a small 
comment to tell it is a hardware requirement.

> +	ret = regmap_field_read(priv->rf[field], &status);
> +	if (ret)
> +		return ret;
> +
> +	/* VER_0 doesn't have VALID bit */
> +	if (tsens_version(priv) == VER_0) {
> +		*temp = status;
> +		return ret;
> +	}

Please use a callback for v0 and v1. Set it at probe time, so the 
version does not have to be checked at very read.

> +	for (i = 0; i < max_retry; i++) {
> +		temp_val[i] = status & priv->feat->last_temp_mask;
> +		if (() {
> +			*temp = temp_val[i];
> +			return ret;
> +		}
> +		ret = regmap_field_read(priv->rf[field], &status);
> +		if (ret)
> +			return ret;

It looks like more than max_retry is happening. One time before the 
loop, then 3 times in loop. So 4 times in total.

> +	}
> +
> +	if (temp_val[0] == temp_val[1])
> +		*temp = temp_val[1];
> +	else if (temp_val[1] == temp_val[2])
> +		*temp = temp_val[2];
> +	else
> +		return -EAGAIN;

We have a, b and c.

if a == b, then return b
else b == c, then return c
else return -EAGAIN

It is like we have two consecutives successful read. IMO that could be 
simplified to:

int prev = INTMAX;

/*
  * An explanation ...
  */

for (i = 0; i < max_retry; i++) {

	int value, valid;

	ret = regmap_field_read(priv->rf[field], &status);
	if (ret)
		return ret;

	value = FIELD_GET(priv->feat->last_temp_mask, status);

	valid = FIELD_GET(priv->feat->valid_bit, status)
	if (valid)
		return value;

	if (value == prev)
		return value;

	prev = value;
}

return -EAGAIN;

(Not tested)




  parent reply	other threads:[~2026-05-04 17:29 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-30  5:44 [PATCH 0/2] thermal: qcom: tsens: fix temperature handling Priyansh Jain
2026-04-30  5:44 ` [PATCH 1/2] thermal: qcom: tsens: atomic temperature read with hardware-guided retries Priyansh Jain
2026-04-30 15:51   ` Konrad Dybcio
     [not found]     ` <10c07347-a0df-42d3-b216-5150817b9ed2@oss.qualcomm.com>
2026-05-04  9:59       ` Konrad Dybcio
2026-05-04 10:34         ` Priyansh Jain
2026-04-30 16:00   ` Konrad Dybcio
     [not found]     ` <fc027ab4-695b-4622-b30e-8a79ce6e1781@oss.qualcomm.com>
2026-05-04  9:46       ` Konrad Dybcio
2026-05-04 17:29   ` Daniel Lezcano [this message]
2026-05-05  6:11     ` Priyansh Jain
2026-05-05  7:43       ` Daniel Lezcano
2026-05-05  8:48         ` Priyansh Jain
2026-05-05  9:35           ` Daniel Lezcano
2026-05-05  9:39             ` Priyansh Jain
2026-04-30  5:44 ` [PATCH 2/2] thermal: qcom: tsens: widen temperature limits to match hardware range Priyansh Jain
2026-04-30 16:01   ` Konrad Dybcio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bfecf67e-faf2-4889-b29a-2d4d5cd0d1a6@oss.qualcomm.com \
    --to=daniel.lezcano@oss.qualcomm.com \
    --cc=amitk@kernel.org \
    --cc=daniel.lezcano@kernel.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lukasz.luba@arm.com \
    --cc=manaf.pallikunhi@oss.qualcomm.com \
    --cc=priyansh.jain@oss.qualcomm.com \
    --cc=rafael@kernel.org \
    --cc=rui.zhang@intel.com \
    --cc=thara.gopinath@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox