From: Daniel Lezcano <daniel.lezcano@oss.qualcomm.com>
To: Priyansh Jain <priyansh.jain@oss.qualcomm.com>,
Amit Kucheria <amitk@kernel.org>,
Thara Gopinath <thara.gopinath@gmail.com>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Daniel Lezcano <daniel.lezcano@kernel.org>,
Zhang Rui <rui.zhang@intel.com>,
Lukasz Luba <lukasz.luba@arm.com>
Cc: linux-pm@vger.kernel.org, linux-arm-msm@vger.kernel.org,
linux-kernel@vger.kernel.org, manaf.pallikunhi@oss.qualcomm.com
Subject: Re: [PATCH 1/2] thermal: qcom: tsens: atomic temperature read with hardware-guided retries
Date: Mon, 4 May 2026 19:29:36 +0200 [thread overview]
Message-ID: <bfecf67e-faf2-4889-b29a-2d4d5cd0d1a6@oss.qualcomm.com> (raw)
In-Reply-To: <20260430054422.2461150-2-priyansh.jain@oss.qualcomm.com>
On 4/30/26 07:44, Priyansh Jain wrote:
> The existing TSENS temperature read logic polls the valid bit and then
> reads the temperature register. When temperature reads are triggered
> at very short intervals, this can race with hardware updates and allow
> the temperature field to be read while it is still being updated.
>
> In this case, the valid bit may already be asserted even though the
> temperature value is transitioning, resulting in an incorrect reading.
>
> Hardware programming guidelines require the temperature value and the
> valid bit to be sampled atomically in the same read transaction. A
> reading is considered valid only if the valid bit is observed set in
> that same sample.
>
> The guidelines further specify that software should attempt the
> temperature read up to three times to account for transient update
> windows. If none of the attempts observe a valid sample, a stable
> fallback value must be returned: if the first and second samples match,
> the second value is returned; otherwise, if the second and third
> samples match, the third value is returned.
>
> Update the TSENS sensor read logic to implement atomic sampling along
> with the recommended retry-and-compare fallback behavior. This removes
> the race window and ensures deterministic temperature values in
> accordance with hardware requirements.
>
> Signed-off-by: Priyansh Jain <priyansh.jain@oss.qualcomm.com>
> ---
> drivers/thermal/qcom/tsens-v1.c | 6 +-
> drivers/thermal/qcom/tsens-v2.c | 6 +-
> drivers/thermal/qcom/tsens.c | 118 +++++++++++++++++++++-----------
> drivers/thermal/qcom/tsens.h | 22 ++----
> 4 files changed, 91 insertions(+), 61 deletions(-)
>
> diff --git a/drivers/thermal/qcom/tsens-v1.c b/drivers/thermal/qcom/tsens-v1.c
> index faa5d00788ca..2e0a01348c48 100644
> --- a/drivers/thermal/qcom/tsens-v1.c
> +++ b/drivers/thermal/qcom/tsens-v1.c
> @@ -77,6 +77,9 @@ static struct tsens_features tsens_v1_feat = {
> .max_sensors = 11,
> .trip_min_temp = -40000,
> .trip_max_temp = 120000,
> + .valid_bit = BIT(14),
> + .last_temp_mask = 0x3FF,
This is GENMASK(9, 0)
> + .last_temp_resolution = 9,
Please comply with the SSOT, in the init function compute the mask with:
->last_temp_mask = GENMASK(9, 0);
and remove the initialization here
> };
>
> static struct tsens_features tsens_v1_no_rpm_feat = {
> @@ -132,8 +135,7 @@ static const struct reg_field tsens_v1_regfields[MAX_REGFIELDS] = {
> /* NO CRITICAL INTERRUPT SUPPORT on v1 */
>
> /* Sn_STATUS */
> - REG_FIELD_FOR_EACH_SENSOR11(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 9),
> - REG_FIELD_FOR_EACH_SENSOR11(VALID, TM_Sn_STATUS_OFF, 14, 14),
> + REG_FIELD_FOR_EACH_SENSOR11(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 14),
> /* xxx_STATUS bits: 1 == threshold violated */
> REG_FIELD_FOR_EACH_SENSOR11(MIN_STATUS, TM_Sn_STATUS_OFF, 10, 10),
> REG_FIELD_FOR_EACH_SENSOR11(LOWER_STATUS, TM_Sn_STATUS_OFF, 11, 11),
> diff --git a/drivers/thermal/qcom/tsens-v2.c b/drivers/thermal/qcom/tsens-v2.c
> index 8d9698ea3ec4..814147735ba5 100644
> --- a/drivers/thermal/qcom/tsens-v2.c
> +++ b/drivers/thermal/qcom/tsens-v2.c
> @@ -56,6 +56,9 @@ static struct tsens_features tsens_v2_feat = {
> .max_sensors = 16,
> .trip_min_temp = -40000,
> .trip_max_temp = 120000,
> + .valid_bit = BIT(21),
> + .last_temp_mask = 0xFFF,
> + .last_temp_resolution = 11,
Ditto
> };
>
> static struct tsens_features ipq8074_feat = {
> @@ -125,8 +128,7 @@ static const struct reg_field tsens_v2_regfields[MAX_REGFIELDS] = {
> [WDOG_BARK_COUNT] = REG_FIELD(TM_WDOG_LOG_OFF, 0, 7),
>
> /* Sn_STATUS */
> - REG_FIELD_FOR_EACH_SENSOR16(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 11),
> - REG_FIELD_FOR_EACH_SENSOR16(VALID, TM_Sn_STATUS_OFF, 21, 21),
> + REG_FIELD_FOR_EACH_SENSOR16(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 21),
> /* xxx_STATUS bits: 1 == threshold violated */
> REG_FIELD_FOR_EACH_SENSOR16(MIN_STATUS, TM_Sn_STATUS_OFF, 16, 16),
> REG_FIELD_FOR_EACH_SENSOR16(LOWER_STATUS, TM_Sn_STATUS_OFF, 17, 17),
> diff --git a/drivers/thermal/qcom/tsens.c b/drivers/thermal/qcom/tsens.c
> index a2422ebee816..15392a17ef41 100644
> --- a/drivers/thermal/qcom/tsens.c
> +++ b/drivers/thermal/qcom/tsens.c
> @@ -315,10 +315,66 @@ static inline int code_to_degc(u32 adc_code, const struct tsens_sensor *s)
> return degc;
> }
>
> +static inline enum tsens_ver tsens_version(struct tsens_priv *priv)
> +{
> + return priv->feat->ver_major;
> +}
I agree putting accessor functions is a good practice but here as it
results in duplicating the function, the benefit is discutable.
> +/**
> + * tsens_read_temp - To read temperature from hw in deciCelsius.
> + * @s: Pointer to sensor struct
> + * @field: Index into regmap_field array pointing to temperature data
> + * @temp: temperature in deciCelsius to be read from hardware
> + *
> + * This function handles temperature returned in ADC code or deciCelsius
> + * depending on IP version.
> + *
> + * Return: 0 on success, a negative errno will be returned in error cases
> + */
> +static int tsens_read_temp(const struct tsens_sensor *s, int field, int *temp)
> +{
> + struct tsens_priv *priv = s->priv;
> + int temp_val[3] = {0};
> + unsigned int status = 0;
> + int ret = 0, i;
> + int max_retry = 3;
Please avoid litterals. Add a macro for max number of retries. As the
value 3 is not an arbitrary value but a documented value, add a small
comment to tell it is a hardware requirement.
> + ret = regmap_field_read(priv->rf[field], &status);
> + if (ret)
> + return ret;
> +
> + /* VER_0 doesn't have VALID bit */
> + if (tsens_version(priv) == VER_0) {
> + *temp = status;
> + return ret;
> + }
Please use a callback for v0 and v1. Set it at probe time, so the
version does not have to be checked at very read.
> + for (i = 0; i < max_retry; i++) {
> + temp_val[i] = status & priv->feat->last_temp_mask;
> + if (() {
> + *temp = temp_val[i];
> + return ret;
> + }
> + ret = regmap_field_read(priv->rf[field], &status);
> + if (ret)
> + return ret;
It looks like more than max_retry is happening. One time before the
loop, then 3 times in loop. So 4 times in total.
> + }
> +
> + if (temp_val[0] == temp_val[1])
> + *temp = temp_val[1];
> + else if (temp_val[1] == temp_val[2])
> + *temp = temp_val[2];
> + else
> + return -EAGAIN;
We have a, b and c.
if a == b, then return b
else b == c, then return c
else return -EAGAIN
It is like we have two consecutives successful read. IMO that could be
simplified to:
int prev = INTMAX;
/*
* An explanation ...
*/
for (i = 0; i < max_retry; i++) {
int value, valid;
ret = regmap_field_read(priv->rf[field], &status);
if (ret)
return ret;
value = FIELD_GET(priv->feat->last_temp_mask, status);
valid = FIELD_GET(priv->feat->valid_bit, status)
if (valid)
return value;
if (value == prev)
return value;
prev = value;
}
return -EAGAIN;
(Not tested)
next prev parent reply other threads:[~2026-05-04 17:29 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 5:44 [PATCH 0/2] thermal: qcom: tsens: fix temperature handling Priyansh Jain
2026-04-30 5:44 ` [PATCH 1/2] thermal: qcom: tsens: atomic temperature read with hardware-guided retries Priyansh Jain
2026-04-30 15:51 ` Konrad Dybcio
[not found] ` <10c07347-a0df-42d3-b216-5150817b9ed2@oss.qualcomm.com>
2026-05-04 9:59 ` Konrad Dybcio
2026-05-04 10:34 ` Priyansh Jain
2026-04-30 16:00 ` Konrad Dybcio
[not found] ` <fc027ab4-695b-4622-b30e-8a79ce6e1781@oss.qualcomm.com>
2026-05-04 9:46 ` Konrad Dybcio
2026-05-04 17:29 ` Daniel Lezcano [this message]
2026-05-05 6:11 ` Priyansh Jain
2026-05-05 7:43 ` Daniel Lezcano
2026-05-05 8:48 ` Priyansh Jain
2026-05-05 9:35 ` Daniel Lezcano
2026-05-05 9:39 ` Priyansh Jain
2026-04-30 5:44 ` [PATCH 2/2] thermal: qcom: tsens: widen temperature limits to match hardware range Priyansh Jain
2026-04-30 16:01 ` Konrad Dybcio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bfecf67e-faf2-4889-b29a-2d4d5cd0d1a6@oss.qualcomm.com \
--to=daniel.lezcano@oss.qualcomm.com \
--cc=amitk@kernel.org \
--cc=daniel.lezcano@kernel.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=lukasz.luba@arm.com \
--cc=manaf.pallikunhi@oss.qualcomm.com \
--cc=priyansh.jain@oss.qualcomm.com \
--cc=rafael@kernel.org \
--cc=rui.zhang@intel.com \
--cc=thara.gopinath@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox