public inbox for linux-clk@vger.kernel.org
 help / color / mirror / Atom feed
From: Claudiu Beznea <claudiu.beznea@tuxon.dev>
To: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: rafael@kernel.org, rui.zhang@intel.com, lukasz.luba@arm.com,
	robh@kernel.org, krzk+dt@kernel.org, conor+dt@kernel.org,
	geert+renesas@glider.be, magnus.damm@gmail.com,
	mturquette@baylibre.com, sboyd@kernel.org,
	p.zabel@pengutronix.de, ulf.hansson@linaro.org,
	linux-pm@vger.kernel.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-renesas-soc@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-clk@vger.kernel.org,
	Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Subject: Re: [PATCH 2/6] thermal: of: Export non-devres helper to register/unregister thermal zone
Date: Fri, 31 Jan 2025 01:16:21 +0200	[thread overview]
Message-ID: <e7374f91-a65d-4882-8a9b-de478582e09e@tuxon.dev> (raw)
In-Reply-To: <7d1bf72b-183a-429d-9a0c-10e1936a9abe@linaro.org>

Hi, Daniel,

On 31.01.2025 00:33, Daniel Lezcano wrote:
> On 30/01/2025 21:53, Claudiu Beznea wrote:
>> Hi, Daniel,
>>
>> On 30.01.2025 19:24, Daniel Lezcano wrote:
>>> On 30/01/2025 11:30, Claudiu Beznea wrote:
>>>>
>>>>
>>>> On 30.01.2025 12:07, Daniel Lezcano wrote:
>>>>> On Thu, Jan 30, 2025 at 11:08:03AM +0200, Claudiu Beznea wrote:
>>>>>> Hi, Daniel,
>>>
>>> [ ... ]
>>>
>>>>>>> Would the IP need some cycles to capture the temperature accurately
>>>>>>> after the
>>>>>>> clock is enabled ?
>>>>>>
>>>>>> There is nothing about this mentioned about this in the HW manual of the
>>>>>> RZ/G3S SoC. The only points mentioned are as described in the driver
>>>>>> code:
>>>>>> - wait at least 3us after each IIO channel read
>>>>>> - wait at least 30us after enabling the sensor
>>>>>> - wait at least 50us after setting OE bit in TSU_SM
>>>>>>
>>>>>> For this I chose to have it implemented as proposed.
>>>>>
>>>>> IMO, disabling/enabling the clock between two reads through the pm
>>>>> runtime may
>>>>> not be a good thing, especially if the system enters a thermal situation
>>>>> where
>>>>> it has to mitigate.
>>>>>
>>>>> Without any testing capturing the temperatures and compare between the
>>>>> always-on
>>>>> and on/off, it is hard to say if it is true or not. Up to you to test
>>>>> that or
>>>>> not. If you think it is fine, then let's go with it.
>>>>
>>>> I tested it with and w/o the runtime PM and on/off support (so, everything
>>>> ON from the probe) and the reported temperature values were similar.
>>>
>>>
>>> Did you remove the roundup to 0.5°C ?
>>
>> I did the testing as suggested and, this time, collected results and
>> compared side by side. I read the temperature for 10 minutes, 60 seconds
>> after the Linux prompt showed up. There is, indeed, a slight difference b/w
>> the 2 cases.
>>
>> When the runtime PM doesn't touch the clocks on read the reported
>> temperature varies b/w 53-54 degrees while when the runtime PM
>> enables/disables the clocks a single read reported 55 degrees, the rest
>> reported 54 degrees.
>>
>> I plotted the results side by side here:
>> https://i2.paste.pics/f07eaeddc2ccc3c6695fe5056b52f4a2.png?
>> trs=0a0eaab99bb59ebcb10051eb298f437c7cd50c16437a87392aebc16cd9013e18&rand=vWXm2VTrbt
>>
>> Please let me know how do you consider it.
> 

After sending this to you I figured it out that precision is lost somewhere
so I re-tested it with the following diff (multiplied parts of the equation
with 1000):

diff --git a/drivers/thermal/renesas/rzg3s_thermal.c
b/drivers/thermal/renesas/rzg3s_thermal.c
index 6719f9ca05eb..84e18ff69d7c 100644
--- a/drivers/thermal/renesas/rzg3s_thermal.c
+++ b/drivers/thermal/renesas/rzg3s_thermal.c
@@ -83,7 +83,7 @@ static int rzg3s_thermal_get_temp(struct
thermal_zone_device *tz, int *temp)
        }

        ret = 0;
-       ts_code_ave = DIV_ROUND_CLOSEST(ts_code_ave, TSU_READ_STEPS);
+       ts_code_ave = DIV_ROUND_CLOSEST(MCELSIUS(ts_code_ave), TSU_READ_STEPS);

        /*
         * According to the HW manual (section 40.4.4 Procedure for
Measuring the Temperature)
@@ -91,11 +91,8 @@ static int rzg3s_thermal_get_temp(struct
thermal_zone_device *tz, int *temp)
         *
         * Tj = (ts_code_ave - priv->calib0) * 165 / (priv->calib0 -
priv->calib1) - 40
         */
-       *temp = DIV_ROUND_CLOSEST((ts_code_ave - priv->calib1) * 165,
-                                 (priv->calib0 - priv->calib1)) - 40;
-
-       /* Report it in mili degrees Celsius and round it up to 0.5 degrees
Celsius. */
-       *temp = roundup(MCELSIUS(*temp), 500);
+       *temp = DIV_ROUND_CLOSEST((u64)(ts_code_ave -
MCELSIUS(priv->calib1)) * MCELSIUS(165),
+                                 MCELSIUS(priv->calib0 - priv->calib1)) -
MCELSIUS(40);

 rpm_put:
        pm_runtime_mark_last_busy(dev);

With this, the results seems similar b/w runtime PM and no runtime PM cases.

The tests were executed after the board was off for few hours. The
first test was with runtime PM suspend/resume on each read. Then the board
was rebooted and re-run the test w/o runtime PM suspend/resume on reads.

Figure with results is here:
https://i2.paste.pics/5f353a4f04b07b4bead3086624aba23f.png?trs=0a0eaab99bb59ebcb10051eb298f437c7cd50c16437a87392aebc16cd9013e18&rand=5n34QNjWID


> Thanks for taking the time to provide a figure
> 
> Testing thermal can be painful because it should be done under certain
> conditions.
> 
> I guess there was no particular work load on the system when running the
> tests.

No load, indeed.

> 
> At the first glance, it seems, without the pm runtime, the measurement is
> more precise as it catches more thermal changes. But the test does not give
> information about the thermal behavior under stress. And one second
> sampling is too long to really figure it out.
> 
> In the kernel source tree, there is a tool to read the temperature in an
> optimized manner, you may want to use it to read the temperature at a
> higher rate. It is located in tools/thermal/thermometer
> 
> Compiling is a bit fuzzy ATM, so until it is fixed, here are the steps:
> 
> (you should install libconfig-dev and libnl-3-dev packages).
> 
> cd $LINUX_DIR/tools/thermal/lib
> make
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$LINUX_DIR/tools/thermal/lib
> 
> cd $LINUX_DIR/tools
> make thermometer
> 
> 
> 
> Then change directory:
> 
> cd $LINUX_DIR/tools/thermal/thermometer
> 
> 
> Run the tool:
> 
> ./thermometer -o out -c t.conf -l DEBUG -- <my_command>
> 
> 
> The content of the configuration file t.conf is:
> 
> thermal-zones = (
>           {    name = "cpu[0_9].*-thermal";
>         polling = 100; }
>       )
> 
> All the captured data will be in the 'out' directory
> 
> For 'my_command', I suggest to use a script containing:
> 
> sleep 10; dhrystone -t 1 -r 120; sleep 10
> 
> If you need the dhrystone binary, let me know.
> 
> The thermal zone device tree configuration should be changed to use a 65°C
> passive trip point instead of 100°C (and the kernel setup with the step
> wise governor as default).
> 
> The resulting figure from the temperature should show a flat temperature
> figure during 10 seconds, then the temperature increasing until reaching
> the temperature threshold of 65°C, the temperature stabilizing around it,
> then followed by a temperature decreasing when the test finishes.
> 
> If the temperature does not reach the limit, decrease the trip point
> temperature or increase the dhrystone duration (the -r 120 option)
> 
> At this point, you should the test with and without pm runtime but in order
> to have consistent results, you should wait ~20 minutes between two tests.
> 
> The shape of the figures will give the immediate information about how the
> mitigation vs thermal sensor vs cooling device behave.
> 
> Additionally, you can enable the thermal DEBUGFS option and add the
> collected information statistics from /sys/kernel/debug/thermal/*** in the
> results.
> 
> 
> Hope that helps

Thank you for all these details. I'll have a look on it but starting with
Monday as I won't have access to setup in the following days.

Thank you,
Claudiu

> 
> 
> 
> 
> 
> 


  reply	other threads:[~2025-01-30 23:16 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-03 16:37 [PATCH 0/6] thermal: renesas: Add support for RZ/G3S Claudiu
2025-01-03 16:38 ` [PATCH 1/6] clk: renesas: r9a08g045: Add clocks, resets and power domain support for the TSU IP Claudiu
2025-01-10 14:16   ` Geert Uytterhoeven
2025-01-03 16:38 ` [PATCH 2/6] thermal: of: Export non-devres helper to register/unregister thermal zone Claudiu
2025-01-09 17:34   ` Daniel Lezcano
2025-01-15 15:42     ` Ulf Hansson
2025-02-04 14:33       ` Jonathan Cameron
2025-02-05  8:33         ` Claudiu Beznea
2025-02-05 11:39           ` Jonathan Cameron
2025-01-29 17:24   ` Daniel Lezcano
2025-01-30  9:08     ` Claudiu Beznea
2025-01-30 10:07       ` Daniel Lezcano
2025-01-30 10:30         ` Claudiu Beznea
2025-01-30 10:50           ` Claudiu Beznea
2025-01-30 17:24           ` Daniel Lezcano
2025-01-30 17:32             ` Claudiu Beznea
2025-01-30 20:53             ` Claudiu Beznea
2025-01-30 22:33               ` Daniel Lezcano
2025-01-30 23:16                 ` Claudiu Beznea [this message]
2025-02-03 16:30                   ` Claudiu Beznea
2025-01-30 10:33         ` Biju Das
2025-01-30 17:31           ` Daniel Lezcano
2025-01-30 17:47             ` Biju Das
2025-01-03 16:38 ` [PATCH 3/6] dt-bindings: thermal: r9a08g045-tsu: Document the TSU unit Claudiu
2025-01-08 23:43   ` Rob Herring (Arm)
2025-01-22 10:12   ` Geert Uytterhoeven
2025-01-03 16:38 ` [PATCH 4/6] thermal: renesas: rzg3s: Add thermal driver for the Renesas RZ/G3S SoC Claudiu
2025-01-22 10:29   ` Geert Uytterhoeven
2025-01-25 12:18     ` Jonathan Cameron
2025-01-27  8:32       ` Claudiu Beznea
2025-01-27  8:54         ` Geert Uytterhoeven
2025-01-27  9:11           ` Biju Das
2025-01-27  9:15             ` Claudiu Beznea
2025-01-27  9:17               ` Biju Das
2025-01-03 16:38 ` [PATCH 5/6] arm64: dts: renesas: r9a08g045: Add TSU node Claudiu
2025-01-22 10:31   ` Geert Uytterhoeven
2025-01-30 17:53   ` Daniel Lezcano
2025-01-03 16:38 ` [PATCH 6/6] arm64: defconfig: Enable RZ/G3S thermal Claudiu
2025-01-22 10:33   ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e7374f91-a65d-4882-8a9b-de478582e09e@tuxon.dev \
    --to=claudiu.beznea@tuxon.dev \
    --cc=claudiu.beznea.uj@bp.renesas.com \
    --cc=conor+dt@kernel.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=devicetree@vger.kernel.org \
    --cc=geert+renesas@glider.be \
    --cc=krzk+dt@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-renesas-soc@vger.kernel.org \
    --cc=lukasz.luba@arm.com \
    --cc=magnus.damm@gmail.com \
    --cc=mturquette@baylibre.com \
    --cc=p.zabel@pengutronix.de \
    --cc=rafael@kernel.org \
    --cc=robh@kernel.org \
    --cc=rui.zhang@intel.com \
    --cc=sboyd@kernel.org \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox