public inbox for ltp@lists.linux.it
 help / color / mirror / Atom feed
From: Petr Vorel <pvorel@suse.cz>
To: "Wysocki, Rafael J" <rafael.j.wysocki@intel.com>
Cc: "Dubel, Helena Anna" <helena.anna.dubel@intel.com>,
	"Ossowski, Tomasz" <tomasz.ossowski@intel.com>,
	"Niestepski, Daniel" <daniel.niestepski@intel.com>,
	"ltp@lists.linux.it" <ltp@lists.linux.it>
Subject: Re: [LTP] [PATCH v3] thermal: add new test group
Date: Fri, 30 Jan 2026 00:24:54 +0100	[thread overview]
Message-ID: <20260129232454.GA110393@pevik> (raw)
In-Reply-To: <79fc9cb3-ea1f-40ab-ba6f-e7a7dd4f23eb@intel.com>


> On 1/23/2026 7:28 PM, Petr Vorel wrote:
> > Hi Piotr,

> > > > > Then it
> > > > > + * decreases the threshold for sending a thermal interrupt to just
> > > > > above
> > > > > + * the current temperature and runs a workload on the CPU.
> > > > First, why test needs to run for 30 sec and then sleep for 10 sec?
> > Maybe the most important of my questions / points.

> > > Here the point is to use a decreasing timeout. The test starts with 10s
> > > cooldown to make sure that even pre-production CPU's, which might have
> > > their thermal protections disabled, cool down properly. Once sleep time
> > > reaches 0, the conclusion is that either there was not enough workload
> > > or somehow interrupts are not triggered after all.
> > Why 30 sec and then sleep for 10 sec? Is it really needed to do it this way?

> Of course not.


> > Aren't these times depending on the tested machine? Some of them will fail due
> > time not running enough,

> That's unexpected with the numbers that are used, so something is amiss if
> it fails (and so it should fail).

I tested on very old (~15 years) Thinkpad, quite powerful 3 years old Thinkpad
and some random machine kind of between these two. All detect the threshold with
less than 10% of time (heating CPU runtime 1s and sleep time 1s).

You go with 30s to smaller values. Wouldn't be faster to go the opposite
(start with small values and increase)? The diff below runs successfully on all
3 machines. What am I missing?

> > other will waste time (if they get interrupt e.g. in 10 sec).

> That very well may happen, but is it a big deal?

We try to cut down the test runtime, because LTP test collection is huge
and runtime for it is many hours [1]. For example, we have many CVE tests which
detect race condition. Instead of running each test for "safe long time enough"
which could be e.g. several minutes for many of them we have way to shorten the
time (see include/tst_fuzzy_sync.h).

[1] https://linux-test-project.readthedocs.io/en/latest/developers/ground_rules.html#why-is-sleep-in-tests-bad-then

> > The usual approach would be to have the timeout safe enough for any type
> > of hardware but proactively check the temperature and stop testing once it's
> > done.

> We want to create conditions in which the temperature should rise and if it
> doesn't, then there is a problem.

Sure.

> That said, the temperature can of course be checked more proactively, at
> least in principle, like say run cpu_workload() for 1s, check the
> temperature, repeat that several times, then cool down etc.

Yeah, that's kind of my proposal above.

Also, all of my 3 machines have only 1x x86_pkg_temp type, but I suppose there
are devices with more (I was not able to figure that out from
drivers/thermal/intel/x86_pkg_temp_thermal.c but otherwise the test would not
try to test them all). But why it is important to test them all? Isn't it enough
just to test a single one?

Kind regards,
Petr

> BR, Rafael


+++ testcases/kernel/thermal/thermal_interrupt_events.c
@@ -117,8 +117,8 @@ static void *cpu_workload(double run_time)
 static void test_zone(int i)
 {
 			char path[NAME_MAX], temp_path[NAME_MAX];
-			int sleep_time = SLEEP, temp_high, temp;
-			double run_time = RUNTIME;
+			int sleep_time = 1, temp_high, temp;
+			double run_time = 1;
 
 			snprintf(path, NAME_MAX, "/sys/class/thermal/thermal_zone%d/", i);
 			strncpy(temp_path, path, NAME_MAX);
@@ -138,7 +138,7 @@ static void test_zone(int i)
 			SAFE_FILE_SCANF(trip_path, "%d", &trip);
 			SAFE_FILE_PRINTF(trip_path, "%d", temp_high);
 
-			while (sleep_time > 0) {
+			while (sleep_time < SLEEP) {
 				tst_res(TDEBUG, "Running for %f seconds, then sleeping for %d seconds", run_time, sleep_time);
 
 				for (int j = 0; j < nproc; j++) {
@@ -155,8 +155,8 @@ static void test_zone(int i)
 
 				if (temp > temp_high)
 					break;
-				sleep(sleep_time--);
-				run_time -= 3;
+				sleep(sleep_time++);
+				run_time += 3;
 			}
 
 			if (temp <= temp_high)

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

  reply	other threads:[~2026-01-29 23:25 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-21 13:41 [LTP] [PATCH v3] thermal: add new test group Piotr Kubaj
2026-01-22 14:07 ` Petr Vorel
2026-01-23 13:12   ` Kubaj, Piotr
2026-01-23 18:28     ` Petr Vorel
2026-01-29 21:40       ` Wysocki, Rafael J
2026-01-29 23:24         ` Petr Vorel [this message]
2026-02-03 14:38           ` Wysocki, Rafael J
  -- strict thread matches above, loose matches on Subject: below --
2025-11-19 11:45 Piotr Kubaj
2025-11-20 16:30 ` Petr Vorel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260129232454.GA110393@pevik \
    --to=pvorel@suse.cz \
    --cc=daniel.niestepski@intel.com \
    --cc=helena.anna.dubel@intel.com \
    --cc=ltp@lists.linux.it \
    --cc=rafael.j.wysocki@intel.com \
    --cc=tomasz.ossowski@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox