[LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs

public inbox for ltp@lists.linux.it
 help / color / mirror / Atom feed

* [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs
@ 2024-12-22  7:22 Li Wang
  2024-12-22  7:22 ` [LTP] [PATCH 2/2] starvation: skip test on slow kernel Li Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: Li Wang @ 2024-12-22  7:22 UTC (permalink / raw)
  To: ltp

This refines the handling of timeouts for tests running on
systems with slow kernel configurations (kconfigs).

Previously, the max_runtime was multiplied globally when
slow kconfigs were detected, which inadvertently prolonged
the runtime of all tests using max_runtime for control.

This patch corrects that behavior by applying the multiplication
specifically to timeouts, ensuring it only affects the intended
operations without impacting other tests.

Fixes: 2da30df24 ("lib: multiply the max_runtime if detect slow kconfigs")
Signed-off-by: Li Wang <liwang@redhat.com>
---
 lib/tst_test.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/tst_test.c b/lib/tst_test.c
index 205fc8326..feffc9f86 100644
--- a/lib/tst_test.c
+++ b/lib/tst_test.c
@@ -555,9 +555,6 @@ static int multiply_runtime(int max_runtime)
 
 	parse_mul(&runtime_mul, "LTP_RUNTIME_MUL", 0.0099, 100);
 
-	if (tst_has_slow_kconfig())
-		max_runtime *= 4;
-
 	return max_runtime * runtime_mul;
 }
 
@@ -1706,6 +1703,9 @@ unsigned int tst_multiply_timeout(unsigned int timeout)
 	if (timeout < 1)
 		tst_brk(TBROK, "timeout must to be >= 1! (%d)", timeout);
 
+	if (tst_has_slow_kconfig())
+		timeout *= 4;
+
 	return timeout * timeout_mul;
 }
 
-- 
2.47.1


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [LTP] [PATCH 2/2] starvation: skip test on slow kernel
  2024-12-22  7:22 [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs Li Wang
@ 2024-12-22  7:22 ` Li Wang
  2025-01-02 12:56   ` Petr Vorel
  2025-01-02 12:43 ` [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs Petr Vorel
  2025-01-15 22:41 ` [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs) Petr Vorel
  2 siblings, 1 reply; 40+ messages in thread
From: Li Wang @ 2024-12-22  7:22 UTC (permalink / raw)
  To: ltp

Systems with slow kernel configurations may not meet
the performance requirements necessary for the starvation
test to produce valid results.

Skipping the test ensures that it runs only on systems
where its results are meaningful.

Signed-off-by: Li Wang <liwang@redhat.com>
---
 testcases/kernel/sched/cfs-scheduler/starvation.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/testcases/kernel/sched/cfs-scheduler/starvation.c b/testcases/kernel/sched/cfs-scheduler/starvation.c
index e707e0865..0fd53a0ca 100644
--- a/testcases/kernel/sched/cfs-scheduler/starvation.c
+++ b/testcases/kernel/sched/cfs-scheduler/starvation.c
@@ -21,6 +21,7 @@
 #include <sched.h>
 
 #include "tst_test.h"
+#include "tst_kconfig.h"
 #include "tst_safe_clocks.h"
 #include "tst_timer.h"
 
@@ -108,6 +109,9 @@ static void setup(void)
 	else
 		timeout = callibrate() / 1000;
 
+	if (tst_has_slow_kconfig())
+		tst_brk(TCONF, "Skip test due to slow kernel configuration");
+
 	tst_set_max_runtime(timeout);
 }
 
-- 
2.47.1


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs
  2024-12-22  7:22 [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs Li Wang
  2024-12-22  7:22 ` [LTP] [PATCH 2/2] starvation: skip test on slow kernel Li Wang
@ 2025-01-02 12:43 ` Petr Vorel
  2025-01-03  5:00   ` Li Wang
  2025-01-15 22:41 ` [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs) Petr Vorel
  2 siblings, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-02 12:43 UTC (permalink / raw)
  To: Li Wang; +Cc: Martin Doucha, ltp

Hi Li, all,

[ Cc others to get broather feedback ]

> This refines the handling of timeouts for tests running on
> systems with slow kernel configurations (kconfigs).

> Previously, the max_runtime was multiplied globally when
> slow kconfigs were detected, which inadvertently prolonged
> the runtime of all tests using max_runtime for control.

> This patch corrects that behavior by applying the multiplication
> specifically to timeouts, ensuring it only affects the intended
> operations without impacting other tests.

> Fixes: 2da30df24 ("lib: multiply the max_runtime if detect slow kconfigs")

Thanks for handling this, I overlooked it on 27th, thus review it now.

Multiplying whole timeout instead of max_runtime helps to hide longer timeout
from the test which uses detection via tst_remaining_runtime(). I.e. previously
it behaved on slow config as LTP_RUNTIME_MUL=4, now as LTP_TIMEOUT_MUL=4.
Good idea. IMHO good enough (Martin previously suggested [1] to add a new
tst_test flag to identify tests which exit when runtime expires).

Reviewed-by: Petr Vorel <pvorel@suse.cz>

Some measurements on my Tumbleweed VM, which is detected as slow due CONFIG_LATENCYTOP:

TEST                                              | 2da30df24~ | 2da30df24  | this patch
--------------------------------------------------|-------------------------------------
swapping01.c (calls tst_remaining_runtime())      | 0h 10m 30s | 0h 40m 30s | 0h 12m 00s
tst_fuzzy_sync01.c (calls tst_remaining_runtime())| 0h 03m 00s | 0h 10m 30s | 0h 04m 30s
tst_cgroup02.c (default timeout 0h 00m 30s)       | 0h 00m 30s | 0h 00m 30s | 0h 02m 00s
test_runtime01.c (.max_runtime = 4, calls         | 0h 00m 34s | 0h 00m 46s | 0h 02m 04s
tst_remaining_runtime())
starvation.c (calls tst_remaining_runtime() only  | 0h 01m 05s | 0h 02m 50s | 0h 02m 34s
to detect failure)                                |

=> Tests which call tst_remaining_runtime() runs slightly longer, but IMHO
that's OK. Other tests (regardless if with the default runtime or these which
set .max_runtime) run 4* longer as expected.

Tested-by: Petr Vorel <pvorel@suse.cz>

Kind regards,
Petr

[1] https://lore.kernel.org/ltp/b6da77f3-45d6-4eed-b2d3-90ad20c63e50@suse.cz/

> Signed-off-by: Li Wang <liwang@redhat.com>
> ---
>  lib/tst_test.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

> diff --git a/lib/tst_test.c b/lib/tst_test.c
> index 205fc8326..feffc9f86 100644
> --- a/lib/tst_test.c
> +++ b/lib/tst_test.c
> @@ -555,9 +555,6 @@ static int multiply_runtime(int max_runtime)

>  	parse_mul(&runtime_mul, "LTP_RUNTIME_MUL", 0.0099, 100);

> -	if (tst_has_slow_kconfig())
> -		max_runtime *= 4;
> -
>  	return max_runtime * runtime_mul;
>  }

> @@ -1706,6 +1703,9 @@ unsigned int tst_multiply_timeout(unsigned int timeout)
>  	if (timeout < 1)
>  		tst_brk(TBROK, "timeout must to be >= 1! (%d)", timeout);

> +	if (tst_has_slow_kconfig())
> +		timeout *= 4;
> +
>  	return timeout * timeout_mul;
>  }

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 2/2] starvation: skip test on slow kernel
  2024-12-22  7:22 ` [LTP] [PATCH 2/2] starvation: skip test on slow kernel Li Wang
@ 2025-01-02 12:56   ` Petr Vorel
  2025-01-02 14:31     ` Petr Vorel
  0 siblings, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-02 12:56 UTC (permalink / raw)
  To: Li Wang; +Cc: ltp

Hi Li,

> Systems with slow kernel configurations may not meet
> the performance requirements necessary for the starvation
> test to produce valid results.

> Skipping the test ensures that it runs only on systems
> where its results are meaningful.
...
> +	if (tst_has_slow_kconfig())
> +		tst_brk(TCONF, "Skip test due to slow kernel configuration");
> +

Reviewed-by: Petr Vorel <pvorel@suse.cz>

I suppose you have strong reason for this (e.g. it's failing on some slow
machine). I'm testing it on emulated riscv64 to see whether it's needed to be
skipped. Because due CONFIG_LATENCYTOP it will always be skipped on Tumbleweed.

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 2/2] starvation: skip test on slow kernel
  2025-01-02 12:56   ` Petr Vorel
@ 2025-01-02 14:31     ` Petr Vorel
  2025-01-03  7:53       ` Li Wang
  0 siblings, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-02 14:31 UTC (permalink / raw)
  To: Li Wang, ltp

> Hi Li,

> > Systems with slow kernel configurations may not meet
> > the performance requirements necessary for the starvation
> > test to produce valid results.

> > Skipping the test ensures that it runs only on systems
> > where its results are meaningful.
> ...
> > +	if (tst_has_slow_kconfig())
> > +		tst_brk(TCONF, "Skip test due to slow kernel configuration");
> > +

> Reviewed-by: Petr Vorel <pvorel@suse.cz>

> I suppose you have strong reason for this (e.g. it's failing on some slow
> machine). I'm testing it on emulated riscv64 to see whether it's needed to be
> skipped. Because due CONFIG_LATENCYTOP it will always be skipped on Tumbleweed.

OK, I was able to reproduce TFAIL on emulated riscv64, which runs ~ 4m 30s, with
timeout increased to 0h 05m 46s:

tst_tmpdir.c:316: TINFO: Using /tmp/LTP_sta3UIB6E as tmpdir (tmpfs filesystem)
tst_test.c:1893: TINFO: LTP version: 20240930
tst_test.c:1897: TINFO: Tested kernel: 6.12.6-1-default #1 SMP PREEMPT_DYNAMIC Thu Dec 19 17:23:25 UTC 2024 (fb072de) riscv64
tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which might slow the execution
tst_test.c:1728: TINFO: Timeout per run is 0h 02m 00s
starvation.c:98: TINFO: Setting affinity to CPU 0
starvation.c:52: TINFO: CPU did 120000000 loops in 226566us
tst_test.c:1736: TINFO: Updating max runtime to 0h 03m 46s
tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which might slow the execution
tst_test.c:1728: TINFO: Timeout per run is 0h 05m 46s
starvation.c:148: TFAIL: Scheduller starvation reproduced.

But it works on emulated x86_64.

tst_tmpdir.c:316: TINFO: Using /tmp/LTP_staIt0g73 as tmpdir (tmpfs filesystem)
tst_test.c:1893: TINFO: LTP version: 20240930-148-g4539bfbc7
tst_test.c:1897: TINFO: Tested kernel: 6.13.0-rc4-1.g4a5c6a6-default #1 SMP PREEMPT_DYNAMIC Sun Dec 22 22:11:35 UTC 2024 (4a5c6a6) x86_64
tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which might slow the execution
tst_test.c:1728: TINFO: Timeout per run is 0h 02m 00s
starvation.c:98: TINFO: Setting affinity to CPU 0
starvation.c:52: TINFO: CPU did 120000000 loops in 34189us
tst_test.c:1736: TINFO: Updating max runtime to 0h 00m 34s
tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which might slow the execution
tst_test.c:1728: TINFO: Timeout per run is 0h 02m 34s
starvation.c:150: TPASS: Haven't reproduced scheduller starvation.

I wonder if we can quit with TCONF only on certain runtime increase. We would
also need to take -l (number of loops into an account).

Kind regards,
Petr

> Kind regards,
> Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs
  2025-01-02 12:43 ` [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs Petr Vorel
@ 2025-01-03  5:00   ` Li Wang
  2025-01-03  7:06     ` Petr Vorel
  2025-01-03 15:48     ` [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD Cyril Hrubis
  0 siblings, 2 replies; 40+ messages in thread
From: Li Wang @ 2025-01-03  5:00 UTC (permalink / raw)
  To: Petr Vorel; +Cc: Martin Doucha, ltp

On Thu, Jan 2, 2025 at 8:43 PM Petr Vorel <pvorel@suse.cz> wrote:

> Hi Li, all,
>
> [ Cc others to get broather feedback ]
>
> > This refines the handling of timeouts for tests running on
> > systems with slow kernel configurations (kconfigs).
>
> > Previously, the max_runtime was multiplied globally when
> > slow kconfigs were detected, which inadvertently prolonged
> > the runtime of all tests using max_runtime for control.
>
> > This patch corrects that behavior by applying the multiplication
> > specifically to timeouts, ensuring it only affects the intended
> > operations without impacting other tests.
>
> > Fixes: 2da30df24 ("lib: multiply the max_runtime if detect slow
> kconfigs")
>
> Thanks for handling this, I overlooked it on 27th, thus review it now.
>
> Multiplying whole timeout instead of max_runtime helps to hide longer
> timeout
> from the test which uses detection via tst_remaining_runtime(). I.e.
> previously
> it behaved on slow config as LTP_RUNTIME_MUL=4, now as LTP_TIMEOUT_MUL=4.
>

Yes, the benefit of multiplying TIMEOUT (on a slow system) is not only
to avoid increasing the actual execution time of the test, but also to give
 the system more time to wait for the test to complete the final work.

Original:
  |  -- timeout -- | -- max_runtime -- |

Previous:
  |  -- timeout -- | -------- max_runtime * 4 -------- |

Now:
  |  -------- timeout * 4 -------- | -- max_runtime --  |




> Good idea. IMHO good enough (Martin previously suggested [1] to add a new
> tst_test flag to identify tests which exit when runtime expires).
>

Introduce a new tst_test flag to split the max_runtime into two parts can
resolve it as well, but the disadvantage might make people hard to
understand the LTP time controlling, if go with timeout, max_runtime,
max_exetime
I think 'simple+uselful' is beautiful unless we need to complex it in the
future.



>
> Reviewed-by: Petr Vorel <pvorel@suse.cz>
>
> Some measurements on my Tumbleweed VM, which is detected as slow due
> CONFIG_LATENCYTOP:
>
> TEST                                              | 2da30df24~ |
> 2da30df24  | this patch
>
> --------------------------------------------------|-------------------------------------
> swapping01.c (calls tst_remaining_runtime())      | 0h 10m 30s | 0h 40m
> 30s | 0h 12m 00s
> tst_fuzzy_sync01.c (calls tst_remaining_runtime())| 0h 03m 00s | 0h 10m
> 30s | 0h 04m 30s
> tst_cgroup02.c (default timeout 0h 00m 30s)       | 0h 00m 30s | 0h 00m
> 30s | 0h 02m 00s
> test_runtime01.c (.max_runtime = 4, calls         | 0h 00m 34s | 0h 00m
> 46s | 0h 02m 04s
> tst_remaining_runtime())
> starvation.c (calls tst_remaining_runtime() only  | 0h 01m 05s | 0h 02m
> 50s | 0h 02m 34s
> to detect failure)                                |
>
> => Tests which call tst_remaining_runtime() runs slightly longer, but IMHO
> that's OK. Other tests (regardless if with the default runtime or these
> which
> set .max_runtime) run 4* longer as expected.
>
> Tested-by: Petr Vorel <pvorel@suse.cz>
>

The longer time is not because call tst_remaining_runtime(), it just comes
from
the ' timeout *= 4' while detecting slow configs, as you can see, the
original default
timeout is 30s, and with multiple 4 it become 120s (which is +2mins), all
the test with
this patch shows that 2 more mins there.

But, that does not mean the test executing time is really adding 2 minutes,
it just
means having that timeout value. We need to use `time ./swapping01` to
evaluate the real test time, and I didn't find any more delay with this
method :).

Thanks for the comparison, actually I did some tests for RHEL and got a
good result.


> Kind regards,
> Petr
>
> [1]
> https://lore.kernel.org/ltp/b6da77f3-45d6-4eed-b2d3-90ad20c63e50@suse.cz/
>
> > Signed-off-by: Li Wang <liwang@redhat.com>
> > ---
> >  lib/tst_test.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
>
> > diff --git a/lib/tst_test.c b/lib/tst_test.c
> > index 205fc8326..feffc9f86 100644
> > --- a/lib/tst_test.c
> > +++ b/lib/tst_test.c
> > @@ -555,9 +555,6 @@ static int multiply_runtime(int max_runtime)
>
> >       parse_mul(&runtime_mul, "LTP_RUNTIME_MUL", 0.0099, 100);
>
> > -     if (tst_has_slow_kconfig())
> > -             max_runtime *= 4;
> > -
> >       return max_runtime * runtime_mul;
> >  }
>
> > @@ -1706,6 +1703,9 @@ unsigned int tst_multiply_timeout(unsigned int
> timeout)
> >       if (timeout < 1)
> >               tst_brk(TBROK, "timeout must to be >= 1! (%d)", timeout);
>
> > +     if (tst_has_slow_kconfig())
> > +             timeout *= 4;
> > +
> >       return timeout * timeout_mul;
> >  }
>
>

-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs
  2025-01-03  5:00   ` Li Wang
@ 2025-01-03  7:06     ` Petr Vorel
  2025-01-03  7:33       ` Petr Vorel
  2025-01-03 15:48     ` [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD Cyril Hrubis
  1 sibling, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-03  7:06 UTC (permalink / raw)
  To: Li Wang; +Cc: Martin Doucha, ltp

> On Thu, Jan 2, 2025 at 8:43 PM Petr Vorel <pvorel@suse.cz> wrote:

> > Hi Li, all,

> > [ Cc others to get broather feedback ]

> > > This refines the handling of timeouts for tests running on
> > > systems with slow kernel configurations (kconfigs).

> > > Previously, the max_runtime was multiplied globally when
> > > slow kconfigs were detected, which inadvertently prolonged
> > > the runtime of all tests using max_runtime for control.

> > > This patch corrects that behavior by applying the multiplication
> > > specifically to timeouts, ensuring it only affects the intended
> > > operations without impacting other tests.

> > > Fixes: 2da30df24 ("lib: multiply the max_runtime if detect slow
> > kconfigs")

> > Thanks for handling this, I overlooked it on 27th, thus review it now.

> > Multiplying whole timeout instead of max_runtime helps to hide longer
> > timeout
> > from the test which uses detection via tst_remaining_runtime(). I.e.
> > previously
> > it behaved on slow config as LTP_RUNTIME_MUL=4, now as LTP_TIMEOUT_MUL=4.


> Yes, the benefit of multiplying TIMEOUT (on a slow system) is not only
> to avoid increasing the actual execution time of the test, but also to give
>  the system more time to wait for the test to complete the final work.

> Original:
>   |  -- timeout -- | -- max_runtime -- |

> Previous:
>   |  -- timeout -- | -------- max_runtime * 4 -------- |

> Now:
>   |  -------- timeout * 4 -------- | -- max_runtime --  |

Later it'd be nice to document this simple timeline (also with LTP_RUNTIME_MUL
and LTP_TIMEOUT_MUL) in sphinx docs (/** */). Or, it could be in lib/README.md,
but I would like to convert also this page to sphinx.

> > Good idea. IMHO good enough (Martin previously suggested [1] to add a new
> > tst_test flag to identify tests which exit when runtime expires).


> Introduce a new tst_test flag to split the max_runtime into two parts can
> resolve it as well, but the disadvantage might make people hard to
> understand the LTP time controlling, if go with timeout, max_runtime,
> max_exetime
> I think 'simple+uselful' is beautiful unless we need to complex it in the
> future.

+1, I fully agree.

> > Reviewed-by: Petr Vorel <pvorel@suse.cz>

> > Some measurements on my Tumbleweed VM, which is detected as slow due
> > CONFIG_LATENCYTOP:

> > TEST                                              | 2da30df24~ |
> > 2da30df24  | this patch

> > --------------------------------------------------|-------------------------------------
> > swapping01.c (calls tst_remaining_runtime())      | 0h 10m 30s | 0h 40m
> > 30s | 0h 12m 00s
> > tst_fuzzy_sync01.c (calls tst_remaining_runtime())| 0h 03m 00s | 0h 10m
> > 30s | 0h 04m 30s
> > tst_cgroup02.c (default timeout 0h 00m 30s)       | 0h 00m 30s | 0h 00m
> > 30s | 0h 02m 00s
> > test_runtime01.c (.max_runtime = 4, calls         | 0h 00m 34s | 0h 00m
> > 46s | 0h 02m 04s
> > tst_remaining_runtime())
> > starvation.c (calls tst_remaining_runtime() only  | 0h 01m 05s | 0h 02m
> > 50s | 0h 02m 34s
> > to detect failure)                                |

> > => Tests which call tst_remaining_runtime() runs slightly longer, but IMHO
> > that's OK. Other tests (regardless if with the default runtime or these
> > which
> > set .max_runtime) run 4* longer as expected.

> > Tested-by: Petr Vorel <pvorel@suse.cz>


> The longer time is not because call tst_remaining_runtime(), it just comes
> from
> the ' timeout *= 4' while detecting slow configs, as you can see, the
> original default
> timeout is 30s, and with multiple 4 it become 120s (which is +2mins), all
> the test with
> this patch shows that 2 more mins there.

> But, that does not mean the test executing time is really adding 2 minutes,
> it just
> means having that timeout value. We need to use `time ./swapping01` to
> evaluate the real test time, and I didn't find any more delay with this
> method :).

Yes, I noticed that (measuring just test_runtime01.c, where it's nicely
visible).

> Thanks for the comparison, actually I did some tests for RHEL and got a
> good result.

I'm OK with whole result. I'd be happier if we could avoid TCONF of starvation,
but let's discuss this on that patch.

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs
  2025-01-03  7:06     ` Petr Vorel
@ 2025-01-03  7:33       ` Petr Vorel
  2025-01-03  7:58         ` Li Wang
  0 siblings, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-03  7:33 UTC (permalink / raw)
  To: Li Wang, ltp, Martin Doucha, Cyril Hrubis, Avinesh Kumar,
	Jan Stancek

Hi Li, all,

thinking about it twice, I vote for merging the patchset.
It's an improvement to the current state. And we can solve starvation later.

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 2/2] starvation: skip test on slow kernel
  2025-01-02 14:31     ` Petr Vorel
@ 2025-01-03  7:53       ` Li Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Li Wang @ 2025-01-03  7:53 UTC (permalink / raw)
  To: Petr Vorel; +Cc: ltp

On Thu, Jan 2, 2025 at 10:31 PM Petr Vorel <pvorel@suse.cz> wrote:

> > Hi Li,
>
> > > Systems with slow kernel configurations may not meet
> > > the performance requirements necessary for the starvation
> > > test to produce valid results.
>
> > > Skipping the test ensures that it runs only on systems
> > > where its results are meaningful.
> > ...
> > > +   if (tst_has_slow_kconfig())
> > > +           tst_brk(TCONF, "Skip test due to slow kernel
> configuration");
> > > +
>
> > Reviewed-by: Petr Vorel <pvorel@suse.cz>
>
> > I suppose you have strong reason for this (e.g. it's failing on some slow
> > machine). I'm testing it on emulated riscv64 to see whether it's needed
> to be
> > skipped. Because due CONFIG_LATENCYTOP it will always be skipped on
> Tumbleweed.
>
> OK, I was able to reproduce TFAIL on emulated riscv64, which runs ~ 4m
> 30s, with
> timeout increased to 0h 05m 46s:
>
> tst_tmpdir.c:316: TINFO: Using /tmp/LTP_sta3UIB6E as tmpdir (tmpfs
> filesystem)
> tst_test.c:1893: TINFO: LTP version: 20240930
> tst_test.c:1897: TINFO: Tested kernel: 6.12.6-1-default #1 SMP
> PREEMPT_DYNAMIC Thu Dec 19 17:23:25 UTC 2024 (fb072de) riscv64
> tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
> tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which
> might slow the execution
> tst_test.c:1728: TINFO: Timeout per run is 0h 02m 00s
> starvation.c:98: TINFO: Setting affinity to CPU 0
> starvation.c:52: TINFO: CPU did 120000000 loops in 226566us
> tst_test.c:1736: TINFO: Updating max runtime to 0h 03m 46s
> tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
> tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which
> might slow the execution
> tst_test.c:1728: TINFO: Timeout per run is 0h 05m 46s
> starvation.c:148: TFAIL: Scheduller starvation reproduced.
>
> But it works on emulated x86_64.
>
> tst_tmpdir.c:316: TINFO: Using /tmp/LTP_staIt0g73 as tmpdir (tmpfs
> filesystem)
> tst_test.c:1893: TINFO: LTP version: 20240930-148-g4539bfbc7
> tst_test.c:1897: TINFO: Tested kernel: 6.13.0-rc4-1.g4a5c6a6-default #1
> SMP PREEMPT_DYNAMIC Sun Dec 22 22:11:35 UTC 2024 (4a5c6a6) x86_64
> tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
> tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which
> might slow the execution
> tst_test.c:1728: TINFO: Timeout per run is 0h 02m 00s
> starvation.c:98: TINFO: Setting affinity to CPU 0
> starvation.c:52: TINFO: CPU did 120000000 loops in 34189us
> tst_test.c:1736: TINFO: Updating max runtime to 0h 00m 34s
> tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
> tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which
> might slow the execution
> tst_test.c:1728: TINFO: Timeout per run is 0h 02m 34s
> starvation.c:150: TPASS: Haven't reproduced scheduller starvation.
>
> I wonder if we can quit with TCONF only on certain runtime increase. We
> would
> also need to take -l (number of loops into an account).
>

The simplest way is to extend 'max_runtime' when detecting slow configs
for only starvation.c test.

Just multiply the argument of tst_set_max_runtime(timeout) for starvation.c.

I didn't do that because Philip (our Scheduler expert) told me that not much
sense to run this on debug-kernel.
(I don't know that Tumbleweed enables CONFIG_LATENCYTOP by default)


-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs
  2025-01-03  7:33       ` Petr Vorel
@ 2025-01-03  7:58         ` Li Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Li Wang @ 2025-01-03  7:58 UTC (permalink / raw)
  To: Petr Vorel; +Cc: Martin Doucha, ltp

On Fri, Jan 3, 2025 at 3:33 PM Petr Vorel <pvorel@suse.cz> wrote:

> Hi Li, all,
>
> thinking about it twice, I vote for merging the patchset.
> It's an improvement to the current state. And we can solve starvation
> later.
>

Yes, the patch only extends the timeout value and does not impact the real
execution time of LTP.

starvation.c is a dedicated test we can treat separately (if people still
want to run it on slow kernel).

-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-03  5:00   ` Li Wang
  2025-01-03  7:06     ` Petr Vorel
@ 2025-01-03 15:48     ` Cyril Hrubis
  2025-01-04  3:38       ` Li Wang
  1 sibling, 1 reply; 40+ messages in thread
From: Cyril Hrubis @ 2025-01-03 15:48 UTC (permalink / raw)
  To: Li Wang; +Cc: Martin Doucha, ltp

Hi!
> Yes, the benefit of multiplying TIMEOUT (on a slow system) is not only
> to avoid increasing the actual execution time of the test, but also to give
>  the system more time to wait for the test to complete the final work.
> 
> Original:
>   |  -- timeout -- | -- max_runtime -- |
> 
> Previous:
>   |  -- timeout -- | -------- max_runtime * 4 -------- |
> 
> Now:
>   |  -------- timeout * 4 -------- | -- max_runtime --  |

The problems I see here:

There are tests, I think this covers mostly the IO stress tests, where
the max runtime may be significantly larger than the timeout, so
multiplying only the timeout may not be enough there.

I wanted to eventually move to a shorter default timeout, e.g. 10s once
we have enough max_runtime anotation in the testcases.

So in the end we may eventually need both max_runtime and runtime in the
tst_test structure.

-- 
Cyril Hrubis
chrubis@suse.cz

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-03 15:48     ` [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD Cyril Hrubis
@ 2025-01-04  3:38       ` Li Wang
  2025-01-06  9:35         ` Petr Vorel
  2025-01-06 12:10         ` Cyril Hrubis
  0 siblings, 2 replies; 40+ messages in thread
From: Li Wang @ 2025-01-04  3:38 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: Martin Doucha, ltp

On Fri, Jan 3, 2025 at 11:48 PM Cyril Hrubis <chrubis@suse.cz> wrote:

> Hi!
> > Yes, the benefit of multiplying TIMEOUT (on a slow system) is not only
> > to avoid increasing the actual execution time of the test, but also to
> give
> >  the system more time to wait for the test to complete the final work.
> >
> > Original:
> >   |  -- timeout -- | -- max_runtime -- |
> >
> > Previous:
> >   |  -- timeout -- | -------- max_runtime * 4 -------- |
> >
> > Now:
> >   |  -------- timeout * 4 -------- | -- max_runtime --  |
>
> The problems I see here:
>
> There are tests, I think this covers mostly the IO stress tests, where
> the max runtime may be significantly larger than the timeout, so
> multiplying only the timeout may not be enough there.
>

I did a quick grep that some ltp-aiodio tests set it to 1800 sec, which
only 8/91 occupation in the LTP testcases/, I'm not sure if it's worth
adding a new field for those few stress tests.

And with the previous method, the multiple 4 max_runtime for 1800s
is 2hours per test up limit, I can't imagine how long we will get eventually
in the whole test time.

Maybe another way is to create a separate function in a header
like aio_common.h (or in high-level dir) for handling that significantly
larger runtime tests.

BTW, we have TST_UNLIMITED_RUNTIME choice or, invoke
tst_set_max_runtime() in somehow.



> I wanted to eventually move to a shorter default timeout, e.g. 10s once
> we have enough max_runtime anotation in the testcases.
>

That's fine, but most tests do not have .max_runtime. I guess what
we should consider now is making them have enough timeout time
on the slow kernel, isn't it?


$ git grep .max_runtime | wc -l
91

$ git grep .max_runtime
cve/cve-2014-0196.c:    .max_runtime = 60,
cve/cve-2015-3290.c:    .max_runtime = 180,
cve/cve-2016-7117.c:    .max_runtime = 60,
cve/cve-2017-2671.c:    .max_runtime = 40,
kernel/controllers/cgroup/cgroup_core03.c:      .max_runtime = 20,
kernel/crypto/af_alg02.c:       .max_runtime = 20,
kernel/crypto/af_alg07.c:       .max_runtime = 150,
kernel/crypto/pcrypt_aead01.c:  .max_runtime = 300,
kernel/fs/fs_fill/fs_fill.c:    .max_runtime = 300,
kernel/fs/fsplough/fsplough.c:  tst_set_max_runtime(bufsize * loop_count /
(8 * 1024 * 1024));
kernel/fs/fsx-linux/fsx-linux.c:        .max_runtime = 1800,
kernel/fs/read_all/read_all.c:          tst_res(TINFO, "Worker timeout set
to 10%% of max_runtime: %dms",
kernel/fs/read_all/read_all.c:  .max_runtime = 100,
kernel/io/ltp-aiodio/aio-stress.c:      .max_runtime = 1800,
kernel/io/ltp-aiodio/aiocp.c:   .max_runtime = 1800,
kernel/io/ltp-aiodio/aiodio_append.c:   .max_runtime = 1800,
kernel/io/ltp-aiodio/aiodio_sparse.c:   .max_runtime = 1800,
kernel/io/ltp-aiodio/dio_append.c:      .max_runtime = 1800,
kernel/io/ltp-aiodio/dio_read.c:        .max_runtime = 1800,
kernel/io/ltp-aiodio/dio_sparse.c:      .max_runtime = 1800,
kernel/io/ltp-aiodio/dio_truncate.c:    .max_runtime = 1800,
kernel/mem/ksm/ksm02.c:         tst_set_max_runtime(32 * (size /
DEFAULT_MEMSIZE));
kernel/mem/ksm/ksm02.c: .max_runtime = 32,
kernel/mem/ksm/ksm04.c:         tst_set_max_runtime(32 * (size /
DEFAULT_MEMSIZE));
kernel/mem/ksm/ksm04.c: .max_runtime = 32,
kernel/mem/mmapstress/mmapstress01.c:   .max_runtime = 12,
kernel/mem/mtest01/mtest01.c:   .max_runtime = 300,
kernel/mem/mtest06/mmap1.c:     .max_runtime = 180,
kernel/mem/mtest06/mmap3.c:     .max_runtime = 60,
kernel/mem/mtest07/mallocstress.c:      .max_runtime = 600,
kernel/mem/oom/oom01.c: .max_runtime = TST_UNLIMITED_RUNTIME,
kernel/mem/oom/oom02.c: .max_runtime = TST_UNLIMITED_RUNTIME,
kernel/mem/oom/oom03.c: .max_runtime = TST_UNLIMITED_RUNTIME,
kernel/mem/oom/oom04.c: .max_runtime = TST_UNLIMITED_RUNTIME,
kernel/mem/oom/oom05.c: .max_runtime = TST_UNLIMITED_RUNTIME,
kernel/mem/swapping/swapping01.c:       .max_runtime = 600,
kernel/mem/thp/thp04.c: .max_runtime = 150,
kernel/mem/tunable/min_free_kbytes.c:   .max_runtime =
TST_UNLIMITED_RUNTIME,
kernel/pty/pty03.c:     .max_runtime = 30,
kernel/pty/pty05.c:     .max_runtime = 150,
kernel/pty/pty06.c:     .max_runtime = 150,
kernel/pty/pty07.c:     .max_runtime = 150,
kernel/sched/cfs-scheduler/cfs_bandwidth01.c:   .max_runtime = 20,
kernel/sched/cfs-scheduler/starvation.c:
 tst_set_max_runtime(timeout);
kernel/security/dirtyc0w_shmem/dirtyc0w_shmem.c:        .max_runtime = 120,
kernel/security/kallsyms/kallsyms.c:    .max_runtime = 60,
kernel/sound/snd_seq01.c:       .max_runtime = 60,
kernel/sound/snd_timer01.c:     .max_runtime = 150,
kernel/syscalls/bind/bind06.c:  .max_runtime = 300,
kernel/syscalls/copy_file_range/copy_file_range01.c:    .max_runtime = 5
kernel/syscalls/fallocate/fallocate06.c:        .max_runtime = 120,
kernel/syscalls/fork/fork13.c:  .max_runtime = 600,
kernel/syscalls/fsync/fsync02.c:        .max_runtime = 300,
kernel/syscalls/gettimeofday/gettimeofday02.c:  .max_runtime = 10,
kernel/syscalls/inotify/inotify06.c:    .max_runtime = 600,
kernel/syscalls/inotify/inotify09.c:    .max_runtime = 150,
kernel/syscalls/inotify/inotify12.c:    .max_runtime = 10,
kernel/syscalls/ioctl/ioctl_sg01.c:     .max_runtime = 3600,
kernel/syscalls/ipc/msgstress/msgstress01.c:    .max_runtime = 180,
kernel/syscalls/ipc/semget/semget05.c:  tst_set_max_runtime(maxsems / 200);
kernel/syscalls/ipc/shmctl/shmctl05.c:  .max_runtime = 10,
kernel/syscalls/keyctl/keyctl02.c:      .max_runtime = 60,
kernel/syscalls/landlock/landlock04.c:  .max_runtime = 360,
kernel/syscalls/madvise/madvise06.c:    .max_runtime = 60,
kernel/syscalls/madvise/madvise11.c:    .max_runtime = 30,
kernel/syscalls/migrate_pages/migrate_pages03.c:        .max_runtime = 300,
kernel/syscalls/move_pages/move_pages12.c:      .max_runtime = 240,
kernel/syscalls/nice/nice05.c:  .max_runtime = 3,
kernel/syscalls/perf_event_open/perf_event_open02.c:    .max_runtime = 72
kernel/syscalls/perf_event_open/perf_event_open03.c:    .max_runtime = 300,
kernel/syscalls/preadv2/preadv203.c:    .max_runtime = 60,
kernel/syscalls/readahead/readahead02.c:         * speed, sometime test
timeout when the default max_runtime is used up.
kernel/syscalls/readahead/readahead02.c:
 tst_set_max_runtime(test.max_runtime + (usec + usec_ra) / 1000000);
kernel/syscalls/readahead/readahead02.c:
 tst_set_max_runtime(1 + testfile_size / (DEFAULT_FILESIZE/32));
kernel/syscalls/readahead/readahead02.c:        .max_runtime = 30,
kernel/syscalls/request_key/request_key03.c:    .max_runtime = 20,
kernel/syscalls/sendfile/sendfile09.c:  .max_runtime = 120,
kernel/syscalls/sendmsg/sendmsg03.c:    .max_runtime = 150,
kernel/syscalls/set_mempolicy/set_mempolicy01.c:
 tst_set_max_runtime(test.max_runtime * (1 << nodes->cnt/16));
kernel/syscalls/set_mempolicy/set_mempolicy01.c:        .max_runtime = 600,
kernel/syscalls/setsockopt/setsockopt06.c:      .max_runtime = 270,
kernel/syscalls/setsockopt/setsockopt07.c:      .max_runtime = 150,
kernel/syscalls/swapoff/swapoff01.c:    .max_runtime = 60,
kernel/syscalls/swapon/swapon01.c:      .max_runtime = 60,
kernel/syscalls/timerfd/timerfd_settime02.c:    .max_runtime = 150,
kernel/syscalls/writev/writev03.c:      .max_runtime = 75,
network/can/cve/can_bcm01.c:    .max_runtime = 30,
network/netstress/netstress.c:  .max_runtime = 300,
network/nfs/nfs_stress/nfs05_make_tree.c:       .max_runtime = 300,
network/packet/fanout01.c:      .max_runtime = 180,
network/sockets/vsock01.c:      .max_runtime = 60,


-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-04  3:38       ` Li Wang
@ 2025-01-06  9:35         ` Petr Vorel
  2025-01-06 12:10         ` Cyril Hrubis
  1 sibling, 0 replies; 40+ messages in thread
From: Petr Vorel @ 2025-01-06  9:35 UTC (permalink / raw)
  To: Li Wang; +Cc: ltp, Martin Doucha

Hi all,

[ Cc Andrea due aiodio ]

> On Fri, Jan 3, 2025 at 11:48 PM Cyril Hrubis <chrubis@suse.cz> wrote:

> > Hi!
> > > Yes, the benefit of multiplying TIMEOUT (on a slow system) is not only
> > > to avoid increasing the actual execution time of the test, but also to
> > give
> > >  the system more time to wait for the test to complete the final work.

> > > Original:
> > >   |  -- timeout -- | -- max_runtime -- |

> > > Previous:
> > >   |  -- timeout -- | -------- max_runtime * 4 -------- |

> > > Now:
> > >   |  -------- timeout * 4 -------- | -- max_runtime --  |

> > The problems I see here:

> > There are tests, I think this covers mostly the IO stress tests, where
> > the max runtime may be significantly larger than the timeout, so
> > multiplying only the timeout may not be enough there.


> I did a quick grep that some ltp-aiodio tests set it to 1800 sec, which
> only 8/91 occupation in the LTP testcases/, I'm not sure if it's worth
> adding a new field for those few stress tests.

> And with the previous method, the multiple 4 max_runtime for 1800s
> is 2hours per test up limit, I can't imagine how long we will get eventually
> in the whole test time.

Agree. This solution is quite simple. I would also hesitate to add new field for
few tests.

It's questionable whether .max_runtime = 1800 is reasonable for aiodio tests at
all (see below). I'll try to have look and send a patch to shorten it.  For some
tests it'd be nice to have some calculation instead of blindly expecting they
run 30 min (because we have many runs of these tests and vast majority runs
slowly).

OTOH it's questionable how much time we should spent with these tests,
see Jan Kara [1]:

	So the performance of this test is irrelevant because combining buffered
	reads with direct IO writes was always in "better don't do it" territory.
	Definitely not if you care about performance.

NOTE: Jan's comment is only for aiodio_sparse.c and dio_sparse.c, but I suppose
this is related to all LTP aiodio tests (at least dio_append.c does also
buffered reads with O_DIRECT, dio_read.c combines buffered writes with O_DIRECT).

[1] https://lore.kernel.org/ltp/20240722090012.mlvkaenuxar2x3vr@quack3/

> Maybe another way is to create a separate function in a header
> like aio_common.h (or in high-level dir) for handling that significantly
> larger runtime tests.

> BTW, we have TST_UNLIMITED_RUNTIME choice or, invoke
> tst_set_max_runtime() in somehow.



> > I wanted to eventually move to a shorter default timeout, e.g. 10s once
> > we have enough max_runtime anotation in the testcases.


> That's fine, but most tests do not have .max_runtime. I guess what
> we should consider now is making them have enough timeout time
> on the slow kernel, isn't it?


> $ git grep .max_runtime | wc -l
> 91

> $ git grep .max_runtime
> cve/cve-2014-0196.c:    .max_runtime = 60,
> cve/cve-2015-3290.c:    .max_runtime = 180,
> cve/cve-2016-7117.c:    .max_runtime = 60,
> cve/cve-2017-2671.c:    .max_runtime = 40,
> kernel/controllers/cgroup/cgroup_core03.c:      .max_runtime = 20,
> kernel/crypto/af_alg02.c:       .max_runtime = 20,
> kernel/crypto/af_alg07.c:       .max_runtime = 150,
> kernel/crypto/pcrypt_aead01.c:  .max_runtime = 300,
> kernel/fs/fs_fill/fs_fill.c:    .max_runtime = 300,
> kernel/fs/fsplough/fsplough.c:  tst_set_max_runtime(bufsize * loop_count /
> (8 * 1024 * 1024));
> kernel/fs/fsx-linux/fsx-linux.c:        .max_runtime = 1800,

Just looking into some aiodio tests.

We have 22 variants of these tests (none without params). The longest is fsx22
(-N 100000) due much higher -N than other tests and the default (1000).
The highest runtime I see on this test is the current mainline (6.12): 1m 31s.

> kernel/fs/read_all/read_all.c:          tst_res(TINFO, "Worker timeout set
> to 10%% of max_runtime: %dms",
> kernel/fs/read_all/read_all.c:  .max_runtime = 100,
> kernel/io/ltp-aiodio/aio-stress.c:      .max_runtime = 1800,
> kernel/io/ltp-aiodio/aiocp.c:   .max_runtime = 1800,
> kernel/io/ltp-aiodio/aiodio_append.c:   .max_runtime = 1800,

For example aiodio_append runs on 1 sec on Tumbleweed.

> kernel/io/ltp-aiodio/aiodio_sparse.c:   .max_runtime = 1800,

This is a valid one, it runs without parameters 9m 39s (ADSP000).

But we have several parameters (it runs 38x with a different parameters)
Most of them run below 7s.

These are running longer, obviously combination of high -w and -n:
ADSP037 44s (-w 18192k -n 512)
ADSP038 1m 44s (-w 18192k -n 1000)

We simplified our life with blindly saying everything runs 30 min,
but vast majority don't.

Also this is a long run on the current mainline kernel (slow down in 6.6 [1]).
It runs on older kernels ~1 or 2 min.

> kernel/io/ltp-aiodio/dio_append.c:      .max_runtime = 1800,

This runs ~ 32 sec on Tumbleweed.

> kernel/io/ltp-aiodio/dio_read.c:        .max_runtime = 1800,
> kernel/io/ltp-aiodio/dio_sparse.c:      .max_runtime = 1800,

The only long running is ADSP039 (dio_sparse without params): 9m 37s 

> kernel/io/ltp-aiodio/dio_truncate.c:    .max_runtime = 1800,
> kernel/mem/ksm/ksm02.c:         tst_set_max_runtime(32 * (size /
> DEFAULT_MEMSIZE));
> kernel/mem/ksm/ksm02.c: .max_runtime = 32,
> kernel/mem/ksm/ksm04.c:         tst_set_max_runtime(32 * (size /
> DEFAULT_MEMSIZE));
> kernel/mem/ksm/ksm04.c: .max_runtime = 32,
> kernel/mem/mmapstress/mmapstress01.c:   .max_runtime = 12,
> kernel/mem/mtest01/mtest01.c:   .max_runtime = 300,
> kernel/mem/mtest06/mmap1.c:     .max_runtime = 180,
> kernel/mem/mtest06/mmap3.c:     .max_runtime = 60,
> kernel/mem/mtest07/mallocstress.c:      .max_runtime = 600,
> kernel/mem/oom/oom01.c: .max_runtime = TST_UNLIMITED_RUNTIME,
> kernel/mem/oom/oom02.c: .max_runtime = TST_UNLIMITED_RUNTIME,
> kernel/mem/oom/oom03.c: .max_runtime = TST_UNLIMITED_RUNTIME,
> kernel/mem/oom/oom04.c: .max_runtime = TST_UNLIMITED_RUNTIME,
> kernel/mem/oom/oom05.c: .max_runtime = TST_UNLIMITED_RUNTIME,
> kernel/mem/swapping/swapping01.c:       .max_runtime = 600,
> kernel/mem/thp/thp04.c: .max_runtime = 150,
> kernel/mem/tunable/min_free_kbytes.c:   .max_runtime =
> TST_UNLIMITED_RUNTIME,
> kernel/pty/pty03.c:     .max_runtime = 30,
> kernel/pty/pty05.c:     .max_runtime = 150,
> kernel/pty/pty06.c:     .max_runtime = 150,
> kernel/pty/pty07.c:     .max_runtime = 150,
> kernel/sched/cfs-scheduler/cfs_bandwidth01.c:   .max_runtime = 20,
> kernel/sched/cfs-scheduler/starvation.c:
>  tst_set_max_runtime(timeout);
> kernel/security/dirtyc0w_shmem/dirtyc0w_shmem.c:        .max_runtime = 120,
> kernel/security/kallsyms/kallsyms.c:    .max_runtime = 60,
> kernel/sound/snd_seq01.c:       .max_runtime = 60,
> kernel/sound/snd_timer01.c:     .max_runtime = 150,
> kernel/syscalls/bind/bind06.c:  .max_runtime = 300,
> kernel/syscalls/copy_file_range/copy_file_range01.c:    .max_runtime = 5
> kernel/syscalls/fallocate/fallocate06.c:        .max_runtime = 120,
> kernel/syscalls/fork/fork13.c:  .max_runtime = 600,
> kernel/syscalls/fsync/fsync02.c:        .max_runtime = 300,
> kernel/syscalls/gettimeofday/gettimeofday02.c:  .max_runtime = 10,
> kernel/syscalls/inotify/inotify06.c:    .max_runtime = 600,
> kernel/syscalls/inotify/inotify09.c:    .max_runtime = 150,
> kernel/syscalls/inotify/inotify12.c:    .max_runtime = 10,
> kernel/syscalls/ioctl/ioctl_sg01.c:     .max_runtime = 3600,

FYI Martin set 1 hour due long setup on systems with big RAM.
Real runtime is 1 sec on 8GB RAM.

Kind regards,
Petr

> kernel/syscalls/ipc/msgstress/msgstress01.c:    .max_runtime = 180,
> kernel/syscalls/ipc/semget/semget05.c:  tst_set_max_runtime(maxsems / 200);
> kernel/syscalls/ipc/shmctl/shmctl05.c:  .max_runtime = 10,
> kernel/syscalls/keyctl/keyctl02.c:      .max_runtime = 60,
> kernel/syscalls/landlock/landlock04.c:  .max_runtime = 360,
> kernel/syscalls/madvise/madvise06.c:    .max_runtime = 60,
> kernel/syscalls/madvise/madvise11.c:    .max_runtime = 30,
> kernel/syscalls/migrate_pages/migrate_pages03.c:        .max_runtime = 300,
> kernel/syscalls/move_pages/move_pages12.c:      .max_runtime = 240,
> kernel/syscalls/nice/nice05.c:  .max_runtime = 3,
> kernel/syscalls/perf_event_open/perf_event_open02.c:    .max_runtime = 72
> kernel/syscalls/perf_event_open/perf_event_open03.c:    .max_runtime = 300,
> kernel/syscalls/preadv2/preadv203.c:    .max_runtime = 60,
> kernel/syscalls/readahead/readahead02.c:         * speed, sometime test
> timeout when the default max_runtime is used up.
> kernel/syscalls/readahead/readahead02.c:
>  tst_set_max_runtime(test.max_runtime + (usec + usec_ra) / 1000000);
> kernel/syscalls/readahead/readahead02.c:
>  tst_set_max_runtime(1 + testfile_size / (DEFAULT_FILESIZE/32));
> kernel/syscalls/readahead/readahead02.c:        .max_runtime = 30,
> kernel/syscalls/request_key/request_key03.c:    .max_runtime = 20,
> kernel/syscalls/sendfile/sendfile09.c:  .max_runtime = 120,
> kernel/syscalls/sendmsg/sendmsg03.c:    .max_runtime = 150,
> kernel/syscalls/set_mempolicy/set_mempolicy01.c:
>  tst_set_max_runtime(test.max_runtime * (1 << nodes->cnt/16));
> kernel/syscalls/set_mempolicy/set_mempolicy01.c:        .max_runtime = 600,
> kernel/syscalls/setsockopt/setsockopt06.c:      .max_runtime = 270,
> kernel/syscalls/setsockopt/setsockopt07.c:      .max_runtime = 150,
> kernel/syscalls/swapoff/swapoff01.c:    .max_runtime = 60,
> kernel/syscalls/swapon/swapon01.c:      .max_runtime = 60,
> kernel/syscalls/timerfd/timerfd_settime02.c:    .max_runtime = 150,
> kernel/syscalls/writev/writev03.c:      .max_runtime = 75,
> network/can/cve/can_bcm01.c:    .max_runtime = 30,
> network/netstress/netstress.c:  .max_runtime = 300,
> network/nfs/nfs_stress/nfs05_make_tree.c:       .max_runtime = 300,
> network/packet/fanout01.c:      .max_runtime = 180,
> network/sockets/vsock01.c:      .max_runtime = 60,

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-04  3:38       ` Li Wang
  2025-01-06  9:35         ` Petr Vorel
@ 2025-01-06 12:10         ` Cyril Hrubis
  2025-01-06 12:52           ` Petr Vorel
  2025-01-06 16:03           ` Martin Doucha
  1 sibling, 2 replies; 40+ messages in thread
From: Cyril Hrubis @ 2025-01-06 12:10 UTC (permalink / raw)
  To: Li Wang; +Cc: Martin Doucha, ltp

Hi!
> I did a quick grep that some ltp-aiodio tests set it to 1800 sec, which
> only 8/91 occupation in the LTP testcases/, I'm not sure if it's worth
> adding a new field for those few stress tests.
> 
> And with the previous method, the multiple 4 max_runtime for 1800s
> is 2hours per test up limit, I can't imagine how long we will get eventually
> in the whole test time.
> 
> Maybe another way is to create a separate function in a header
> like aio_common.h (or in high-level dir) for handling that significantly
> larger runtime tests.
> 
> BTW, we have TST_UNLIMITED_RUNTIME choice or, invoke
> tst_set_max_runtime() in somehow.

I stil think that misusing max_runtime, which is supposed to be upper
bound for the actual test runtime was a mistake.

Maybe we should have called the max_runtime a timeout and add runtime
for tests that needs it. That way we would have timeout compromising of
two parts, one would be the 30s that is used for all tests and second
part from the tst_test structure. And then the sum of these two would be
multiplied by the timeout multipliers. Then we would have a runtime,
which would be used only by tests that call tst_remaining_runtime().

The overall test timeout would be then:

(default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL + tst_test->runtime * RUNTIME_MUL

What do you think?

-- 
Cyril Hrubis
chrubis@suse.cz

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-06 12:10         ` Cyril Hrubis
@ 2025-01-06 12:52           ` Petr Vorel
  2025-01-06 13:39             ` Cyril Hrubis
  2025-01-06 16:03           ` Martin Doucha
  1 sibling, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-06 12:52 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: Martin Doucha, ltp

> Hi!
> > I did a quick grep that some ltp-aiodio tests set it to 1800 sec, which
> > only 8/91 occupation in the LTP testcases/, I'm not sure if it's worth
> > adding a new field for those few stress tests.

> > And with the previous method, the multiple 4 max_runtime for 1800s
> > is 2hours per test up limit, I can't imagine how long we will get eventually
> > in the whole test time.

> > Maybe another way is to create a separate function in a header
> > like aio_common.h (or in high-level dir) for handling that significantly
> > larger runtime tests.

> > BTW, we have TST_UNLIMITED_RUNTIME choice or, invoke
> > tst_set_max_runtime() in somehow.

> I stil think that misusing max_runtime, which is supposed to be upper
> bound for the actual test runtime was a mistake.

Do you want to separate timeout for setup() and for actual test run?
Which one would be prolonged in case of "slow" kernels? (e.g. this patch).

Or you want to fix -iN?

Also for docs purposes it might be useful to list long running tests.
Fortunately there are just few tests which calls tst_set_max_runtime()
for dynamically set timeout.

> Maybe we should have called the max_runtime a timeout and add runtime
> for tests that needs it. That way we would have timeout compromising of
> two parts, one would be the 30s that is used for all tests and second
> part from the tst_test structure. And then the sum of these two would be
> multiplied by the timeout multipliers. Then we would have a runtime,
> which would be used only by tests that call tst_remaining_runtime().

> The overall test timeout would be then:

> (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL + tst_test->runtime * RUNTIME_MUL

> What do you think?

Timeout is for setup function, right? e.g. for ioctl_sg01? If yes, timeout is
too generic, IMHO many people will think that it's a general test timeout.
I would think about general name.

The above formula should be written in the docs in the separate description and
this section should be linked in TIMEOUT_MUL and RUNTIME_MUL description [1].

Maybe part of library README.md [2], which itself should be also moved to sphinx
docs.

Kind regards,
Petr

[1] https://linux-test-project.readthedocs.io/en/latest/users/setup_tests.html#library-environment-variables
[2] https://github.com/linux-test-project/ltp/blob/master/lib/README.md

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-06 12:52           ` Petr Vorel
@ 2025-01-06 13:39             ` Cyril Hrubis
  2025-01-06 15:36               ` Petr Vorel
  0 siblings, 1 reply; 40+ messages in thread
From: Cyril Hrubis @ 2025-01-06 13:39 UTC (permalink / raw)
  To: Petr Vorel; +Cc: Martin Doucha, ltp

Hi!
> > > I did a quick grep that some ltp-aiodio tests set it to 1800 sec, which
> > > only 8/91 occupation in the LTP testcases/, I'm not sure if it's worth
> > > adding a new field for those few stress tests.
> 
> > > And with the previous method, the multiple 4 max_runtime for 1800s
> > > is 2hours per test up limit, I can't imagine how long we will get eventually
> > > in the whole test time.
> 
> > > Maybe another way is to create a separate function in a header
> > > like aio_common.h (or in high-level dir) for handling that significantly
> > > larger runtime tests.
> 
> > > BTW, we have TST_UNLIMITED_RUNTIME choice or, invoke
> > > tst_set_max_runtime() in somehow.
> 
> > I stil think that misusing max_runtime, which is supposed to be upper
> > bound for the actual test runtime was a mistake.
> 
> Do you want to separate timeout for setup() and for actual test run?
> Which one would be prolonged in case of "slow" kernels? (e.g. this patch).

Currently the test timeout applies both for the setup, single test
iteration and cleanup. So the short answer is both.

> Or you want to fix -iN?

After each test iteration the child signals the parent that the
iteration has been finished and that restarts the timeout timer.

The test overall timeout is computed in set_timeout() and currently is
computed as:

	results->timeout = tst_multiply_timeout(timeout) + results->max_runtime;


And the tst_multiply_tiemout() would then be changed into:

	return (DEFAULT_TIMEOUT + tst_test->timeout) * timeout_mul;

> Also for docs purposes it might be useful to list long running tests.
> Fortunately there are just few tests which calls tst_set_max_runtime()
> for dynamically set timeout.

That would be actually easier, because we could do:

- add special value TST_RUNTIME_TIMEOUT
- allow test to change timeout only if TST_RUNTIME_TIMEOUT was set in
  tst_test

Then all long running test would have either tst_test->timeout or
tst_test->runtime set.

Technically we would need two special timeout values
TST_UNLIMITED_TIMEOUT and TST_RUNTIME_TIMEOUT.

> > Maybe we should have called the max_runtime a timeout and add runtime
> > for tests that needs it. That way we would have timeout compromising of
> > two parts, one would be the 30s that is used for all tests and second
> > part from the tst_test structure. And then the sum of these two would be
> > multiplied by the timeout multipliers. Then we would have a runtime,
> > which would be used only by tests that call tst_remaining_runtime().
> 
> > The overall test timeout would be then:
> 
> > (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL + tst_test->runtime * RUNTIME_MUL
> 
> > What do you think?
> 
> Timeout is for setup function, right? e.g. for ioctl_sg01? If yes, timeout is
> too generic, IMHO many people will think that it's a general test timeout.
> I would think about general name.

Not only, the default 30s timeout is for the whole testrun for tests
that are quick enough. We have a lot of tests that run just for less
than 1s even on small embedded boards.

> The above formula should be written in the docs in the separate description and
> this section should be linked in TIMEOUT_MUL and RUNTIME_MUL description [1].

Yes.

-- 
Cyril Hrubis
chrubis@suse.cz

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-06 13:39             ` Cyril Hrubis
@ 2025-01-06 15:36               ` Petr Vorel
  2025-01-06 16:19                 ` Cyril Hrubis
  0 siblings, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-06 15:36 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: Martin Doucha, ltp

> Hi!
> > > > I did a quick grep that some ltp-aiodio tests set it to 1800 sec, which
> > > > only 8/91 occupation in the LTP testcases/, I'm not sure if it's worth
> > > > adding a new field for those few stress tests.

> > > > And with the previous method, the multiple 4 max_runtime for 1800s
> > > > is 2hours per test up limit, I can't imagine how long we will get eventually
> > > > in the whole test time.

> > > > Maybe another way is to create a separate function in a header
> > > > like aio_common.h (or in high-level dir) for handling that significantly
> > > > larger runtime tests.

> > > > BTW, we have TST_UNLIMITED_RUNTIME choice or, invoke
> > > > tst_set_max_runtime() in somehow.

> > > I stil think that misusing max_runtime, which is supposed to be upper
> > > bound for the actual test runtime was a mistake.

> > Do you want to separate timeout for setup() and for actual test run?
> > Which one would be prolonged in case of "slow" kernels? (e.g. this patch).

> Currently the test timeout applies both for the setup, single test
> iteration and cleanup. So the short answer is both.

> > Or you want to fix -iN?

> After each test iteration the child signals the parent that the
> iteration has been finished and that restarts the timeout timer.

OK, -iN is already solved. Obviously, otherwise testing -i2000 would not work.

> The test overall timeout is computed in set_timeout() and currently is
> computed as:

> 	results->timeout = tst_multiply_timeout(timeout) + results->max_runtime;


> And the tst_multiply_tiemout() would then be changed into:

> 	return (DEFAULT_TIMEOUT + tst_test->timeout) * timeout_mul;

> > Also for docs purposes it might be useful to list long running tests.
> > Fortunately there are just few tests which calls tst_set_max_runtime()
> > for dynamically set timeout.

> That would be actually easier, because we could do:

> - add special value TST_RUNTIME_TIMEOUT
> - allow test to change timeout only if TST_RUNTIME_TIMEOUT was set in
>   tst_test
And if value does not get changed it's the default value. Also name is a bit
confusing (you suggest to have members "timeout", "runtime"), which suggest the
special value is related to both.

OK, flag which would allow us to see that time will be changed.

> Then all long running test would have either tst_test->timeout or
> tst_test->runtime set.

> Technically we would need two special timeout values
> TST_UNLIMITED_TIMEOUT and TST_RUNTIME_TIMEOUT.

OK, TST_UNLIMITED_TIMEOUT is equivalent of:

#define TST_UNLIMITED_RUNTIME (-1)

Maybe have just single definition TST_UNLIMITED, which could be used in both
tst_test->timeout and tst_test->runtime? But that's just an implementation
detail.

> > > Maybe we should have called the max_runtime a timeout and add runtime
> > > for tests that needs it. That way we would have timeout compromising of
> > > two parts, one would be the 30s that is used for all tests and second
> > > part from the tst_test structure. And then the sum of these two would be
> > > multiplied by the timeout multipliers. Then we would have a runtime,
> > > which would be used only by tests that call tst_remaining_runtime().

OK, the point of whole change is to separate some general timeout (30 sec) from
runtime (used for tst_remaining_runtime()), right?

> > > The overall test timeout would be then:

> > > (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL + tst_test->runtime * RUNTIME_MUL

(default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL is meant for setup or
cleanup and library overhead, tst_test->runtime * RUNTIME_MUL for running test
function?

> > > What do you think?

> > Timeout is for setup function, right? e.g. for ioctl_sg01? If yes, timeout is
> > too generic, IMHO many people will think that it's a general test timeout.
> > I would think about general name.

> Not only, the default 30s timeout is for the whole testrun for tests
> that are quick enough. We have a lot of tests that run just for less
> than 1s even on small embedded boards.

And yes, with properly set data, 30s could be carefully lowered down a bit.

Kind regards,
Petr

> > The above formula should be written in the docs in the separate description and
> > this section should be linked in TIMEOUT_MUL and RUNTIME_MUL description [1].

> Yes.

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-06 12:10         ` Cyril Hrubis
  2025-01-06 12:52           ` Petr Vorel
@ 2025-01-06 16:03           ` Martin Doucha
  2025-01-06 16:21             ` Cyril Hrubis
  1 sibling, 1 reply; 40+ messages in thread
From: Martin Doucha @ 2025-01-06 16:03 UTC (permalink / raw)
  To: Cyril Hrubis, Li Wang; +Cc: ltp

On 06. 01. 25 13:10, Cyril Hrubis wrote:
> I stil think that misusing max_runtime, which is supposed to be upper
> bound for the actual test runtime was a mistake.
> 
> Maybe we should have called the max_runtime a timeout and add runtime
> for tests that needs it. That way we would have timeout compromising of
> two parts, one would be the 30s that is used for all tests and second
> part from the tst_test structure. And then the sum of these two would be
> multiplied by the timeout multipliers. Then we would have a runtime,
> which would be used only by tests that call tst_remaining_runtime().
> 
> The overall test timeout would be then:
> 
> (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL + tst_test->runtime * RUNTIME_MUL
> 
> What do you think?

Hi,
sorry but I still don't follow the logic in the math above. I agree that 
"runtime" should control test iteration and "timeout" should be a hard 
limit for test execution. But then it doesn't make sense to add these 
two numbers and RUNTIME_MUL would be pointless. Instead, the total 
timeout (for single testcase/filesystem test) should be calculated like 
this:

default_30s_timeout * TIMEOUT_MUL + MAX(MAX(1, tst_test->timeout) * 
TIMEOUT_MUL, tst_test->runtime)

If you want to force a different runtime value, it should be done 
through the -I command line parameter. We could also replace the 
"duration" logic in testrun() with tst_remaining_runtime() which will 
allow looping tests for fixed amount of time by default just by setting 
the tst_test->runtime attribute, without any loop code inside the test 
function itself.

-- 
Martin Doucha   mdoucha@suse.cz
SW Quality Engineer
SUSE LINUX, s.r.o.
CORSO IIa
Krizikova 148/34
186 00 Prague 8
Czech Republic

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-06 15:36               ` Petr Vorel
@ 2025-01-06 16:19                 ` Cyril Hrubis
  2025-01-07  5:37                   ` Li Wang
  0 siblings, 1 reply; 40+ messages in thread
From: Cyril Hrubis @ 2025-01-06 16:19 UTC (permalink / raw)
  To: Petr Vorel; +Cc: Martin Doucha, ltp

Hi!
> And if value does not get changed it's the default value. Also name is a bit
> confusing (you suggest to have members "timeout", "runtime"), which suggest the
> special value is related to both.
> 
> OK, flag which would allow us to see that time will be changed.
> 
> > Then all long running test would have either tst_test->timeout or
> > tst_test->runtime set.
> 
> > Technically we would need two special timeout values
> > TST_UNLIMITED_TIMEOUT and TST_RUNTIME_TIMEOUT.
> 
> OK, TST_UNLIMITED_TIMEOUT is equivalent of:
> 
> #define TST_UNLIMITED_RUNTIME (-1)
> 
> Maybe have just single definition TST_UNLIMITED, which could be used in both
> tst_test->timeout and tst_test->runtime? But that's just an implementation
> detail.

The UNLIMITED_RUNTIME would not be needed anymore. Because runtime will
mean _only_ for how long will a few tests spend in the main loop.
Infinite loop does not make any sense. The tst_runtime will be basically
renamed to timeout and TST_UNLIMITED_RUNTIME becomes
TST_UNLIMITED_TIMEOUT.

> > > > Maybe we should have called the max_runtime a timeout and add runtime
> > > > for tests that needs it. That way we would have timeout compromising of
> > > > two parts, one would be the 30s that is used for all tests and second
> > > > part from the tst_test structure. And then the sum of these two would be
> > > > multiplied by the timeout multipliers. Then we would have a runtime,
> > > > which would be used only by tests that call tst_remaining_runtime().
> 
> OK, the point of whole change is to separate some general timeout (30 sec) from
> runtime (used for tst_remaining_runtime()), right?

The point is to separate timeout, which is a guess on upper bound on the
time the test will spend executing from a runtime which is the exact
time a few tests will spend looping in the test function.

> > > > The overall test timeout would be then:
> 
> > > > (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL + tst_test->runtime * RUNTIME_MUL
> 
> (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL is meant for setup or
> cleanup and library overhead, tst_test->runtime * RUNTIME_MUL for running test
> function?

No, it's the other way around. The timeout is a guessed upper bound for
a test execution time. It's for everything the test does and in most of
the cases only the default timeout (which is kind of safety measure) is
sufficient. Then there are tests that do some work that is not time
bound, e.g. I/O. In that case we set the timeout in the tst_test
structure to some value we measured in practice and that plus the
default timeout will become the overall test timeout.

Then we have a few testcases that do a loop in the test function that
takes exact time. In that case we know that we spend exactly runtime in
the test function, but there is a setup and cleanup as well. So we add
the default timeout to the runtime to get the overall timeout. There may
be also a case where the test setup for such test takes some time, in
that case we would set both the timeout and runtime in the tst_test
structure. The timeout would be upper bound for the test setup and
runtime would be exactly for how long will the test function loop.

Mainly this would make sure that if timeout part of the overall test
time limit gets multiplied, because we are running on a slow system, the
runtime will not. Because we could control the timeout and runtime
separately.

> > Not only, the default 30s timeout is for the whole testrun for tests
> > that are quick enough. We have a lot of tests that run just for less
> > than 1s even on small embedded boards.
> 
> And yes, with properly set data, 30s could be carefully lowered down a bit.

That was my long term plan.

-- 
Cyril Hrubis
chrubis@suse.cz

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-06 16:03           ` Martin Doucha
@ 2025-01-06 16:21             ` Cyril Hrubis
  2025-01-06 16:49               ` Martin Doucha
  0 siblings, 1 reply; 40+ messages in thread
From: Cyril Hrubis @ 2025-01-06 16:21 UTC (permalink / raw)
  To: Martin Doucha; +Cc: ltp

Hi!
> If you want to force a different runtime value, it should be done 
> through the -I command line parameter. We could also replace the 
> "duration" logic in testrun() with tst_remaining_runtime() which will 
> allow looping tests for fixed amount of time by default just by setting 
> the tst_test->runtime attribute, without any loop code inside the test 
> function itself.

The problem we have is that we currently use max_runtime for two
different purposes. See my other emails where I explain how would I like
to decouple these two different things that are currently bundled
together in max_runtime.

-- 
Cyril Hrubis
chrubis@suse.cz

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-06 16:21             ` Cyril Hrubis
@ 2025-01-06 16:49               ` Martin Doucha
  0 siblings, 0 replies; 40+ messages in thread
From: Martin Doucha @ 2025-01-06 16:49 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: ltp

On 06. 01. 25 17:21, Cyril Hrubis wrote:
> The problem we have is that we currently use max_runtime for two
> different purposes. See my other emails where I explain how would I like
> to decouple these two different things that are currently bundled
> together in max_runtime.

I got that. I'm just pointing out that your current proposal still keeps 
them somewhat coupled. The only relationship between runtime and timeout 
should be that tst_test->runtime will automatically set a lower bound 
for tst_test->timeout * TIMEOUT_MUL. Nothing more.

-- 
Martin Doucha   mdoucha@suse.cz
SW Quality Engineer
SUSE LINUX, s.r.o.
CORSO IIa
Krizikova 148/34
186 00 Prague 8
Czech Republic

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-06 16:19                 ` Cyril Hrubis
@ 2025-01-07  5:37                   ` Li Wang
  2025-01-07  6:28                     ` Li Wang
  0 siblings, 1 reply; 40+ messages in thread
From: Li Wang @ 2025-01-07  5:37 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: Martin Doucha, ltp

On Tue, Jan 7, 2025 at 12:29 AM Cyril Hrubis <chrubis@suse.cz> wrote:

> Hi!
> > And if value does not get changed it's the default value. Also name is a
> bit
> > confusing (you suggest to have members "timeout", "runtime"), which
> suggest the
> > special value is related to both.
> >
> > OK, flag which would allow us to see that time will be changed.
> >
> > > Then all long running test would have either tst_test->timeout or
> > > tst_test->runtime set.
> >
> > > Technically we would need two special timeout values
> > > TST_UNLIMITED_TIMEOUT and TST_RUNTIME_TIMEOUT.
> >
> > OK, TST_UNLIMITED_TIMEOUT is equivalent of:
> >
> > #define TST_UNLIMITED_RUNTIME (-1)
> >
> > Maybe have just single definition TST_UNLIMITED, which could be used in
> both
> > tst_test->timeout and tst_test->runtime? But that's just an
> implementation
> > detail.
>
> The UNLIMITED_RUNTIME would not be needed anymore. Because runtime will
> mean _only_ for how long will a few tests spend in the main loop.
> Infinite loop does not make any sense. The tst_runtime will be basically
> renamed to timeout and TST_UNLIMITED_RUNTIME becomes
> TST_UNLIMITED_TIMEOUT.
>
> > > > > Maybe we should have called the max_runtime a timeout and add
> runtime
> > > > > for tests that needs it. That way we would have timeout
> compromising of
> > > > > two parts, one would be the 30s that is used for all tests and
> second
> > > > > part from the tst_test structure. And then the sum of these two
> would be
> > > > > multiplied by the timeout multipliers. Then we would have a
> runtime,
> > > > > which would be used only by tests that call
> tst_remaining_runtime().
> >
> > OK, the point of whole change is to separate some general timeout (30
> sec) from
> > runtime (used for tst_remaining_runtime()), right?
>
> The point is to separate timeout, which is a guess on upper bound on the
> time the test will spend executing from a runtime which is the exact
> time a few tests will spend looping in the test function.
>
> > > > > The overall test timeout would be then:
> >
> > > > > (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL +
> tst_test->runtime * RUNTIME_MUL
> >
> > (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL is meant for
> setup or
> > cleanup and library overhead, tst_test->runtime * RUNTIME_MUL for
> running test
> > function?
>
> No, it's the other way around. The timeout is a guessed upper bound for
> a test execution time. It's for everything the test does and in most of
> the cases only the default timeout (which is kind of safety measure) is
> sufficient. Then there are tests that do some work that is not time
> bound, e.g. I/O. In that case we set the timeout in the tst_test
> structure to some value we measured in practice and that plus the
> default timeout will become the overall test timeout.
>
> Then we have a few testcases that do a loop in the test function that
> takes exact time. In that case we know that we spend exactly runtime in
> the test function, but there is a setup and cleanup as well. So we add
> the default timeout to the runtime to get the overall timeout. There may
> be also a case where the test setup for such test takes some time, in
> that case we would set both the timeout and runtime in the tst_test
> structure. The timeout would be upper bound for the test setup and
> runtime would be exactly for how long will the test function loop.
>

Fair enough, I agree with that.

Thus we will have tst_test->timeout used for setup time control, and
tst_test->runtime means the exact executable time of the main function,
tst_remaining_runtime() only rely on tst_test->runtime to count.

As you pointed out above:
 (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL +
tst_test->runtime * RUNTIME_MUL

But, questions come back to the item, which part should be extended
when detecting slow kernel configs? the TIMEOUT_MUL?
If so it looks only renaming something based on my patch no essential
changes.



>
> Mainly this would make sure that if timeout part of the overall test
> time limit gets multiplied, because we are running on a slow system, the
> runtime will not. Because we could control the timeout and runtime
> separately.
>
> > > Not only, the default 30s timeout is for the whole testrun for tests
> > > that are quick enough. We have a lot of tests that run just for less
> > > than 1s even on small embedded boards.
> >
> > And yes, with properly set data, 30s could be carefully lowered down a
> bit.
>
> That was my long term plan.
>
> --
> Cyril Hrubis
> chrubis@suse.cz
>
>

-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-07  5:37                   ` Li Wang
@ 2025-01-07  6:28                     ` Li Wang
  2025-01-07 12:42                       ` Cyril Hrubis
  0 siblings, 1 reply; 40+ messages in thread
From: Li Wang @ 2025-01-07  6:28 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: Martin Doucha, ltp

On Tue, Jan 7, 2025 at 1:37 PM Li Wang <liwang@redhat.com> wrote:

>
>
> On Tue, Jan 7, 2025 at 12:29 AM Cyril Hrubis <chrubis@suse.cz> wrote:
>
>> Hi!
>> > And if value does not get changed it's the default value. Also name is
>> a bit
>> > confusing (you suggest to have members "timeout", "runtime"), which
>> suggest the
>> > special value is related to both.
>> >
>> > OK, flag which would allow us to see that time will be changed.
>> >
>> > > Then all long running test would have either tst_test->timeout or
>> > > tst_test->runtime set.
>> >
>> > > Technically we would need two special timeout values
>> > > TST_UNLIMITED_TIMEOUT and TST_RUNTIME_TIMEOUT.
>> >
>> > OK, TST_UNLIMITED_TIMEOUT is equivalent of:
>> >
>> > #define TST_UNLIMITED_RUNTIME (-1)
>> >
>> > Maybe have just single definition TST_UNLIMITED, which could be used in
>> both
>> > tst_test->timeout and tst_test->runtime? But that's just an
>> implementation
>> > detail.
>>
>> The UNLIMITED_RUNTIME would not be needed anymore. Because runtime will
>> mean _only_ for how long will a few tests spend in the main loop.
>> Infinite loop does not make any sense. The tst_runtime will be basically
>> renamed to timeout and TST_UNLIMITED_RUNTIME becomes
>> TST_UNLIMITED_TIMEOUT.
>>
>> > > > > Maybe we should have called the max_runtime a timeout and add
>> runtime
>> > > > > for tests that needs it. That way we would have timeout
>> compromising of
>> > > > > two parts, one would be the 30s that is used for all tests and
>> second
>> > > > > part from the tst_test structure. And then the sum of these two
>> would be
>> > > > > multiplied by the timeout multipliers. Then we would have a
>> runtime,
>> > > > > which would be used only by tests that call
>> tst_remaining_runtime().
>> >
>> > OK, the point of whole change is to separate some general timeout (30
>> sec) from
>> > runtime (used for tst_remaining_runtime()), right?
>>
>> The point is to separate timeout, which is a guess on upper bound on the
>> time the test will spend executing from a runtime which is the exact
>> time a few tests will spend looping in the test function.
>>
>> > > > > The overall test timeout would be then:
>> >
>> > > > > (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL +
>> tst_test->runtime * RUNTIME_MUL
>> >
>> > (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL is meant for
>> setup or
>> > cleanup and library overhead, tst_test->runtime * RUNTIME_MUL for
>> running test
>> > function?
>>
>> No, it's the other way around. The timeout is a guessed upper bound for
>> a test execution time. It's for everything the test does and in most of
>> the cases only the default timeout (which is kind of safety measure) is
>> sufficient. Then there are tests that do some work that is not time
>> bound, e.g. I/O. In that case we set the timeout in the tst_test
>> structure to some value we measured in practice and that plus the
>> default timeout will become the overall test timeout.
>>
>> Then we have a few testcases that do a loop in the test function that
>> takes exact time. In that case we know that we spend exactly runtime in
>> the test function, but there is a setup and cleanup as well. So we add
>> the default timeout to the runtime to get the overall timeout. There may
>> be also a case where the test setup for such test takes some time, in
>> that case we would set both the timeout and runtime in the tst_test
>> structure. The timeout would be upper bound for the test setup and
>> runtime would be exactly for how long will the test function loop.
>>
>
> Fair enough, I agree with that.
>
> Thus we will have tst_test->timeout used for setup time control, and
> tst_test->runtime means the exact executable time of the main function,
> tst_remaining_runtime() only rely on tst_test->runtime to count.
>
> As you pointed out above:
>  (default_30s_timeout + tst_test->timeout) * TIMEOUT_MUL +
> tst_test->runtime * RUNTIME_MUL
>
> But, questions come back to the item, which part should be extended
> when detecting slow kernel configs? the TIMEOUT_MUL?
> If so it looks only renaming something based on my patch no essential
> changes.
>

>
>
>>
>> Mainly this would make sure that if timeout part of the overall test
>> time limit gets multiplied, because we are running on a slow system, the
>> runtime will not. Because we could control the timeout and runtime
>> separately.
>>
>

Ah, you mean we multiply the overall test time limit results->timeout,
right?
e.g.

    results->timeout =  (default_30s_timeout + tst_test->timeout) *
TIMEOUT_MUL + tst_test->runtime * RUNTIME_MUL;
    if (tst_has_slow_kconfig())
            results->timeout *= 4;




>
>> > > Not only, the default 30s timeout is for the whole testrun for tests
>> > > that are quick enough. We have a lot of tests that run just for less
>> > > than 1s even on small embedded boards.
>> >
>> > And yes, with properly set data, 30s could be carefully lowered down a
>> bit.
>>
>> That was my long term plan.
>>
>> --
>> Cyril Hrubis
>> chrubis@suse.cz
>>
>>
>
> --
> Regards,
> Li Wang
>


-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-07  6:28                     ` Li Wang
@ 2025-01-07 12:42                       ` Cyril Hrubis
  2025-01-07 16:49                         ` Petr Vorel
  0 siblings, 1 reply; 40+ messages in thread
From: Cyril Hrubis @ 2025-01-07 12:42 UTC (permalink / raw)
  To: Li Wang; +Cc: Martin Doucha, ltp

Hi!
> Ah, you mean we multiply the overall test time limit results->timeout,
> right?
> e.g.
> 
>     results->timeout =  (default_30s_timeout + tst_test->timeout) *
> TIMEOUT_MUL + tst_test->runtime * RUNTIME_MUL;
>     if (tst_has_slow_kconfig())
>             results->timeout *= 4;

That would work too, but since the runtime will be always constant (once
only the test that call tst_remaning_runtime() use runtime and the rest
of the tests is moved to timeout) we may as well multiply the timeout
part.

However this detail does not matter that much, the most imporatant part
is the clear separation of the guessed upper bound and the actual
runtime that is used to controll how long should the loop in the test
spin.

-- 
Cyril Hrubis
chrubis@suse.cz

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-07 12:42                       ` Cyril Hrubis
@ 2025-01-07 16:49                         ` Petr Vorel
  2025-01-08  1:47                           ` Li Wang
  0 siblings, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-07 16:49 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: Martin Doucha, ltp

> Hi!
> > Ah, you mean we multiply the overall test time limit results->timeout,
> > right?
> > e.g.

> >     results->timeout =  (default_30s_timeout + tst_test->timeout) *
> > TIMEOUT_MUL + tst_test->runtime * RUNTIME_MUL;
> >     if (tst_has_slow_kconfig())
> >             results->timeout *= 4;

> That would work too, but since the runtime will be always constant (once
> only the test that call tst_remaning_runtime() use runtime and the rest
> of the tests is moved to timeout) we may as well multiply the timeout
> part.

> However this detail does not matter that much, the most imporatant part
> is the clear separation of the guessed upper bound and the actual
> runtime that is used to controll how long should the loop in the test
> spin.

Thanks this for explanation.

BTW do you plan to send a patch soon? If not I would prefer to temporarily
revert 2da30df24e (or just *= 4), which causes some tests to timeout.

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-07 16:49                         ` Petr Vorel
@ 2025-01-08  1:47                           ` Li Wang
  2025-01-08  2:51                             ` Li Wang
  0 siblings, 1 reply; 40+ messages in thread
From: Li Wang @ 2025-01-08  1:47 UTC (permalink / raw)
  To: Petr Vorel; +Cc: Martin Doucha, ltp

On Wed, Jan 8, 2025 at 12:49 AM Petr Vorel <pvorel@suse.cz> wrote:

> > Hi!
> > > Ah, you mean we multiply the overall test time limit results->timeout,
> > > right?
> > > e.g.
>
> > >     results->timeout =  (default_30s_timeout + tst_test->timeout) *
> > > TIMEOUT_MUL + tst_test->runtime * RUNTIME_MUL;
> > >     if (tst_has_slow_kconfig())
> > >             results->timeout *= 4;
>
> > That would work too, but since the runtime will be always constant (once
> > only the test that call tst_remaning_runtime() use runtime and the rest
> > of the tests is moved to timeout) we may as well multiply the timeout
> > part.
>
> > However this detail does not matter that much, the most imporatant part
> > is the clear separation of the guessed upper bound and the actual
> > runtime that is used to controll how long should the loop in the test
> > spin.
>
> Thanks this for explanation.
>
> BTW do you plan to send a patch soon? If not I would prefer to temporarily
> revert 2da30df24e (or just *= 4), which causes some tests to timeout.
>

Petr, Cyril, all,

I am now working on the new patch, but it's not guaranteed the new one
will be ready for the coming release.

It's fine to apply this one, at least it addresses the side effects and
extend
timeout time for most tests.



-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD
  2025-01-08  1:47                           ` Li Wang
@ 2025-01-08  2:51                             ` Li Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Li Wang @ 2025-01-08  2:51 UTC (permalink / raw)
  To: Petr Vorel; +Cc: Martin Doucha, ltp

Hi All,

I boldly merge this patch because it could temporarily resolve the
side effects and help extend the time (on slow kernel) for most tests.

And, based on the above discussion I will fully work on the new
timeout+runtime method improvement this week.
Hope we can get that down for the coming release.

Thanks everyone for the suggestions!

-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2024-12-22  7:22 [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs Li Wang
  2024-12-22  7:22 ` [LTP] [PATCH 2/2] starvation: skip test on slow kernel Li Wang
  2025-01-02 12:43 ` [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs Petr Vorel
@ 2025-01-15 22:41 ` Petr Vorel
  2025-01-16  6:54   ` Li Wang
  2 siblings, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-15 22:41 UTC (permalink / raw)
  To: Li Wang, Cyril Hrubis; +Cc: Martin Doucha, ltp

Hi Li, Cyril, all,

...
> +++ b/lib/tst_test.c
> @@ -555,9 +555,6 @@ static int multiply_runtime(int max_runtime)

>  	parse_mul(&runtime_mul, "LTP_RUNTIME_MUL", 0.0099, 100);

> -	if (tst_has_slow_kconfig())
> -		max_runtime *= 4;
> -
>  	return max_runtime * runtime_mul;
>  }

> @@ -1706,6 +1703,9 @@ unsigned int tst_multiply_timeout(unsigned int timeout)
>  	if (timeout < 1)
>  		tst_brk(TBROK, "timeout must to be >= 1! (%d)", timeout);

> +	if (tst_has_slow_kconfig())
> +		timeout *= 4;

FYI this change, merged as 893ca0abe7 ("lib: multiply the timeout if detect slow
kconfigs") caused a regression on *all* tests which use tst_net.sh.

First, it detects slow config on everything which uses struct tst_test,
which are unfortunately some tools at testcases/lib/:

$ git grep -l "struct tst_test" testcases/lib/*.c
testcases/lib/tst_device.c // not obvious reason, might be removed
testcases/lib/tst_get_free_pids.c // force messages to be printed from new library
testcases/lib/tst_ns_create.c // .forks_child = 1, Needed by SAFE_CLONE
testcases/lib/tst_ns_exec.c  // .forks_child = 1, Needed by SAFE_CLONE
testcases/lib/tst_run_shell.c // not obvious reason, might be removed

Besides unimportant fact that slow config detection is an unnecessary slowdown
on these tools, the problem is that it prints messages to stderr, which causes
breakage.

Previously tst_ns_exec.c run just command passed by -c parameter:

$ ./tst_ns_exec 14536 net,mnt sh -c " cat /proc/sys/net/ipv6/conf/ltp_ns_veth1/disable_ipv6"
0

Now it prints TINFO:
$ ./tst_ns_exec 14536 net,mnt sh -c " cat /proc/sys/net/ipv6/conf/ltp_ns_veth1/disable_ipv6"
tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which might slow the execution
0

tst_rhost_run() in tst_net.sh runs binary on remote host:

	output=$($rcmd "$sh_cmd" $out 2>&1 || echo 'RTERR')

Redirect stderr to stdout (2>&1) is likely needed for some tests. But
tst_rhost_run() output is often parsed including therefore certain tools in
testcases/lib/ have to print only expected output:

init_ltp_netspace()
{
	...
		pid="$(ROD tst_ns_create net,mnt)"
	...
	export LTP_NETNS="${LTP_NETNS:-tst_ns_exec $pid net,mnt}"

I will probably solve it by adding yet another parameter to tst_rhost_run(),
which ignores stderr and use it for tst_ns_{create,exec}. But maybe there is
another solution (or another problem).

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-15 22:41 ` [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs) Petr Vorel
@ 2025-01-16  6:54   ` Li Wang
  2025-01-16  8:35     ` Petr Vorel
                       ` (3 more replies)
  0 siblings, 4 replies; 40+ messages in thread
From: Li Wang @ 2025-01-16  6:54 UTC (permalink / raw)
  To: Petr Vorel; +Cc: ltp, Martin Doucha

On Thu, Jan 16, 2025 at 6:42 AM Petr Vorel <pvorel@suse.cz> wrote:

> Hi Li, Cyril, all,
>
> ...
> > +++ b/lib/tst_test.c
> > @@ -555,9 +555,6 @@ static int multiply_runtime(int max_runtime)
>
> >       parse_mul(&runtime_mul, "LTP_RUNTIME_MUL", 0.0099, 100);
>
> > -     if (tst_has_slow_kconfig())
> > -             max_runtime *= 4;
> > -
> >       return max_runtime * runtime_mul;
> >  }
>
> > @@ -1706,6 +1703,9 @@ unsigned int tst_multiply_timeout(unsigned int
> timeout)
> >       if (timeout < 1)
> >               tst_brk(TBROK, "timeout must to be >= 1! (%d)", timeout);
>
> > +     if (tst_has_slow_kconfig())
> > +             timeout *= 4;
>
> FYI this change, merged as 893ca0abe7 ("lib: multiply the timeout if
> detect slow
> kconfigs") caused a regression on *all* tests which use tst_net.sh.
>
> First, it detects slow config on everything which uses struct tst_test,
> which are unfortunately some tools at testcases/lib/:
>
> $ git grep -l "struct tst_test" testcases/lib/*.c
> testcases/lib/tst_device.c // not obvious reason, might be removed
> testcases/lib/tst_get_free_pids.c // force messages to be printed from new
> library
> testcases/lib/tst_ns_create.c // .forks_child = 1, Needed by SAFE_CLONE
> testcases/lib/tst_ns_exec.c  // .forks_child = 1, Needed by SAFE_CLONE
> testcases/lib/tst_run_shell.c // not obvious reason, might be removed
>
> Besides unimportant fact that slow config detection is an unnecessary
> slowdown
> on these tools, the problem is that it prints messages to stderr, which
> causes
> breakage.
>
> Previously tst_ns_exec.c run just command passed by -c parameter:
>
> $ ./tst_ns_exec 14536 net,mnt sh -c " cat
> /proc/sys/net/ipv6/conf/ltp_ns_veth1/disable_ipv6"
> 0
>
> Now it prints TINFO:
> $ ./tst_ns_exec 14536 net,mnt sh -c " cat
> /proc/sys/net/ipv6/conf/ltp_ns_veth1/disable_ipv6"
> tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
> tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which
> might slow the execution
> 0
>
> tst_rhost_run() in tst_net.sh runs binary on remote host:
>
>         output=$($rcmd "$sh_cmd" $out 2>&1 || echo 'RTERR')
>
> Redirect stderr to stdout (2>&1) is likely needed for some tests. But
> tst_rhost_run() output is often parsed including therefore certain tools in
> testcases/lib/ have to print only expected output:
>
> init_ltp_netspace()
> {
>         ...
>                 pid="$(ROD tst_ns_create net,mnt)"
>         ...
>         export LTP_NETNS="${LTP_NETNS:-tst_ns_exec $pid net,mnt}"
>
> I will probably solve it by adding yet another parameter to
> tst_rhost_run(),
> which ignores stderr and use it for tst_ns_{create,exec}. But maybe there
> is
> another solution (or another problem).
>

Or, another way is just to set 'tst_test->timeout == TST_UNLIMITED_TIMEOUT'
in those testcase/lib tools.


-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-16  6:54   ` Li Wang
@ 2025-01-16  8:35     ` Petr Vorel
  2025-01-16  8:38     ` Petr Vorel
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 40+ messages in thread
From: Petr Vorel @ 2025-01-16  8:35 UTC (permalink / raw)
  To: Li Wang; +Cc: ltp, Martin Doucha

> On Thu, Jan 16, 2025 at 6:42 AM Petr Vorel <pvorel@suse.cz> wrote:

> > Hi Li, Cyril, all,

> > ...
> > > +++ b/lib/tst_test.c
> > > @@ -555,9 +555,6 @@ static int multiply_runtime(int max_runtime)

> > >       parse_mul(&runtime_mul, "LTP_RUNTIME_MUL", 0.0099, 100);

> > > -     if (tst_has_slow_kconfig())
> > > -             max_runtime *= 4;
> > > -
> > >       return max_runtime * runtime_mul;
> > >  }

> > > @@ -1706,6 +1703,9 @@ unsigned int tst_multiply_timeout(unsigned int
> > timeout)
> > >       if (timeout < 1)
> > >               tst_brk(TBROK, "timeout must to be >= 1! (%d)", timeout);

> > > +     if (tst_has_slow_kconfig())
> > > +             timeout *= 4;

> > FYI this change, merged as 893ca0abe7 ("lib: multiply the timeout if
> > detect slow
> > kconfigs") caused a regression on *all* tests which use tst_net.sh.

> > First, it detects slow config on everything which uses struct tst_test,
> > which are unfortunately some tools at testcases/lib/:

> > $ git grep -l "struct tst_test" testcases/lib/*.c
> > testcases/lib/tst_device.c // not obvious reason, might be removed
> > testcases/lib/tst_get_free_pids.c // force messages to be printed from new
> > library
> > testcases/lib/tst_ns_create.c // .forks_child = 1, Needed by SAFE_CLONE
> > testcases/lib/tst_ns_exec.c  // .forks_child = 1, Needed by SAFE_CLONE
> > testcases/lib/tst_run_shell.c // not obvious reason, might be removed

> > Besides unimportant fact that slow config detection is an unnecessary
> > slowdown
> > on these tools, the problem is that it prints messages to stderr, which
> > causes
> > breakage.

> > Previously tst_ns_exec.c run just command passed by -c parameter:

> > $ ./tst_ns_exec 14536 net,mnt sh -c " cat
> > /proc/sys/net/ipv6/conf/ltp_ns_veth1/disable_ipv6"
> > 0

> > Now it prints TINFO:
> > $ ./tst_ns_exec 14536 net,mnt sh -c " cat
> > /proc/sys/net/ipv6/conf/ltp_ns_veth1/disable_ipv6"
> > tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
> > tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which
> > might slow the execution
> > 0

> > tst_rhost_run() in tst_net.sh runs binary on remote host:

> >         output=$($rcmd "$sh_cmd" $out 2>&1 || echo 'RTERR')

> > Redirect stderr to stdout (2>&1) is likely needed for some tests. But
> > tst_rhost_run() output is often parsed including therefore certain tools in
> > testcases/lib/ have to print only expected output:

> > init_ltp_netspace()
> > {
> >         ...
> >                 pid="$(ROD tst_ns_create net,mnt)"
> >         ...
> >         export LTP_NETNS="${LTP_NETNS:-tst_ns_exec $pid net,mnt}"

> > I will probably solve it by adding yet another parameter to
> > tst_rhost_run(),
> > which ignores stderr and use it for tst_ns_{create,exec}. But maybe there
> > is
> > another solution (or another problem).


> Or, another way is just to set 'tst_test->timeout == TST_UNLIMITED_TIMEOUT'
> in those testcase/lib tools.

This would be an easiest fix, but OTOH it's another hack which depends on the
implementation (I would prefer to get rid of the need to use struct tst_test
for these tools instead).

Anyway, I'll send a patch to fix tst_net.sh, but ideas how to improve and unify
testcases/lib are welcome.

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-16  6:54   ` Li Wang
  2025-01-16  8:35     ` Petr Vorel
@ 2025-01-16  8:38     ` Petr Vorel
  2025-01-16 12:10       ` Li Wang
  2025-01-16 12:56     ` Petr Vorel
  2025-01-20  9:07     ` [LTP] [REGRESSION] pidns05 timeout " Petr Vorel
  3 siblings, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-16  8:38 UTC (permalink / raw)
  To: Li Wang; +Cc: ltp, Martin Doucha

...
> > init_ltp_netspace()
> > {
> >         ...
> >                 pid="$(ROD tst_ns_create net,mnt)"
> >         ...
> >         export LTP_NETNS="${LTP_NETNS:-tst_ns_exec $pid net,mnt}"

> > I will probably solve it by adding yet another parameter to
> > tst_rhost_run(),
> > which ignores stderr and use it for tst_ns_{create,exec}. But maybe there
> > is
> > another solution (or another problem).


> Or, another way is just to set 'tst_test->timeout == TST_UNLIMITED_TIMEOUT'
> in those testcase/lib tools.

Or, we could have yet another flag TST_SKIP_SLOW_DETECTION which could be used
for these tools.

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-16  8:38     ` Petr Vorel
@ 2025-01-16 12:10       ` Li Wang
  2025-01-16 12:13         ` Li Wang
  2025-01-16 13:13         ` Petr Vorel
  0 siblings, 2 replies; 40+ messages in thread
From: Li Wang @ 2025-01-16 12:10 UTC (permalink / raw)
  To: Petr Vorel; +Cc: ltp, Martin Doucha

On Thu, Jan 16, 2025 at 4:38 PM Petr Vorel <pvorel@suse.cz> wrote:

> ...
> > > init_ltp_netspace()
> > > {
> > >         ...
> > >                 pid="$(ROD tst_ns_create net,mnt)"
> > >         ...
> > >         export LTP_NETNS="${LTP_NETNS:-tst_ns_exec $pid net,mnt}"
>
> > > I will probably solve it by adding yet another parameter to
> > > tst_rhost_run(),
> > > which ignores stderr and use it for tst_ns_{create,exec}. But maybe
> there
> > > is
> > > another solution (or another problem).
>
>
> > Or, another way is just to set 'tst_test->timeout ==
> TST_UNLIMITED_TIMEOUT'
> > in those testcase/lib tools.
>
> Or, we could have yet another flag TST_SKIP_SLOW_DETECTION which could be
> used
> for these tools.
>

Sounds better. But maybe name it TST_NO_SLOW_CONFIG_CHECK ?

diff --git a/lib/tst_test.c b/lib/tst_test.c
index b204ad975..1e9504f29 100644
--- a/lib/tst_test.c
+++ b/lib/tst_test.c
@@ -1702,8 +1702,10 @@ unsigned int tst_multiply_timeout(unsigned int
timeout)
        if (timeout < 1)
                tst_brk(TBROK, "timeout must to be >= 1! (%d)", timeout);

+#ifndef TST_NO_SLOW_CONFIG_CHECK
        if (tst_has_slow_kconfig())
                timeout *= 4;
+#endif

        return timeout * timeout_mul;
 }
diff --git a/testcases/lib/tst_ns_exec.c b/testcases/lib/tst_ns_exec.c
index 6a8e39339..a1b780ab7 100644
--- a/testcases/lib/tst_ns_exec.c
+++ b/testcases/lib/tst_ns_exec.c
@@ -14,6 +14,7 @@
  */

 #define TST_NO_DEFAULT_MAIN
+#define TST_NO_SLOW_CONFIG_CHECK

 #include <stdio.h>
 #include <sys/wait.h>


-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-16 12:10       ` Li Wang
@ 2025-01-16 12:13         ` Li Wang
  2025-01-16 13:13         ` Petr Vorel
  1 sibling, 0 replies; 40+ messages in thread
From: Li Wang @ 2025-01-16 12:13 UTC (permalink / raw)
  To: Petr Vorel; +Cc: ltp, Martin Doucha

On Thu, Jan 16, 2025 at 8:10 PM Li Wang <liwang@redhat.com> wrote:

>
>
> On Thu, Jan 16, 2025 at 4:38 PM Petr Vorel <pvorel@suse.cz> wrote:
>
>> ...
>> > > init_ltp_netspace()
>> > > {
>> > >         ...
>> > >                 pid="$(ROD tst_ns_create net,mnt)"
>> > >         ...
>> > >         export LTP_NETNS="${LTP_NETNS:-tst_ns_exec $pid net,mnt}"
>>
>> > > I will probably solve it by adding yet another parameter to
>> > > tst_rhost_run(),
>> > > which ignores stderr and use it for tst_ns_{create,exec}. But maybe
>> there
>> > > is
>> > > another solution (or another problem).
>>
>>
>> > Or, another way is just to set 'tst_test->timeout ==
>> TST_UNLIMITED_TIMEOUT'
>> > in those testcase/lib tools.
>>
>> Or, we could have yet another flag TST_SKIP_SLOW_DETECTION which could be
>> used
>> for these tools.
>>
>
> Sounds better. But maybe name it TST_NO_SLOW_CONFIG_CHECK ?
>

Ah, typo with missing 'K'.

s/TST_NO_SLOW_CONFIG_CHECK/TST_NO_SLOW_KCONFIG_CHECK

>

-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-16  6:54   ` Li Wang
  2025-01-16  8:35     ` Petr Vorel
  2025-01-16  8:38     ` Petr Vorel
@ 2025-01-16 12:56     ` Petr Vorel
  2025-01-20  9:07     ` [LTP] [REGRESSION] pidns05 timeout " Petr Vorel
  3 siblings, 0 replies; 40+ messages in thread
From: Petr Vorel @ 2025-01-16 12:56 UTC (permalink / raw)
  To: Li Wang; +Cc: ltp, Martin Doucha

> On Thu, Jan 16, 2025 at 6:42 AM Petr Vorel <pvorel@suse.cz> wrote:

> > Hi Li, Cyril, all,

> > ...
> > > +++ b/lib/tst_test.c
> > > @@ -555,9 +555,6 @@ static int multiply_runtime(int max_runtime)

> > >       parse_mul(&runtime_mul, "LTP_RUNTIME_MUL", 0.0099, 100);

> > > -     if (tst_has_slow_kconfig())
> > > -             max_runtime *= 4;
> > > -
> > >       return max_runtime * runtime_mul;
> > >  }

> > > @@ -1706,6 +1703,9 @@ unsigned int tst_multiply_timeout(unsigned int
> > timeout)
> > >       if (timeout < 1)
> > >               tst_brk(TBROK, "timeout must to be >= 1! (%d)", timeout);

> > > +     if (tst_has_slow_kconfig())
> > > +             timeout *= 4;

> > FYI this change, merged as 893ca0abe7 ("lib: multiply the timeout if
> > detect slow
> > kconfigs") caused a regression on *all* tests which use tst_net.sh.

> > First, it detects slow config on everything which uses struct tst_test,
> > which are unfortunately some tools at testcases/lib/:

> > $ git grep -l "struct tst_test" testcases/lib/*.c
> > testcases/lib/tst_device.c // not obvious reason, might be removed
> > testcases/lib/tst_get_free_pids.c // force messages to be printed from new
> > library
> > testcases/lib/tst_ns_create.c // .forks_child = 1, Needed by SAFE_CLONE
> > testcases/lib/tst_ns_exec.c  // .forks_child = 1, Needed by SAFE_CLONE
> > testcases/lib/tst_run_shell.c // not obvious reason, might be removed

> > Besides unimportant fact that slow config detection is an unnecessary
> > slowdown
> > on these tools, the problem is that it prints messages to stderr, which
> > causes
> > breakage.

> > Previously tst_ns_exec.c run just command passed by -c parameter:

> > $ ./tst_ns_exec 14536 net,mnt sh -c " cat
> > /proc/sys/net/ipv6/conf/ltp_ns_veth1/disable_ipv6"
> > 0

> > Now it prints TINFO:
> > $ ./tst_ns_exec 14536 net,mnt sh -c " cat
> > /proc/sys/net/ipv6/conf/ltp_ns_veth1/disable_ipv6"
> > tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
> > tst_kconfig.c:667: TINFO: CONFIG_LATENCYTOP kernel option detected which
> > might slow the execution
> > 0

> > tst_rhost_run() in tst_net.sh runs binary on remote host:

> >         output=$($rcmd "$sh_cmd" $out 2>&1 || echo 'RTERR')

> > Redirect stderr to stdout (2>&1) is likely needed for some tests. But
> > tst_rhost_run() output is often parsed including therefore certain tools in
> > testcases/lib/ have to print only expected output:

> > init_ltp_netspace()
> > {
> >         ...
> >                 pid="$(ROD tst_ns_create net,mnt)"
> >         ...
> >         export LTP_NETNS="${LTP_NETNS:-tst_ns_exec $pid net,mnt}"

> > I will probably solve it by adding yet another parameter to
> > tst_rhost_run(),
> > which ignores stderr and use it for tst_ns_{create,exec}. But maybe there

Actually tst_ns_exec.c needs to be quiet to not clutter the output of the
command it runs via tst_rhost_run() for netns (via $LTP_NETNS). Therefore adding
an quiet option to tst_rhost_run() is not a solution.

> > is
> > another solution (or another problem).


> Or, another way is just to set 'tst_test->timeout == TST_UNLIMITED_TIMEOUT'
> in those testcase/lib tools.

Unfortunately this does not work. In the end I probably filter out 'TINFO:'
messages in tst_rhost_run().

It'd be safer to avoid using library for tst_ns_exec.c, but it's also used in
containers shell tests (netns_lib.sh, netns_sysfs.sh) and fs_bind tests
(fs_bind_cloneNS*.sh).

Also, tst_ns_exec.c (and tst_ns_create.c) could be for netns replaced by
'ip netns' (revert back change in 2017 3fb501e04c ("net: use LTP ns_create/ns_exec")

Maybe other tools which use tst_ns_{create,exec,ifmove}.c could be replaced by
some tools from util-linux (unshare ?). We consider having own binary as better
than external dependency, but binary needs to be simple to avoid problems when
LTP library changes.

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-16 12:10       ` Li Wang
  2025-01-16 12:13         ` Li Wang
@ 2025-01-16 13:13         ` Petr Vorel
  1 sibling, 0 replies; 40+ messages in thread
From: Petr Vorel @ 2025-01-16 13:13 UTC (permalink / raw)
  To: Li Wang; +Cc: ltp, Martin Doucha

> On Thu, Jan 16, 2025 at 4:38 PM Petr Vorel <pvorel@suse.cz> wrote:

> > ...
> > > > init_ltp_netspace()
> > > > {
> > > >         ...
> > > >                 pid="$(ROD tst_ns_create net,mnt)"
> > > >         ...
> > > >         export LTP_NETNS="${LTP_NETNS:-tst_ns_exec $pid net,mnt}"

> > > > I will probably solve it by adding yet another parameter to
> > > > tst_rhost_run(),
> > > > which ignores stderr and use it for tst_ns_{create,exec}. But maybe
> > there
> > > > is
> > > > another solution (or another problem).


> > > Or, another way is just to set 'tst_test->timeout ==
> > TST_UNLIMITED_TIMEOUT'
> > > in those testcase/lib tools.

> > Or, we could have yet another flag TST_SKIP_SLOW_DETECTION which could be
> > used
> > for these tools.


> Sounds better. But maybe name it TST_NO_SLOW_CONFIG_CHECK ?

> diff --git a/lib/tst_test.c b/lib/tst_test.c
> index b204ad975..1e9504f29 100644
> --- a/lib/tst_test.c
> +++ b/lib/tst_test.c
> @@ -1702,8 +1702,10 @@ unsigned int tst_multiply_timeout(unsigned int
> timeout)
>         if (timeout < 1)
>                 tst_brk(TBROK, "timeout must to be >= 1! (%d)", timeout);

> +#ifndef TST_NO_SLOW_CONFIG_CHECK
>         if (tst_has_slow_kconfig())
>                 timeout *= 4;
> +#endif

>         return timeout * timeout_mul;
>  }
> diff --git a/testcases/lib/tst_ns_exec.c b/testcases/lib/tst_ns_exec.c
> index 6a8e39339..a1b780ab7 100644
> --- a/testcases/lib/tst_ns_exec.c
> +++ b/testcases/lib/tst_ns_exec.c
> @@ -14,6 +14,7 @@
>   */

>  #define TST_NO_DEFAULT_MAIN
> +#define TST_NO_SLOW_CONFIG_CHECK

>  #include <stdio.h>
>  #include <sys/wait.h>

Looks useful, I vote to add it for all tools in testcases/lib/ which define
TST_NO_DEFAULT_MAIN.

FYI I added other hardening, which prevents other potential problems with
unwanted output from the library:
https://patchwork.ozlabs.org/project/ltp/patch/20250116130936.700441-1-pvorel@suse.cz/

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] pidns05 timeout (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-16  6:54   ` Li Wang
                       ` (2 preceding siblings ...)
  2025-01-16 12:56     ` Petr Vorel
@ 2025-01-20  9:07     ` Petr Vorel
  2025-01-20  9:11       ` Petr Vorel
  3 siblings, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-20  9:07 UTC (permalink / raw)
  To: Li Wang; +Cc: ltp, Martin Doucha

> On Thu, Jan 16, 2025 at 6:42 AM Petr Vorel <pvorel@suse.cz> wrote:

> > Hi Li, Cyril, all,

> > ...
> > > +++ b/lib/tst_test.c
> > > @@ -555,9 +555,6 @@ static int multiply_runtime(int max_runtime)

> > >       parse_mul(&runtime_mul, "LTP_RUNTIME_MUL", 0.0099, 100);

> > > -     if (tst_has_slow_kconfig())
> > > -             max_runtime *= 4;
> > > -
> > >       return max_runtime * runtime_mul;
> > >  }

> > > @@ -1706,6 +1703,9 @@ unsigned int tst_multiply_timeout(unsigned int
> > timeout)
> > >       if (timeout < 1)
> > >               tst_brk(TBROK, "timeout must to be >= 1! (%d)", timeout);

> > > +     if (tst_has_slow_kconfig())
> > > +             timeout *= 4;

> > FYI this change, merged as 893ca0abe7 ("lib: multiply the timeout if
> > detect slow
> > kconfigs") caused a regression on *all* tests which use tst_net.sh.
...

FYI also at least pidns05.c sometimes fails due timeout with this change.
On some of SLES product previously pidns05.c run for 3 sec. With this change it
runs 12s and therefore timeouts.

In pidns05.c case child is run 5x. For each of this child we again detect if we
run on slow config. Maybe we should have used struct tst_test member to cache
the value.

What bothers me more that how much time we waste for whole LTP testing with
repeatedly detecting slow config for all tests (runtest/syscalls has 1457 items,
we run it more times for each product with different kernel cmdline parameters).
I don't know what was supposed to be fixed by this feature, is it really worth
of slowdown? Why not just set LTP_RUNTIME_MUL=2 on slow kernels? We could have
tool which would 'exit 1' on "slow" kernel and 'exit 0' on normal kernel to do
automatic detection, which could be run by frameworks just once.

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] pidns05 timeout (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-20  9:07     ` [LTP] [REGRESSION] pidns05 timeout " Petr Vorel
@ 2025-01-20  9:11       ` Petr Vorel
  2025-01-20 10:18         ` Li Wang
  0 siblings, 1 reply; 40+ messages in thread
From: Petr Vorel @ 2025-01-20  9:11 UTC (permalink / raw)
  To: Li Wang, Cyril Hrubis, ltp, Avinesh Kumar, Martin Doucha,
	Jan Stancek, Andrea Cervesato

> > On Thu, Jan 16, 2025 at 6:42 AM Petr Vorel <pvorel@suse.cz> wrote:

> > > Hi Li, Cyril, all,

> > > ...
> > > > +++ b/lib/tst_test.c
> > > > @@ -555,9 +555,6 @@ static int multiply_runtime(int max_runtime)

> > > >       parse_mul(&runtime_mul, "LTP_RUNTIME_MUL", 0.0099, 100);

> > > > -     if (tst_has_slow_kconfig())
> > > > -             max_runtime *= 4;
> > > > -
> > > >       return max_runtime * runtime_mul;
> > > >  }

> > > > @@ -1706,6 +1703,9 @@ unsigned int tst_multiply_timeout(unsigned int
> > > timeout)
> > > >       if (timeout < 1)
> > > >               tst_brk(TBROK, "timeout must to be >= 1! (%d)", timeout);

> > > > +     if (tst_has_slow_kconfig())
> > > > +             timeout *= 4;

> > > FYI this change, merged as 893ca0abe7 ("lib: multiply the timeout if
> > > detect slow
> > > kconfigs") caused a regression on *all* tests which use tst_net.sh.
> ...

> FYI also at least pidns05.c sometimes fails due timeout with this change.
> On some of SLES product previously pidns05.c run for 3 sec. With this change it
> runs 12s and therefore timeouts.

I'm sorry for a wrong report. Looking about it twice there is "*** stack
smashing detected ***: terminated" => some other problem, which causes slow
down. IMHO it's not optimal to run the detection many times + basically now
requiring kernel config for each LTP test, but performance impact is probably
low.

Kind regards,
Petr

> In pidns05.c case child is run 5x. For each of this child we again detect if we
> run on slow config. Maybe we should have used struct tst_test member to cache
> the value.

> What bothers me more that how much time we waste for whole LTP testing with
> repeatedly detecting slow config for all tests (runtest/syscalls has 1457 items,
> we run it more times for each product with different kernel cmdline parameters).
> I don't know what was supposed to be fixed by this feature, is it really worth
> of slowdown? Why not just set LTP_RUNTIME_MUL=2 on slow kernels? We could have
> tool which would 'exit 1' on "slow" kernel and 'exit 0' on normal kernel to do
> automatic detection, which could be run by frameworks just once.

> Kind regards,
> Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] pidns05 timeout (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-20  9:11       ` Petr Vorel
@ 2025-01-20 10:18         ` Li Wang
  2025-01-20 12:29           ` Cyril Hrubis
  0 siblings, 1 reply; 40+ messages in thread
From: Li Wang @ 2025-01-20 10:18 UTC (permalink / raw)
  To: Petr Vorel; +Cc: ltp, Martin Doucha

On Mon, Jan 20, 2025 at 5:12 PM Petr Vorel <pvorel@suse.cz> wrote:

> > > On Thu, Jan 16, 2025 at 6:42 AM Petr Vorel <pvorel@suse.cz> wrote:
>
> > > > Hi Li, Cyril, all,
>
> > > > ...
> > > > > +++ b/lib/tst_test.c
> > > > > @@ -555,9 +555,6 @@ static int multiply_runtime(int max_runtime)
>
> > > > >       parse_mul(&runtime_mul, "LTP_RUNTIME_MUL", 0.0099, 100);
>
> > > > > -     if (tst_has_slow_kconfig())
> > > > > -             max_runtime *= 4;
> > > > > -
> > > > >       return max_runtime * runtime_mul;
> > > > >  }
>
> > > > > @@ -1706,6 +1703,9 @@ unsigned int tst_multiply_timeout(unsigned
> int
> > > > timeout)
> > > > >       if (timeout < 1)
> > > > >               tst_brk(TBROK, "timeout must to be >= 1! (%d)",
> timeout);
>
> > > > > +     if (tst_has_slow_kconfig())
> > > > > +             timeout *= 4;
>
> > > > FYI this change, merged as 893ca0abe7 ("lib: multiply the timeout if
> > > > detect slow
> > > > kconfigs") caused a regression on *all* tests which use tst_net.sh.
> > ...
>
> > FYI also at least pidns05.c sometimes fails due timeout with this change.
> > On some of SLES product previously pidns05.c run for 3 sec. With this
> change it
> > runs 12s and therefore timeouts.
>
> I'm sorry for a wrong report. Looking about it twice there is "*** stack
> smashing detected ***: terminated" => some other problem, which causes slow
> down. IMHO it's not optimal to run the detection many times + basically now
> requiring kernel config for each LTP test, but performance impact is
> probably
> low.
>

Yes, I understand your concern. I agree that we can avoid the
kconfig detection for most faster test cases.

After reviewing my previous pre-release LTP test and compare to
the current main branch test, it doesn't show much more time-consuming,
maybe there is different HW/kernel factors, but the performance
impact can be ignored.




>
> Kind regards,
> Petr
>
> > In pidns05.c case child is run 5x. For each of this child we again
> detect if we
> > run on slow config. Maybe we should have used struct tst_test member to
> cache
> > the value.
>
> > What bothers me more that how much time we waste for whole LTP testing
> with
> > repeatedly detecting slow config for all tests (runtest/syscalls has
> 1457 items,
> > we run it more times for each product with different kernel cmdline
> parameters).
> > I don't know what was supposed to be fixed by this feature, is it really
> worth
> > of slowdown? Why not just set LTP_RUNTIME_MUL=2 on slow kernels? We
> could have
> > tool which would 'exit 1' on "slow" kernel and 'exit 0' on normal kernel
> to do
> > automatic detection, which could be run by frameworks just once.
>
> > Kind regards,
> > Petr
>
>

-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] pidns05 timeout (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-20 10:18         ` Li Wang
@ 2025-01-20 12:29           ` Cyril Hrubis
  2025-01-20 13:03             ` Petr Vorel
  0 siblings, 1 reply; 40+ messages in thread
From: Cyril Hrubis @ 2025-01-20 12:29 UTC (permalink / raw)
  To: Li Wang; +Cc: ltp, Martin Doucha

Hi!
> Yes, I understand your concern. I agree that we can avoid the
> kconfig detection for most faster test cases.
> 
> After reviewing my previous pre-release LTP test and compare to
> the current main branch test, it doesn't show much more time-consuming,
> maybe there is different HW/kernel factors, but the performance
> impact can be ignored.

I've just send a patch that caches the detection results, because in the
case of pins05 we did parse the config for each SAFE_CLONE() call.

-- 
Cyril Hrubis
chrubis@suse.cz

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [LTP] [REGRESSION] pidns05 timeout (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs)
  2025-01-20 12:29           ` Cyril Hrubis
@ 2025-01-20 13:03             ` Petr Vorel
  0 siblings, 0 replies; 40+ messages in thread
From: Petr Vorel @ 2025-01-20 13:03 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: ltp

> Hi!
> > Yes, I understand your concern. I agree that we can avoid the
> > kconfig detection for most faster test cases.

> > After reviewing my previous pre-release LTP test and compare to
> > the current main branch test, it doesn't show much more time-consuming,
> > maybe there is different HW/kernel factors, but the performance
> > impact can be ignored.

> I've just send a patch that caches the detection results, because in the
> case of pins05 we did parse the config for each SAFE_CLONE() call.

Thanks! Elegant solution, I should have come up with static myself instead of
complaining :).

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2025-01-20 13:03 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-22  7:22 [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs Li Wang
2024-12-22  7:22 ` [LTP] [PATCH 2/2] starvation: skip test on slow kernel Li Wang
2025-01-02 12:56   ` Petr Vorel
2025-01-02 14:31     ` Petr Vorel
2025-01-03  7:53       ` Li Wang
2025-01-02 12:43 ` [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs Petr Vorel
2025-01-03  5:00   ` Li Wang
2025-01-03  7:06     ` Petr Vorel
2025-01-03  7:33       ` Petr Vorel
2025-01-03  7:58         ` Li Wang
2025-01-03 15:48     ` [LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigsD Cyril Hrubis
2025-01-04  3:38       ` Li Wang
2025-01-06  9:35         ` Petr Vorel
2025-01-06 12:10         ` Cyril Hrubis
2025-01-06 12:52           ` Petr Vorel
2025-01-06 13:39             ` Cyril Hrubis
2025-01-06 15:36               ` Petr Vorel
2025-01-06 16:19                 ` Cyril Hrubis
2025-01-07  5:37                   ` Li Wang
2025-01-07  6:28                     ` Li Wang
2025-01-07 12:42                       ` Cyril Hrubis
2025-01-07 16:49                         ` Petr Vorel
2025-01-08  1:47                           ` Li Wang
2025-01-08  2:51                             ` Li Wang
2025-01-06 16:03           ` Martin Doucha
2025-01-06 16:21             ` Cyril Hrubis
2025-01-06 16:49               ` Martin Doucha
2025-01-15 22:41 ` [LTP] [REGRESSION] Broken tests using tst_net.sh by 893ca0abe7 (was: [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs) Petr Vorel
2025-01-16  6:54   ` Li Wang
2025-01-16  8:35     ` Petr Vorel
2025-01-16  8:38     ` Petr Vorel
2025-01-16 12:10       ` Li Wang
2025-01-16 12:13         ` Li Wang
2025-01-16 13:13         ` Petr Vorel
2025-01-16 12:56     ` Petr Vorel
2025-01-20  9:07     ` [LTP] [REGRESSION] pidns05 timeout " Petr Vorel
2025-01-20  9:11       ` Petr Vorel
2025-01-20 10:18         ` Li Wang
2025-01-20 12:29           ` Cyril Hrubis
2025-01-20 13:03             ` Petr Vorel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox