[LTP] [RFC] [PATCH] pselect01: Tune thresholds

public inbox for ltp@lists.linux.it
 help / color / mirror / Atom feed

* [LTP] [RFC] [PATCH] pselect01: Tune thresholds
@ 2017-05-05 13:18 Cyril Hrubis
  2017-05-11 12:29 ` Cyril Hrubis
  2017-05-11 15:54 ` Jan Stancek
  0 siblings, 2 replies; 11+ messages in thread
From: Cyril Hrubis @ 2017-05-05 13:18 UTC (permalink / raw)
  To: ltp

The testcase was failing frequenty on virtual machines where the
thresholds were too tight so I've revisited the threshold computation
and came with this model:

The kernel timers are allowed to slack for 0.1% of the sleep time capped
at 100ms we add 40us to this number to accomodate for context switches,
etc. and multiply the number by the number of iterations. This works
fine when the number of iterations is large enough (> 400) where the
outliners are averaged out, but the test was still failing rarely when
we hit outliner when the number of iterations was relatively small. The
solution is to add an arbitrary number divided by the number of
iterations to the threshold so that the thresholds for large enough
number of iterations are not affected.

We should probably also tune the thresholds separately for virtual
machines and for real hardware and maybe we should also put this piece
of code into the test library so that it could be easily reused in all
testcases that are measuring calls that can sleep in kernel. I will look
into this later once we are done with the release and the git is open
not only to important fixes.

Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
---
 testcases/kernel/syscalls/pselect/pselect01.c | 51 +++++++++++++++++++++++----
 1 file changed, 44 insertions(+), 7 deletions(-)

diff --git a/testcases/kernel/syscalls/pselect/pselect01.c b/testcases/kernel/syscalls/pselect/pselect01.c
index 6f3b77b..a544ae5 100644
--- a/testcases/kernel/syscalls/pselect/pselect01.c
+++ b/testcases/kernel/syscalls/pselect/pselect01.c
@@ -42,13 +42,34 @@ struct tcase {
 };
 
 static struct tcase tcases[] = {
-	{{1, 0},         1},
-	{{0, 1000000}, 100},
-	{{0, 2000000}, 100},
-	{{0, 10000000}, 10},
-	{{0, 100000000}, 1},
+	{{0, 1000000},  800},
+	{{0, 2000000},  500},
+	{{0, 10000000}, 100},
+	{{0, 100000000},  1},
+	{{1, 0},          1},
 };
 
+#define MIN(a, b) ((a) < (b) ? @ : (b))
+
+/*
+ * The threshold is computed as 0.1% of the runtime capped@100ms (which is
+ * what what kernel timers are allowed to slack off) + 40us to accomodate for
+ * context switches, etc. multiplied by the number of iterations.
+ *
+ * We also allow for outliners, i.e. add some number to the threshold in case
+ * that the number of iteration is small. For large enoung number of iterations
+ * outliners are averaged out.
+ *
+ * With this the test can pass reliably even on virtual machine with the host
+ * under heavy CPU load.
+ */
+static int compute_threshold(long long requested_us, unsigned int iterations)
+{
+	unsigned int slack_per_run = MIN(100000, requested_us / 1000);
+
+	return (40 + slack_per_run) * iterations + 20000 / (iterations/200 + 1);
+}
+
 static void verify_pselect(unsigned int n)
 {
 	fd_set readfds;
@@ -62,6 +83,8 @@ static void verify_pselect(unsigned int n)
 			t->tv.tv_sec, t->tv.tv_nsec, t->iterations);
 
 	for (i = 0; i < t->iterations; i++) {
+		long long elapsed_us;
+
 		FD_ZERO(&readfds);
 		FD_SET(0, &readfds);
 
@@ -71,11 +94,25 @@ static void verify_pselect(unsigned int n)
 		pselect(0, &readfds, NULL, NULL, &tv, NULL);
 		tst_timer_stop();
 
-		slept_us += tst_timer_elapsed_us();
+		elapsed_us = tst_timer_elapsed_us();
+
+		if (elapsed_us >= 10 * tst_timespec_to_us(t->tv)) {
+			tst_res(TINFO,
+				"Found outliner took %lli us, expected %lli us",
+				elapsed_us, tst_timespec_to_us(t->tv));
+		}
+
+		slept_us += elapsed_us;
 	}
 
 	requested_us = tst_timespec_to_us(t->tv) * t->iterations;
-	threshold = requested_us / 100 + 200 * t->iterations;
+	threshold = compute_threshold(requested_us, t->iterations);
+
+	if (t->iterations > 1) {
+		tst_res(TINFO, "Mean sleep time %.2f us, expected %lli us, threshold %.2f",
+			1.00 * slept_us / t->iterations,
+			tst_timespec_to_us(t->tv), 1.00 * threshold / t->iterations);
+	}
 
 	if (slept_us < requested_us) {
 		tst_res(TFAIL,
-- 
2.10.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [LTP] [RFC] [PATCH] pselect01: Tune thresholds
  2017-05-05 13:18 [LTP] [RFC] [PATCH] pselect01: Tune thresholds Cyril Hrubis
@ 2017-05-11 12:29 ` Cyril Hrubis
  2017-05-11 15:54 ` Jan Stancek
  1 sibling, 0 replies; 11+ messages in thread
From: Cyril Hrubis @ 2017-05-11 12:29 UTC (permalink / raw)
  To: ltp

Hi!
Has anybody any comments for this patch?

This and the move_pages12 failures still blocks the release from my
point of view.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [RFC] [PATCH] pselect01: Tune thresholds
  2017-05-05 13:18 [LTP] [RFC] [PATCH] pselect01: Tune thresholds Cyril Hrubis
  2017-05-11 12:29 ` Cyril Hrubis
@ 2017-05-11 15:54 ` Jan Stancek
  2017-05-12  6:51   ` Cyril Hrubis
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Stancek @ 2017-05-11 15:54 UTC (permalink / raw)
  To: ltp



----- Original Message -----
> +static int compute_threshold(long long requested_us, unsigned int
> iterations)
> +{
> +	unsigned int slack_per_run = MIN(100000, requested_us / 1000);
> +
> +	return (40 + slack_per_run) * iterations + 20000 / (iterations/200 + 1);
> +}

  requested_us = tst_timespec_to_us(t->tv) * t->iterations;

If slack_per_run is slack of all iterations, why is it multiplied
again with number of iterations?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [RFC] [PATCH] pselect01: Tune thresholds
  2017-05-11 15:54 ` Jan Stancek
@ 2017-05-12  6:51   ` Cyril Hrubis
  2017-05-12  7:11     ` Jan Stancek
  0 siblings, 1 reply; 11+ messages in thread
From: Cyril Hrubis @ 2017-05-12  6:51 UTC (permalink / raw)
  To: ltp

Hi!
> > +static int compute_threshold(long long requested_us, unsigned int
> > iterations)
> > +{
> > +	unsigned int slack_per_run = MIN(100000, requested_us / 1000);
> > +
> > +	return (40 + slack_per_run) * iterations + 20000 / (iterations/200 + 1);
> > +}
> 
>   requested_us = tst_timespec_to_us(t->tv) * t->iterations;
> 
> If slack_per_run is slack of all iterations, why is it multiplied
> again with number of iterations?

That variable is poorly named. It should have been slack_per_call or
slack_per_iteration. It's slack per one syscall computed using the same
formula that is used in kernel...

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [RFC] [PATCH] pselect01: Tune thresholds
  2017-05-12  6:51   ` Cyril Hrubis
@ 2017-05-12  7:11     ` Jan Stancek
  2017-05-12  7:17       ` Cyril Hrubis
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Stancek @ 2017-05-12  7:11 UTC (permalink / raw)
  To: ltp



----- Original Message -----
> Hi!
> > > +static int compute_threshold(long long requested_us, unsigned int
> > > iterations)
> > > +{
> > > +	unsigned int slack_per_run = MIN(100000, requested_us / 1000);
> > > +
> > > +	return (40 + slack_per_run) * iterations + 20000 / (iterations/200 +
> > > 1);
> > > +}
> > 
> >   requested_us = tst_timespec_to_us(t->tv) * t->iterations;
> > 
> > If slack_per_run is slack of all iterations, why is it multiplied
> > again with number of iterations?
> 
> That variable is poorly named. It should have been slack_per_call or
> slack_per_iteration. It's slack per one syscall computed using the same
> formula that is used in kernel...

If it's per iteration, shouldn't this function be passed time per iteration?
At the moment, it's time of all iterations:
  requested_us = tst_timespec_to_us(t->tv) * t->iterations;
  threshold = compute_threshold(requested_us, t->iterations);

Should it be ... ?
  threshold = compute_threshold(tst_timespec_to_us(t->tv), t->iterations);

Regards,
Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [RFC] [PATCH] pselect01: Tune thresholds
  2017-05-12  7:11     ` Jan Stancek
@ 2017-05-12  7:17       ` Cyril Hrubis
  2017-05-12  7:33         ` Jan Stancek
  0 siblings, 1 reply; 11+ messages in thread
From: Cyril Hrubis @ 2017-05-12  7:17 UTC (permalink / raw)
  To: ltp

Hi!
> > >   requested_us = tst_timespec_to_us(t->tv) * t->iterations;
> > > 
> > > If slack_per_run is slack of all iterations, why is it multiplied
> > > again with number of iterations?
> > 
> > That variable is poorly named. It should have been slack_per_call or
> > slack_per_iteration. It's slack per one syscall computed using the same
> > formula that is used in kernel...
> 
> If it's per iteration, shouldn't this function be passed time per iteration?
> At the moment, it's time of all iterations:
>   requested_us = tst_timespec_to_us(t->tv) * t->iterations;
>   threshold = compute_threshold(requested_us, t->iterations);
> 
> Should it be ... ?
>   threshold = compute_threshold(tst_timespec_to_us(t->tv), t->iterations);

Good catch, I will fix that, retest and send v2.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [RFC] [PATCH] pselect01: Tune thresholds
  2017-05-12  7:17       ` Cyril Hrubis
@ 2017-05-12  7:33         ` Jan Stancek
  2017-05-12 10:04           ` Helge Deller
  2017-05-12 15:38           ` Cyril Hrubis
  0 siblings, 2 replies; 11+ messages in thread
From: Jan Stancek @ 2017-05-12  7:33 UTC (permalink / raw)
  To: ltp


----- Original Message -----
> Hi!
> > > >   requested_us = tst_timespec_to_us(t->tv) * t->iterations;
> > > > 
> > > > If slack_per_run is slack of all iterations, why is it multiplied
> > > > again with number of iterations?
> > > 
> > > That variable is poorly named. It should have been slack_per_call or
> > > slack_per_iteration. It's slack per one syscall computed using the same
> > > formula that is used in kernel...
> > 
> > If it's per iteration, shouldn't this function be passed time per
> > iteration?
> > At the moment, it's time of all iterations:
> >   requested_us = tst_timespec_to_us(t->tv) * t->iterations;
> >   threshold = compute_threshold(requested_us, t->iterations);
> > 
> > Should it be ... ?
> >   threshold = compute_threshold(tst_timespec_to_us(t->tv), t->iterations);
> 
> Good catch, I will fix that, retest and send v2.

OK, if we agree, it's supposed to be time of single iteration,
isn't new function (in some cases) more strict than it used to be?

Before we had:
  threshold = requested_us / 100 + 200 * t->iterations;
which was 1% slack plus small number (200) that scales with iterations.

Now we have 0.1% slack plus small number (40) that scales
with iterations plus big number (20000) that scales inverse
to number number of iterations.

Say for 1000 us and 1000 iterations we used to have slack
10000 + 200*1000 =~ 210000. And now it's
(40 + 1) * 1000 + 20000 / (1000/200+1) = 41000 + 3333 =~ 44333

Regards,
Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [RFC] [PATCH] pselect01: Tune thresholds
  2017-05-12  7:33         ` Jan Stancek
@ 2017-05-12 10:04           ` Helge Deller
  2017-05-12 13:17             ` Cyril Hrubis
  2017-05-12 15:38           ` Cyril Hrubis
  1 sibling, 1 reply; 11+ messages in thread
From: Helge Deller @ 2017-05-12 10:04 UTC (permalink / raw)
  To: ltp

> > Hi!
> > > > >   requested_us = tst_timespec_to_us(t->tv) * t->iterations;
> > > > > 
> > > > > If slack_per_run is slack of all iterations, why is it multiplied
> > > > > again with number of iterations?
> > > > 
> > > > That variable is poorly named. It should have been slack_per_call or
> > > > slack_per_iteration. It's slack per one syscall computed using the same
> > > > formula that is used in kernel...
> > > 
> > > If it's per iteration, shouldn't this function be passed time per
> > > iteration?
> > > At the moment, it's time of all iterations:
> > >   requested_us = tst_timespec_to_us(t->tv) * t->iterations;
> > >   threshold = compute_threshold(requested_us, t->iterations);
> > > 
> > > Should it be ... ?
> > >   threshold = compute_threshold(tst_timespec_to_us(t->tv), t->iterations);
> > 
> > Good catch, I will fix that, retest and send v2.
> 
> OK, if we agree, it's supposed to be time of single iteration,
> isn't new function (in some cases) more strict than it used to be?
> 
> Before we had:
>   threshold = requested_us / 100 + 200 * t->iterations;
> which was 1% slack plus small number (200) that scales with iterations.
> 
> Now we have 0.1% slack plus small number (40) that scales
> with iterations plus big number (20000) that scales inverse
> to number number of iterations.
> 
> Say for 1000 us and 1000 iterations we used to have slack
> 10000 + 200*1000 =~ 210000. And now it's
> (40 + 1) * 1000 + 20000 / (1000/200+1) = 41000 + 3333 =~ 44333


Additionally, please keep in mind that on some architectures the
time measured is based on the jiffies clock resolution which depends
on the CONFIG_HZ value and results in a resolutions of e.g. 0.01, 
0.004, or 0.001 seconds which is way above the resolution you get
on the x86 platform (with the TSC clock).

This testcase was failing for me on the parisc platform because of this inaccuracy.

Maybe the resolution of the CLOCK_MONOTONIC clock (which is basically used in this
test) should be checked with clock_getres() and it's result included in the calculation?

Helge

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [RFC] [PATCH] pselect01: Tune thresholds
  2017-05-12 10:04           ` Helge Deller
@ 2017-05-12 13:17             ` Cyril Hrubis
  2017-05-12 13:59               ` Cyril Hrubis
  0 siblings, 1 reply; 11+ messages in thread
From: Cyril Hrubis @ 2017-05-12 13:17 UTC (permalink / raw)
  To: ltp

Hi!
> Additionally, please keep in mind that on some architectures the
> time measured is based on the jiffies clock resolution which depends
> on the CONFIG_HZ value and results in a resolutions of e.g. 0.01, 
> 0.004, or 0.001 seconds which is way above the resolution you get
> on the x86 platform (with the TSC clock).
> 
> This testcase was failing for me on the parisc platform because of this inaccuracy.
> 
> Maybe the resolution of the CLOCK_MONOTONIC clock (which is basically used in this
> test) should be checked with clock_getres() and it's result included in the calculation?

Right, I've dusted off old pxa270 and indeed we cannot measure less than
20us interval there. But clock_getres() returns 10ms which is much more
than the actuall measured value...

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [RFC] [PATCH] pselect01: Tune thresholds
  2017-05-12 13:17             ` Cyril Hrubis
@ 2017-05-12 13:59               ` Cyril Hrubis
  0 siblings, 0 replies; 11+ messages in thread
From: Cyril Hrubis @ 2017-05-12 13:59 UTC (permalink / raw)
  To: ltp

Hi!
> > Additionally, please keep in mind that on some architectures the
> > time measured is based on the jiffies clock resolution which depends
> > on the CONFIG_HZ value and results in a resolutions of e.g. 0.01, 
> > 0.004, or 0.001 seconds which is way above the resolution you get
> > on the x86 platform (with the TSC clock).
> > 
> > This testcase was failing for me on the parisc platform because of this inaccuracy.
> > 
> > Maybe the resolution of the CLOCK_MONOTONIC clock (which is basically used in this
> > test) should be checked with clock_getres() and it's result included in the calculation?
> 
> Right, I've dusted off old pxa270 and indeed we cannot measure less than
> 20us interval there. But clock_getres() returns 10ms which is much more
> than the actuall measured value...

And I did a mistake in conversion, looks like 20ms is minimal time the
pselect() can sleep there while reported precision is 10ms, so adding
2 * clock_monotonic_resolution fixes the test there. I will send v2
for the testcase right away.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [RFC] [PATCH] pselect01: Tune thresholds
  2017-05-12  7:33         ` Jan Stancek
  2017-05-12 10:04           ` Helge Deller
@ 2017-05-12 15:38           ` Cyril Hrubis
  1 sibling, 0 replies; 11+ messages in thread
From: Cyril Hrubis @ 2017-05-12 15:38 UTC (permalink / raw)
  To: ltp

Hi!
> > > If it's per iteration, shouldn't this function be passed time per
> > > iteration?
> > > At the moment, it's time of all iterations:
> > >   requested_us = tst_timespec_to_us(t->tv) * t->iterations;
> > >   threshold = compute_threshold(requested_us, t->iterations);
> > > 
> > > Should it be ... ?
> > >   threshold = compute_threshold(tst_timespec_to_us(t->tv), t->iterations);
> > 
> > Good catch, I will fix that, retest and send v2.
> 
> OK, if we agree, it's supposed to be time of single iteration,
> isn't new function (in some cases) more strict than it used to be?
> 
> Before we had:
>   threshold = requested_us / 100 + 200 * t->iterations;
> which was 1% slack plus small number (200) that scales with iterations.

The slack has to be multiplied by the number of iterations. I did
several measurements with the attached program and on x86 the mean seems
to consist of constant offset that is somewhere between 50us and 250us
and slack that does follow 0.1% of the sleep time.

Here are measurements on otherwise idle AMD Phenom Quad-Core:

min 112 max 301 median 154 mean 154.56 0.1% 0.10
min 520 max 595 median 555 mean 555.37 0.1% 0.50
min 1015 max 1126 median 1056 mean 1056.13 0.1% 1.00
min 2014 max 2128 median 2056 mean 2056.89 0.1% 2.00
min 3018 max 3219 median 3057 mean 3057.80 0.1% 3.00
min 4008 max 4072 median 4057 mean 4056.28 0.1% 4.00
min 10015 max 10275 median 10057 mean 10058.41 0.1% 10.00
min 100010 max 100354 median 100111 mean 100153.25 0.1% 100.00
min 500014 max 500702 median 500499 mean 500474.65 0.1% 500.00
min 1000652 max 1001351 median 1001000 mean 1000999.24 0.1% 1000.00

It looks like the offset is 50us and that the mean more or less follows
the slack. The offset seems to be closer to 250us for virtual machines.

On machines that use only jiffies the timer overshoot is close to twice
of the monotonic clock resolution.

> Now we have 0.1% slack plus small number (40) that scales
> with iterations plus big number (20000) that scales inverse
> to number number of iterations.
> 
> Say for 1000 us and 1000 iterations we used to have slack
> 10000 + 200*1000 =~ 210000. And now it's
> (40 + 1) * 1000 + 20000 / (1000/200+1) = 41000 + 3333 =~ 44333

See v2. It's more complicated now but looks like I've managed to break
it into several pieces and the new formula works fine but there has to
be at least one idle CPU. When machine is under heavy load the sleep
times are affected by scheduler slices...

Looking at it again we can also play it safe and allow 0.2% slack but otherwise
I think that I got it right this time.

-- 
Cyril Hrubis
chrubis@suse.cz
-------------- next part --------------
A non-text attachment was scrubbed...
Name: measure.c
Type: text/x-c
Size: 2371 bytes
Desc: not available
URL: <http://lists.linux.it/pipermail/ltp/attachments/20170512/16580293/attachment-0001.bin>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-05-12 15:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-05 13:18 [LTP] [RFC] [PATCH] pselect01: Tune thresholds Cyril Hrubis
2017-05-11 12:29 ` Cyril Hrubis
2017-05-11 15:54 ` Jan Stancek
2017-05-12  6:51   ` Cyril Hrubis
2017-05-12  7:11     ` Jan Stancek
2017-05-12  7:17       ` Cyril Hrubis
2017-05-12  7:33         ` Jan Stancek
2017-05-12 10:04           ` Helge Deller
2017-05-12 13:17             ` Cyril Hrubis
2017-05-12 13:59               ` Cyril Hrubis
2017-05-12 15:38           ` Cyril Hrubis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox