* [OSSTEST PATCH 0/9] Host allocation scoring improvements
@ 2014-11-11 19:41 Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 1/9] cs-adjust-flight: Fix doc about /<pcre> to match implementation Ian Jackson
` (9 more replies)
0 siblings, 10 replies; 17+ messages in thread
From: Ian Jackson @ 2014-11-11 19:41 UTC (permalink / raw)
To: xen-devel; +Cc: Ian Campbell
Here, I'm trying to fix the way that osstest gets far too obsessed
about particular failing hosts.
1/9 cs-adjust-flight: Fix doc about /<pcre> to match
2/9 ts-hosts-allocate-Executive: allow uncompressed
3/9 Osstest/Executive.pm: Debug log same-host status
4/9 ts-hosts-allocate-Executive: Move $variation_age
5/9 ts-hosts-allocate-Executive: Do not prefer fast
6/9 ts-hosts-allocate-Executive: Clarify an
7/9 ts-hosts-allocate-Executive: Score for
8/9 ts-hosts-allocate-Executive: Redo
9/9 ts-hosts-allocate-Executive: Radically reduce
^ permalink raw reply [flat|nested] 17+ messages in thread
* [OSSTEST PATCH 1/9] cs-adjust-flight: Fix doc about /<pcre> to match implementation
2014-11-11 19:41 [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Jackson
@ 2014-11-11 19:41 ` Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 2/9] ts-hosts-allocate-Executive: allow uncompressed log Ian Jackson
` (8 subsequent siblings)
9 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2014-11-11 19:41 UTC (permalink / raw)
To: xen-devel; +Cc: Ian Jackson, Ian Campbell
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
cs-adjust-flight | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/cs-adjust-flight b/cs-adjust-flight
index 663ca6f..7ec17e3 100755
--- a/cs-adjust-flight
+++ b/cs-adjust-flight
@@ -18,7 +18,7 @@
# <foo-name>
# . means all jobs
# ^<pcre> means $foo =~ m/^<pcre>/
-# /<pcre>/ means $foo =~ m/<pcre>/
+# /<pcre> means $foo =~ m/<pcre>/
# This is part of "osstest", an automated testing framework for Xen.
# Copyright (C) 2009-2013 Citrix Inc.
--
1.7.10.4
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [OSSTEST PATCH 2/9] ts-hosts-allocate-Executive: allow uncompressed log
2014-11-11 19:41 [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 1/9] cs-adjust-flight: Fix doc about /<pcre> to match implementation Ian Jackson
@ 2014-11-11 19:41 ` Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 3/9] Osstest/Executive.pm: Debug log same-host status (in duration estimator) Ian Jackson
` (7 subsequent siblings)
9 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2014-11-11 19:41 UTC (permalink / raw)
To: xen-devel; +Cc: Ian Jackson, Ian Campbell
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
ts-hosts-allocate-Executive | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
index 0e9c193..73c1a45 100755
--- a/ts-hosts-allocate-Executive
+++ b/ts-hosts-allocate-Executive
@@ -29,12 +29,14 @@ tsreadconfig();
open DEBUG, ">/dev/null" or die $!;
+our $compressdebug=1;
+
while (@ARGV and $ARGV[0] =~ m/^-/) {
$_= shift @ARGV;
last if m/^--$/;
while (m/^-./) {
- if (0) {
- # no options
+ if (s/^-U/-/) {
+ $compressdebug=0;
} else {
die "$_ ?";
}
@@ -60,12 +62,16 @@ sub setup () {
$taskid= findtask();
- my $logbase = "hosts-allocate.debug.gz";
+ my $logbase = "hosts-allocate.debug".($compressdebug?".gz":"");
my $logfh = open_unique_stashfile \$logbase;
- my $logchild = open DEBUG, "|-"; defined $logchild or die $!;
- if (!$logchild) {
- open STDOUT, ">&", $logfh or die $!;
- exec "gzip" or die $!;
+ if ($compressdebug) {
+ my $logchild = open DEBUG, "|-"; defined $logchild or die $!;
+ if (!$logchild) {
+ open STDOUT, ">&", $logfh or die $!;
+ exec "gzip" or die $!;
+ }
+ } else {
+ open DEBUG, ">&", $logfh or die $!;
}
DEBUG->autoflush(1);
logm("host allocation debug log in $logbase");
--
1.7.10.4
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [OSSTEST PATCH 3/9] Osstest/Executive.pm: Debug log same-host status (in duration estimator)
2014-11-11 19:41 [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 1/9] cs-adjust-flight: Fix doc about /<pcre> to match implementation Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 2/9] ts-hosts-allocate-Executive: allow uncompressed log Ian Jackson
@ 2014-11-11 19:41 ` Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 4/9] ts-hosts-allocate-Executive: Move $variation_age setting Ian Jackson
` (6 subsequent siblings)
9 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2014-11-11 19:41 UTC (permalink / raw)
To: xen-devel; +Cc: Ian Jackson, Ian Campbell
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
Osstest/Executive.pm | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/Osstest/Executive.pm b/Osstest/Executive.pm
index 2a3cc7c..3dc37d1 100644
--- a/Osstest/Executive.pm
+++ b/Osstest/Executive.pm
@@ -733,7 +733,8 @@ END
my ($duration) = $duration_duration_q->fetchrow_array();
$duration_duration_q->finish();
if ($duration) {
- $dbg->("REF $ref->{flight} DURATION $duration");
+ $dbg->("REF $ref->{flight} DURATION $duration ".
+ ($ref->{status} // ''));
$duration_max= $duration
if $duration > $duration_max;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [OSSTEST PATCH 4/9] ts-hosts-allocate-Executive: Move $variation_age setting
2014-11-11 19:41 [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Jackson
` (2 preceding siblings ...)
2014-11-11 19:41 ` [OSSTEST PATCH 3/9] Osstest/Executive.pm: Debug log same-host status (in duration estimator) Ian Jackson
@ 2014-11-11 19:41 ` Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 5/9] ts-hosts-allocate-Executive: Do not prefer fast hosts for tests Ian Jackson
` (5 subsequent siblings)
9 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2014-11-11 19:41 UTC (permalink / raw)
To: xen-devel; +Cc: Ian Jackson, Ian Campbell
We are going to want to put more stuff in here which depends on
$duration_rightaway_adjust.
No functional change in this commit.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
ts-hosts-allocate-Executive | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
index 73c1a45..67b8891 100755
--- a/ts-hosts-allocate-Executive
+++ b/ts-hosts-allocate-Executive
@@ -478,12 +478,6 @@ sub hid_recurse ($$) {
print DEBUG "$dbg EVAL DURATION $duration va=$variation_age\n";
- if ($jobinfo->{recipe} =~ m/build/) {
- $variation_age= 0;
- } elsif ($variation_age > 5*86400) {
- $variation_age= 5*86400;
- }
-
my @requestlist;
foreach my $hid (@hids) {
my $req= {
@@ -502,6 +496,12 @@ sub hid_recurse ($$) {
$duration_rightaway_adjust=0 if $start_time;
+ if ($jobinfo->{recipe} =~ m/build/) {
+ $variation_age= 0;
+ } elsif ($variation_age > 5*86400) {
+ $variation_age= 5*86400;
+ }
+
my $cost= $start_time
+ $duration
+ $duration_rightaway_adjust
--
1.7.10.4
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [OSSTEST PATCH 5/9] ts-hosts-allocate-Executive: Do not prefer fast hosts for tests
2014-11-11 19:41 [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Jackson
` (3 preceding siblings ...)
2014-11-11 19:41 ` [OSSTEST PATCH 4/9] ts-hosts-allocate-Executive: Move $variation_age setting Ian Jackson
@ 2014-11-11 19:41 ` Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 6/9] ts-hosts-allocate-Executive: Clarify an expression with // Ian Jackson
` (4 subsequent siblings)
9 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2014-11-11 19:41 UTC (permalink / raw)
To: xen-devel; +Cc: Ian Jackson, Ian Campbell
Introduce $duration_for_cost and set it to the previous formula for
build jobs, or 0 for test jobs.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
ts-hosts-allocate-Executive | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
index 67b8891..fc54cda 100755
--- a/ts-hosts-allocate-Executive
+++ b/ts-hosts-allocate-Executive
@@ -496,15 +496,16 @@ sub hid_recurse ($$) {
$duration_rightaway_adjust=0 if $start_time;
+ my $duration_for_cost = 0;
if ($jobinfo->{recipe} =~ m/build/) {
$variation_age= 0;
+ $duration_for_cost= $duration + $duration_rightaway_adjust;
} elsif ($variation_age > 5*86400) {
$variation_age= 5*86400;
}
my $cost= $start_time
- + $duration
- + $duration_rightaway_adjust
+ + $duration_for_cost
- $previously_failed * 366*86400
+ ($previously_failed ? + $variation_age * 10 : - $variation_age / 30)
- $share_reuse * 10000;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [OSSTEST PATCH 6/9] ts-hosts-allocate-Executive: Clarify an expression with //
2014-11-11 19:41 [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Jackson
` (4 preceding siblings ...)
2014-11-11 19:41 ` [OSSTEST PATCH 5/9] ts-hosts-allocate-Executive: Do not prefer fast hosts for tests Ian Jackson
@ 2014-11-11 19:41 ` Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 7/9] ts-hosts-allocate-Executive: Score for equivalent previous failures Ian Jackson
` (3 subsequent siblings)
9 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2014-11-11 19:41 UTC (permalink / raw)
To: xen-devel; +Cc: Ian Jackson, Ian Campbell
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
ts-hosts-allocate-Executive | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
index fc54cda..9562a0a 100755
--- a/ts-hosts-allocate-Executive
+++ b/ts-hosts-allocate-Executive
@@ -464,8 +464,7 @@ sub hid_recurse ($$) {
!defined($duration) ||
defined($cand->{Duration}) && $cand->{Duration} >= $duration;
$previously_failed++ if
- defined $cand->{MostRecentStatus} &&
- $cand->{MostRecentStatus} eq 'fail';
+ ($cand->{MostRecentStatus} // '') eq 'fail';
}
my $duration_rightaway_adjust= 0;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [OSSTEST PATCH 7/9] ts-hosts-allocate-Executive: Score for equivalent previous failures
2014-11-11 19:41 [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Jackson
` (5 preceding siblings ...)
2014-11-11 19:41 ` [OSSTEST PATCH 6/9] ts-hosts-allocate-Executive: Clarify an expression with // Ian Jackson
@ 2014-11-11 19:41 ` Ian Jackson
2014-11-14 16:59 ` Ian Campbell
2014-11-11 19:41 ` [OSSTEST PATCH 8/9] ts-hosts-allocate-Executive: Redo variation_bonus scoring Ian Jackson
` (2 subsequent siblings)
9 siblings, 1 reply; 17+ messages in thread
From: Ian Jackson @ 2014-11-11 19:41 UTC (permalink / raw)
To: xen-devel; +Cc: Ian Jackson, Ian Campbell
Look to see whether the last run on any hosts which are equivalent to
the ones we're looking at, failed. This means that when host X is
failing and we are considering host Y which is equivalent to X, we
give Y a selection bonus.
This means that osstest will be less obsessive about sticking to the
very same failing host.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
ts-hosts-allocate-Executive | 45 +++++++++++++++++++++++++++++++++++++++++--
1 file changed, 43 insertions(+), 2 deletions(-)
diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
index 9562a0a..24f78d3 100755
--- a/ts-hosts-allocate-Executive
+++ b/ts-hosts-allocate-Executive
@@ -284,6 +284,29 @@ END
$findhostsq->execute("blessed-$fi->{intended}");
}
+ my $equivstatusq= $dbh_tests->prepare(<<END);
+ SELECT flight, job, val, status
+ FROM flights f
+ JOIN jobs j USING (flight)
+ JOIN runvars r USING (flight,job)
+ WHERE j.job=?
+ AND f.blessing=?
+ AND f.branch=?
+ AND r.name=?
+ AND r.val IN (
+ SELECT hostname
+ FROM hostflags
+ WHERE hostflag IN (
+ SELECT hostflag
+ FROM hostflags
+ WHERE hostname=?
+ AND hostflag LIKE 'equiv-%'
+ )
+ )
+ ORDER BY f.started DESC
+ LIMIT 1;
+END
+
my @candidates;
my $any=0;
@@ -342,6 +365,17 @@ END
find_recent_duration($dbg,$hid,$candrow);
+ if ($candrow->{restype} eq 'host') {
+ $equivstatusq->execute($job,$fi->{intended},$fi->{branch},
+ $hid->{Ident},$candrow->{resname});
+ my $esrow = $equivstatusq->fetchrow_hashref();
+ $candrow->{EquivMostRecentStatus} = $esrow->{status};
+ print DEBUG "$dbg EQUIV-MOST-RECENT ";
+ print DEBUG ("$esrow->{flight}.$esrow->{job}".
+ " $esrow->{val} $esrow->{status}") if $esrow;
+ print DEBUG ".\n";
+ }
+
foreach my $kcomb (qw(Shared-Max-Wear Shared-Max-Tasks)) {
my $kdb= $kcomb; $kdb =~ y/-A-Z/ a-z/;
my $khash= $kcomb; $khash =~ y/-//d;
@@ -362,6 +396,7 @@ END
print DEBUG "$dbg CANDIDATE.\n";
}
$findhostsq->finish();
+ $equivstatusq->finish();
if (!@candidates) {
if (defined $use) {
@@ -455,6 +490,7 @@ sub hid_recurse ($$) {
my $variation_age= 0;
my $duration= undef;
my $previously_failed = 0;
+ my $previously_failed_equiv = 0;
foreach my $hid (@hids) {
my $cand= $hid->{Selected};
my $recentstarted= $cand->{MostRecentStarted};
@@ -465,6 +501,8 @@ sub hid_recurse ($$) {
defined($cand->{Duration}) && $cand->{Duration} >= $duration;
$previously_failed++ if
($cand->{MostRecentStatus} // '') eq 'fail';
+ $previously_failed_equiv++ if
+ ($cand->{EquivMostRecentStatus} // '') eq 'fail';
}
my $duration_rightaway_adjust= 0;
@@ -505,12 +543,15 @@ sub hid_recurse ($$) {
my $cost= $start_time
+ $duration_for_cost
- - $previously_failed * 366*86400
+ - ($previously_failed ==@hids ? 366*86400 :
+ $previously_failed_equiv==@hids ? 365*86400 :
+ 0)
+ ($previously_failed ? + $variation_age * 10 : - $variation_age / 30)
- $share_reuse * 10000;
print DEBUG "$dbg FINAL start=$start_time va=$variation_age".
- " previously_failed=$previously_failed cost=$cost\n";
+ " previously_failed=$previously_failed".
+ " previously_failed_equiv=$previously_failed_equiv cost=$cost\n";
if (!defined $best || $cost < $best->{Cost}) {
print DEBUG "$dbg FINAL BEST: ".
--
1.7.10.4
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [OSSTEST PATCH 8/9] ts-hosts-allocate-Executive: Redo variation_bonus scoring
2014-11-11 19:41 [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Jackson
` (6 preceding siblings ...)
2014-11-11 19:41 ` [OSSTEST PATCH 7/9] ts-hosts-allocate-Executive: Score for equivalent previous failures Ian Jackson
@ 2014-11-11 19:41 ` Ian Jackson
2014-11-14 17:07 ` Ian Campbell
2014-11-11 19:41 ` [OSSTEST PATCH 9/9] ts-hosts-allocate-Executive: Radically reduce the previously_failed bonus Ian Jackson
2014-11-12 10:36 ` [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Campbell
9 siblings, 1 reply; 17+ messages in thread
From: Ian Jackson @ 2014-11-11 19:41 UTC (permalink / raw)
To: xen-devel; +Cc: Ian Jackson, Ian Campbell
Use a logarithmic scale. Cap the bonus at 12h rather than 5d/30 = 4h.
When we have previously failed, make sure we apply a reverse bonus,
rather than a penalty.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
ts-hosts-allocate-Executive | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
index 24f78d3..590fe98 100755
--- a/ts-hosts-allocate-Executive
+++ b/ts-hosts-allocate-Executive
@@ -537,19 +537,25 @@ sub hid_recurse ($$) {
if ($jobinfo->{recipe} =~ m/build/) {
$variation_age= 0;
$duration_for_cost= $duration + $duration_rightaway_adjust;
- } elsif ($variation_age > 5*86400) {
- $variation_age= 5*86400;
}
+ my $log_variation_age = log(1+$variation_age/86400);
+ my $variation_bonus = $log_variation_age * 3600*2;
+ my $max_variation_bonus = 12*86400;
+ $variation_bonus=$max_variation_bonus
+ if $variation_bonus>$max_variation_bonus;
+
my $cost= $start_time
+ $duration_for_cost
- ($previously_failed ==@hids ? 366*86400 :
$previously_failed_equiv==@hids ? 365*86400 :
0)
- + ($previously_failed ? + $variation_age * 10 : - $variation_age / 30)
+ + ($previously_failed || $previously_failed_equiv
+ ? (-$max_variation_bonus+$variation_bonus) : -$variation_bonus)
- $share_reuse * 10000;
print DEBUG "$dbg FINAL start=$start_time va=$variation_age".
+ " variation_bonus=$variation_bonus".
" previously_failed=$previously_failed".
" previously_failed_equiv=$previously_failed_equiv cost=$cost\n";
--
1.7.10.4
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [OSSTEST PATCH 9/9] ts-hosts-allocate-Executive: Radically reduce the previously_failed bonus
2014-11-11 19:41 [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Jackson
` (7 preceding siblings ...)
2014-11-11 19:41 ` [OSSTEST PATCH 8/9] ts-hosts-allocate-Executive: Redo variation_bonus scoring Ian Jackson
@ 2014-11-11 19:41 ` Ian Jackson
2014-11-12 10:36 ` [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Campbell
9 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2014-11-11 19:41 UTC (permalink / raw)
To: xen-devel; +Cc: Ian Jackson, Ian Campbell
Make osstest less obsessive about sticking to failing hosts if they
are persistently unavailable.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
ts-hosts-allocate-Executive | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
index 590fe98..1378f25 100755
--- a/ts-hosts-allocate-Executive
+++ b/ts-hosts-allocate-Executive
@@ -547,8 +547,12 @@ sub hid_recurse ($$) {
my $cost= $start_time
+ $duration_for_cost
- - ($previously_failed ==@hids ? 366*86400 :
- $previously_failed_equiv==@hids ? 365*86400 :
+ - ($previously_failed ==@hids ? 7*86400 :
+ $previously_failed_equiv==@hids ? 6.5*86400 :
+ # We wait 7d extra to try a failing test on the same
+ # hardware, or 6.5d on `equivalent' hardware (as defined by
+ # equiv-* flags). Compared to `equivalent' hardware, we
+ # wait 12h to try it on exactly the same.
0)
+ ($previously_failed || $previously_failed_equiv
? (-$max_variation_bonus+$variation_bonus) : -$variation_bonus)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [OSSTEST PATCH 0/9] Host allocation scoring improvements
2014-11-11 19:41 [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Jackson
` (8 preceding siblings ...)
2014-11-11 19:41 ` [OSSTEST PATCH 9/9] ts-hosts-allocate-Executive: Radically reduce the previously_failed bonus Ian Jackson
@ 2014-11-12 10:36 ` Ian Campbell
2014-11-13 16:16 ` Ian Jackson
9 siblings, 1 reply; 17+ messages in thread
From: Ian Campbell @ 2014-11-12 10:36 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel
On Tue, 2014-11-11 at 19:41 +0000, Ian Jackson wrote:
> Here, I'm trying to fix the way that osstest gets far too obsessed
> about particular failing hosts.
All fine by me, not that I've really grokked the bits towards the end.
(I will try to if you want)
>
> 1/9 cs-adjust-flight: Fix doc about /<pcre> to match
> 2/9 ts-hosts-allocate-Executive: allow uncompressed
> 3/9 Osstest/Executive.pm: Debug log same-host status
> 4/9 ts-hosts-allocate-Executive: Move $variation_age
> 5/9 ts-hosts-allocate-Executive: Do not prefer fast
> 6/9 ts-hosts-allocate-Executive: Clarify an
> 7/9 ts-hosts-allocate-Executive: Score for
> 8/9 ts-hosts-allocate-Executive: Redo
> 9/9 ts-hosts-allocate-Executive: Radically reduce
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [OSSTEST PATCH 0/9] Host allocation scoring improvements
2014-11-12 10:36 ` [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Campbell
@ 2014-11-13 16:16 ` Ian Jackson
2014-11-14 16:37 ` Ian Campbell
0 siblings, 1 reply; 17+ messages in thread
From: Ian Jackson @ 2014-11-13 16:16 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel
Ian Campbell writes ("Re: [OSSTEST PATCH 0/9] Host allocation scoring improvements"):
> On Tue, 2014-11-11 at 19:41 +0000, Ian Jackson wrote:
> > Here, I'm trying to fix the way that osstest gets far too obsessed
> > about particular failing hosts.
>
> All fine by me, not that I've really grokked the bits towards the end.
> (I will try to if you want)
I find this area a bit confusing, and it's a right bastard to test, so
if you have the bandwidth for a proper review that would be great.
Thanks,
Ian.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [OSSTEST PATCH 0/9] Host allocation scoring improvements
2014-11-13 16:16 ` Ian Jackson
@ 2014-11-14 16:37 ` Ian Campbell
0 siblings, 0 replies; 17+ messages in thread
From: Ian Campbell @ 2014-11-14 16:37 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel
On Thu, 2014-11-13 at 16:16 +0000, Ian Jackson wrote:
> Ian Campbell writes ("Re: [OSSTEST PATCH 0/9] Host allocation scoring improvements"):
> > On Tue, 2014-11-11 at 19:41 +0000, Ian Jackson wrote:
> > > Here, I'm trying to fix the way that osstest gets far too obsessed
> > > about particular failing hosts.
> >
> > All fine by me, not that I've really grokked the bits towards the end.
> > (I will try to if you want)
>
> I find this area a bit confusing, and it's a right bastard to test, so
> if you have the bandwidth for a proper review that would be great.
These:
[OSSTEST PATCH 1/9] cs-adjust-flight: Fix doc about /<pcre> to match implementation
[OSSTEST PATCH 2/9] ts-hosts-allocate-Executive: allow uncompressed log
[OSSTEST PATCH 3/9] Osstest/Executive.pm: Debug log same-host status (in duration estimator)
[OSSTEST PATCH 4/9] ts-hosts-allocate-Executive: Move $variation_age setting
[OSSTEST PATCH 5/9] ts-hosts-allocate-Executive: Do not prefer fast hosts for tests
[OSSTEST PATCH 6/9] ts-hosts-allocate-Executive: Clarify an expression with //
Are:
Acked-by: Ian Campbell <ian.campbell@citrix.com>
I'll try and grok the rest now and reply to them individually.
Ian.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [OSSTEST PATCH 7/9] ts-hosts-allocate-Executive: Score for equivalent previous failures
2014-11-11 19:41 ` [OSSTEST PATCH 7/9] ts-hosts-allocate-Executive: Score for equivalent previous failures Ian Jackson
@ 2014-11-14 16:59 ` Ian Campbell
2014-11-14 17:10 ` Ian Jackson
0 siblings, 1 reply; 17+ messages in thread
From: Ian Campbell @ 2014-11-14 16:59 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel
On Tue, 2014-11-11 at 19:41 +0000, Ian Jackson wrote:
> Look to see whether the last run on any hosts which are equivalent to
> the ones we're looking at, failed. This means that when host X is
> failing and we are considering host Y which is equivalent to X, we
> give Y a selection bonus.
>
> This means that osstest will be less obsessive about sticking to the
> very same failing host.
>
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> ---
> ts-hosts-allocate-Executive | 45 +++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 43 insertions(+), 2 deletions(-)
>
> diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
> index 9562a0a..24f78d3 100755
> --- a/ts-hosts-allocate-Executive
> +++ b/ts-hosts-allocate-Executive
> @@ -284,6 +284,29 @@ END
> $findhostsq->execute("blessed-$fi->{intended}");
> }
>
> + my $equivstatusq= $dbh_tests->prepare(<<END);
> + SELECT flight, job, val, status
> + FROM flights f
> + JOIN jobs j USING (flight)
> + JOIN runvars r USING (flight,job)
> + WHERE j.job=?
> + AND f.blessing=?
> + AND f.branch=?
> + AND r.name=?
> + AND r.val IN (
> + SELECT hostname
> + FROM hostflags
> + WHERE hostflag IN (
> + SELECT hostflag
> + FROM hostflags
> + WHERE hostname=?
> + AND hostflag LIKE 'equiv-%'
> + )
> + )
> + ORDER BY f.started DESC
> + LIMIT 1;
> +END
> +
> my @candidates;
> my $any=0;
>
> @@ -342,6 +365,17 @@ END
>
> find_recent_duration($dbg,$hid,$candrow);
>
> + if ($candrow->{restype} eq 'host') {
> + $equivstatusq->execute($job,$fi->{intended},$fi->{branch},
> + $hid->{Ident},$candrow->{resname});
> + my $esrow = $equivstatusq->fetchrow_hashref();
For the first flight on a new branch (or perhaps a new blessing), this
will return an undef, because there is no previous flight to match,
won't it?
http://search.cpan.org/~timb/DBI-1.632/DBI.pm#fetchrow_hashref says if
you get an undef you should check $equivstatusq->err to see if that was
due to an error vs. empty result set. Not sure if you'll care given this
is all heuristics though.
> + $candrow->{EquivMostRecentStatus} = $esrow->{status};
Meaning this will fail, or perhaps just produce a warning.
> + print DEBUG "$dbg EQUIV-MOST-RECENT ";
> + print DEBUG ("$esrow->{flight}.$esrow->{job}".
> + " $esrow->{val} $esrow->{status}") if $esrow;
> + print DEBUG ".\n";
And so will these?
> + }
> +
> foreach my $kcomb (qw(Shared-Max-Wear Shared-Max-Tasks)) {
> my $kdb= $kcomb; $kdb =~ y/-A-Z/ a-z/;
> my $khash= $kcomb; $khash =~ y/-//d;
> @@ -362,6 +396,7 @@ END
> print DEBUG "$dbg CANDIDATE.\n";
> }
> $findhostsq->finish();
> + $equivstatusq->finish();
>
> if (!@candidates) {
> if (defined $use) {
> @@ -455,6 +490,7 @@ sub hid_recurse ($$) {
> my $variation_age= 0;
> my $duration= undef;
> my $previously_failed = 0;
> + my $previously_failed_equiv = 0;
> foreach my $hid (@hids) {
> my $cand= $hid->{Selected};
> my $recentstarted= $cand->{MostRecentStarted};
> @@ -465,6 +501,8 @@ sub hid_recurse ($$) {
> defined($cand->{Duration}) && $cand->{Duration} >= $duration;
> $previously_failed++ if
> ($cand->{MostRecentStatus} // '') eq 'fail';
> + $previously_failed_equiv++ if
> + ($cand->{EquivMostRecentStatus} // '') eq 'fail';
> }
> my $duration_rightaway_adjust= 0;
>
> @@ -505,12 +543,15 @@ sub hid_recurse ($$) {
>
> my $cost= $start_time
> + $duration_for_cost
> - - $previously_failed * 366*86400
> + - ($previously_failed ==@hids ? 366*86400 :
> + $previously_failed_equiv==@hids ? 365*86400 :
> + 0)
You've dropped the behaviour of multiplying 366*86400 by
$previously_failed, was that intentional?
I think you've also gone to giving a bonus at all only if all @hids
previously failed, instead of just at least one of them.
> + ($previously_failed ? + $variation_age * 10 : - $variation_age / 30)
> - $share_reuse * 10000;
>
> print DEBUG "$dbg FINAL start=$start_time va=$variation_age".
> - " previously_failed=$previously_failed cost=$cost\n";
> + " previously_failed=$previously_failed".
> + " previously_failed_equiv=$previously_failed_equiv cost=$cost\n";
>
> if (!defined $best || $cost < $best->{Cost}) {
> print DEBUG "$dbg FINAL BEST: ".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [OSSTEST PATCH 8/9] ts-hosts-allocate-Executive: Redo variation_bonus scoring
2014-11-11 19:41 ` [OSSTEST PATCH 8/9] ts-hosts-allocate-Executive: Redo variation_bonus scoring Ian Jackson
@ 2014-11-14 17:07 ` Ian Campbell
2014-11-14 17:24 ` Ian Jackson
0 siblings, 1 reply; 17+ messages in thread
From: Ian Campbell @ 2014-11-14 17:07 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel
On Tue, 2014-11-11 at 19:41 +0000, Ian Jackson wrote:
> Use a logarithmic scale. Cap the bonus at 12h rather than 5d/30 = 4h.
> When we have previously failed, make sure we apply a reverse bonus,
> rather than a penalty.
>
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> ---
> ts-hosts-allocate-Executive | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive
> index 24f78d3..590fe98 100755
> --- a/ts-hosts-allocate-Executive
> +++ b/ts-hosts-allocate-Executive
> @@ -537,19 +537,25 @@ sub hid_recurse ($$) {
> if ($jobinfo->{recipe} =~ m/build/) {
> $variation_age= 0;
> $duration_for_cost= $duration + $duration_rightaway_adjust;
> - } elsif ($variation_age > 5*86400) {
> - $variation_age= 5*86400;
> }
>
> + my $log_variation_age = log(1+$variation_age/86400);
> + my $variation_bonus = $log_variation_age * 3600*2;
> + my $max_variation_bonus = 12*86400;
Isn't that 12 days, rather than the 12 hours in the commit log?
Or are the units here something other than seconds? (in which case I'm
v. confused by what time() returns...)
> + $variation_bonus=$max_variation_bonus
> + if $variation_bonus>$max_variation_bonus;
> +
> my $cost= $start_time
> + $duration_for_cost
> - ($previously_failed ==@hids ? 366*86400 :
> $previously_failed_equiv==@hids ? 365*86400 :
> 0)
> - + ($previously_failed ? + $variation_age * 10 : - $variation_age / 30)
> + + ($previously_failed || $previously_failed_equiv
> + ? (-$max_variation_bonus+$variation_bonus) : -$variation_bonus)
> - $share_reuse * 10000;
>
> print DEBUG "$dbg FINAL start=$start_time va=$variation_age".
> + " variation_bonus=$variation_bonus".
> " previously_failed=$previously_failed".
> " previously_failed_equiv=$previously_failed_equiv cost=$cost\n";
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [OSSTEST PATCH 7/9] ts-hosts-allocate-Executive: Score for equivalent previous failures
2014-11-14 16:59 ` Ian Campbell
@ 2014-11-14 17:10 ` Ian Jackson
0 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2014-11-14 17:10 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel
Ian Campbell writes ("Re: [OSSTEST PATCH 7/9] ts-hosts-allocate-Executive: Score for equivalent previous failures"):
> On Tue, 2014-11-11 at 19:41 +0000, Ian Jackson wrote:
> > + if ($candrow->{restype} eq 'host') {
> > + $equivstatusq->execute($job,$fi->{intended},$fi->{branch},
> > + $hid->{Ident},$candrow->{resname});
> > + my $esrow = $equivstatusq->fetchrow_hashref();
>
> For the first flight on a new branch (or perhaps a new blessing), this
> will return an undef, because there is no previous flight to match,
> won't it?
Yes.
> http://search.cpan.org/~timb/DBI-1.632/DBI.pm#fetchrow_hashref says if
> you get an undef you should check $equivstatusq->err to see if that was
> due to an error vs. empty result set. Not sure if you'll care given this
> is all heuristics though.
We turn on the automatic error trapping during db setup, so errors
helpfully turn into die.
> > + $candrow->{EquivMostRecentStatus} = $esrow->{status};
>
> Meaning this will fail, or perhaps just produce a warning.
$ perl -MData::Dumper -we 'use strict; my $y; print Dumper($y->{foo});'
$VAR1 = undef;
$
> > + print DEBUG "$dbg EQUIV-MOST-RECENT ";
> > + print DEBUG ("$esrow->{flight}.$esrow->{job}".
> > + " $esrow->{val} $esrow->{status}") if $esrow;
> > + print DEBUG ".\n";
>
> And so will these?
if $esrow;
> > @@ -505,12 +543,15 @@ sub hid_recurse ($$) {
> >
> > my $cost= $start_time
> > + $duration_for_cost
> > - - $previously_failed * 366*86400
> > + - ($previously_failed ==@hids ? 366*86400 :
> > + $previously_failed_equiv==@hids ? 365*86400 :
> > + 0)
>
> You've dropped the behaviour of multiplying 366*86400 by
> $previously_failed, was that intentional?
Yes. $previously_failed was the number of candidate hosts which had
previous failures. Making the offset proportional to the number of
hosts in the test is daft.
> I think you've also gone to giving a bonus at all only if all @hids
> previously failed, instead of just at least one of them.
Yes.
Should I write this better in the commit message ?
Ian.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [OSSTEST PATCH 8/9] ts-hosts-allocate-Executive: Redo variation_bonus scoring
2014-11-14 17:07 ` Ian Campbell
@ 2014-11-14 17:24 ` Ian Jackson
0 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2014-11-14 17:24 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel
Ian Campbell writes ("Re: [OSSTEST PATCH 8/9] ts-hosts-allocate-Executive: Redo variation_bonus scoring"):
> On Tue, 2014-11-11 at 19:41 +0000, Ian Jackson wrote:
> > + my $log_variation_age = log(1+$variation_age/86400);
> > + my $variation_bonus = $log_variation_age * 3600*2;
> > + my $max_variation_bonus = 12*86400;
>
> Isn't that 12 days, rather than the 12 hours in the commit log?
Well spotted. I made that mistake deliberately, honest.
Thanks for the review!
Ian.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2014-11-14 17:24 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-11 19:41 [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 1/9] cs-adjust-flight: Fix doc about /<pcre> to match implementation Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 2/9] ts-hosts-allocate-Executive: allow uncompressed log Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 3/9] Osstest/Executive.pm: Debug log same-host status (in duration estimator) Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 4/9] ts-hosts-allocate-Executive: Move $variation_age setting Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 5/9] ts-hosts-allocate-Executive: Do not prefer fast hosts for tests Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 6/9] ts-hosts-allocate-Executive: Clarify an expression with // Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 7/9] ts-hosts-allocate-Executive: Score for equivalent previous failures Ian Jackson
2014-11-14 16:59 ` Ian Campbell
2014-11-14 17:10 ` Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 8/9] ts-hosts-allocate-Executive: Redo variation_bonus scoring Ian Jackson
2014-11-14 17:07 ` Ian Campbell
2014-11-14 17:24 ` Ian Jackson
2014-11-11 19:41 ` [OSSTEST PATCH 9/9] ts-hosts-allocate-Executive: Radically reduce the previously_failed bonus Ian Jackson
2014-11-12 10:36 ` [OSSTEST PATCH 0/9] Host allocation scoring improvements Ian Campbell
2014-11-13 16:16 ` Ian Jackson
2014-11-14 16:37 ` Ian Campbell
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.