xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Ian Jackson <ian.jackson@eu.citrix.com>
To: xen-devel@lists.xenproject.org
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>
Subject: [OSSTEST PATCH 28/28] Executive: Delay releasing build host shares by 90s
Date: Tue, 22 Sep 2015 16:12:44 +0100	[thread overview]
Message-ID: <1442934764-8672-8-git-send-email-ian.jackson@eu.citrix.com> (raw)
In-Reply-To: <1442934764-8672-1-git-send-email-ian.jackson@eu.citrix.com>

When a build job finishes, the same flight may well want to do a
subsequent build that depended on the first.  When this happens, we
have a race:

One the one hand, we have the flight: after sg-run-job exits,
sg-execute-flight needs to double-check the job status, and search the
flight for more jobs to run; it will spawn ts-allocate-hosts-Executive
for the new job, which needs to get its head together, parse its
arguments, become a client of the queue daemon, and ask to be put in
the queue.

On the other hand, we have the planning system: currently, as soon as
sg-run-job exits, the connection to the ownerdaemon closes.  The
ownerdaemon tells the queue daemon, and the planning queue is
restarted.  It might even happen that coincidentally the planning
queue is about to start.

If the planning system wins the race, another job will pick up the
newly-freed resource.  Often this will mean unsharing the build host,
which is very wasteful if the releasing flight hasn't finished its
builds for that architecture: it means that the next build job needs
to regroove a host for builds.

Add a bodge to try to make the race go the other way: after a build
job completes successfuly, do not give up the share for a further 90
seconds.  (We have to use setsid because sg-execute-flight kills the
process group to clean up stray processes, which this sleep definitely
is.)

A better solution would be to move the wait-for-referenced-job logic
from sg-execute-flight to ts-hosts-allocate-*.  But that would be much
more complicated.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v4: New patch
---
 sg-run-job               |    2 ++
 tcl/JobDB-Executive.tcl  |    6 ++++++
 tcl/JobDB-Standalone.tcl |    1 +
 3 files changed, 9 insertions(+)

diff --git a/sg-run-job b/sg-run-job
index c51a508..66145b8 100755
--- a/sg-run-job
+++ b/sg-run-job
@@ -71,6 +71,8 @@ proc run-job {job} {
 
     if {$ok} { setstatus pass                                             }
 
+    if {$need_build_host && $ok} { jobdb::preserve-task 90 }
+
     if {$anyfailed} {
         jobdb::logputs stdout "at least one test failed"
     }
diff --git a/tcl/JobDB-Executive.tcl b/tcl/JobDB-Executive.tcl
index d61d2a2..f37bbaf 100644
--- a/tcl/JobDB-Executive.tcl
+++ b/tcl/JobDB-Executive.tcl
@@ -280,6 +280,12 @@ proc become-task {comment} {
     }
 }
 
+proc preserve-task {seconds} {
+    # This keeps the owner daemon connection open: our `sleep'
+    # will continue to own our resources for $seconds longer
+    exec setsid sleep $seconds > /dev/null < /dev/null 2> /dev/null &
+}
+
 proc step-log-filename {flight job stepno ts} {
     global c
     set logdir $c(Logs)/$flight/$job
diff --git a/tcl/JobDB-Standalone.tcl b/tcl/JobDB-Standalone.tcl
index a2b8dd9..d7d8422 100644
--- a/tcl/JobDB-Standalone.tcl
+++ b/tcl/JobDB-Standalone.tcl
@@ -74,6 +74,7 @@ proc step-set-status {flight job stepno st} {
 }
 
 proc become-task {argv} { }
+proc preserve-task {argv} { }
 
 proc step-log-filename {flight job stepno ts} {
     return {}
-- 
1.7.10.4

  parent reply	other threads:[~2015-09-22 15:13 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-22 15:12 [OSSTEST PATCH v4 00/28] xen.git#staging smoke tests Ian Jackson
2015-09-22 15:12 ` [OSSTEST PATCH 05/28] sg-report-flight: Better searching for used revisions Ian Jackson
2015-09-22 15:24   ` Ian Campbell
2015-09-22 15:12 ` [OSSTEST PATCH 14/28] Provide xen-unstable-smoke branch Ian Jackson
2015-09-22 15:29   ` Ian Campbell
2015-09-22 15:31     ` Ian Jackson
2015-09-22 15:12 ` [OSSTEST PATCH 15/28] cr-daily-branch: Use mg-adjust-flight to have smoke tests reuse builds Ian Jackson
2015-09-22 15:30   ` Ian Campbell
2015-09-22 15:32     ` Ian Jackson
2015-09-22 15:12 ` [OSSTEST PATCH 19/28] ts-debian-hvm-install: Defer preseed generation Ian Jackson
2015-09-22 15:31   ` Ian Campbell
2015-09-22 15:12 ` [OSSTEST PATCH 20/28] ts-debian-hvm-install: Cope with images containing only isolinux Ian Jackson
2015-09-22 15:32   ` Ian Campbell
2015-09-22 15:12 ` [OSSTEST PATCH 22/28] ts-debian-hvm-install: Do not create EFI partition if EFI not in use Ian Jackson
2015-09-22 15:32   ` Ian Campbell
2015-09-22 15:12 ` Ian Jackson [this message]
2015-09-22 15:34   ` [OSSTEST PATCH 28/28] Executive: Delay releasing build host shares by 90s Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1442934764-8672-8-git-send-email-ian.jackson@eu.citrix.com \
    --to=ian.jackson@eu.citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).