Re: [XTF PATCH] xtf-runner: fix two synchronisation issues

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Wei Liu <wei.liu2@citrix.com>
To: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Wei Liu <wei.liu2@citrix.com>,
	Xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [XTF PATCH] xtf-runner: fix two synchronisation issues
Date: Fri, 29 Jul 2016 15:55:49 +0100	[thread overview]
Message-ID: <20160729145549.GJ22419@citrix.com> (raw)
In-Reply-To: <22427.26818.981068.78463@mariner.uk.xensource.com>

On Fri, Jul 29, 2016 at 03:31:30PM +0100, Ian Jackson wrote:
> Wei Liu writes ("Re: [XTF PATCH] xtf-runner: fix two synchronisation issues"):
> > On Fri, Jul 29, 2016 at 01:43:42PM +0100, Andrew Cooper wrote:
> > > The runner existing before xl has torn down the guest is very
> > > deliberate, because some part of hvm guests is terribly slow to tear
> > > down; waiting synchronously for teardown tripled the wallclock time to
> > > run a load of tests back-to-back.
> > 
> > Then you won't know if a guest is leaked or it is being slowly destroyed
> > when a dead guest shows up in the snapshot of 'xl list'.
> > 
> > Also consider that would make back-to-back tests that happen to have a
> > guest that has the same name as the one in previous test fail.
> > 
> > I don't think getting blocked for a few more seconds is a big issue.
> > It's is important to eliminate such race conditions so that osstest can
> > work properly.
> 
> IMO the biggest reason for waiting for teardown is that that will make
> it possible to accurately identify the xtf test which was responsible
> for the failure if a test reveals a bug which causes problems for the
> whole host.
> 
> Suppose there is a test T1 which, in buggy hypervisors, creates an
> anomalous data structure, such that the hypervisor crashes when T1's
> guest is finally torn down.
> 
> If we start to run the next test T2 immediately we see success output
> from T1, we will observe the host crashing "due to T2", and T1 would
> be regarded as having succeeded.
> 
> This is why in an in-person conversation with Wei yesterday I
> recommended that osstest should after each xtf test (i) wait for
> everything to be torn down and (ii) then check that the dom0 is still
> up.  (And these two activities are regarded as part of the preceding
> test step.)
> 
> If this leads to over-consumption of machine resources because this
> serialisation is too slow then the right approach would be explicit
> parallelisation in osstest.  That would still mean that in the
> scenario above, T1 would be regarded as having failed, because T1
> wouldn't be regarded as having passed until osstest had seen that all
> of T1's cleanup had been done and the host was still up.  (T2 would
> _also_ be regarded as failed, and that might look like a heisenbug,
> but that would be tolerable.)
> 
> Wei: I need to check what happens with multiple failing test steps in
> the same job.  Specifically, I need to check which one the bisector
> is likely to try to attack.
> 

Yes. I think my current code can meet both you and Andrew's
requirement.

1. The runner waits for all tests to finish, which amortise the clean up
   time. This is what Andrew needs.
2. In osstest, we run one test case at a time. So "all tests" is only
   one test. This is what you need.

Wei.

> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2016-07-29 14:56 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-29 12:07 [XTF PATCH] xtf-runner: fix two synchronisation issues Wei Liu
2016-07-29 12:43 ` Andrew Cooper
2016-07-29 12:58   ` Wei Liu
2016-07-29 13:06     ` Andrew Cooper
2016-07-29 13:12       ` Wei Liu
2016-07-29 13:23         ` Andrew Cooper
2016-07-29 13:26           ` Wei Liu
2016-07-29 14:31     ` Ian Jackson
2016-07-29 14:55       ` Wei Liu [this message]
2016-07-29 16:18         ` Ian Jackson
2016-07-29 16:35           ` Andrew Cooper
2016-07-29 16:41           ` Wei Liu
2016-07-29 15:05       ` Andrew Cooper
2016-08-01 13:16       ` [RFC PATCH 0/8] Fix console " Wei Liu
2016-08-01 13:16         ` [RFC PATCH 1/8] tools/console: fix help string in client Wei Liu
2016-08-05 15:40           ` Ian Jackson
2016-08-01 13:16         ` [RFC PATCH 2/8] tools/console: introduce --start-notify-fd option for console client Wei Liu
2016-08-05 15:43           ` Ian Jackson
2016-08-01 13:16         ` [RFC PATCH 3/8] libxl: factor out libxl__console_tty_path Wei Liu
2016-08-05 15:44           ` Ian Jackson
2016-08-01 13:16         ` [RFC PATCH 4/8] libxl: wait up to 5s in libxl_console_exec for xenconsoled Wei Liu
2016-08-05 15:48           ` Ian Jackson
2016-08-01 13:16         ` [RFC PATCH 5/8] libxl: libxl_{primary_, }console_exec now take notify_fd argument Wei Liu
2016-08-05 15:49           ` Ian Jackson
2016-08-05 15:50             ` Ian Jackson
2016-08-01 13:16         ` [RFC PATCH 6/8] docs: document xenconsole startup protocol Wei Liu
2016-08-05 15:52           ` Ian Jackson
2016-08-01 13:16         ` [RFC PATCH 7/8] xl: use " Wei Liu
2016-08-05 15:55           ` Ian Jackson
2016-08-01 13:16         ` [RFC PATCH 8/8] tools/console: remove 5s bodge in console client Wei Liu
2016-08-05 15:57           ` Ian Jackson
2016-08-05 16:16             ` Wei Liu
2016-08-05 16:18               ` Ian Jackson
2016-08-05 16:28                 ` Wei Liu
2016-08-05 16:32                   ` Ian Jackson
2016-08-05 16:36                     ` Wei Liu
2016-08-05 17:23                       ` Wei Liu
2016-08-08 10:07                         ` Ian Jackson
2016-08-01 14:04       ` [XTF PATCH] xtf-runner: use xl create -Fc directly Wei Liu
2016-07-29 13:27 ` [XTF PATCH] xtf-runner: fix two synchronisation issues Andrew Cooper
2016-07-29 14:21 ` Ian Jackson
2016-07-29 14:25   ` Wei Liu
2016-07-29 14:35     ` Ian Jackson
2016-07-29 14:46       ` Wei Liu
2016-07-29 14:26   ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160729145549.GJ22419@citrix.com \
    --to=wei.liu2@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).