[xen-4.6-testing test] 65112: regressions

All of lore.kernel.org
 help / color / mirror / Atom feed

* [xen-4.6-testing test] 65112: regressions - FAIL
@ 2015-11-26 17:27 osstest service owner
  2015-11-27  8:18 ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: osstest service owner @ 2015-11-26 17:27 UTC (permalink / raw)
  To: xen-devel, osstest-admin

flight 65112 xen-4.6-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/65112/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386                    5 xen-build        fail in 65062 REGR. vs. 63449
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail in 65088 REGR. vs. 63449

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-rtds     11 guest-start        fail in 65062 pass in 65112
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 13 guest-localmigrate fail pass in 65062
 test-amd64-i386-rumpuserxen-i386 15 rumpuserxen-demo-xenstorels/xenstorels.repeat fail pass in 65088
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 13 guest-localmigrate fail pass in 65088

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-qemuu-nested 3 host-install(3) broken in 65062 baseline untested
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail in 65062 blocked in 63449
 test-amd64-amd64-qemuu-nested 16 debian-hvm-install/l1/l2 fail in 65088 baseline untested

Tests which did not succeed, but are not blocking:
 build-i386-rumpuserxen        1 build-check(1)            blocked in 65062 n/a
 build-i386-libvirt            1 build-check(1)            blocked in 65062 n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1)    blocked in 65062 n/a
 test-amd64-i386-migrupgrade   1 build-check(1)            blocked in 65062 n/a
 test-amd64-i386-xl            1 build-check(1)            blocked in 65062 n/a
 test-amd64-i386-libvirt       1 build-check(1)            blocked in 65062 n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)      blocked in 65062 n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64 1 build-check(1) blocked in 65062 n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)        blocked in 65062 n/a
 test-amd64-i386-qemut-rhel6hvm-amd  1 build-check(1)      blocked in 65062 n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked in 65062 n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)            blocked in 65062 n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked in 65062 n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)     blocked in 65062 n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1)    blocked in 65062 n/a
 test-amd64-i386-xl-qemut-win7-amd64  1 build-check(1)     blocked in 65062 n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)         blocked in 65062 n/a
 test-amd64-i386-pair          1 build-check(1)            blocked in 65062 n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)            blocked in 65062 n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)          blocked in 65062 n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)     blocked in 65062 n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 1 build-check(1) blocked in 65062 n/a
 test-amd64-i386-xl-raw        1 build-check(1)            blocked in 65062 n/a
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 1 build-check(1) blocked in 65062 n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-check(1)       blocked in 65062 n/a
 test-amd64-i386-xl-qemut-winxpsp3  1 build-check(1)       blocked in 65062 n/a
 test-armhf-armhf-libvirt-raw  9 debian-di-install            fail   never pass
 test-armhf-armhf-xl-vhd       9 debian-di-install            fail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start                  fail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-check        fail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop             fail never pass
 test-armhf-armhf-xl-rtds     13 saverestore-support-check    fail   never pass
 test-armhf-armhf-xl-rtds     12 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-rtds     16 guest-start/debian.repeat    fail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-check        fail   never pass
 test-amd64-i386-libvirt      12 migrate-support-check        fail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-check        fail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestore            fail   never pass
 test-armhf-armhf-xl-xsm      13 saverestore-support-check    fail   never pass
 test-armhf-armhf-xl-xsm      12 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-check    fail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-check        fail  never pass
 test-armhf-armhf-libvirt-qcow2  9 debian-di-install            fail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-check    fail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-check        fail   never pass
 test-armhf-armhf-libvirt     14 guest-saverestore            fail   never pass
 test-armhf-armhf-libvirt     12 migrate-support-check        fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start                  fail  never pass
 test-amd64-amd64-libvirt     12 migrate-support-check        fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-check        fail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop             fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop              fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop              fail never pass
 test-armhf-armhf-xl          12 migrate-support-check        fail   never pass
 test-armhf-armhf-xl          13 saverestore-support-check    fail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-check        fail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-check    fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-check    fail   never pass

version targeted for testing:
 xen                  78833c04250416f1870c458309d3ac0e5cf915fd
baseline version:
 xen                  40d7a7454835c2f7c639c78f6c09e7b6f0e4a4e2

Last test of basis    63449  2015-11-01 10:09:20 Z   25 days
Failing since         64055  2015-11-10 11:39:11 Z   16 days   12 attempts
Testing same since    64935  2015-11-20 02:51:37 Z    6 days    6 attempts

------------------------------------------------------------
People who touched revisions under test:
  Andrew Cooper <andrew.cooper3@citrix.com>
  Ian Campbell <ian.campbell@citrix.com>
  Ian Jackson <ian.jackson@eu.citrix.com>
  Jan Beulich <jbeulich@suse.com>
  Kevin Tian <kevin.tian@intel.com>

jobs:
 build-amd64-xsm                                              pass    
 build-armhf-xsm                                              pass    
 build-i386-xsm                                               pass    
 build-amd64                                                  pass    
 build-armhf                                                  pass    
 build-i386                                                   pass    
 build-amd64-libvirt                                          pass    
 build-armhf-libvirt                                          pass    
 build-i386-libvirt                                           pass    
 build-amd64-prev                                             pass    
 build-i386-prev                                              pass    
 build-amd64-pvops                                            pass    
 build-armhf-pvops                                            pass    
 build-i386-pvops                                             pass    
 build-amd64-rumpuserxen                                      pass    
 build-i386-rumpuserxen                                       pass    
 test-amd64-amd64-xl                                          pass    
 test-armhf-armhf-xl                                          pass    
 test-amd64-i386-xl                                           pass    
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm                pass    
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm                 pass    
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm           pass    
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm            pass    
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm                pass    
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm                 pass    
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm        fail    
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm         fail    
 test-amd64-amd64-libvirt-xsm                                 pass    
 test-armhf-armhf-libvirt-xsm                                 fail    
 test-amd64-i386-libvirt-xsm                                  pass    
 test-amd64-amd64-xl-xsm                                      pass    
 test-armhf-armhf-xl-xsm                                      pass    
 test-amd64-i386-xl-xsm                                       pass    
 test-amd64-amd64-qemuu-nested-amd                            fail    
 test-amd64-amd64-xl-pvh-amd                                  fail    
 test-amd64-i386-qemut-rhel6hvm-amd                           pass    
 test-amd64-i386-qemuu-rhel6hvm-amd                           pass    
 test-amd64-amd64-xl-qemut-debianhvm-amd64                    pass    
 test-amd64-i386-xl-qemut-debianhvm-amd64                     pass    
 test-amd64-amd64-xl-qemuu-debianhvm-amd64                    pass    
 test-amd64-i386-xl-qemuu-debianhvm-amd64                     pass    
 test-amd64-i386-freebsd10-amd64                              pass    
 test-amd64-amd64-xl-qemuu-ovmf-amd64                         pass    
 test-amd64-i386-xl-qemuu-ovmf-amd64                          pass    
 test-amd64-amd64-rumpuserxen-amd64                           pass    
 test-amd64-amd64-xl-qemut-win7-amd64                         fail    
 test-amd64-i386-xl-qemut-win7-amd64                          fail    
 test-amd64-amd64-xl-qemuu-win7-amd64                         fail    
 test-amd64-i386-xl-qemuu-win7-amd64                          fail    
 test-armhf-armhf-xl-arndale                                  pass    
 test-amd64-amd64-xl-credit2                                  pass    
 test-armhf-armhf-xl-credit2                                  pass    
 test-armhf-armhf-xl-cubietruck                               pass    
 test-amd64-i386-freebsd10-i386                               pass    
 test-amd64-i386-rumpuserxen-i386                             fail    
 test-amd64-amd64-qemuu-nested-intel                          pass    
 test-amd64-amd64-xl-pvh-intel                                fail    
 test-amd64-i386-qemut-rhel6hvm-intel                         pass    
 test-amd64-i386-qemuu-rhel6hvm-intel                         pass    
 test-amd64-amd64-libvirt                                     pass    
 test-armhf-armhf-libvirt                                     fail    
 test-amd64-i386-libvirt                                      pass    
 test-amd64-amd64-migrupgrade                                 pass    
 test-amd64-i386-migrupgrade                                  pass    
 test-amd64-amd64-xl-multivcpu                                pass    
 test-armhf-armhf-xl-multivcpu                                pass    
 test-amd64-amd64-pair                                        pass    
 test-amd64-i386-pair                                         pass    
 test-amd64-amd64-libvirt-pair                                pass    
 test-amd64-i386-libvirt-pair                                 pass    
 test-amd64-amd64-amd64-pvgrub                                pass    
 test-amd64-amd64-i386-pvgrub                                 pass    
 test-amd64-amd64-pygrub                                      pass    
 test-armhf-armhf-libvirt-qcow2                               fail    
 test-amd64-amd64-xl-qcow2                                    pass    
 test-armhf-armhf-libvirt-raw                                 fail    
 test-amd64-i386-xl-raw                                       pass    
 test-amd64-amd64-xl-rtds                                     pass    
 test-armhf-armhf-xl-rtds                                     fail    
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1                     pass    
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1                     pass    
 test-amd64-amd64-libvirt-vhd                                 pass    
 test-armhf-armhf-xl-vhd                                      fail    
 test-amd64-amd64-xl-qemut-winxpsp3                           pass    
 test-amd64-i386-xl-qemut-winxpsp3                            pass    
 test-amd64-amd64-xl-qemuu-winxpsp3                           pass    
 test-amd64-i386-xl-qemuu-winxpsp3                            pass    


------------------------------------------------------------
sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
    http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
    http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

------------------------------------------------------------
commit 78833c04250416f1870c458309d3ac0e5cf915fd
Author: Ian Campbell <ian.campbell@citrix.com>
Date:   Thu Sep 10 14:31:34 2015 +0100

    Config: Switch to unified qemu trees.
    
    Upstream qemu is now in qemu-xen.git and the trad fork is in
    qemu-xen-traditional.git.
    
    QEMU_UPSTREAM_REVISION is currently a tag and
    QEMU_TRADITIONAL_REVISION is a specific revision, so no changes are
    required to those.
    
    Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
    Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
    
    Conflicts:
    	Config.mk
    Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
    Acked-by: Ian Campbell <ian.campbell@citrix.com>

commit e3b0c81ba143939282d99d7cdc041f95bae9c917
Author: Jan Beulich <jbeulich@suse.com>
Date:   Tue Nov 10 12:16:51 2015 +0100

    x86/HVM: always intercept #AC and #DB
    
    Both being benign exceptions, and both being possible to get triggered
    by exception delivery, this is required to prevent a guest from locking
    up a CPU (resulting from no other VM exits occurring once getting into
    such a loop).
    
    The specific scenarios:
    
    1) #AC may be raised during exception delivery if the handler is set to
    be a ring-3 one by a 32-bit guest, and the stack is misaligned.
    
    This is CVE-2015-5307 / XSA-156.
    
    Reported-by: Benjamin Serebrin <serebrin@google.com>
    
    2) #DB may be raised during exception delivery when a breakpoint got
    placed on a data structure involved in delivering the exception. This
    can result in an endless loop when a 64-bit guest uses a non-zero IST
    for the vector 1 IDT entry, but even without use of IST the time it
    takes until a contributory fault would get raised (results depending
    on the handler) may be quite long.
    
    This is CVE-2015-8104 / XSA-156.
    
    Signed-off-by: Jan Beulich <jbeulich@suse.com>
    Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
    master commit: bd2239d9fa975a1ee5bcd27c218ae042cd0a57bc
    master date: 2015-11-10 12:03:08 +0100

commit a01d1c7ce27c21e31944ae34fd45a4581c202701
Author: Andrew Cooper <andrew.cooper3@citrix.com>
Date:   Tue Nov 10 12:16:11 2015 +0100

    x86/vmx: improvements to vmentry failure handling
    
    Combine the almost identical vm_launch_fail() and vm_resume_fail() into a
    single vmx_vmentry_failure().
    
    Re-save all GPRs so that domain_crash() prints the real register values,
    rather than the stack frame of the vmx_vmentry_failure() call.
    
    Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Acked-by: Kevin Tian <kevin.tian@intel.com>
    master commit: bbcf0b218f64b1e3e2b66b0fbb623f51d9014e81
    master date: 2015-11-03 18:14:02 +0100

commit 97549e503a2edc8476f9597400159bbe7262fc41
Author: Andrew Cooper <andrew.cooper3@citrix.com>
Date:   Tue Nov 10 12:15:29 2015 +0100

    x86/PoD: Make p2m_pod_empty_cache() restartable
    
    This avoids a long running operation when destroying a domain with a
    large PoD cache.
    
    Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Reviewed-by: George Dunlap <george.dunlap@citrix.com>
    master commit: 59a5061723ba47c0028cf48487e5de551c42a378
    master date: 2015-11-02 15:33:38 +0100
(qemu changes not included)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-26 17:27 [xen-4.6-testing test] 65112: regressions - FAIL osstest service owner
@ 2015-11-27  8:18 ` Jan Beulich
  2015-11-27  9:53   ` Ian Campbell
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2015-11-27  8:18 UTC (permalink / raw)
  To: osstest-admin; +Cc: xen-devel

>>> On 26.11.15 at 18:27, <osstest-admin@xenproject.org> wrote:
> flight 65112 xen-4.6-testing real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/65112/ 
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking, including tests which could not be run:
>  build-i386                    5 xen-build        fail in 65062 REGR. vs. 63449
>  test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail in 65088 REGR. vs. 63449

Neither of these failed in this flight, and there's nothing else blocking
the push. Why did this not result in a push then? Or in other words
why do the failures in earlier flights get considered a reason not to
push?

Thanks, Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-27  8:18 ` Jan Beulich
@ 2015-11-27  9:53   ` Ian Campbell
  2015-11-27 12:02     ` Ian Jackson
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Campbell @ 2015-11-27  9:53 UTC (permalink / raw)
  To: Jan Beulich, osstest-admin; +Cc: xen-devel

On Fri, 2015-11-27 at 01:18 -0700, Jan Beulich wrote:
> > > > On 26.11.15 at 18:27, <osstest-admin@xenproject.org> wrote:
> > flight 65112 xen-4.6-testing real [real]
> > http://logs.test-lab.xenproject.org/osstest/logs/65112/ 
> > 
> > Regressions :-(
> > 
> > Tests which did not succeed and are blocking, including tests which
> > could not be run:
> >  build-i386                    5 xen-build        fail in 65062 REGR. vs. 63449
> >  test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail in 65088 REGR. vs. 63449
> 
> Neither of these failed in this flight, and there's nothing else blocking
> the push. Why did this not result in a push then? Or in other words
> why do the failures in earlier flights get considered a reason not to
> push?

@Ian, README.email covers lots of these kinds of patterns, but not this
specific one.


Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-27  9:53   ` Ian Campbell
@ 2015-11-27 12:02     ` Ian Jackson
  2015-11-27 12:28       ` Ian Campbell
  2015-11-27 12:52       ` Jan Beulich
  0 siblings, 2 replies; 16+ messages in thread
From: Ian Jackson @ 2015-11-27 12:02 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, osstest-admin

Ian Campbell writes ("Re: [Xen-devel] [xen-4.6-testing test] 65112: regressions - FAIL"):
> On Fri, 2015-11-27 at 01:18 -0700, Jan Beulich wrote:
> > Neither of these failed in this flight, and there's nothing else blocking
> > the push. Why did this not result in a push then? Or in other words
> > why do the failures in earlier flights get considered a reason not to
> > push?
> 
> @Ian, README.email covers lots of these kinds of patterns, but not this
> specific one.

See below for proposed docs patch to explain the general meaning of
`fail in X REGR. vs. Y'.


> > >  build-i386 5 xen-build fail in 65062 REGR. vs. 63449

This is completely explained below, I think.

> > >  test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail in 65088 REGR. vs. 63449

As explained below, in 65112 this step did not run because the earlier
step `guest-localmigrate' failed:
  http://logs.test-lab.xenproject.org/osstest/logs/65112/test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm/info.html

The fact that we have both `guest-localmigrate' and
`guest-localmigrate/x10' isn't ideal because it hides from the
heisenbug compensator that these are actually the same underlying
test.  Maybe it is time now to rename `guest-localmigrate/x10' to
`guest-localmigrate' and abolish the latter.


>From 987dd088192f9f94c59beeddc073cecaad76a24e Mon Sep 17 00:00:00 2001
From: Ian Jackson <ian.jackson@eu.citrix.com>
Date: Fri, 27 Nov 2015 11:36:05 +0000
Subject: [OSSTEST PATCH] README.email: Document `fail in 58948 REGR. vs.
 63449'

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 README.email |   18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/README.email b/README.email
index 992a574..40df71a 100644
--- a/README.email
+++ b/README.email
@@ -71,6 +71,24 @@ history.  Here are some examples:
       detect regressions of this test.  Perhaps the test has been
       recently introduced.
 
+   fail in 58948 REGR. vs. 63449
+
+      The results processor used 58948 (another flight testing the
+      just-tested version) to convince itself that some other test
+      failure is intermittent.  Look for other references to 58948 in
+      the report to see which those other test failures are.
+
+      However, in 58948, there were further failures.  In particular,
+      the step being reported here failed, and that failure could not
+      in turn be justified.
+
+      If this further failure is in a test job, this is usually
+      because the reported step did not run at all in the most recent
+      flight, usually because it was blocked by an earlier failure.
+      (Intermittent build job failures are never considered
+      justifiable because they prevent other tests from running and
+      can so conceal bugs.)
+
    fail in 58948 pass in 58965
    fail in 58948 like 37628
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-27 12:02     ` Ian Jackson
@ 2015-11-27 12:28       ` Ian Campbell
  2015-11-27 12:35         ` Jan Beulich
  2015-11-27 13:24         ` Ian Jackson
  2015-11-27 12:52       ` Jan Beulich
  1 sibling, 2 replies; 16+ messages in thread
From: Ian Campbell @ 2015-11-27 12:28 UTC (permalink / raw)
  To: Ian Jackson, Jan Beulich; +Cc: xen-devel, osstest-admin

On Fri, 2015-11-27 at 12:02 +0000, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] [xen-4.6-testing test] 65112:
> regressions - FAIL"):
> > On Fri, 2015-11-27 at 01:18 -0700, Jan Beulich wrote:
> > > Neither of these failed in this flight, and there's nothing else
> > > blocking
> > > the push. Why did this not result in a push then? Or in other words
> > > why do the failures in earlier flights get considered a reason not to
> > > push?
> > 
> > @Ian, README.email covers lots of these kinds of patterns, but not this
> > specific one.
> 
> See below for proposed docs patch to explain the general meaning of
> `fail in X REGR. vs. Y'.
> 
> 
> > > >  build-i386 5 xen-build fail in 65062 REGR. vs. 63449
> 
> This is completely explained below, I think.
> 
> > > >  test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 16 guest-
> > > > localmigrate/x10 fail in 65088 REGR. vs. 63449
> 
> As explained below, in 65112 this step did not run because the earlier
> step `guest-localmigrate' failed:
>   http://logs.test-lab.xenproject.org/osstest/logs/65112/test-amd64-
> amd64-xl-qemut-stubdom-debianhvm-amd64-xsm/info.html

Would it be possible to arrange for "blocked" to appear somewhere in the
results for the job? e.g. "blocked fail in XXX REGR. vs. YYY". README.email
says "The results normally start with the result in this flight" and I
think this would be in keeping with that.

Otherwise I think people naturally tend to just read the "and are blocking"
section and forget to consider that non-blocking stuff further down may
have (tolerably) failed but then blocking something else which is then
blocking the push.

> The fact that we have both `guest-localmigrate' and
> `guest-localmigrate/x10' isn't ideal because it hides from the
> heisenbug compensator that these are actually the same underlying
> test.  Maybe it is time now to rename `guest-localmigrate/x10' to
> `guest-localmigrate' and abolish the latter.

I think this would be a good idea.

> From 987dd088192f9f94c59beeddc073cecaad76a24e Mon Sep 17 00:00:00 2001
> From: Ian Jackson <ian.jackson@eu.citrix.com>
> Date: Fri, 27 Nov 2015 11:36:05 +0000
> Subject: [OSSTEST PATCH] README.email: Document `fail in 58948 REGR. vs.
>  63449'
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> ---
>  README.email |   18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/README.email b/README.email
> index 992a574..40df71a 100644
> --- a/README.email
> +++ b/README.email
> @@ -71,6 +71,24 @@ history.  Here are some examples:
>        detect regressions of this test.  Perhaps the test has been
>        recently introduced.
>  
> +   fail in 58948 REGR. vs. 63449
> +
> +      The results processor used 58948 (another flight testing the
> +      just-tested version) to convince itself that some other test
> +      failure is intermittent.  Look for other references to 58948 in
> +      the report to see which those other test failures are.
> +
> +      However, in 58948, there were further failures.  In particular,
> +      the step being reported here failed, and that failure could not
> +      in turn be justified.
> +
> +      If this further failure is in a test job, this is usually
> +      because the reported step did not run at all in the most recent
> +      flight, usually because it was blocked by an earlier failure.
> +      (Intermittent build job failures are never considered
> +      justifiable because they prevent other tests from running and
> +      can so conceal bugs.)
> +
>     fail in 58948 pass in 58965
>     fail in 58948 like 37628
>  

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-27 12:28       ` Ian Campbell
@ 2015-11-27 12:35         ` Jan Beulich
  2015-11-27 13:25           ` Ian Jackson
  2015-11-27 13:24         ` Ian Jackson
  1 sibling, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2015-11-27 12:35 UTC (permalink / raw)
  To: Ian Campbell, Ian Jackson; +Cc: xen-devel, osstest-admin

>>> On 27.11.15 at 13:28, <ian.campbell@citrix.com> wrote:
> On Fri, 2015-11-27 at 12:02 +0000, Ian Jackson wrote:
>> As explained below, in 65112 this step did not run because the earlier
>> step `guest-localmigrate' failed:
>>   http://logs.test-lab.xenproject.org/osstest/logs/65112/test-amd64- 
>> amd64-xl-qemut-stubdom-debianhvm-amd64-xsm/info.html
> 
> Would it be possible to arrange for "blocked" to appear somewhere in the
> results for the job? e.g. "blocked fail in XXX REGR. vs. YYY". README.email
> says "The results normally start with the result in this flight" and I
> think this would be in keeping with that.
> 
> Otherwise I think people naturally tend to just read the "and are blocking"
> section and forget to consider that non-blocking stuff further down may
> have (tolerably) failed but then blocking something else which is then
> blocking the push.

Indeed - that's precisely how I've looked at things this morning.

>> --- a/README.email
>> +++ b/README.email
>> @@ -71,6 +71,24 @@ history.  Here are some examples:
>>        detect regressions of this test.  Perhaps the test has been
>>        recently introduced.
>>  
>> +   fail in 58948 REGR. vs. 63449

It seems confusing to me that the numbers are reversed from how
I think they would normally appear (regressions normally being
against older flights).

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-27 12:02     ` Ian Jackson
  2015-11-27 12:28       ` Ian Campbell
@ 2015-11-27 12:52       ` Jan Beulich
  2015-11-27 13:44         ` Ian Jackson
  1 sibling, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2015-11-27 12:52 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, osstest-admin

>>> On 27.11.15 at 13:02, <Ian.Jackson@eu.citrix.com> wrote:
>> > >  build-i386 5 xen-build fail in 65062 REGR. vs. 63449
> 
> This is completely explained below, I think.

I can't see the connection to any other (failed) test here (also not
in flight 65136's results, which have just come in). There were many
blocked tests in 65062, but the two build-i386-* look to be
independent tests, and test-* ones shouldn't block build-* ones aiui.

> > > >  test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail in 65088 REGR. vs. 63449
> 
> As explained below, in 65112 this step did not run because the earlier
> step `guest-localmigrate' failed:
>   http://logs.test-lab.xenproject.org/osstest/logs/65112/test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm/info.html 
> 
> The fact that we have both `guest-localmigrate' and
> `guest-localmigrate/x10' isn't ideal because it hides from the
> heisenbug compensator that these are actually the same underlying
> test.  Maybe it is time now to rename `guest-localmigrate/x10' to
> `guest-localmigrate' and abolish the latter.

Independent of that, does it make sense for a dependent test to
not be considered failing intermittently when the test it depends on
is?

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-27 12:28       ` Ian Campbell
  2015-11-27 12:35         ` Jan Beulich
@ 2015-11-27 13:24         ` Ian Jackson
  2015-11-27 14:03           ` Ian Campbell
  1 sibling, 1 reply; 16+ messages in thread
From: Ian Jackson @ 2015-11-27 13:24 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, osstest-admin, Jan Beulich

Ian Campbell writes ("Re: [Xen-devel] [xen-4.6-testing test] 65112: regressions - FAIL"):
> On Fri, 2015-11-27 at 12:02 +0000, Ian Jackson wrote:
> > As explained below, in 65112 this step did not run because the earlier
> > step `guest-localmigrate' failed:
> >   http://logs.test-lab.xenproject.org/osstest/logs/65112/test-amd64-
> > amd64-xl-qemut-stubdom-debianhvm-amd64-xsm/info.html
> 
> Would it be possible to arrange for "blocked" to appear somewhere in the
> results for the job? e.g. "blocked fail in XXX REGR. vs. YYY". README.email
> says "The results normally start with the result in this flight" and I
> think this would be in keeping with that.

But it might not be true that it was blocked.  Maybe the version of
osstest used didn't have that step at all, for example.

The best you could say would be something like
  "not run; fail in XXX REGR. vs. YYY"
but that poses more questions than it answers.

> Otherwise I think people naturally tend to just read the "and are blocking"
> section and forget to consider that non-blocking stuff further down may
> have (tolerably) failed but then blocking something else which is then
> blocking the push.

Perhaps sg-report-flight could, if there are any blockages of the form
`fail in XXX REGR. vs YYY', add a note below the blockage section,
saying something like `XXX examined since needed to justify other
failures, see below'.

I'm a bit reluctant to suggest this because it is, essentially,
boilerplate - it would always say the same thing about any `fail in
XXX' - and filling reports like this with boilerplate isn't always a
good idea.

> > The fact that we have both `guest-localmigrate' and
> > `guest-localmigrate/x10' isn't ideal because it hides from the
> > heisenbug compensator that these are actually the same underlying
> > test.  Maybe it is time now to rename `guest-localmigrate/x10' to
> > `guest-localmigrate' and abolish the latter.
> 
> I think this would be a good idea.

I'll send a patch.

> > From 987dd088192f9f94c59beeddc073cecaad76a24e Mon Sep 17 00:00:00 2001
> > From: Ian Jackson <ian.jackson@eu.citrix.com>
> > Date: Fri, 27 Nov 2015 11:36:05 +0000
> > Subject: [OSSTEST PATCH] README.email: Document `fail in 58948 REGR. vs.
> >  63449'
> > 
> > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-27 12:35         ` Jan Beulich
@ 2015-11-27 13:25           ` Ian Jackson
  0 siblings, 0 replies; 16+ messages in thread
From: Ian Jackson @ 2015-11-27 13:25 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Ian Campbell, osstest-admin

Jan Beulich writes ("Re: [Xen-devel] [xen-4.6-testing test] 65112: regressions - FAIL"):
> On 27.11.15 at 13:28, <ian.campbell@citrix.com> wrote:
> >> --- a/README.email
> >> +++ b/README.email
> >> @@ -71,6 +71,24 @@ history.  Here are some examples:
> >>        detect regressions of this test.  Perhaps the test has been
> >>        recently introduced.
> >>  
> >> +   fail in 58948 REGR. vs. 63449
> 
> It seems confusing to me that the numbers are reversed from how
> I think they would normally appear (regressions normally being
> against older flights).

You mean the example would be less confusing with a different number ?
I'll change it.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-27 12:52       ` Jan Beulich
@ 2015-11-27 13:44         ` Ian Jackson
  2015-11-27 14:04           ` Ian Campbell
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Jackson @ 2015-11-27 13:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, osstest-admin

Jan Beulich writes ("Re: [Xen-devel] [xen-4.6-testing test] 65112: regressions - FAIL"):
> On 27.11.15 at 13:02, <Ian.Jackson@eu.citrix.com> wrote:
> > The fact that we have both `guest-localmigrate' and
> > `guest-localmigrate/x10' isn't ideal because it hides from the
> > heisenbug compensator that these are actually the same underlying
> > test.  Maybe it is time now to rename `guest-localmigrate/x10' to
> > `guest-localmigrate' and abolish the latter.
> 
> Independent of that, does it make sense for a dependent test to
> not be considered failing intermittently when the test it depends on
> is?

Suppose two test steps A and B, which normally run in that order.
Suppose failure of A prevents the execution of B (this is the usual
case where step A precedes step B; normally later steps in a job
depend on the success of earlier steps, because after an earlier
failure the testbed state is not necessarily well-defined).

Now suppose A has an intermittent bug, but B is totally broken.

With our current policy on intermittent bugs[1], we would allow a push
despite the bug in A.  But we should not allow a push despite B: the
100% reproducible failure of B should prevent all pushes.

But the bug in B only shows up when A happens to pass.  So the
heisenbug compensator has to insist on seeing an actual pass of B
(which in this hypothetical situation, will not occur).

Eg, consider these flights:

  100  is now master  A pass, B pass      pushed
  200  staging        A pass, B fail      `B REGR. vs 100'
  201  staging        A fail, B not run   `B fail in 200 REGR. vs 100'

In flight 201, the failure of A is indeed justifiable as a heisenbug
because it can be seen to succeed in flight 200.  It is the problem
with B which is actually blocking the push - it is merely that the
failure occurred in flight 200.

If, contrary to my suppositions above, the failure of B is actually a
heisenbug, then hopefully eventually both A and then B will happen to
pass in the same run.  Even if that particular flight has other
problems, a future evaluation of a test of the same version can use
that flight's passes of A and B to justify, respectively, whatever
failures of A and/or B that it comes across.

Ian.

[1] In principle we could have a different policy: to try to reject
intermittent bugs.  But it would require a lot of test resources
because all tests would have to be repeated a lot, and naturally
intermittent bugs would slip through anyway.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-27 13:24         ` Ian Jackson
@ 2015-11-27 14:03           ` Ian Campbell
  2015-11-27 14:07             ` Ian Campbell
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Campbell @ 2015-11-27 14:03 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, osstest-admin, Jan Beulich

On Fri, 2015-11-27 at 13:24 +0000, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] [xen-4.6-testing test] 65112:
> regressions - FAIL"):
> > On Fri, 2015-11-27 at 12:02 +0000, Ian Jackson wrote:
> > > As explained below, in 65112 this step did not run because the
> > > earlier
> > > step `guest-localmigrate' failed:
> > >   http://logs.test-lab.xenproject.org/osstest/logs/65112/test-amd64-
> > > amd64-xl-qemut-stubdom-debianhvm-amd64-xsm/info.html
> > 
> > Would it be possible to arrange for "blocked" to appear somewhere in
> > the
> > results for the job? e.g. "blocked fail in XXX REGR. vs. YYY".
> > README.email
> > says "The results normally start with the result in this flight" and I
> > think this would be in keeping with that.
> 
> But it might not be true that it was blocked.

Can't sg-run-job tell if it was blocked vs something else though?

I was suggesting to only add blocked if it was blocked, I'm not sure what I
was suggesting to do for other reason not to run, because I hadn't really
considered it, but those would be unusual I think?

>   Maybe the version of
> osstest used didn't have that step at all, for example.

In which case would it still be considering the step for failures at all?

i.e. if:

flight 100 had test-foo == pass
flight 200 had test-foo == fail (blocking)
flight 201 had test-foo == blocked; fail in 201 vs 100
flight 202 had no test-foo present at all

Would the decision for flight 202 really be to consider the test-foo
results in 100, 200 and 201, and therefore block?

> The best you could say would be something like
>   "not run; fail in XXX REGR. vs. YYY"
> but that poses more questions than it answers.

Right.

> 
> > Otherwise I think people naturally tend to just read the "and are
> > blocking"
> > section and forget to consider that non-blocking stuff further down may
> > have (tolerably) failed but then blocking something else which is then
> > blocking the push.
> 
> Perhaps sg-report-flight could, if there are any blockages of the form
> `fail in XXX REGR. vs YYY', add a note below the blockage section,
> saying something like `XXX examined since needed to justify other
> failures, see below'.
> 
> I'm a bit reluctant to suggest this because it is, essentially,
> boilerplate - it would always say the same thing about any `fail in
> XXX' - and filling reports like this with boilerplate isn't always a
> good idea.

In general I agree, in this case it might be worth it to counteract a
(perfectly understandable IMHO) natural tendency to only look at the
section labelled blocking, it's basically "don't forget that this non-
blocking stuff might actually be relevant to the blockage".

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-27 13:44         ` Ian Jackson
@ 2015-11-27 14:04           ` Ian Campbell
  2015-11-27 14:59             ` [xen-4.6-testing test] 65112: regressions - FAIL [and 1 more messages] Ian Jackson
  2015-11-27 15:38             ` [OSSTEST PATCH] README.email: Add `Worked example of relevant regression in previous flight' Ian Jackson
  0 siblings, 2 replies; 16+ messages in thread
From: Ian Campbell @ 2015-11-27 14:04 UTC (permalink / raw)
  To: Ian Jackson, Jan Beulich; +Cc: xen-devel, osstest-admin

On Fri, 2015-11-27 at 13:44 +0000, Ian Jackson wrote:
> Eg, consider these flights:
> 
>   100  is now master  A pass, B pass      pushed
>   200  staging        A pass, B fail      `B REGR. vs 100'
>   201  staging        A fail, B not run   `B fail in 200 REGR. vs 100'
> 
> In flight 201, the failure of A is indeed justifiable as a heisenbug
> because it can be seen to succeed in flight 200.  It is the problem
> with B which is actually blocking the push - it is merely that the
> failure occurred in flight 200.

This example really helped clarify things for me, thanks.

I don't know if this is the sort of thing which could fit into a doc
somewhere (maybe README.email could have some of these kinds of worked
examples?)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL
  2015-11-27 14:03           ` Ian Campbell
@ 2015-11-27 14:07             ` Ian Campbell
  0 siblings, 0 replies; 16+ messages in thread
From: Ian Campbell @ 2015-11-27 14:07 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, osstest-admin, Jan Beulich

On Fri, 2015-11-27 at 14:03 +0000, Ian Campbell wrote:
> On Fri, 2015-11-27 at 13:24 +0000, Ian Jackson wrote:
> > Ian Campbell writes ("Re: [Xen-devel] [xen-4.6-testing test] 65112:
> > regressions - FAIL"):
> > > On Fri, 2015-11-27 at 12:02 +0000, Ian Jackson wrote:
> > > > As explained below, in 65112 this step did not run because the
> > > > earlier
> > > > step `guest-localmigrate' failed:
> > > >   http://logs.test-lab.xenproject.org/osstest/logs/65112/test-amd64
> > > > -
> > > > amd64-xl-qemut-stubdom-debianhvm-amd64-xsm/info.html
> > > 
> > > Would it be possible to arrange for "blocked" to appear somewhere in
> > > the
> > > results for the job? e.g. "blocked fail in XXX REGR. vs. YYY".
> > > README.email
> > > says "The results normally start with the result in this flight" and
> > > I
> > > think this would be in keeping with that.
> > 
> > But it might not be true that it was blocked.
> 
> Can't sg-run-job tell if it was blocked vs something else though?

I meant sg-report-flight, of course.

> 
> I was suggesting to only add blocked if it was blocked, I'm not sure what
> I
> was suggesting to do for other reason not to run, because I hadn't really
> considered it, but those would be unusual I think?
> 
> >   Maybe the version of
> > osstest used didn't have that step at all, for example.
> 
> In which case would it still be considering the step for failures at all?
> 
> i.e. if:
> 
> flight 100 had test-foo == pass
> flight 200 had test-foo == fail (blocking)
> flight 201 had test-foo == blocked; fail in 201 vs 100
> flight 202 had no test-foo present at all
> 
> Would the decision for flight 202 really be to consider the test-foo
> results in 100, 200 and 201, and therefore block?
> 
> > The best you could say would be something like
> >   "not run; fail in XXX REGR. vs. YYY"
> > but that poses more questions than it answers.
> 
> Right.
> 
> > 
> > > Otherwise I think people naturally tend to just read the "and are
> > > blocking"
> > > section and forget to consider that non-blocking stuff further down
> > > may
> > > have (tolerably) failed but then blocking something else which is
> > > then
> > > blocking the push.
> > 
> > Perhaps sg-report-flight could, if there are any blockages of the form
> > `fail in XXX REGR. vs YYY', add a note below the blockage section,
> > saying something like `XXX examined since needed to justify other
> > failures, see below'.
> > 
> > I'm a bit reluctant to suggest this because it is, essentially,
> > boilerplate - it would always say the same thing about any `fail in
> > XXX' - and filling reports like this with boilerplate isn't always a
> > good idea.
> 
> In general I agree, in this case it might be worth it to counteract a
> (perfectly understandable IMHO) natural tendency to only look at the
> section labelled blocking, it's basically "don't forget that this non-
> blocking stuff might actually be relevant to the blockage".
> 
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [xen-4.6-testing test] 65112: regressions - FAIL [and 1 more messages]
  2015-11-27 14:04           ` Ian Campbell
@ 2015-11-27 14:59             ` Ian Jackson
  2015-11-27 15:38             ` [OSSTEST PATCH] README.email: Add `Worked example of relevant regression in previous flight' Ian Jackson
  1 sibling, 0 replies; 16+ messages in thread
From: Ian Jackson @ 2015-11-27 14:59 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, osstest-admin, Jan Beulich

Ian Campbell writes ("Re: [Xen-devel] [xen-4.6-testing test] 65112: regressions - FAIL"):
> On Fri, 2015-11-27 at 13:24 +0000, Ian Jackson wrote:
> > But it might not be true that it was blocked.
> 
> Can't sg-run-job tell if it was blocked vs something else though?

(You meant sg-execute-flight.)  No, it can't.  Steps that aren't run
simply don't appear for that job, in the db steps table.

> >   Maybe the version of
> > osstest used didn't have that step at all, for example.
> 
> In which case would it still be considering the step for failures at all?
> 
> i.e. if:
> 
> flight 100 had test-foo == pass
> flight 200 had test-foo == fail (blocking)
> flight 201 had test-foo == blocked; fail in 201 vs 100
> flight 202 had no test-foo present at all
> 
> Would the decision for flight 202 really be to consider the test-foo
> results in 100, 200 and 201, and therefore block?

Only if the evaluation of flight 202 needs to use the results in 200
or 201 to justify a failure of test-bar in 202.  Then it would spot
the earlier problems with test-foo and want a justification for them.


> > Perhaps sg-report-flight could, if there are any blockages of the form
> > `fail in XXX REGR. vs YYY', add a note below the blockage section,
> > saying something like `XXX examined since needed to justify other
> > failures, see below'.
> > 
> > I'm a bit reluctant to suggest this because it is, essentially,
> > boilerplate - it would always say the same thing about any `fail in
> > XXX' - and filling reports like this with boilerplate isn't always a
> > good idea.
> 
> In general I agree, in this case it might be worth it to counteract a
> (perfectly understandable IMHO) natural tendency to only look at the
> section labelled blocking, it's basically "don't forget that this non-
> blocking stuff might actually be relevant to the blockage".

I'll see about doing this.


Ian Campbell writes ("Re: [Xen-devel] [xen-4.6-testing test] 65112: regressions - FAIL"):
> On Fri, 2015-11-27 at 13:44 +0000, Ian Jackson wrote:
> > In flight 201, the failure of A is indeed justifiable as a heisenbug
> > because it can be seen to succeed in flight 200.  It is the problem
> > with B which is actually blocking the push - it is merely that the
> > failure occurred in flight 200.
> 
> This example really helped clarify things for me, thanks.
> 
> I don't know if this is the sort of thing which could fit into a doc
> somewhere (maybe README.email could have some of these kinds of worked
> examples?)

We could put some of this at the bottom of README.email, sure.

Ian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [OSSTEST PATCH] README.email: Add `Worked example of relevant regression in previous flight'
  2015-11-27 14:04           ` Ian Campbell
  2015-11-27 14:59             ` [xen-4.6-testing test] 65112: regressions - FAIL [and 1 more messages] Ian Jackson
@ 2015-11-27 15:38             ` Ian Jackson
  2015-11-27 15:48               ` Ian Campbell
  1 sibling, 1 reply; 16+ messages in thread
From: Ian Jackson @ 2015-11-27 15:38 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell, Jan Beulich

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Jan Beulich <JBeulich@suse.com>
---
 README.email |   51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/README.email b/README.email
index 5de63dd..e14a816 100644
--- a/README.email
+++ b/README.email
@@ -89,6 +89,9 @@ history.  Here are some examples:
       justifiable because they prevent other tests from running and
       can so conceal bugs.)
 
+      See `Worked example of relevant regression in previous flight',
+      below.
+
    fail in 58948 pass in 58965
    fail in 58948 like 37628
 
@@ -159,3 +162,51 @@ X-Osstest-Versions-That:
      tree     revision
 
 `This' is the version being tested, and `That' is the baseline.
+
+
+
+Worked example of relevant regression in previous flight
+--------------------------------------------------------
+
+Suppose two test steps A and B, which normally run in that order:
+  job test-foo
+       A   ./ts-do-some-thing
+       B   ./ts-do-another-thing
+
+Suppose failure of A prevents the execution of B.  (This is the usual
+case where step A precedes step B; normally later steps in a job
+depend on the success of earlier steps, because after an earlier
+failure the testbed state is not necessarily well-defined.)
+
+Now suppose A has an intermittent bug, but B is totally broken.
+
+With our current policy on intermittent bugs[1], we would allow a push
+despite the bug in A.  But we should not allow a push despite B: the
+100% reproducible failure of B should prevent all pushes.
+
+But the bug in B only shows up when A happens to pass.  So the
+heisenbug compensator has to insist on seeing an actual pass of B
+(which in this hypothetical situation, will not occur).
+
+Eg, consider these flights:
+
+  100  is now master  A pass, B pass      pushed
+  200  staging        A pass, B fail      `B REGR. vs 100'
+  201  staging        A fail, B not run   `B fail in 200 REGR. vs 100'
+
+In flight 201, the failure of A is indeed justifiable as a heisenbug
+because it can be seen to succeed in flight 200.  It is the problem
+with B which is actually blocking the push - but that failure is only
+visible in flight 200.
+
+If, contrary to the suppositions above, the failure of B is actually a
+heisenbug, then hopefully eventually both A and then B will happen to
+pass in the same run.  Even if that particular flight has other
+problems, a future evaluation of a test of the same version can use
+that flight's passes of A and B to justify, respectively, whatever
+failures of A and/or B that it comes across.
+
+[1] In principle we could have a different policy: to try to reject
+intermittent bugs.  But it would require a lot of test resources
+because all tests would have to be repeated a lot, and naturally
+intermittent bugs would slip through anyway.
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [OSSTEST PATCH] README.email: Add `Worked example of relevant regression in previous flight'
  2015-11-27 15:38             ` [OSSTEST PATCH] README.email: Add `Worked example of relevant regression in previous flight' Ian Jackson
@ 2015-11-27 15:48               ` Ian Campbell
  0 siblings, 0 replies; 16+ messages in thread
From: Ian Campbell @ 2015-11-27 15:48 UTC (permalink / raw)
  To: Ian Jackson, xen-devel; +Cc: Jan Beulich

On Fri, 2015-11-27 at 15:38 +0000, Ian Jackson wrote:
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Jan Beulich <JBeulich@suse.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> ---
>  README.email |   51 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
> 
> diff --git a/README.email b/README.email
> index 5de63dd..e14a816 100644
> --- a/README.email
> +++ b/README.email
> @@ -89,6 +89,9 @@ history.  Here are some examples:
>        justifiable because they prevent other tests from running and
>        can so conceal bugs.)
>  
> +      See `Worked example of relevant regression in previous flight',
> +      below.
> +
>     fail in 58948 pass in 58965
>     fail in 58948 like 37628
>  
> @@ -159,3 +162,51 @@ X-Osstest-Versions-That:
>       tree     revision
>  
>  `This' is the version being tested, and `That' is the baseline.
> +
> +
> +
> +Worked example of relevant regression in previous flight
> +--------------------------------------------------------
> +
> +Suppose two test steps A and B, which normally run in that order:
> +  job test-foo
> +       A   ./ts-do-some-thing
> +       B   ./ts-do-another-thing
> +
> +Suppose failure of A prevents the execution of B.  (This is the usual
> +case where step A precedes step B; normally later steps in a job
> +depend on the success of earlier steps, because after an earlier
> +failure the testbed state is not necessarily well-defined.)
> +
> +Now suppose A has an intermittent bug, but B is totally broken.
> +
> +With our current policy on intermittent bugs[1], we would allow a push
> +despite the bug in A.  But we should not allow a push despite B: the
> +100% reproducible failure of B should prevent all pushes.
> +
> +But the bug in B only shows up when A happens to pass.  So the
> +heisenbug compensator has to insist on seeing an actual pass of B
> +(which in this hypothetical situation, will not occur).
> +
> +Eg, consider these flights:
> +
> +  100  is now master  A pass, B pass      pushed
> +  200  staging        A pass, B fail      `B REGR. vs 100'
> +  201  staging        A fail, B not run   `B fail in 200 REGR. vs 100'
> +
> +In flight 201, the failure of A is indeed justifiable as a heisenbug
> +because it can be seen to succeed in flight 200.  It is the problem
> +with B which is actually blocking the push - but that failure is only
> +visible in flight 200.
> +
> +If, contrary to the suppositions above, the failure of B is actually a
> +heisenbug, then hopefully eventually both A and then B will happen to
> +pass in the same run.  Even if that particular flight has other
> +problems, a future evaluation of a test of the same version can use
> +that flight's passes of A and B to justify, respectively, whatever
> +failures of A and/or B that it comes across.
> +
> +[1] In principle we could have a different policy: to try to reject
> +intermittent bugs.  But it would require a lot of test resources
> +because all tests would have to be repeated a lot, and naturally
> +intermittent bugs would slip through anyway.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-11-27 15:49 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-26 17:27 [xen-4.6-testing test] 65112: regressions - FAIL osstest service owner
2015-11-27  8:18 ` Jan Beulich
2015-11-27  9:53   ` Ian Campbell
2015-11-27 12:02     ` Ian Jackson
2015-11-27 12:28       ` Ian Campbell
2015-11-27 12:35         ` Jan Beulich
2015-11-27 13:25           ` Ian Jackson
2015-11-27 13:24         ` Ian Jackson
2015-11-27 14:03           ` Ian Campbell
2015-11-27 14:07             ` Ian Campbell
2015-11-27 12:52       ` Jan Beulich
2015-11-27 13:44         ` Ian Jackson
2015-11-27 14:04           ` Ian Campbell
2015-11-27 14:59             ` [xen-4.6-testing test] 65112: regressions - FAIL [and 1 more messages] Ian Jackson
2015-11-27 15:38             ` [OSSTEST PATCH] README.email: Add `Worked example of relevant regression in previous flight' Ian Jackson
2015-11-27 15:48               ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.