Re: [Qemu-devel] [PATCH 1/2] Acceptance tests: exclude "flaky" tests

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Cleber Rosa <crosa@redhat.com>
To: "Philippe Mathieu-Daudé" <philmd@redhat.com>
Cc: "Fam Zheng" <fam@euphon.net>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	"Aleksandar Rikalo" <arikalo@wavecomp.com>,
	qemu-devel@nongnu.org,
	"Wainer dos Santos Moschetta" <wainersm@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Aurelien Jarno" <aurelien@aurel32.net>
Subject: Re: [Qemu-devel] [PATCH 1/2] Acceptance tests: exclude "flaky" tests
Date: Fri, 21 Jun 2019 10:38:16 -0400	[thread overview]
Message-ID: <20190621143816.GA24282@localhost.localdomain> (raw)
In-Reply-To: <f18a5df8-201e-b8a1-1a3e-3e2254ce8b1e@redhat.com>

On Fri, Jun 21, 2019 at 09:03:33AM +0200, Philippe Mathieu-Daudé wrote:
> On 6/21/19 8:09 AM, Cleber Rosa wrote:
> > It's a fact that some tests may not be 100% reliable in all
> > environments.  While it's a tough call to remove a useful test that
> > from the tree because it may fail every 1/100th time (or so), having
> > human attention drawn to known issues is very bad for humans and for
> > the projects they manage.
> > 
> > As a compromise solution, this marks tests that are known to have
> > issues, or that exercises known issues in QEMU or other components,
> > and excludes them from the entry point.  As a consequence, tests
> > marked as "flaky" will not be executed as part of "make
> > check-acceptance".
> > 
> > Because such tests should be forgiven but never be forgotten, it's
> > possible to list them with (assuming "make check-venv" or "make
> > check-acceptance" has already initiatilized the venv):
> > 
> >   $ ./tests/venv/bin/avocado list -t flaky tests/acceptance
> > 
> > The current list of tests marked as flaky are a result of running
> > the entire set of acceptance tests around 20 times.  The results
> > were then processed with a helper script[1].  That either confirmed
> > known issues (in the case of aarch64 and arm)[2] or revealed new
> > ones (mips).
> > 
> > This also bumps the Avocado version to one that includes a fix to the
> > parsing of multiple and mix "key:val" and simple tag values.
> > 
> > [1] https://raw.githubusercontent.com/avocado-framework/avocado/master/contrib/scripts/summarize-job-failures.py
> > [2] https://bugs.launchpad.net/qemu/+bug/1829779
> > 
> > Signed-off-by: Cleber Rosa <crosa@redhat.com>
> > ---
> >  docs/devel/testing.rst                   | 17 +++++++++++++++++
> >  tests/Makefile.include                   |  6 +++++-
> >  tests/acceptance/boot_linux_console.py   |  2 ++
> >  tests/acceptance/linux_ssh_mips_malta.py |  2 ++
> >  tests/requirements.txt                   |  2 +-
> >  5 files changed, 27 insertions(+), 2 deletions(-)
> > 
> > diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
> > index da2d0fc964..ff4d8e2e1c 100644
> > --- a/docs/devel/testing.rst
> > +++ b/docs/devel/testing.rst
> > @@ -574,6 +574,23 @@ may be invoked by running:
> >  
> >    tests/venv/bin/avocado run $OPTION1 $OPTION2 tests/acceptance/
> >  
> > +Tagging tests
> > +-------------
> > +
> > +flaky
> > +~~~~~
> > +
> > +If a test is known to fail intermittently, even if only every one
> > +hundredth time, it's highly advisable to mark it as a flaky test.
> > +This will prevent these individual tests from failing much larger
> > +jobs, will avoid human interaction and time wasted to verify a known
> > +issue, and worse of all, can lead to the discredit of automated
> > +testing.
> > +
> > +To mark a test as flaky, add to its docstring.::
> > +
> > +  :avocado: tags=flaky
> 
> I certainly disagree with this patch, failing tests have to be fixed.
> Why not tag all the codebase flaky and sing "happy coding"?
>

That's a great idea! :)

Now, seriously, I also resisted this for quite a long time.  The
reality, though, is that intermittent failures will continue to
appear, and letting tests (and jobs, and CI pipelines, and whatnot)
fail is a very bad idea.  We all agree that real fixes are better than
this, but many times they don't come quickly.

> Anyway if this get accepted, 'flaky' tags must have the intermittent
> failure well described, and a Launchpad/Bugzilla tracking ticket referenced.
>

And here you have a key point that I absolutely agree with.  The
"flaky" approach can either poison a lot of tests, and be seen as
quick way out of a difficult issue revealed by a test.  Or, it can
serve as an effective tool to keep track of these very important
issues.

If we add:

   # https://bugs.launchpad.net/qemu/+bug/1829779
   :avocado: flaky

Topped with some human, I believe this can be very effective.  This goes
without saying, but comments here are very much welcome.

- Cleber.

> > +
> >  Manual Installation
> >  -------------------
> >  
> > diff --git a/tests/Makefile.include b/tests/Makefile.include
> > index db750dd6d0..4c97da2878 100644
> > --- a/tests/Makefile.include
> > +++ b/tests/Makefile.include
> > @@ -1125,7 +1125,11 @@ TESTS_RESULTS_DIR=$(BUILD_DIR)/tests/results
> >  # Any number of command separated loggers are accepted.  For more
> >  # information please refer to "avocado --help".
> >  AVOCADO_SHOW=app
> > -AVOCADO_TAGS=$(patsubst %-softmmu,-t arch:%, $(filter %-softmmu,$(TARGET_DIRS)))
> > +
> > +# Additional tags that are added to each occurence of "--filter-by-tags"
> > +AVOCADO_EXTRA_TAGS := ,-flaky
> > +
> > +AVOCADO_TAGS=$(patsubst %-softmmu,--filter-by-tags=arch:%$(AVOCADO_EXTRA_TAGS), $(filter %-softmmu,$(TARGET_DIRS)))
> >  
> >  ifneq ($(findstring v2,"v$(PYTHON_VERSION)"),v2)
> >  $(TESTS_VENV_DIR): $(TESTS_VENV_REQ)
> > diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py
> > index 32159503e9..6bd5c1ab53 100644
> > --- a/tests/acceptance/boot_linux_console.py
> > +++ b/tests/acceptance/boot_linux_console.py
> > @@ -249,6 +249,7 @@ class BootLinuxConsole(Test):
> >          """
> >          :avocado: tags=arch:aarch64
> >          :avocado: tags=machine:virt
> > +        :avocado: tags=flaky
> >          """
> >          kernel_url = ('https://download.fedoraproject.org/pub/fedora/linux/'
> >                        'releases/29/Everything/aarch64/os/images/pxeboot/vmlinuz')
> > @@ -270,6 +271,7 @@ class BootLinuxConsole(Test):
> >          """
> >          :avocado: tags=arch:arm
> >          :avocado: tags=machine:virt
> > +        :avocado: tags=flaky
> >          """
> >          kernel_url = ('https://download.fedoraproject.org/pub/fedora/linux/'
> >                        'releases/29/Everything/armhfp/os/images/pxeboot/vmlinuz')
> > diff --git a/tests/acceptance/linux_ssh_mips_malta.py b/tests/acceptance/linux_ssh_mips_malta.py
> > index aafb0c39f6..ae70b658e0 100644
> > --- a/tests/acceptance/linux_ssh_mips_malta.py
> > +++ b/tests/acceptance/linux_ssh_mips_malta.py
> > @@ -208,6 +208,7 @@ class LinuxSSH(Test):
> >          :avocado: tags=machine:malta
> >          :avocado: tags=endian:big
> >          :avocado: tags=device:pcnet32
> > +        :avocado: tags=flaky
> >          """
> >          kernel_url = ('https://people.debian.org/~aurel32/qemu/mips/'
> >                        'vmlinux-3.2.0-4-5kc-malta')
> > @@ -222,6 +223,7 @@ class LinuxSSH(Test):
> >          :avocado: tags=machine:malta
> >          :avocado: tags=endian:little
> >          :avocado: tags=device:pcnet32
> > +        :avocado: tags=flaky
> >          """
> >          kernel_url = ('https://people.debian.org/~aurel32/qemu/mipsel/'
> >                        'vmlinux-3.2.0-4-5kc-malta')
> > diff --git a/tests/requirements.txt b/tests/requirements.txt
> > index 3ae0e29ad7..58d63d171f 100644
> > --- a/tests/requirements.txt
> > +++ b/tests/requirements.txt
> > @@ -1,5 +1,5 @@
> >  # Add Python module requirements, one per line, to be installed
> >  # in the tests/venv Python virtual environment. For more info,
> >  # refer to: https://pip.pypa.io/en/stable/user_guide/#id1
> > -avocado-framework==68.0
> > +avocado-framework==69.1
> >  paramiko
> > 
>

next prev parent reply	other threads:[~2019-06-21 14:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-21  6:09 [Qemu-devel] [PATCH 0/2] Acceptance tests: exclude "flaky" tests and introduce SPICE test Cleber Rosa
2019-06-21  6:09 ` [Qemu-devel] [PATCH 1/2] Acceptance tests: exclude "flaky" tests Cleber Rosa
2019-06-21  7:03   ` Philippe Mathieu-Daudé
2019-06-21 14:38     ` Cleber Rosa [this message]
2019-06-28 20:43       ` Wainer dos Santos Moschetta
2019-06-30 17:51         ` Cleber Rosa
2019-07-05 19:01           ` Wainer dos Santos Moschetta
2019-06-21  6:09 ` [Qemu-devel] [PATCH 2/2] Acceptance tests: add SPICE protocol check Cleber Rosa
2019-06-28 20:54   ` Wainer dos Santos Moschetta
2019-06-30 18:01     ` Cleber Rosa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190621143816.GA24282@localhost.localdomain \
    --to=crosa@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=arikalo@wavecomp.com \
    --cc=aurelien@aurel32.net \
    --cc=ehabkost@redhat.com \
    --cc=fam@euphon.net \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wainersm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.