Re: Serious doubts about Gitlab CI

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Thomas Huth <thuth@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
	"Andrew Jones" <drjones@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	"Wainer dos Santos Moschetta" <wainersm@redhat.com>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	"Cleber Rosa" <crosa@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>
Subject: Re: Serious doubts about Gitlab CI
Date: Tue, 30 Mar 2021 13:55:48 +0200	[thread overview]
Message-ID: <04e5e251-7a09-dcf6-82ad-31bf696bc248@redhat.com> (raw)
In-Reply-To: <YGMJSoIGa5VoVDB1@redhat.com>

On 30/03/2021 13.19, Daniel P. Berrangé wrote:
> On Mon, Mar 29, 2021 at 03:10:36PM +0100, Stefan Hajnoczi wrote:
>> Hi,
>> I wanted to follow up with a summary of the CI jobs:
>>
>> 1. Containers & Containers Layer2 - ~3 minutes/job x 39 jobs
>> 2. Builds - ~50 minutes/job x 61 jobs
>> 3. Tests - ~12 minutes/job x 20 jobs
>> 4. Deploy - 52 minutes x 1 job

I hope that 52 was just a typo ... ?

> I think a challenges we have with our incremental approach is that
> we're not really taking into account relative importance of the
> different build scenarios, and often don't look at the big picture
> of what the new job adds in terms of quality, compared to existing
> jobs.
> 
> eg Consider we have
> 
>    build-system-alpine:
>    build-system-ubuntu:
>    build-system-debian:
>    build-system-fedora:
>    build-system-centos:
>    build-system-opensuse:

I guess we could go through that list of jobs and remove the duplicated 
target CPUs, e.g. it should be enough to test x86_64-softmmu only once.

>    build-trace-multi-user:
>    build-trace-ftrace-system:
>    build-trace-ust-system:
> 
> I'd question whether we really need any of those 'build-trace'
> jobs. Instead, we could have build-system-ubuntu pass
> --enable-trace-backends=log,simple,syslog, build-system-debian
> pass --enable-trace-backends=ust and build-system-fedora
> pass --enable-trace-backends=ftrace, etc.

I recently had the very same idea already:

  https://gitlab.com/qemu-project/qemu/-/commit/65aff82076a9bbfdf7

:-)

> Another example, is that we test builds on centos7 with
> three different combos of crypto backend settings. This was
> to exercise bugs we've seen in old crypto packages in RHEL-7
> but in reality, it is probably overkill, because downstream
> RHEL-7 only cares about one specific combination.

Care to send a patch? Or shall we just wait one more months and then remove 
these jobs (since we won't support RHEL7 after QEMU 6.0 anymore)?

> We don't really have a clearly defined plan to identify what
> the most important things are in our testing coverage, so we
> tend to accept anything without questioning its value add.
> This really feeds back into the idea I've brought up many
> times in the past, that we need to better define what we aim
> to support in QEMU and its quality level, which will influence
> what are the scenarios we care about testing.

But code that we have in the repository should get at least some basic test 
coverage, otherwise it bitrots soon ... so it's maybe rather the other old 
problem that we struggle with, that we should deprecate more code and remove 
it if nobody cares about it...

>> Traditionally ccache (https://ccache.dev/) was used to detect
>> recompilation of the same compiler input files. This is trickier to do
>> in GitLab CI since it would be necessary to share and update a cache,
>> potentially between untrusted users. Unfortunately this shifts the
>> bottleneck from CPU to network in a CI-as-a-Service environment since
>> the cached build output needs to be accessed by the linker on the CI
>> runner but is stored remotely.
> 
> Our docker containers install ccache already and I could have sworn
> that we use that in gitlab, but now I'm not so sure. We're only
> saving the "build/" directory as an artifact between jobs, and I'm
> not sure that directory holds the ccache cache.

AFAIK we never really enabled ccache in the gitlab-CI, only in Travis.

>> This is as far as I've gotten with thinking about CI efficiency. Do you
>> think these optimizations are worth investigating or should we keep it
>> simple and just disable many builds by default?
> 
> ccache is a no-brainer and assuming it isn't already working with
> our gitlab jobs, we must fix that asap.

I've found some nice instructions here:

https://gould.cx/ted/blog/2017/06/10/ccache-for-Gitlab-CI/

... and just kicked off a build with these modifications, let's see how it 
goes...

> Aside from optimizing CI, we should consider whether there's more we
> can do to optimize build process itself. We've done alot of work, but
> there's still plenty of stuff we build multiple times, once for each
> target. Perhaps there's scope for cutting this down in some manner ?

Right, I think we should also work more towards consolidating the QEMU 
binaries, to avoid that we have to always build sooo many target binaries 
again and again. E.g.:

- Do we still need to support 32-bit hosts? If not we could
   finally get rid of qemu-system-i386, qemu-system-ppc,
   qemu-system-arm, etc. and just provide the 64-bit variants

- Could we maybe somehow unify the targets that have both, big
   and little endian versions? Then we could merge e.g.
   qemu-system-microblaze and qemu-system-microblazeel etc.

- Or could we maybe even build a unified qemu-system binary that
   contains all target CPUs? ... that would also allow e.g.
   machines with a x86 main CPU and an ARM-based board management
   controller...

> I'm unclear how many jobs in CI are build submodules, but if there's
> more scope for using the pre-built distro packages that's going to
> be beneficial in build time.

Since the build system has been converted to meson, I think the configure 
script prefers to use the submodules instead of the distro packages. I've 
tried to remedy this a little bit here:

https://gitlab.com/qemu-project/qemu/-/commit/db0108d5d846e9a8

... but new jobs of course will use the submodules again if the author is 
not careful.

I think we should tweak that behavior again to use the system version of 
capstone, slirp and dtc instead if these are available. Paolo, what do you 
think?

Also I wonder whether we could maybe even get rid of the capstone and slirp 
submodules in QEMU now ... these libraries should be available in the most 
distros by now, and otherwise people could also install them manually instead?

  Thomas

next prev parent reply	other threads:[~2021-03-30 11:56 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-17 20:29 Serious doubts about Gitlab CI Philippe Mathieu-Daudé
2021-03-18  1:28 ` Bin Meng
2021-03-18  8:55   ` Philippe Mathieu-Daudé
2021-03-18  9:33 ` Daniel P. Berrangé
2021-03-18  9:50   ` Philippe Mathieu-Daudé
2021-03-18 10:28     ` Philippe Mathieu-Daudé
2021-03-19  5:34       ` Thomas Huth
2021-03-18 19:46 ` Stefan Hajnoczi
2021-03-18 19:52   ` John Snow
2021-03-18 20:53     ` Philippe Mathieu-Daudé
2021-03-18 20:30   ` Paolo Bonzini
2021-03-19  9:33     ` Stefan Hajnoczi
2021-03-19  9:41       ` Paolo Bonzini
2021-03-19 11:44         ` Philippe Mathieu-Daudé
2021-03-19 10:18       ` Andrew Jones
2021-03-19 10:59         ` Paolo Bonzini
2021-03-19 11:34           ` Philippe Mathieu-Daudé
2021-03-19 15:27             ` Wainer dos Santos Moschetta
2021-03-29 14:10               ` Stefan Hajnoczi
2021-03-30 11:19                 ` Daniel P. Berrangé
2021-03-30 11:55                   ` Thomas Huth [this message]
2021-03-30 12:09                     ` Paolo Bonzini
2021-03-30 12:23                       ` Philippe Mathieu-Daudé
2021-03-30 12:45                         ` Paolo Bonzini
2021-03-30 13:12                           ` Daniel P. Berrangé
2021-03-30 13:19                             ` Paolo Bonzini
2021-03-30 13:27                               ` Daniel P. Berrangé
2021-03-30 15:59                                 ` Warner Losh
2021-03-30 16:11                                   ` Peter Maydell
2021-03-30 16:24                                     ` Warner Losh
2021-03-30 16:10                                 ` Thomas Huth
2021-03-30 13:09                       ` Daniel P. Berrangé
2021-03-30 12:09                     ` Paolo Bonzini
2021-03-30 12:19                     ` Peter Maydell
2021-03-30 12:33                     ` Philippe Mathieu-Daudé
2021-03-30 13:19                     ` Daniel P. Berrangé
2021-03-31  7:54                       ` Thomas Huth
2021-03-31  9:31                         ` Andrea Bolognani
2021-03-30 14:13                     ` Stefan Hajnoczi
2021-03-30 14:23                       ` Paolo Bonzini
2021-03-30 14:30                         ` Daniel P. Berrangé
2021-03-30 14:09                   ` Stefan Hajnoczi
2021-03-19 12:07         ` Markus Armbruster
2021-03-19 13:06           ` Thomas Huth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=04e5e251-7a09-dcf6-82ad-31bf696bc248@redhat.com \
    --to=thuth@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=berrange@redhat.com \
    --cc=crosa@redhat.com \
    --cc=drjones@redhat.com \
    --cc=f4bug@amsat.org \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@redhat.com \
    --cc=wainersm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).