* [RFC PATCH] scripts/ci: add gitlab-failure-analysis script
@ 2025-09-08 21:18 Alex Bennée
2025-09-09 4:37 ` Thomas Huth
0 siblings, 1 reply; 5+ messages in thread
From: Alex Bennée @ 2025-09-08 21:18 UTC (permalink / raw)
To: qemu-devel
Cc: berrange, Alex Bennée, Philippe Mathieu-Daudé,
Thomas Huth
This is a script designed to collect data from multiple pipelines and
analyse the failure modes they have.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
scripts/ci/gitlab-failure-analysis | 65 ++++++++++++++++++++++++++++++
1 file changed, 65 insertions(+)
create mode 100755 scripts/ci/gitlab-failure-analysis
diff --git a/scripts/ci/gitlab-failure-analysis b/scripts/ci/gitlab-failure-analysis
new file mode 100755
index 00000000000..195db63a0c0
--- /dev/null
+++ b/scripts/ci/gitlab-failure-analysis
@@ -0,0 +1,65 @@
+#!/usr/bin/env python3
+#
+# A script to analyse failures in the gitlab pipelines. It requires an
+# API key from gitlab with the following permissions:
+# - api
+# - read_repository
+# - read_user
+#
+
+import argparse
+import gitlab
+import os
+
+#
+# Arguments
+#
+parser = argparse.ArgumentParser(description="Analyse failed GitLab CI runs.")
+
+parser.add_argument("--gitlab",
+ default="https://gitlab.com",
+ help="GitLab instance URL (default: https://gitlab.com).")
+parser.add_argument("--id", default=11167699,
+ type=int,
+ help="GitLab project id (default: 11167699 for qemu-project/qemu)")
+parser.add_argument("--token",
+ default=os.getenv("GITLAB_TOKEN"),
+ help="Your personal access token with 'api' scope.")
+parser.add_argument("--branch",
+ default="staging",
+ help="The name of the branch (default: 'staging')")
+parser.add_argument("--count", type=int,
+ default=3,
+ help="The number of failed runs to fetch.")
+
+
+if __name__ == "__main__":
+ args = parser.parse_args()
+
+ gl = gitlab.Gitlab(url=args.gitlab, private_token=args.token)
+ project = gl.projects.get(args.id)
+
+ # Use an iterator to fetch the pipelines
+ pipe_iter = project.pipelines.list(iterator=True,
+ status="failed",
+ ref=args.branch)
+ pipe_failed = [next(pipe_iter) for _ in range(args.count)]
+
+ # Check each failed pipeline
+ for p in pipe_failed:
+
+ jobs = p.jobs.list(get_all = True)
+ failed_jobs = [j for j in jobs if j.status == "failed"]
+ skipped_jobs = [j for j in jobs if j.status == "skipped"]
+ manual_jobs = [j for j in jobs if j.status == "manual"]
+
+ test_report = p.test_report.get()
+
+ print(f"Failed pipeline {p.id}, total jobs {len(jobs)}, "
+ f"skipped {len(skipped_jobs)}, "
+ f"failed {len(failed_jobs)}, ",
+ f"{test_report.total_count} tests, "
+ f"{test_report.failed_count} failed tests")
+
+ for j in failed_jobs:
+ print(f" Failed {j.id}, {j.name}, {j.web_url}")
--
2.47.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] scripts/ci: add gitlab-failure-analysis script
2025-09-08 21:18 [RFC PATCH] scripts/ci: add gitlab-failure-analysis script Alex Bennée
@ 2025-09-09 4:37 ` Thomas Huth
2025-09-09 8:38 ` Alex Bennée
0 siblings, 1 reply; 5+ messages in thread
From: Thomas Huth @ 2025-09-09 4:37 UTC (permalink / raw)
To: Alex Bennée, qemu-devel; +Cc: berrange, Philippe Mathieu-Daudé
On 08/09/2025 23.18, Alex Bennée wrote:
> This is a script designed to collect data from multiple pipelines and
> analyse the failure modes they have.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
> scripts/ci/gitlab-failure-analysis | 65 ++++++++++++++++++++++++++++++
> 1 file changed, 65 insertions(+)
> create mode 100755 scripts/ci/gitlab-failure-analysis
You already get a nice overview by visiting a page like
https://gitlab.com/qemu-project/qemu/-/pipelines/2019002986 ... what's the
advantage of this script?
Thomas
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] scripts/ci: add gitlab-failure-analysis script
2025-09-09 4:37 ` Thomas Huth
@ 2025-09-09 8:38 ` Alex Bennée
2025-09-09 9:00 ` Peter Maydell
0 siblings, 1 reply; 5+ messages in thread
From: Alex Bennée @ 2025-09-09 8:38 UTC (permalink / raw)
To: Thomas Huth; +Cc: qemu-devel, berrange, Philippe Mathieu-Daudé
Thomas Huth <thuth@redhat.com> writes:
> On 08/09/2025 23.18, Alex Bennée wrote:
>> This is a script designed to collect data from multiple pipelines and
>> analyse the failure modes they have.
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> ---
>> scripts/ci/gitlab-failure-analysis | 65 ++++++++++++++++++++++++++++++
>> 1 file changed, 65 insertions(+)
>> create mode 100755 scripts/ci/gitlab-failure-analysis
>
> You already get a nice overview by visiting a page like
> https://gitlab.com/qemu-project/qemu/-/pipelines/2019002986 ... what's
> the advantage of this script?
Not having to click every link when I want to see what the pattern of
failures is and what might be a candidate for making flaky.
>
> Thomas
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] scripts/ci: add gitlab-failure-analysis script
2025-09-09 8:38 ` Alex Bennée
@ 2025-09-09 9:00 ` Peter Maydell
2025-09-09 9:12 ` Daniel P. Berrangé
0 siblings, 1 reply; 5+ messages in thread
From: Peter Maydell @ 2025-09-09 9:00 UTC (permalink / raw)
To: Alex Bennée
Cc: Thomas Huth, qemu-devel, berrange, Philippe Mathieu-Daudé
On Tue, 9 Sept 2025 at 09:39, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Thomas Huth <thuth@redhat.com> writes:
>
> > On 08/09/2025 23.18, Alex Bennée wrote:
> >> This is a script designed to collect data from multiple pipelines and
> >> analyse the failure modes they have.
> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >> ---
> >> scripts/ci/gitlab-failure-analysis | 65 ++++++++++++++++++++++++++++++
> >> 1 file changed, 65 insertions(+)
> >> create mode 100755 scripts/ci/gitlab-failure-analysis
> >
> > You already get a nice overview by visiting a page like
> > https://gitlab.com/qemu-project/qemu/-/pipelines/2019002986 ... what's
> > the advantage of this script?
>
> Not having to click every link when I want to see what the pattern of
> failures is and what might be a candidate for making flaky.
What I would like for finding flaky tests is to find every
case where:
* a job failed on commit hash X
* we also have the same job succeeding on the same commit X
Those are the flaky tests, where we hit retry and it just
passed the second time, and it rules out the cases where
we had a genuine "job failed because the code being tested
for merge had a problem".
When we find those jobs that only failed because of a flaky
test then we can analyse their logs to identify what the
exact failures were.
Can we find those with this script ? (You can't do it with
the gitlab web UI, whose search and filter capabilities
are extremely limited.)
thanks
-- PMM
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] scripts/ci: add gitlab-failure-analysis script
2025-09-09 9:00 ` Peter Maydell
@ 2025-09-09 9:12 ` Daniel P. Berrangé
0 siblings, 0 replies; 5+ messages in thread
From: Daniel P. Berrangé @ 2025-09-09 9:12 UTC (permalink / raw)
To: Peter Maydell
Cc: Alex Bennée, Thomas Huth, qemu-devel,
Philippe Mathieu-Daudé
On Tue, Sep 09, 2025 at 10:00:05AM +0100, Peter Maydell wrote:
> On Tue, 9 Sept 2025 at 09:39, Alex Bennée <alex.bennee@linaro.org> wrote:
> >
> > Thomas Huth <thuth@redhat.com> writes:
> >
> > > On 08/09/2025 23.18, Alex Bennée wrote:
> > >> This is a script designed to collect data from multiple pipelines and
> > >> analyse the failure modes they have.
> > >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> > >> ---
> > >> scripts/ci/gitlab-failure-analysis | 65 ++++++++++++++++++++++++++++++
> > >> 1 file changed, 65 insertions(+)
> > >> create mode 100755 scripts/ci/gitlab-failure-analysis
> > >
> > > You already get a nice overview by visiting a page like
> > > https://gitlab.com/qemu-project/qemu/-/pipelines/2019002986 ... what's
> > > the advantage of this script?
> >
> > Not having to click every link when I want to see what the pattern of
> > failures is and what might be a candidate for making flaky.
>
> What I would like for finding flaky tests is to find every
> case where:
> * a job failed on commit hash X
> * we also have the same job succeeding on the same commit X
>
> Those are the flaky tests, where we hit retry and it just
> passed the second time, and it rules out the cases where
> we had a genuine "job failed because the code being tested
> for merge had a problem".
>
> When we find those jobs that only failed because of a flaky
> test then we can analyse their logs to identify what the
> exact failures were.
>
> Can we find those with this script ? (You can't do it with
> the gitlab web UI, whose search and filter capabilities
> are extremely limited.)
Downloading data from gitlab API is painfully slow so not something
you want to do regularly/repeatedly.
If we can have the script to download the data and save it locally,
we could then do something like populate a sqllite DB with pipeline
results which can we efficiently query to extract failure patterns.
I guess this script at least starts us moving in that direction by
giving us the framework to fetch data, and build on that...
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-09-09 9:13 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-08 21:18 [RFC PATCH] scripts/ci: add gitlab-failure-analysis script Alex Bennée
2025-09-09 4:37 ` Thomas Huth
2025-09-09 8:38 ` Alex Bennée
2025-09-09 9:00 ` Peter Maydell
2025-09-09 9:12 ` Daniel P. Berrangé
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).