* [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes @ 2021-11-16 16:33 Thomas Huth 2021-11-16 16:49 ` Daniel P. Berrangé 2021-11-16 17:17 ` Willian Rampazzo 0 siblings, 2 replies; 8+ messages in thread From: Thomas Huth @ 2021-11-16 16:33 UTC (permalink / raw) To: qemu-devel, Philippe Mathieu-Daudé, Alex Bennée Cc: Willian Rampazzo, Daniel P . Berrangé, Wainer dos Santos Moschetta The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to be scheduled, so while the build test itself finishes within 60 minutes, the total run time of the jobs can be longer due to this waiting time. Thus let's increase the timeout on the gitlab side a little bit, so that these jobs are not marked as failing just because of the delay. Signed-off-by: Thomas Huth <thuth@redhat.com> --- .gitlab-ci.d/cirrus.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml index e7b25e7427..22d42585e4 100644 --- a/.gitlab-ci.d/cirrus.yml +++ b/.gitlab-ci.d/cirrus.yml @@ -14,6 +14,7 @@ stage: build image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master needs: [] + timeout: 80m allow_failure: true script: - source .gitlab-ci.d/cirrus/$NAME.vars -- 2.27.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes 2021-11-16 16:33 [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes Thomas Huth @ 2021-11-16 16:49 ` Daniel P. Berrangé 2021-11-16 17:09 ` Philippe Mathieu-Daudé 2021-11-16 17:17 ` Willian Rampazzo 1 sibling, 1 reply; 8+ messages in thread From: Daniel P. Berrangé @ 2021-11-16 16:49 UTC (permalink / raw) To: Thomas Huth Cc: Willian Rampazzo, Alex Bennée, qemu-devel, Wainer dos Santos Moschetta, Philippe Mathieu-Daudé On Tue, Nov 16, 2021 at 05:33:09PM +0100, Thomas Huth wrote: > The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to > be scheduled, so while the build test itself finishes within 60 minutes, > the total run time of the jobs can be longer due to this waiting time. > Thus let's increase the timeout on the gitlab side a little bit, so > that these jobs are not marked as failing just because of the delay. On a successful pipeline I see freebsd-11 - 28 minutes freebsd-12 - 57 minutes macos - 30 minutes We know cirrus allows 2 concurrent jobs, so from that I infer that the freebsd-12 job was queued for ~30 minutes waiting for either the freebsd-11 or macos job to finish, and then it ran in 30 minutes, giving the ~60 minute total. That's too close to the 60 minute gitlab default job timeout for comfort - it can easily slip over 60 minutes by just a small amount. 80 minutes will certainly help in the case where we randomly take a little longer than 30 minutes to build, and have 1 of the 3 jobs queued. When we're running jobs on both master + staging, we can have 2 jobs running and 4 more queued - 2 of those queued might just finish in time, but 2 will definitely fail. My patch will cut these extra jobs on master, so in common case we only ever get 1 queued, which should work well in combo with your patch here. That should be good enough for the qemu-project namespace, unless someone is triggering pipelines for stable branch staging at the same time as the master branch staging. If we do want to worry about more than 2 queued jobs again for that reason, we might consider putting it upto 100 minutes. That would give us enough slack to have 4 queued jobs behind two running jobs and have them all succeed > Signed-off-by: Thomas Huth <thuth@redhat.com> > --- > .gitlab-ci.d/cirrus.yml | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml > index e7b25e7427..22d42585e4 100644 > --- a/.gitlab-ci.d/cirrus.yml > +++ b/.gitlab-ci.d/cirrus.yml > @@ -14,6 +14,7 @@ > stage: build > image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master > needs: [] > + timeout: 80m > allow_failure: true > script: > - source .gitlab-ci.d/cirrus/$NAME.vars Whether 80 or 100 minute, consider it Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes 2021-11-16 16:49 ` Daniel P. Berrangé @ 2021-11-16 17:09 ` Philippe Mathieu-Daudé 2021-11-16 17:22 ` Thomas Huth 0 siblings, 1 reply; 8+ messages in thread From: Philippe Mathieu-Daudé @ 2021-11-16 17:09 UTC (permalink / raw) To: Daniel P. Berrangé, Thomas Huth, Peter Maydell, Richard Henderson Cc: Willian Rampazzo, Alex Bennée, qemu-devel, Wainer dos Santos Moschetta On 11/16/21 17:49, Daniel P. Berrangé wrote: > On Tue, Nov 16, 2021 at 05:33:09PM +0100, Thomas Huth wrote: >> The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to >> be scheduled, so while the build test itself finishes within 60 minutes, >> the total run time of the jobs can be longer due to this waiting time. >> Thus let's increase the timeout on the gitlab side a little bit, so >> that these jobs are not marked as failing just because of the delay. > > On a successful pipeline I see > > freebsd-11 - 28 minutes > freebsd-12 - 57 minutes > macos - 30 minutes > > We know cirrus allows 2 concurrent jobs, so from that I infer > that the freebsd-12 job was queued for ~30 minutes waiting for > either the freebsd-11 or macos job to finish, and then it > ran in 30 minutes, giving the ~60 minute total. > > That's too close to the 60 minute gitlab default job timeout > for comfort - it can easily slip over 60 minutes by just a > small amount. > > 80 minutes will certainly help in the case where we > randomly take a little longer than 30 minutes to build, > and have 1 of the 3 jobs queued. > > When we're running jobs on both master + staging, we can > have 2 jobs running and 4 more queued - 2 of those queued > might just finish in time, but 2 will definitely fail. > My patch will cut these extra jobs on master, so in common > case we only ever get 1 queued, which should work well in > combo with your patch here. That should be good enough > for the qemu-project namespace, unless someone is triggering > pipelines for stable branch staging at the same time as > the master branch staging. > > If we do want to worry about more than 2 queued jobs > again for that reason, we might consider putting > it upto 100 minutes. That would give us enough slack to > have 4 queued jobs behind two running jobs and have > them all succeed > >> Signed-off-by: Thomas Huth <thuth@redhat.com> >> --- >> .gitlab-ci.d/cirrus.yml | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml >> index e7b25e7427..22d42585e4 100644 >> --- a/.gitlab-ci.d/cirrus.yml >> +++ b/.gitlab-ci.d/cirrus.yml >> @@ -14,6 +14,7 @@ >> stage: build >> image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master >> needs: [] >> + timeout: 80m >> allow_failure: true >> script: >> - source .gitlab-ci.d/cirrus/$NAME.vars > > Whether 80 or 100 minute, consider it > > Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> This pipeline took 1h51m09s: https://gitlab.com/qemu-project/qemu/-/pipelines/409666733/builds But Richard restarted unstable jobs, which probably added time to the total. IIRC from a maintainer perspective 1h15 is the upper limit. 80m fits, 100m is over. Up to the project maintainers (personally I don't have any objection, in particular if this reduces the failures rate). Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes 2021-11-16 17:09 ` Philippe Mathieu-Daudé @ 2021-11-16 17:22 ` Thomas Huth 2021-11-16 17:36 ` Richard Henderson 0 siblings, 1 reply; 8+ messages in thread From: Thomas Huth @ 2021-11-16 17:22 UTC (permalink / raw) To: Philippe Mathieu-Daudé, Daniel P. Berrangé, Peter Maydell, Richard Henderson Cc: Willian Rampazzo, Alex Bennée, qemu-devel, Wainer dos Santos Moschetta On 16/11/2021 18.09, Philippe Mathieu-Daudé wrote: > On 11/16/21 17:49, Daniel P. Berrangé wrote: >> On Tue, Nov 16, 2021 at 05:33:09PM +0100, Thomas Huth wrote: >>> The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to >>> be scheduled, so while the build test itself finishes within 60 minutes, >>> the total run time of the jobs can be longer due to this waiting time. >>> Thus let's increase the timeout on the gitlab side a little bit, so >>> that these jobs are not marked as failing just because of the delay. ...>>> diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml >>> index e7b25e7427..22d42585e4 100644 >>> --- a/.gitlab-ci.d/cirrus.yml >>> +++ b/.gitlab-ci.d/cirrus.yml >>> @@ -14,6 +14,7 @@ >>> stage: build >>> image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master >>> needs: [] >>> + timeout: 80m >>> allow_failure: true >>> script: >>> - source .gitlab-ci.d/cirrus/$NAME.vars >> >> Whether 80 or 100 minute, consider it >> >> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> > > This pipeline took 1h51m09s: > https://gitlab.com/qemu-project/qemu/-/pipelines/409666733/builds > But Richard restarted unstable jobs, which probably added time > to the total. > > IIRC from a maintainer perspective 1h15 is the upper limit. > 80m fits, 100m is over. I think I agree ... I normally don't want to wait more than a little bit more than one hour, so 100 minutes feels too long already. We already have some 70m timeouts in other jobs, and one 80 minute timeout in .gitlab-ci.d/crossbuild-template.yml, so I'd say 80 minutes are really the upper boundary that we should use. > Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Thank to all for your reviews! Thomas ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes 2021-11-16 17:22 ` Thomas Huth @ 2021-11-16 17:36 ` Richard Henderson 2021-11-16 18:20 ` Daniel P. Berrangé 0 siblings, 1 reply; 8+ messages in thread From: Richard Henderson @ 2021-11-16 17:36 UTC (permalink / raw) To: Thomas Huth, Philippe Mathieu-Daudé, Daniel P. Berrangé, Peter Maydell Cc: Willian Rampazzo, Alex Bennée, qemu-devel, Wainer dos Santos Moschetta On 11/16/21 6:22 PM, Thomas Huth wrote: > On 16/11/2021 18.09, Philippe Mathieu-Daudé wrote: >> On 11/16/21 17:49, Daniel P. Berrangé wrote: >>> On Tue, Nov 16, 2021 at 05:33:09PM +0100, Thomas Huth wrote: >>>> The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to >>>> be scheduled, so while the build test itself finishes within 60 minutes, >>>> the total run time of the jobs can be longer due to this waiting time. >>>> Thus let's increase the timeout on the gitlab side a little bit, so >>>> that these jobs are not marked as failing just because of the delay. > ...>>> diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml >>>> index e7b25e7427..22d42585e4 100644 >>>> --- a/.gitlab-ci.d/cirrus.yml >>>> +++ b/.gitlab-ci.d/cirrus.yml >>>> @@ -14,6 +14,7 @@ >>>> stage: build >>>> image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master >>>> needs: [] >>>> + timeout: 80m >>>> allow_failure: true >>>> script: >>>> - source .gitlab-ci.d/cirrus/$NAME.vars >>> >>> Whether 80 or 100 minute, consider it >>> >>> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> >> >> This pipeline took 1h51m09s: >> https://gitlab.com/qemu-project/qemu/-/pipelines/409666733/builds >> But Richard restarted unstable jobs, which probably added time >> to the total. >> >> IIRC from a maintainer perspective 1h15 is the upper limit. >> 80m fits, 100m is over. > > I think I agree ... I normally don't want to wait more than a little bit more than one > hour, so 100 minutes feels too long already. We already have some 70m timeouts in other > jobs, and one 80 minute timeout in .gitlab-ci.d/crossbuild-template.yml, so I'd say 80 > minutes are really the upper boundary that we should use. We are also talking apples and oranges: Gitlab timeouts are on the amount of time the job runs. Cirrus timeouts appear to be on the amount of time the job is queued. If cirrus would just not start accounting until the thing runs we'd be fine. r~ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes 2021-11-16 17:36 ` Richard Henderson @ 2021-11-16 18:20 ` Daniel P. Berrangé 2021-11-17 7:03 ` Thomas Huth 0 siblings, 1 reply; 8+ messages in thread From: Daniel P. Berrangé @ 2021-11-16 18:20 UTC (permalink / raw) To: Richard Henderson Cc: Peter Maydell, Thomas Huth, Alex Bennée, qemu-devel, Wainer dos Santos Moschetta, Willian Rampazzo, Philippe Mathieu-Daudé On Tue, Nov 16, 2021 at 06:36:50PM +0100, Richard Henderson wrote: > On 11/16/21 6:22 PM, Thomas Huth wrote: > > On 16/11/2021 18.09, Philippe Mathieu-Daudé wrote: > > > On 11/16/21 17:49, Daniel P. Berrangé wrote: > > > > On Tue, Nov 16, 2021 at 05:33:09PM +0100, Thomas Huth wrote: > > > > > The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to > > > > > be scheduled, so while the build test itself finishes within 60 minutes, > > > > > the total run time of the jobs can be longer due to this waiting time. > > > > > Thus let's increase the timeout on the gitlab side a little bit, so > > > > > that these jobs are not marked as failing just because of the delay. > > ...>>> diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml > > > > > index e7b25e7427..22d42585e4 100644 > > > > > --- a/.gitlab-ci.d/cirrus.yml > > > > > +++ b/.gitlab-ci.d/cirrus.yml > > > > > @@ -14,6 +14,7 @@ > > > > > stage: build > > > > > image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master > > > > > needs: [] > > > > > + timeout: 80m > > > > > allow_failure: true > > > > > script: > > > > > - source .gitlab-ci.d/cirrus/$NAME.vars > > > > > > > > Whether 80 or 100 minute, consider it > > > > > > > > Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> > > > > > > This pipeline took 1h51m09s: > > > https://gitlab.com/qemu-project/qemu/-/pipelines/409666733/builds > > > But Richard restarted unstable jobs, which probably added time > > > to the total. > > > > > > IIRC from a maintainer perspective 1h15 is the upper limit. > > > 80m fits, 100m is over. > > > > I think I agree ... I normally don't want to wait more than a little bit > > more than one hour, so 100 minutes feels too long already. We already > > have some 70m timeouts in other jobs, and one 80 minute timeout in > > .gitlab-ci.d/crossbuild-template.yml, so I'd say 80 minutes are really > > the upper boundary that we should use. > > We are also talking apples and oranges: > Gitlab timeouts are on the amount of time the job runs. > Cirrus timeouts appear to be on the amount of time the job is queued. > > If cirrus would just not start accounting until the thing runs we'd be fine. Unfortunately it isn't that easy. Our cirrus CI jobs are launched using the cirrus-run tool, from a gitlab job. The timeouts we're usually hitting are from the gitlab job which is sitting around waiting for the cirrus job it launched to finish, so it can report back stats. Cirrus CI does itself have a job timeout, but I'm not aware of us hitting that typically, unless i'm misinterpreting something. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes 2021-11-16 18:20 ` Daniel P. Berrangé @ 2021-11-17 7:03 ` Thomas Huth 0 siblings, 0 replies; 8+ messages in thread From: Thomas Huth @ 2021-11-17 7:03 UTC (permalink / raw) To: Daniel P. Berrangé, Richard Henderson Cc: Peter Maydell, Alex Bennée, qemu-devel, Wainer dos Santos Moschetta, Willian Rampazzo, Philippe Mathieu-Daudé On 16/11/2021 19.20, Daniel P. Berrangé wrote: > On Tue, Nov 16, 2021 at 06:36:50PM +0100, Richard Henderson wrote: >> On 11/16/21 6:22 PM, Thomas Huth wrote: >>> On 16/11/2021 18.09, Philippe Mathieu-Daudé wrote: >>>> On 11/16/21 17:49, Daniel P. Berrangé wrote: >>>>> On Tue, Nov 16, 2021 at 05:33:09PM +0100, Thomas Huth wrote: >>>>>> The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to >>>>>> be scheduled, so while the build test itself finishes within 60 minutes, >>>>>> the total run time of the jobs can be longer due to this waiting time. >>>>>> Thus let's increase the timeout on the gitlab side a little bit, so >>>>>> that these jobs are not marked as failing just because of the delay. >>> ...>>> diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml >>>>>> index e7b25e7427..22d42585e4 100644 >>>>>> --- a/.gitlab-ci.d/cirrus.yml >>>>>> +++ b/.gitlab-ci.d/cirrus.yml >>>>>> @@ -14,6 +14,7 @@ >>>>>> stage: build >>>>>> image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master >>>>>> needs: [] >>>>>> + timeout: 80m >>>>>> allow_failure: true >>>>>> script: >>>>>> - source .gitlab-ci.d/cirrus/$NAME.vars >>>>> >>>>> Whether 80 or 100 minute, consider it >>>>> >>>>> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> >>>> >>>> This pipeline took 1h51m09s: >>>> https://gitlab.com/qemu-project/qemu/-/pipelines/409666733/builds >>>> But Richard restarted unstable jobs, which probably added time >>>> to the total. >>>> >>>> IIRC from a maintainer perspective 1h15 is the upper limit. >>>> 80m fits, 100m is over. >>> >>> I think I agree ... I normally don't want to wait more than a little bit >>> more than one hour, so 100 minutes feels too long already. We already >>> have some 70m timeouts in other jobs, and one 80 minute timeout in >>> .gitlab-ci.d/crossbuild-template.yml, so I'd say 80 minutes are really >>> the upper boundary that we should use. >> >> We are also talking apples and oranges: >> Gitlab timeouts are on the amount of time the job runs. >> Cirrus timeouts appear to be on the amount of time the job is queued. >> >> If cirrus would just not start accounting until the thing runs we'd be fine. > > Unfortunately it isn't that easy. Our cirrus CI jobs are launched using > the cirrus-run tool, from a gitlab job. The timeouts we're usually > hitting are from the gitlab job which is sitting around waiting for > the cirrus job it launched to finish, so it can report back stats. > > Cirrus CI does itself have a job timeout, but I'm not aware of us > hitting that typically, unless i'm misinterpreting something. Right, the problem is the timeout on the gitlab-CI side, not the timeout on the Cirrus-CI side. I've never seen us hitting the timeout on the Cirrus side either. Thomas ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes 2021-11-16 16:33 [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes Thomas Huth 2021-11-16 16:49 ` Daniel P. Berrangé @ 2021-11-16 17:17 ` Willian Rampazzo 1 sibling, 0 replies; 8+ messages in thread From: Willian Rampazzo @ 2021-11-16 17:17 UTC (permalink / raw) To: Thomas Huth Cc: Daniel P . Berrangé, Alex Bennée, qemu-devel, Wainer dos Santos Moschetta, Philippe Mathieu-Daudé On Tue, Nov 16, 2021 at 1:33 PM Thomas Huth <thuth@redhat.com> wrote: > > The jobs on Cirrus-CI sometimes get delayed quite a bit, waiting to > be scheduled, so while the build test itself finishes within 60 minutes, > the total run time of the jobs can be longer due to this waiting time. > Thus let's increase the timeout on the gitlab side a little bit, so > that these jobs are not marked as failing just because of the delay. > > Signed-off-by: Thomas Huth <thuth@redhat.com> > --- > .gitlab-ci.d/cirrus.yml | 1 + > 1 file changed, 1 insertion(+) > Reviewed-by: Willian Rampazzo <willianr@redhat.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-11-17 7:04 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-11-16 16:33 [PATCH] gitlab-ci/cirrus: Increase timeout to 80 minutes Thomas Huth 2021-11-16 16:49 ` Daniel P. Berrangé 2021-11-16 17:09 ` Philippe Mathieu-Daudé 2021-11-16 17:22 ` Thomas Huth 2021-11-16 17:36 ` Richard Henderson 2021-11-16 18:20 ` Daniel P. Berrangé 2021-11-17 7:03 ` Thomas Huth 2021-11-16 17:17 ` Willian Rampazzo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).