From: Patrick Steinhardt <ps@pks.im>
To: Emily Shaffer <nasamuffin@google.com>
Cc: git@vger.kernel.org
Subject: Re: Continuous Benchmarking
Date: Fri, 21 Feb 2025 09:48:16 +0100 [thread overview]
Message-ID: <Z7g90CMEiy-skRKK@pks.im> (raw)
In-Reply-To: <CAJoAoZmJAM--FVmhxs_0sL1A8yrLwNBFULPDYFgV=AtFhn67+g@mail.gmail.com>
On Wed, Feb 05, 2025 at 03:14:21PM -0800, Emily Shaffer wrote:
> On Mon, Feb 3, 2025 at 1:55 AM Patrick Steinhardt <ps@pks.im> wrote:
> >
> > Hi,
> >
> > due to a couple performance regressions that we have hit over the last
> > couple Git releases at GitLab, we have started to set up an effort to
> > implement continuous benchmarking for the Git project. The intent is to
> > have regular (daily) benchmarking runs against Git's `master` and `next`
> > branches to be able to spot any performance regressions before they make
> > it into the next release.
> >
> > I have started with a relatively simple setup:
> >
> > - I have started collection benchmarks that I myself do regularly [1].
> > These benchmarks are built on hyperfine and are thus not part of the
> > Git repository itself.
> >
> > - GitLab CI runs on a nightly basis, executing a subset of these
> > benchmarks [2].
> >
> > - Results are uploaded with a hyperfine adaptor to Bencher and are
> > summarized in dashboards.
> >
> > This at least gives us some visibility in severe performance outliers,
> > whether these are improvements or regressions. Some statistics are
> > applied on this data to automatically generate alerts when things are
> > significantly changing.
> >
> > The setup is of course not perfect. It's built on top of CI jobs, which
> > are by their very nature not really performing consistent. The scripts
> > are hosted outside of Git. And I'm the only one running this.
>
> For the CI "noisy neighbors" problem at least, it could be an option
> to try to host in GCE (or some other compute that isn't shared). I
> asked around a little inside Google and it seems like it's possible,
> I'll keep pushing on it and see just how hard it would be. I'd even be
> happy to trade on-push runs with noisy neighbors for nightly runs with
> no neighbors, which makes it not really a CI thing - guess I will find
> out if that's easier or harder for us to implement. :)
That would be awesome.
> > So I wonder whether there is a wider interest in the Git community to
> > have this infrastructure part of the Git project itself. This may
> > include steps like the following:
> >
> > - Extending our performance tests we have in "t/perf" to cover more
> > benchmarks.
>
> Folks may be aware that our biggest (in terms of scale) internal
> customer at Google is Android project. They are the ones who complain
> to me and my team the most about performance; they are also open to
> setting up nightly performance regression test. Would it be appealing
> to get reports from such a test upstream? I think it's more compelling
> to our customer team if we run it against the closed-source Android
> repo, which means the Git project doesn't get to see as much about the
> shape and content of the repos the performance tests are running
> against, but we might be able to publish info about the shape without
> the contents. Would that be useful? What would help to know (# of
> commits, size of largest object, distribution of object size, # of
> branches, size of worktree...?) If not having the specifics of the
> repo-under-test is a dealbreaker we could explore running performance
> tests in public with Android Open Source Project as the
> repo-under-test instead, but it's much more manageable than full
> Android.
The biggest question is whether such regression reports would be
actionable by the Git community. I often found performance issues to be
very specific to the repository at hand, and reconstructing the exact
situation tends to be extremely tedious or completely infeasible. I run
into the situation way too often where customers come knock at my door
with a performance issue, but don't want to provide the underlying data.
More often than not I end up not being able to reproduce, so I have to
push back on such reports.
Ideally, any report should be accompanied by a trivial reproducer that
any developer can execute on their local machine.
> Maybe in the long term it would be even better to have some toy
> repo-under-test, like "sample repo with massive object store", "sample
> repo with massive history", etc. to help us pinpoint which ways we're
> scaling well and which ways we aren't. But having a ready made
> repo-under-test, and a team who's got a very large stake in Git
> performing well with it (so they can invest their time in setting up
> tests), might be a good enough place to start.
That would be great. I guess this wouldn't be a single repository, but a
set of repositories that have different kinds of characteristics.
> > - Writing an adaptor that is able to upload the data generated from
> > our perf scripts to Bencher.
> >
> > - Setting up proper infrastructure to do the benchmarking. We may for
> > now also continue to use GitLab CI, but as said they are quite noisy
> > overall. Dedicated servers would help here.
> >
> > - Sending alerts to the Git mailing list.
>
> Yeah, I'd love to see reports coming to Git mailing list, or at least
> bad news reports (maybe we don't need "everything ran great!" every
> night, but would appreciate "last night the performance suite ran 50%
> slower than last-6-months average"). That seems the easiest to
> integrate with the way the project runs now, and I think we are used
> to list noise :)
Oh, totally, I certainly don't think there's any benefit in reporting
anything when there is no information. Right now there still are semi-
frequent outliers where an alert is generated only because of a flake,
not a real performance regression. But my hope would be that we can
address this issue once we address the noisy neighbour problem.
> > I'm happy to hear your thoughts on this. Any ideas are welcome,
> > including "we're not interested at all". In that case, we'd simply
> > continue to maintain the setup ourselves at GitLab.
>
> In general, though, yes! I am very interested! Google had trouble with
> performance regressions over the last 3 months or so, I'd love to see
> the community noticing it more. I think in general we have a sense
> that performance matters, during code review, but aren't always sure
> where it matters most, and a regular performance test that anybody can
> see the results of would help a lot.
Thanks for your input!
Patrick
prev parent reply other threads:[~2025-02-21 8:48 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-03 9:54 Continuous Benchmarking Patrick Steinhardt
2025-02-03 16:33 ` Junio C Hamano
2025-02-05 23:14 ` Emily Shaffer
2025-02-21 8:48 ` Patrick Steinhardt [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z7g90CMEiy-skRKK@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=nasamuffin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).