From: Tassilo Horn <tsdh@gnu.org>
To: git@vger.kernel.org
Subject: [BUG?] Major performance issue with some commands on our repo's master branch
Date: Sat, 04 Jun 2022 09:39:35 +0200 [thread overview]
Message-ID: <87h750q1b9.fsf@gnu.org> (raw)
Hi all,
[spoiler alert: I've figured out the config option causing the problem
while writing this long mail, so you might jump straight to the SOLUTION
section at the bottom of this mail.]
at my day job, I work on a git repo (sadly non-public, proprietary) with
these stats:
- master has about 150000 commits, the last release branch I've also benchmarked above has 144000 commits
- the history dates back to 2001
- .git/ is about 1.8 GB
So it's quite big but not unusually big when compared to linux or other
free software projects.
The typical git commands I use (status, fetch, pull, commit, push,
rebase, ...) are all quick. However, I use the git porcelain Magit [1]
which invokes several plumbing commands in order to present to the user
an always up-to-date extended status buffer of the currently checked out
branch showing the current branch. Some of those plumbing commands are
extremely slow for no obvious reasons. The most outstanding command I
could pinpoint is this:
--8<---------------cut here---------------start------------->8---
❯ time git show --no-patch --format="%h %s" "master^{commit}" --
6192a0cfdc6 Merge remote-tracking branch 'origin/SHD_ECORO_3_9_7'
________________________________________________________
Executed in 13.21 secs fish external
usr time 12.99 secs 462.00 micros 12.99 secs
sys time 0.17 secs 119.00 micros 0.17 secs
--8<---------------cut here---------------end--------------->8---
The interesting thing is that I have this problem only with the master
branch. When I run it for the last release branch, I get these times:
--8<---------------cut here---------------start------------->8---
❯ time git show --no-patch --format="%h %s" "SHD_ECORO_3_9_7^{commit}" --
994334fc9fb ECOJ-33833 HTML-Formbrief: Bestellungs-Anhänge im KV-Kontext
________________________________________________________
Executed in 22.68 millis fish external
usr time 7.71 millis 761.00 micros 6.95 millis
sys time 10.47 millis 194.00 micros 10.28 millis
--8<---------------cut here---------------end--------------->8---
So you see, it's almost a factor of 1000 difference! How can that be?
The split between master and the SHD_ECORO_3_X_X series of branches has
happened almost 2 years ago and master is way ahead of those.
--8<---------------cut here---------------start------------->8---
❯ git log --oneline master...origin/SHD_ECORO_3_9_7 | wc -l
5013
--8<---------------cut here---------------end--------------->8---
But there are around 9 merges from the last release branch into master
daily.
--8<---------------cut here---------------start------------->8---
❯ git log --merges --oneline --since 6months | wc -l
1611
--8<---------------cut here---------------end--------------->8---
From my memory, the issue hasn't popped up out of sudden but has gotten
worse slowly over time. I have the impression that the worsening
increased pace over the last few month which might be the result of our
workflow. Before, I've been the merge guy doing two "merge waves" from
the last supported release branch upwards into master once or twice a
day (usually release-branch -> next-release-branch -> master). Since
about 3 month, we've switched to a workflow where every developer does
merge upwards herself just after committing/pushing to some lesser
branch than master simply because branches have diverged so much that
you'd need to be an expert in everything in order to be able to resolve
conflicts sensibly.
I should mention that I haven't seen this issue with any other repo I
have. But that's also the biggest one I use. The Emacs repository I
also work on is comparable in the number of commits but with much less
merges.
At last, here's the git bugreport sysinfo section on that machine and
repository.
--8<---------------cut here---------------start------------->8---
[System Info]
git version:
git version 2.36.1
cpu: x86_64
no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 5.18.1-zen1-1-zen #1 ZEN SMP PREEMPT_DYNAMIC Mon, 30 May 2022 17:53:16 +0000 x86_64
compiler info: gnuc: 11.2
libc info: glibc: 2.35
$SHELL (typically, interactive shell): /usr/bin/fish
[Enabled Hooks]
--8<---------------cut here---------------end--------------->8---
SOLUTION
========
While writing this long mail, I've figured out that the performance
penalty is caused by my setting of diff.renameLimit = 10000. If I
comment that option in my ~/.gitconfig, the above command finishes in
150 millis instead of 13 seconds:
--8<---------------cut here---------------start------------->8---
❯ time git show --no-patch --format="%h %s" "master^{commit}" --
6192a0cfdc6 Merge remote-tracking branch 'origin/SHD_ECORO_3_9_7'
________________________________________________________
Executed in 147.99 millis fish external
usr time 114.52 millis 713.00 micros 113.81 millis
sys time 34.78 millis 193.00 micros 34.59 millis
--8<---------------cut here---------------end--------------->8---
But there's still the question why diff.renameLimit has an influence
here when --no-patch is provided so no diff should be generated.
Bye,
Tassilo
[1] https://magit.vc/
next reply other threads:[~2022-06-04 8:29 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-04 7:39 Tassilo Horn [this message]
2022-06-04 20:20 ` [BUG?] Major performance issue with some commands on our repo's master branch Tao Klerks
2022-06-05 10:46 ` Tassilo Horn
2022-06-06 5:18 ` Tao Klerks
2022-06-08 23:36 ` Jeff King
2022-06-09 1:27 ` Kyle Meyer
2022-06-09 15:03 ` Jeff King
2022-06-09 18:23 ` Junio C Hamano
2022-06-09 18:43 ` Jeff King
2022-06-09 20:06 ` Junio C Hamano
2022-06-09 5:51 ` Tassilo Horn
2022-06-09 15:05 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h750q1b9.fsf@gnu.org \
--to=tsdh@gnu.org \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).