From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: git@vger.kernel.org
Subject: Pathological performance with git remote rename and many tracking refs
Date: Wed, 13 Apr 2022 22:32:57 +0000 [thread overview]
Message-ID: <YldPmUskbU+bOU2n@camp.crustytoothpaste.net> (raw)
[-- Attachment #1: Type: text/plain, Size: 1721 bytes --]
In my day-to-day work, I have the occasion to use GitHub Codespaces on a
repository with about 20,000 refs on the server. The environment is set
up to pre-clone the repository, but I use a different default remote
name than "origin" ("def", to be particular), and thus, one of the things
I do when I set up that environment is to run "git remote rename origin
def".
This process takes 35 minutes, which is extremely pathological. I
believe what's happening is that all of the refs are packed, and
renaming the ref causes a loose ref to be created and the old ref to be
deleted (necessitating a rewrite of the packed-refs file). This is
essentially O(N^2) in the order of refs.
We recently added a --progress option, but I think this performance is
bad enough that that's not going to suffice here, and we should try to
do better.
I found that using "git for-each-ref" and "git update-ref --stdin" in a
pipeline to create and delete the refs as a single transaction takes a
little over 2 seconds. This is greater than a 99.9% improvement and is
much more along the line of what I'd expect.
I thought about porting this code to use a ref transaction, but I
realized that we don't rename reflogs in that situation, which might be
a problem for some people. In my case, since it's a freshly cloned repo
and the reflogs aren't interesting, I don't care.
I think a possible way forward may be to either teach ref transactions
about ref renames, or simply to add a --no-reflogs option, which omits
the reflogs in case the user doesn't care. I'm interested to hear ideas
from others, though, about the best way forward.
--
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
next reply other threads:[~2022-04-13 22:33 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-13 22:32 brian m. carlson [this message]
2022-04-14 7:12 ` Pathological performance with git remote rename and many tracking refs Ævar Arnfjörð Bjarmason
2022-04-15 1:08 ` brian m. carlson
2022-04-15 12:26 ` Ævar Arnfjörð Bjarmason
2022-04-15 17:25 ` Junio C Hamano
2022-04-16 11:23 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YldPmUskbU+bOU2n@camp.crustytoothpaste.net \
--to=sandals@crustytoothpaste.net \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.