From: Phil Hord <phil.hord@gmail.com>
To: git@vger.kernel.org
Cc: Phil Hord <phil.hord@gmail.com>
Subject: [RFC PATCH 0/2] fetch --prune performance problem
Date: Wed, 18 Jun 2025 14:08:38 -0700 [thread overview]
Message-ID: <20250618211024.2332525-1-phil.hord@gmail.com> (raw)
From: Phil Hord <phil.hord@gmail.com>
`git fetch --prune` runs in O(N^2) time normally. This happens because the code
iterates over each ref to be pruned to display its status. In a repo with
174,000 refs, where I was pruning 15,000 refs, the current code made 2.6 billion
calls to strcmp and consumed 470 seconds of CPU. After this change, the same
operation completes in under 1 second.
The loop looks like this:
for p in prune_refs { for ref in all_refs { if p == ref { ... }}}
That loop runs only to check for and report newly dangling refs. A workaround to
avoid this slowness is to run with `-q` to bypass this check.
There is similar check/report functionality in `git remote prune`, but it uses a
more efficient method to check for dangling refs. prune_refs is first sorted, so
it can be searched in O(logN), so this loop is O(N*logN).
for ref in all_refs { if ref in prune_refs { ... }}
My patch fixes this for fetch, but it affects the command's output order.
Currently the results look like this:
- [deleted] (none) -> origin/bar
(origin/bar has become dangling)
- [deleted] (none) -> origin/baz
- [deleted] (none) -> origin/foo
(origin/foo has become dangling)
- [deleted] (none) -> origin/frotz
After my change, the order will change so the danglers are reported at the end.
- [deleted] (none) -> origin/bar
- [deleted] (none) -> origin/baz
- [deleted] (none) -> origin/foo
- [deleted] (none) -> origin/frotz
(origin/bar has become dangling)
(origin/foo has become dangling)
The latter format is close to how `git remote prune` works, but the formatting
is a bit different. I can coerce my change into something that preserves the
original order, but it will be quite a bit messier.
Q: Does anyone care enough about the command output ordering that they think
it's worth the extra code complexity?
Phil Hord (2):
fetch-prune: optimize dangling-ref reporting
refs: remove old refs_warn_dangling_symref
builtin/fetch.c | 16 ++++++++--------
refs.c | 17 +----------------
2 files changed, 9 insertions(+), 24 deletions(-)
--
2.50.0.1.gf2ab606906.dirty
next reply other threads:[~2025-06-18 21:10 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-18 21:08 Phil Hord [this message]
2025-06-18 21:08 ` [RFC PATCH 1/2] fetch-prune: optimize dangling-ref reporting Phil Hord
2025-06-18 21:50 ` Junio C Hamano
2025-06-18 23:18 ` Jacob Keller
2025-06-19 4:00 ` Jeff King
2025-06-19 11:01 ` Lidong Yan
2025-06-19 14:41 ` Lidong Yan
2025-06-18 21:08 ` [RFC PATCH 2/2] refs: remove old refs_warn_dangling_symref Phil Hord
2025-06-18 23:15 ` [RFC PATCH 0/2] fetch --prune performance problem Jacob Keller
2025-06-19 3:37 ` Jeff King
2025-06-19 17:18 ` Junio C Hamano
[not found] ` <CABURp0p4d0JPg=-cW1OZdFQJ+vNT_0PDd9Rv3oz6toFGqGv5=g@mail.gmail.com>
2025-06-23 23:32 ` Jacob Keller
2025-06-23 23:41 ` Junio C Hamano
[not found] ` <CABURp0q-1FGmD+PJeSQ=xvyDN6ZYn1O7Fh8i1OojfD2WQCqgcw@mail.gmail.com>
2025-06-23 23:46 ` Jacob Keller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250618211024.2332525-1-phil.hord@gmail.com \
--to=phil.hord@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.