From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
Elijah Newren <newren@gmail.com>,
Derrick Stolee <stolee@gmail.com>
Subject: [PATCH 0/8] pack-bitmap-write: speed up bitmap generation
Date: Tue, 19 May 2026 12:12:33 -0400 [thread overview]
Message-ID: <cover.1779207127.git.me@ttaylorr.com> (raw)
Note to the maintainer:
* This series is based on 'tb/pseudo-merge-bugfixes', with
'ps/clang-w-glibc-2.43-and-_Generic' merged in. I suggest queueing it
as 'tb/bitmap-build-performance'.
The latter merge is only to avoid the current Clang/glibc 2.43 CI
breakage, and is unrelated to the bitmap changes themselves.
This series improves the performance of reachability bitmap generation,
focusing on very large repositories and the penalty to generate
pseudo-merge reachability bitmaps.
The first few patches address hot paths in the ordinary bitmap build:
- pass object positions into `fill_bitmap_tree()` so callers can avoid
redundant lookups,
- check subtree bits before recursing, which avoids many no-op
`fill_bitmap_tree()` calls,
- reuse already-stored selected bitmaps when `fill_bitmap_commit()`
reaches a selected ancestor, and
- add a small direct-mapped cache from object IDs to bitmap positions
to avoid repeated pack/MIDX lookups while filling bitmaps.
On the large repository that I have been using to benchmark these
changes (~4.8M commits and ~57M total objects), the no-pseudo-merge
bitmap generation case drops **from ~612.5 seconds to ~294.1 seconds**.
The next patch sorts selected bitmaps before choosing XOR offsets. This
does not change bitmap selection/coverage, but in the same repository it
shrinks the generated bitmap file **from ~635.5 MiB to ~176.4 MiB** by
putting related ancestor/descendant bitmaps close enough together for
the XOR search window to find them.
The final two patches focus on pseudo-merge bitmaps. The existing code
feeds pseudo-merges into the same maximal-commit selection machinery as
ordinary selected commits. That machinery works well for real history,
but not pseudo-merges.
Instead, this series builds ordinary selected bitmaps first, then builds
pseudo-merge bitmaps afterwards. The later pseudo-merge fill can still
reuse stored selected ancestor bitmaps, and can also reuse an existing
on-disk pseudo-merge bitmap when the parent set matches.
With the coarse pseudo-merge configuration used for testing:
[bitmapPseudoMerge "all"]
pattern=refs/
threshold=now
stableSize=10000000
maxMerges=8
, the optimized no-pseudo-merge case takes ~294.1 seconds, while the
**pseudo-merge case takes ~328.4 seconds**. Before the final change, the
same pseudo-merge configuration took ~575.0 seconds.
On our testing repository, it is faster at the end of this series to
generate bitmaps with pseudo-merges (~328 seconds as above) than it is
to generate bitmaps without pseudo-merges at the start of this series
(~612 seconds).
Thanks in advance for your review!
Taylor Blau (8):
pack-bitmap: pass object position to `fill_bitmap_tree()`
pack-bitmap: check subtree bits before recursing
pack-bitmap: reuse stored selected bitmaps
pack-bitmap: consolidate `find_object_pos()` success path
pack-bitmap: cache object positions during fill
pack-bitmap: sort bitmaps before XORing
pack-bitmap: remember pseudo-merge parents
pack-bitmap: build pseudo-merge bitmaps after regular bitmaps
pack-bitmap-write.c | 431 +++++++++++++++++++++++++++++++++++++-------
pack-bitmap.h | 7 +
2 files changed, 377 insertions(+), 61 deletions(-)
base-commit: c3d7ca7d982efc3a848fd85f34e867cfc0a99479
--
2.54.0.rc1.84.g30ce254312c
next reply other threads:[~2026-05-19 16:12 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-19 16:12 Taylor Blau [this message]
2026-05-19 16:12 ` [PATCH 1/8] pack-bitmap: pass object position to `fill_bitmap_tree()` Taylor Blau
2026-05-19 16:12 ` [PATCH 2/8] pack-bitmap: check subtree bits before recursing Taylor Blau
2026-05-19 16:12 ` [PATCH 3/8] pack-bitmap: reuse stored selected bitmaps Taylor Blau
2026-05-19 16:12 ` [PATCH 4/8] pack-bitmap: consolidate `find_object_pos()` success path Taylor Blau
2026-05-19 16:12 ` [PATCH 5/8] pack-bitmap: cache object positions during fill Taylor Blau
2026-05-19 16:12 ` [PATCH 6/8] pack-bitmap: sort bitmaps before XORing Taylor Blau
2026-05-19 16:12 ` [PATCH 7/8] pack-bitmap: remember pseudo-merge parents Taylor Blau
2026-05-19 16:12 ` [PATCH 8/8] pack-bitmap: build pseudo-merge bitmaps after regular bitmaps Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1779207127.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox