Git development
 help / color / mirror / Atom feed
* [PATCH 0/8] pack-bitmap-write: speed up bitmap generation
@ 2026-05-19 16:12 Taylor Blau
  2026-05-19 16:12 ` [PATCH 1/8] pack-bitmap: pass object position to `fill_bitmap_tree()` Taylor Blau
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Taylor Blau @ 2026-05-19 16:12 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren, Derrick Stolee

Note to the maintainer:

 * This series is based on 'tb/pseudo-merge-bugfixes', with
   'ps/clang-w-glibc-2.43-and-_Generic' merged in. I suggest queueing it
   as 'tb/bitmap-build-performance'.

   The latter merge is only to avoid the current Clang/glibc 2.43 CI
   breakage, and is unrelated to the bitmap changes themselves.

This series improves the performance of reachability bitmap generation,
focusing on very large repositories and the penalty to generate
pseudo-merge reachability bitmaps.

The first few patches address hot paths in the ordinary bitmap build:

 - pass object positions into `fill_bitmap_tree()` so callers can avoid
   redundant lookups,

 - check subtree bits before recursing, which avoids many no-op
   `fill_bitmap_tree()` calls,

 - reuse already-stored selected bitmaps when `fill_bitmap_commit()`
   reaches a selected ancestor, and

 - add a small direct-mapped cache from object IDs to bitmap positions
   to avoid repeated pack/MIDX lookups while filling bitmaps.

On the large repository that I have been using to benchmark these
changes (~4.8M commits and ~57M total objects), the no-pseudo-merge
bitmap generation case drops **from ~612.5 seconds to ~294.1 seconds**.

The next patch sorts selected bitmaps before choosing XOR offsets. This
does not change bitmap selection/coverage, but in the same repository it
shrinks the generated bitmap file **from ~635.5 MiB to ~176.4 MiB** by
putting related ancestor/descendant bitmaps close enough together for
the XOR search window to find them.

The final two patches focus on pseudo-merge bitmaps. The existing code
feeds pseudo-merges into the same maximal-commit selection machinery as
ordinary selected commits. That machinery works well for real history,
but not pseudo-merges.

Instead, this series builds ordinary selected bitmaps first, then builds
pseudo-merge bitmaps afterwards. The later pseudo-merge fill can still
reuse stored selected ancestor bitmaps, and can also reuse an existing
on-disk pseudo-merge bitmap when the parent set matches.

With the coarse pseudo-merge configuration used for testing:

    [bitmapPseudoMerge "all"]
        pattern=refs/
        threshold=now
        stableSize=10000000
        maxMerges=8

, the optimized no-pseudo-merge case takes ~294.1 seconds, while the
**pseudo-merge case takes ~328.4 seconds**. Before the final change, the
same pseudo-merge configuration took ~575.0 seconds.

On our testing repository, it is faster at the end of this series to
generate bitmaps with pseudo-merges (~328 seconds as above) than it is
to generate bitmaps without pseudo-merges at the start of this series
(~612 seconds).

Thanks in advance for your review!

Taylor Blau (8):
  pack-bitmap: pass object position to `fill_bitmap_tree()`
  pack-bitmap: check subtree bits before recursing
  pack-bitmap: reuse stored selected bitmaps
  pack-bitmap: consolidate `find_object_pos()` success path
  pack-bitmap: cache object positions during fill
  pack-bitmap: sort bitmaps before XORing
  pack-bitmap: remember pseudo-merge parents
  pack-bitmap: build pseudo-merge bitmaps after regular bitmaps

 pack-bitmap-write.c | 431 +++++++++++++++++++++++++++++++++++++-------
 pack-bitmap.h       |   7 +
 2 files changed, 377 insertions(+), 61 deletions(-)


base-commit: c3d7ca7d982efc3a848fd85f34e867cfc0a99479
-- 
2.54.0.rc1.84.g30ce254312c

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-05-19 16:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-19 16:12 [PATCH 0/8] pack-bitmap-write: speed up bitmap generation Taylor Blau
2026-05-19 16:12 ` [PATCH 1/8] pack-bitmap: pass object position to `fill_bitmap_tree()` Taylor Blau
2026-05-19 16:12 ` [PATCH 2/8] pack-bitmap: check subtree bits before recursing Taylor Blau
2026-05-19 16:12 ` [PATCH 3/8] pack-bitmap: reuse stored selected bitmaps Taylor Blau
2026-05-19 16:12 ` [PATCH 4/8] pack-bitmap: consolidate `find_object_pos()` success path Taylor Blau
2026-05-19 16:12 ` [PATCH 5/8] pack-bitmap: cache object positions during fill Taylor Blau
2026-05-19 16:12 ` [PATCH 6/8] pack-bitmap: sort bitmaps before XORing Taylor Blau
2026-05-19 16:12 ` [PATCH 7/8] pack-bitmap: remember pseudo-merge parents Taylor Blau
2026-05-19 16:12 ` [PATCH 8/8] pack-bitmap: build pseudo-merge bitmaps after regular bitmaps Taylor Blau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox