Git development
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Elijah Newren <newren@gmail.com>,
	Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH 3/8] pack-bitmap: reuse stored selected bitmaps
Date: Wed, 27 May 2026 10:40:31 -0400	[thread overview]
Message-ID: <ahcCX+xAKFOL8HcW@nand.local> (raw)
In-Reply-To: <20260527092412.GD981444@coredump.intra.peff.net>

On Wed, May 27, 2026 at 05:24:12AM -0400, Jeff King wrote:
> On Tue, May 19, 2026 at 12:12:41PM -0400, Taylor Blau wrote:
>
> > Building bitmaps from scratch on the same test repository from the
> > previous commits yields a significant speed-up:
> >
> >     +------------------+-------------+-------------+---------------------+
> >     |                  | HEAD^       | HEAD        | Delta               |
> >     +------------------+-------------+-------------+---------------------+
> >     | elapsed          |   562.8 s   |   324.8 s   |   -237.9 s (-42.3%) |
> >     | cycles           | 2,621.3 B   | 1,508.6 B   | -1,112.7 B (-42.4%) |
> >     | instructions     | 2,348.9 B   | 1,436.6 B   |   -912.3 B (-38.8%) |
> >     | CPI              |     1.116   |     1.050   |   -0.066    (-5.9%) |
> >     +------------------+-------------+-------------+---------------------+
>
> Oh my, that's a rather nice speedup. I can reproduce here on linux.git
> (~47% improvement).
>
> > When `fill_bitmap_commit()` reaches an ancestor that was selected for
> > its own bitmap and processed earlier, its object closure is already
> > stored in `writer->bitmaps` as an EWAH bitmap. As a result, walking
> > through that commit's tree and parents again is redundant.
> >
> > Teach `fill_bitmap_commit()` to notice that case. For non-root commits in
> > the walk, look for a stored selected bitmap and OR it into the bitmap
> > being built. If one exists, skip the commit, its tree, and its parents.
>
> I feel like this _shouldn't_ be necessary, because the idea of the
> current writing code is to go from the roots up, following inverted
> parent pointers, and passing the bitmap up as we go. So whenever we
> visit a commit we should in theory have all of the ancestor's bits set
> in that bitmap. But I remember that the simple-and-stupid approach ended
> up being too memory hungry, so we pick some focal points in the graph
> and then fill them independently.

It's sharing within the non-first parent history that is killing us
here. I think what you said is true in a completely linear repository
with no merges. But since we only pass commit masks from commits to
their first parents, we don't reuse any already-generated bitmaps for
common points in history not shared between commits' first parents.

> I wondered about "c != commit" here. "c" is the commit we're traversing,
> and "commit" is the one for which we're trying to build the bitmap. So
> we would not expect to ever have an entry in writer->bitmaps for "c"
> yet, but the conditional is just short-circuiting the hash lookup.

Exactly.

> The rest of the patch looks obviously correct. The trace2 bits aren't
> strictly necessary, of course, but some metrics might help with further
> tuning.

Yeah, these were for my own curiosity as much as anything. I had written
them as a temporary measure in order to write the "[...] there are 1,261
commits selected for bitmap coverage, and 1,382 maximal commits induced
[...]" portion of the commit message above.

Once I had written it, I found the result useful enough to keep around.

Thanks,
Taylor

  reply	other threads:[~2026-05-27 14:40 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-19 16:12 [PATCH 0/8] pack-bitmap-write: speed up bitmap generation Taylor Blau
2026-05-19 16:12 ` [PATCH 1/8] pack-bitmap: pass object position to `fill_bitmap_tree()` Taylor Blau
2026-05-27  8:57   ` Jeff King
2026-05-27 14:36     ` Taylor Blau
2026-05-19 16:12 ` [PATCH 2/8] pack-bitmap: check subtree bits before recursing Taylor Blau
2026-05-27  9:03   ` Jeff King
2026-05-27 14:36     ` Taylor Blau
2026-05-19 16:12 ` [PATCH 3/8] pack-bitmap: reuse stored selected bitmaps Taylor Blau
2026-05-27  9:24   ` Jeff King
2026-05-27 14:40     ` Taylor Blau [this message]
2026-05-19 16:12 ` [PATCH 4/8] pack-bitmap: consolidate `find_object_pos()` success path Taylor Blau
2026-05-20 14:42   ` SZEDER Gábor
2026-05-20 17:12     ` Taylor Blau
2026-05-27  9:27   ` Jeff King
2026-05-19 16:12 ` [PATCH 5/8] pack-bitmap: cache object positions during fill Taylor Blau
2026-05-27  9:45   ` Jeff King
2026-05-27 14:46     ` Taylor Blau
2026-05-19 16:12 ` [PATCH 6/8] pack-bitmap: sort bitmaps before XORing Taylor Blau
2026-05-27 10:04   ` Jeff King
2026-05-27 16:56     ` Taylor Blau
2026-05-19 16:12 ` [PATCH 7/8] pack-bitmap: remember pseudo-merge parents Taylor Blau
2026-05-19 16:12 ` [PATCH 8/8] pack-bitmap: build pseudo-merge bitmaps after regular bitmaps Taylor Blau
2026-05-27 10:25   ` Jeff King
2026-05-27 19:24     ` Taylor Blau
2026-05-27 10:27 ` [PATCH 0/8] pack-bitmap-write: speed up bitmap generation Jeff King
2026-05-27 19:55 ` [PATCH v2 " Taylor Blau
2026-05-27 19:55   ` [PATCH v2 1/8] pack-bitmap: pass object position to `fill_bitmap_tree()` Taylor Blau
2026-05-27 19:55   ` [PATCH v2 2/8] pack-bitmap: check subtree bits before recursing Taylor Blau
2026-05-27 19:55   ` [PATCH v2 3/8] pack-bitmap: reuse stored selected bitmaps Taylor Blau
2026-05-27 19:55   ` [PATCH v2 4/8] pack-bitmap: consolidate `find_object_pos()` success path Taylor Blau
2026-05-27 19:56   ` [PATCH v2 5/8] pack-bitmap: cache object positions during fill Taylor Blau
2026-05-27 19:56   ` [PATCH v2 6/8] pack-bitmap: sort bitmaps before XORing Taylor Blau
2026-05-27 19:56   ` [PATCH v2 7/8] pack-bitmap: remember pseudo-merge parents Taylor Blau
2026-05-27 19:56   ` [PATCH v2 8/8] pack-bitmap: build pseudo-merge bitmaps after regular bitmaps Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahcCX+xAKFOL8HcW@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox