From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 1/8] pack-bitmap: initialize `bitmap_writer_init()` with packing_data
Date: Thu, 29 Aug 2024 15:00:21 -0400 [thread overview]
Message-ID: <ZtDFRYQRLQoe+CHS@nand.local> (raw)
In-Reply-To: <20240817103155.GA551779@coredump.intra.peff.net>
On Sat, Aug 17, 2024 at 06:31:55AM -0400, Jeff King wrote:
> On Thu, Aug 15, 2024 at 01:31:00PM -0400, Taylor Blau wrote:
>
> > In order to determine its object order, the pack-bitmap machinery keeps
> > a 'struct packing_data' corresponding to the pack or pseudo-pack (when
> > writing a MIDX bitmap) being written.
> >
> > The to_pack field is provided to the bitmap machinery by callers of
> > bitmap_writer_build() and assigned to the bitmap_writer struct at that
> > point.
> >
> > But a subsequent commit will want to have access to that data earlier on
> > during commit selection. Prepare for that by adding a 'to_pack' argument
> > to 'bitmap_writer_init()', and initializing the field during that
> > function.
> >
> > Subsequent commits will clean up other functions which take
> > now-redundant arguments (like nr_objects, which is equivalent to
> > pdata->objects_nr, or pdata itself).
>
> This (and the next few follow-on commits) seem like a good change to me.
> It simplifies many of the function calls, and I think it expresses the
> domain logic in the API: there is a single set of objects being mapped
> to bits, and many parts of the process will rely on it.
Thanks. Yeah, it was a little surprising to me that it wasn't already
this way, especially having worked in this area for so long. I suspect
it grew this way organically over time (though haven't actually gone
spelunking through the history to confirm).
> Even the midx code, which is not generating a pack, uses a "fake"
> packing_data as the way to express that (because inherently the bit
> ordering is all coming from the pack-index nature). If we likewise ever
> wrote code to generate bitmaps from an existing pack, it would probably
> use packing_data, too. :)
I agree for the most part, though there is a lot of weight in
packing_data that would be nice to not have to carry around. I know
within GitHub's infrastructure we sometimes OOM kill invocations of "git
multi-pack-index write --bitmap" because of the memory overhead (a lot
of which is dominated by the actual traversal and bitmap generation, but
a lot that comes from just the per-object overhead).
I've thought about alternative structures that might be a little more
memory efficient, but it's never gotten to the top of my list.
Thanks,
Taylor
next prev parent reply other threads:[~2024-08-29 19:00 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-15 17:30 [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Taylor Blau
2024-08-15 17:31 ` [PATCH 1/8] pack-bitmap: initialize `bitmap_writer_init()` with packing_data Taylor Blau
2024-08-17 10:31 ` Jeff King
2024-08-29 19:00 ` Taylor Blau [this message]
2024-08-29 19:36 ` Jeff King
2024-08-15 17:31 ` [PATCH 2/8] pack-bitmap: drop redundant args from `bitmap_writer_build_type_index()` Taylor Blau
2024-08-15 17:31 ` [PATCH 3/8] pack-bitmap: drop redundant args from `bitmap_writer_build()` Taylor Blau
2024-08-15 17:31 ` [PATCH 4/8] pack-bitmap: drop redundant args from `bitmap_writer_finish()` Taylor Blau
2024-08-15 17:31 ` [PATCH 5/8] pack-bitmap-write.c: select pseudo-merges even for small bitmaps Taylor Blau
2024-08-17 10:34 ` Jeff King
2024-08-17 16:42 ` Junio C Hamano
2024-08-29 19:01 ` Taylor Blau
2024-08-15 17:31 ` [PATCH 6/8] t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups Taylor Blau
2024-08-15 17:31 ` [PATCH 7/8] pseudo-merge.c: do not generate empty pseudo-merge commits Taylor Blau
2024-08-17 10:38 ` Jeff King
2024-08-29 19:03 ` Taylor Blau
2024-08-15 17:31 ` [PATCH 8/8] pseudo-merge.c: ensure pseudo-merge groups are closed Taylor Blau
2024-08-17 10:43 ` Jeff King
2024-08-17 10:44 ` [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Jeff King
2024-08-29 19:04 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZtDFRYQRLQoe+CHS@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.