public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Lukas Wanko <lwanko@gitlab.com>
Subject: Re: [PATCH] builtin/pack-objects: don't fetch objects when merging packs
Date: Thu, 12 Feb 2026 18:46:14 -0500	[thread overview]
Message-ID: <aY5mRiYr7icW4lE2@nand.local> (raw)
In-Reply-To: <20260211-pks-pack-objects-stdin-skip-backfill-fetch-v1-1-870cad56d8ae@pks.im>

On Wed, Feb 11, 2026 at 01:44:59PM +0100, Patrick Steinhardt wrote:
> The "--stdin-packs" option can be used to merge objects from multiple
> packfiles given via stdin into a new packfile. One big upside of this
> option is that we don't have to perform a complete rev walk to enumerate
> objects. Instead, we can simply enumerate all objects that are part of
> the specified packfiles, which can be significantly faster in very large
> repositories.
>
> There is one downside though: when we don't perform a rev walk we also
> don't have a good way to learn about the respective object's names. As a
> consequence, we cannot use the name hashes as a heuristic to get better
> delta selection.
>
> We try to offset this downside though by performing a localized rev
> walk: we queue all objects that we're about to repack as interesting,
> and all objects from excluded packfiles as uninteresting. We then
> perform a best-effort rev walk that allows us to fill in object names.

Nicely explained. I think it's reasonable to ask "well why are we doing
a revwalk, you just told me that --stdin-packs does not require a
(potentially expensive) traversal?". I think that the exposition you
provided here answers that question nicely.

> There is one gotcha here though: when "--exclude-promisor-objects" has
> not been given we will perform backfill fetches for any promised objects
> that are missing. This used to not be an issue though as this option was
> mutually exclusive with "--stdin-packs". But that has changed recently,
> and starting with dcc9c7ef47 (builtin/repack: handle promisor packs with
> geometric repacking, 2026-01-05) we will now repack promisor packs
> during geometric compaction. The consequence is that a geometric repack
> may now perform a bunch of backfill fetches.

Great find!

> We of course cannot passe "--exclude-promisor-objects" to fix this

s/passe/pass, though I think Junio noted this below.

> issue -- after all, the whole intent is to repack objects part of a
> promisor pack. But arguably we don't have to: the rev walk is intended
> as best effort, and we already configure it to ignore missing links to
> other objects. So we can adapt the walk to unconditionally disable
> fetching any missing objects.

Yep, I think this is the right trade-off.

> ---
>  builtin/pack-objects.c        | 10 ++++++++++
>  t/t5331-pack-objects-stdin.sh | 18 ++++++++++++++++++
>  2 files changed, 28 insertions(+)

The implementation and test look exactly as expected. Thanks for jumping
on this and fixing it!

Thanks,
Taylor

      parent reply	other threads:[~2026-02-12 23:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-11 12:44 [PATCH] builtin/pack-objects: don't fetch objects when merging packs Patrick Steinhardt
2026-02-11 17:21 ` Junio C Hamano
2026-02-12  6:37   ` Patrick Steinhardt
2026-02-12 23:46 ` Taylor Blau [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aY5mRiYr7icW4lE2@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=lwanko@gitlab.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox