git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Elijah Newren <newren@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2 7/8] pack-objects: introduce '--stdin-packs=follow'
Date: Tue, 15 Apr 2025 16:45:42 -0400	[thread overview]
Message-ID: <Z/7FdgYID9I1qR7K@nand.local> (raw)
In-Reply-To: <CABPp-BFBJP15g=4M90161=KCDei-hEFdnGs7_oY8ERtqgn9s-g@mail.gmail.com>

On Mon, Apr 14, 2025 at 08:11:08PM -0700, Elijah Newren wrote:
> > diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
> > index 7f69ae4855..c894582799 100644
> > --- a/Documentation/git-pack-objects.adoc
> > +++ b/Documentation/git-pack-objects.adoc
> > @@ -87,13 +87,19 @@ base-name::
> >         reference was included in the resulting packfile.  This
> >         can be useful to send new tags to native Git clients.
> >
> > ---stdin-packs::
> > +--stdin-packs[=<mode>]::
> >         Read the basenames of packfiles (e.g., `pack-1234abcd.pack`)
> >         from the standard input, instead of object names or revision
> >         arguments. The resulting pack contains all objects listed in the
> >         included packs (those not beginning with `^`), excluding any
> >         objects listed in the excluded packs (beginning with `^`).
> >  +
> > +When `mode` is "follow", pack objects which are reachable from objects
> > +in the included packs, but appear in packs that are not listed.
> > +Reachable objects which appear in excluded packs are not packed. Useful
> > +for resurrecting once-cruft objects to generate packs which are closed
> > +under reachability up to the excluded packs.
>
> Maybe:
>
> When `mode` is "follow", objects from packs not listed on stdin
> receive special treatment.  Objects within unlisted packs will be
> included if those objects (1) are reachable from the included packs,
> and (2) are not also found in any of the excluded packs.  This mode is
> useful for resurrecting once-cruft objects to generate packs which are
> closed under reachability up to the boundary set by the excluded
> packs.

I like it. I went with your version with some minor rewording and tweaks
on top.

> > +               /*
> > +                * Our 'to_pack' list was constructed by iterating all
> > +                * objects packed in included packs, and so doesn't
> > +                * have a non-zero hash field that you would typically
> > +                * pick up during a reachability traversal.
> > +                *
> > +                * Make a best-effort attempt to fill in the ->hash
> > +                * and ->no_try_delta here using a now in order to
> > +                * perhaps improve the delta selection process.
> > +                */
>
> I know you just moved this paragraph from below...but it doesn't parse
> for me.  "using a now in order to perhaps"?  What does that mean?

Yeah, this is just bogus, and was so before this patch. The rewording is
minor enough (just dropping "using a now") that I think we can just
squash it in with the movement in this patch.

> > +               oe->hash = pack_name_hash_fn(name);
> > +               oe->no_try_delta = name && no_try_delta(name);
> > +
> > +               stdin_packs_hints_nr++;
> > +       }
> > +}
> > +
> > +static void show_commit_pack_hint(struct commit *commit, void *data)
> > +{
> > +       enum stdin_packs_mode mode = *(enum stdin_packs_mode *)data;
> > +       if (mode == STDIN_PACKS_MODE_FOLLOW) {
> > +               show_object_pack_hint((struct object *)commit, "", data);
> >                 return;
> > +       }
> > +       /* nothing to do; commits don't have a namehash */
> >
> > -       /*
> > -        * Our 'to_pack' list was constructed by iterating all objects packed in
> > -        * included packs, and so doesn't have a non-zero hash field that you
> > -        * would typically pick up during a reachability traversal.
> > -        *
> > -        * Make a best-effort attempt to fill in the ->hash and ->no_try_delta
> > -        * here using a now in order to perhaps improve the delta selection
> > -        * process.
> > -        */
> > -       oe->hash = pack_name_hash_fn(name);
> > -       oe->no_try_delta = name && no_try_delta(name);
> > -
> > -       stdin_packs_hints_nr++;
> >  }
>
> It might be worth swapping the order of functions as a preparatory
> patch (both here and when you've done it elsewhere in this series),
> just because it'll make the diff so much easier to read when we can
> see the changes to the function without have to also deal with the
> order swapping (since order swapping looks like a large deletion and
> large addition of one of the two functions).

Fair enough.

> > @@ -4467,6 +4484,23 @@ static int is_not_in_promisor_pack(struct commit *commit, void *data) {
> >         return is_not_in_promisor_pack_obj((struct object *) commit, data);
> >  }
> >
> > +static int parse_stdin_packs_mode(const struct option *opt, const char *arg,
> > +                                 int unset)
> > +{
> > +       enum stdin_packs_mode *mode = opt->value;
> > +
> > +       if (unset)
> > +               *mode = STDIN_PACKS_MODE_NONE;
> > +       else if (!arg || !*arg)
> > +               *mode = STDIN_PACKS_MODE_STANDARD;
>
> I don't understand why you have both a None mode and a Standard mode,
> especially since the implementation seems to only care about whether
> or not the Follow mode has been set.  Shouldn't these both be setting
> mode to the same value?

I'm not sure I follow your question... stdin_packs is a tri-state. It
can be off, on in standard/legacy mode, or on in follow mode.

> > +test_expect_success 'setup for --stdin-packs=follow' '
> > +       git init stdin-packs--follow &&
> > +       (
> > +               cd stdin-packs--follow &&
> > +
> > +               for c in A B C D
> > +               do
> > +                       test_commit "$c" || return 1
> > +               done &&
> > +
> > +               A="$(echo A | git pack-objects --revs $packdir/pack)" &&
> > +               B="$(echo A..B | git pack-objects --revs $packdir/pack)" &&
> > +               C="$(echo B..C | git pack-objects --revs $packdir/pack)" &&
> > +               D="$(echo C..D | git pack-objects --revs $packdir/pack)" &&
> > +
> > +               git prune-packed
> > +       )
> > +'

Huh, I have no idea how this snuck in. This "setup" test does nothing
and creates a repository that isn't used later on in the script.
Probably leftover from writing these tests in the first place, but I've
removed it.

> I like the tests -- normal --stdin-packs, then --stdin-packs=follow,
> then --stdin-packs=follow + --unpacked.

I think the normal tests are accidental since we use pack-objects to
write packs A, B, C, and D. But the --stdin-packs vs.
--stdin-packs=follow and --stdin-packs=follow + --unpacked was
definitely intentional.

> However, would it be worthwhile to create commit E immediately after
> creating the packs?

Yeah, I think that is a good suggestion. We already have tests that
exercise --stdin-packs with --unpacked earlier in the same script, but
obviously not with --stdin-packs=follow. Moving the creation of commit E
earlier up makes a lot of sense to me, thanks!

Thanks,
Taylor

  reply	other threads:[~2025-04-15 20:45 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-11 23:26 [RFC PATCH 0/8] repack: avoid MIDX'ing cruft pack(s) where possible Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 1/8] pack-objects: use standard option incompatibility functions Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 2/8] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 3/8] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 4/8] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 5/8] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 6/8] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 7/8] repack: keep track of existing MIDX'd packs Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 8/8] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau
2025-04-14 20:06 ` [PATCH v2 0/8] repack: avoid MIDX'ing cruft pack(s) " Taylor Blau
2025-04-14 20:06   ` [PATCH v2 1/8] pack-objects: use standard option incompatibility functions Taylor Blau
2025-04-14 20:41     ` Junio C Hamano
2025-04-15 19:32       ` Taylor Blau
2025-04-15 19:48         ` Junio C Hamano
2025-04-15 22:27           ` Taylor Blau
2025-04-14 20:06   ` [PATCH v2 2/8] object-store-ll.h: add note about designated initializers Taylor Blau
2025-04-14 21:07     ` Junio C Hamano
2025-04-15 19:51       ` Taylor Blau
2025-04-15  2:57     ` Elijah Newren
2025-04-15 19:47       ` Taylor Blau
2025-04-14 20:06   ` [PATCH v2 3/8] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-04-15  3:10     ` Elijah Newren
2025-04-14 20:06   ` [PATCH v2 4/8] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-04-14 20:06   ` [PATCH v2 5/8] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-04-14 20:06   ` [PATCH v2 6/8] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-04-15  3:10     ` Elijah Newren
2025-04-15 19:57       ` Taylor Blau
2025-04-14 20:06   ` [PATCH v2 7/8] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-04-15  3:11     ` Elijah Newren
2025-04-15 20:45       ` Taylor Blau [this message]
2025-04-16  5:26         ` Elijah Newren
2025-04-14 20:06   ` [PATCH v2 8/8] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau
2025-04-15  3:11     ` Elijah Newren
2025-04-15 20:51       ` Taylor Blau
2025-04-15  2:57   ` [PATCH v2 0/8] repack: avoid MIDX'ing cruft pack(s) " Elijah Newren
2025-04-15 22:05     ` Taylor Blau
2025-04-15 22:46 ` [PATCH v3 0/9] " Taylor Blau
2025-04-15 22:46   ` [PATCH v3 1/9] pack-objects: use standard option incompatibility functions Taylor Blau
2025-04-15 22:46   ` [PATCH v3 2/9] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-04-16  0:58     ` Junio C Hamano
2025-04-16 22:07       ` Taylor Blau
2025-04-16  5:31     ` Elijah Newren
2025-04-16 22:07       ` Taylor Blau
2025-04-15 22:46   ` [PATCH v3 3/9] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-04-16  0:59     ` Junio C Hamano
2025-04-15 22:46   ` [PATCH v3 4/9] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-04-15 22:47   ` [PATCH v3 5/9] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-04-16  9:21     ` Junio C Hamano
2025-04-15 22:47   ` [PATCH v3 6/9] pack-objects: fix typo in 'show_object_pack_hint()' Taylor Blau
2025-04-16  5:36     ` Elijah Newren
2025-04-15 22:47   ` [PATCH v3 7/9] pack-objects: swap 'show_{object,commit}_pack_hint' Taylor Blau
2025-04-15 22:47   ` [PATCH v3 8/9] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-04-15 22:47   ` [PATCH v3 9/9] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau
2025-04-16  5:56     ` Elijah Newren
2025-04-16 22:16       ` Taylor Blau
2025-05-13  3:34         ` Elijah Newren
2025-05-28 23:20 ` [PATCH v4 0/9] repack: avoid MIDX'ing cruft pack(s) " Taylor Blau
2025-05-28 23:20   ` [PATCH v4 1/9] pack-objects: use standard option incompatibility functions Taylor Blau
2025-05-28 23:20   ` [PATCH v4 2/9] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-05-28 23:20   ` [PATCH v4 3/9] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-05-28 23:20   ` [PATCH v4 4/9] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-05-28 23:20   ` [PATCH v4 5/9] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-05-28 23:20   ` [PATCH v4 6/9] pack-objects: fix typo in 'show_object_pack_hint()' Taylor Blau
2025-05-28 23:20   ` [PATCH v4 7/9] pack-objects: swap 'show_{object,commit}_pack_hint' Taylor Blau
2025-05-28 23:20   ` [PATCH v4 8/9] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-05-28 23:20   ` [PATCH v4 9/9] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau
2025-06-19 11:33     ` Carlo Marcelo Arenas Belón
2025-06-19 13:08     ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
2025-06-19 17:07       ` Junio C Hamano
2025-06-19 23:26         ` Taylor Blau
2025-05-29  0:07   ` [PATCH v4 0/9] repack: avoid MIDX'ing cruft pack(s) " Taylor Blau
2025-05-29  0:15     ` Elijah Newren
2025-06-19 23:30 ` [PATCH v5 " Taylor Blau
2025-06-19 23:30   ` [PATCH v5 1/9] pack-objects: use standard option incompatibility functions Taylor Blau
2025-06-19 23:30   ` [PATCH v5 2/9] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-06-19 23:30   ` [PATCH v5 3/9] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-06-19 23:30   ` [PATCH v5 4/9] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-06-19 23:30   ` [PATCH v5 5/9] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-06-19 23:30   ` [PATCH v5 6/9] pack-objects: fix typo in 'show_object_pack_hint()' Taylor Blau
2025-06-19 23:30   ` [PATCH v5 7/9] pack-objects: swap 'show_{object,commit}_pack_hint' Taylor Blau
2025-06-19 23:30   ` [PATCH v5 8/9] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-06-20 15:27     ` Junio C Hamano
2025-06-19 23:30   ` [PATCH v5 9/9] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau
2025-06-21  4:35     ` Jeff King
2025-06-23 18:47       ` Taylor Blau
2025-06-24 10:54         ` Jeff King
2025-06-24 16:05           ` Taylor Blau
2025-06-23 22:32 ` [PATCH v6 0/9] repack: avoid MIDX'ing cruft pack(s) " Taylor Blau
2025-06-23 22:32   ` [PATCH v6 1/9] pack-objects: use standard option incompatibility functions Taylor Blau
2025-06-24 15:52     ` Junio C Hamano
2025-06-24 16:06       ` Taylor Blau
2025-06-23 22:32   ` [PATCH v6 2/9] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-06-23 22:49     ` Junio C Hamano
2025-06-23 22:32   ` [PATCH v6 3/9] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-06-23 22:32   ` [PATCH v6 4/9] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-06-23 22:59     ` Junio C Hamano
2025-06-23 22:32   ` [PATCH v6 5/9] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-06-23 23:08     ` Junio C Hamano
2025-06-24 16:08       ` Taylor Blau
2025-06-23 22:32   ` [PATCH v6 6/9] pack-objects: fix typo in 'show_object_pack_hint()' Taylor Blau
2025-06-23 22:32   ` [PATCH v6 7/9] pack-objects: swap 'show_{object,commit}_pack_hint' Taylor Blau
2025-06-23 22:32   ` [PATCH v6 8/9] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-06-23 23:35     ` Junio C Hamano
2025-06-24 16:10       ` Taylor Blau
2025-06-23 22:32   ` [PATCH v6 9/9] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z/7FdgYID9I1qR7K@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).