git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>
Subject: Re: [PATCH 5/8] builtin/pack-objects: simplify logic to find kept or nonlocal objects
Date: Wed, 29 Oct 2025 19:13:33 -0400	[thread overview]
Message-ID: <aQKfnd2gWS2T9GaD@nand.local> (raw)
In-Reply-To: <20251028-pks-packfiles-store-drop-list-v1-5-1a3b82030a7a@pks.im>

On Tue, Oct 28, 2025 at 12:08:35PM +0100, Patrick Steinhardt wrote:
> @@ -4388,27 +4388,27 @@ static void add_unreachable_loose_objects(struct rev_info *revs)
>
>  static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
>  {
> -	struct packfile_store *packs = the_repository->objects->packfiles;
>  	static struct packed_git *last_found = (void *)1;
>  	struct packed_git *p;
>
> -	p = (last_found != (void *)1) ? last_found :
> -					packfile_store_get_packs(packs);
> +	if (last_found != (void *)1 && find_pack_entry_one(oid, last_found))
> +		return 1;

This all looks right to me, since we only assign last_found when a pack
meets the kept-or-non-local criteria, which is good.

>
> -	while (p) {
> -		if ((!p->pack_local || p->pack_keep ||
> -				p->pack_keep_in_core) &&
> -			find_pack_entry_one(oid, p)) {
> +	repo_for_each_pack(the_repository, p) {
> +		if ((!p->pack_local || p->pack_keep || p->pack_keep_in_core) &&
> +		    find_pack_entry_one(oid, p)) {
>  			last_found = p;
>  			return 1;
>  		}
> -		if (p == last_found)
> -			p = packfile_store_get_packs(packs);
> -		else
> -			p = p->next;
> -		if (p == last_found)
> -			p = p->next;
> +
> +		/*
> +		 * We have already checked `last_found`, so there is no need to
> +		 * re-check here.
> +		 */
> +		if (p == last_found && last_found != (void *)1)
> +			continue;

Can 'p' ever be (void *)1 here? I would imagine not since this is coming
from repo_for_each_pack(), so I think it would suffice to limit this
conditional to just "if (p == last_found)".

Otherwise looks good. I think you could make use of the kept_cache here
at least for the local-but-kept packs, but what you wrote is definitely
an improvement in readability.

    static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
    {
            static struct packed_git *last_found = (void *)1;
            uint32_t kept_flags = ON_DISK_KEEP_PACKS | IN_CORE_KEEP_PACKS;
            struct packed_git *p;
            struct pack_entry entry;

            if (last_found != (void *)1 && find_pack_entry_one(oid, last_found))
                    return 1;

            if (find_kept_pack_entry(the_repository, oid, flags, &entry)) {
                    last_found = entry.p;
                    return 1;
            }

            repo_for_each_pack(the_repository, p) {
                    if (p->pack_local || p == last_found)
                            continue;
                    if (find_pack_entry_one(oid, p)) {
                            last_found = p;
                            return 1;
                    }
            }
            return 0;
    }

You still end up looping over all of the packs in the worst case, but in
the case where you are going to pick up a copy of the object via a kept
pack, the above should be faster since it will focus the search on those
packs first.

I'm not sure that it matters all that much, since at worst we're
interspersing the search over kept packs with some non-local ones, and
skipping over the rest. But certainly you could imagine cases where the
number of non-local packs greatly outnumbers the local, kept ones, so in
a situation like that I think the above would help.

Probably micro-optimizing this case is not all that useful, since this
only comes up when we are unpacking unreachable objects like with 'git
repack -A', which should be using cruft packs anyway.

Thanks,
Taylor

  parent reply	other threads:[~2025-10-29 23:13 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-28 11:08 [PATCH 0/8] packfiles: track pack lists via the packfile store Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 1/8] packfile: use a `strmap` to store packs by name Patrick Steinhardt
2025-10-29 22:16   ` Taylor Blau
2025-10-28 11:08 ` [PATCH 2/8] packfile: move the MRU list into the packfile store Patrick Steinhardt
2025-10-29 22:39   ` Taylor Blau
2025-10-30  8:59     ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 3/8] http: refactor subsystem to use `packfile_list`s Patrick Steinhardt
2025-10-29 14:24   ` Toon Claes
2025-10-30  8:58     ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 4/8] packfile: fix approximation of object counts Patrick Steinhardt
2025-10-29 22:49   ` Taylor Blau
2025-10-30  8:58     ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 5/8] builtin/pack-objects: simplify logic to find kept or nonlocal objects Patrick Steinhardt
2025-10-29 14:55   ` Toon Claes
2025-10-29 23:15     ` Taylor Blau
2025-10-30  8:59     ` Patrick Steinhardt
2025-10-29 23:13   ` Taylor Blau [this message]
2025-10-30  8:58     ` Patrick Steinhardt
2025-10-30  9:31   ` Toon Claes
2025-10-30  9:52     ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 6/8] packfile: move list of packs into the packfile store Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 7/8] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
2025-10-29 23:25   ` Taylor Blau
2025-10-30  8:58     ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 8/8] packfile: track packs via the MRU list exclusively Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 0/8] packfiles: track pack lists via the packfile store Patrick Steinhardt
2025-10-30 10:38   ` [PATCH v2 1/8] packfile: use a `strmap` to store packs by name Patrick Steinhardt
2025-10-30 10:38   ` [PATCH v2 2/8] packfile: move the MRU list into the packfile store Patrick Steinhardt
2025-10-30 10:38   ` [PATCH v2 3/8] http: refactor subsystem to use `packfile_list`s Patrick Steinhardt
2025-10-30 10:38   ` [PATCH v2 4/8] packfile: fix approximation of object counts Patrick Steinhardt
2025-10-30 10:38   ` [PATCH v2 5/8] builtin/pack-objects: simplify logic to find kept or nonlocal objects Patrick Steinhardt
2025-10-30 10:38   ` [PATCH v2 6/8] packfile: move list of packs into the packfile store Patrick Steinhardt
2025-10-30 10:38   ` [PATCH v2 7/8] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
2025-10-30 10:38   ` [PATCH v2 8/8] packfile: track packs via the MRU list exclusively Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQKfnd2gWS2T9GaD@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).