From: Toon Claes <toon@iotcl.com>
To: Patrick Steinhardt <ps@pks.im>, git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH 5/8] builtin/pack-objects: simplify logic to find kept or nonlocal objects
Date: Wed, 29 Oct 2025 15:55:17 +0100 [thread overview]
Message-ID: <875xbxrc4q.fsf@iotcl.com> (raw)
In-Reply-To: <20251028-pks-packfiles-store-drop-list-v1-5-1a3b82030a7a@pks.im>
Patrick Steinhardt <ps@pks.im> writes:
> The function `has_sha1_pack_kept_or_nonlocal()` takes an object ID and
> then searches through packed objects to figure out whether the object
> exists in a kept or non-local pack. As a performance optimization we
> remember the packfile that contains a given object ID so that the next
> call to the function first checks that same packfile again.
>
> The way this is written is rather hard to follow though, as the caching
> mechanism is intertwined with the loop that iterates through the packs.
> Consequently, we need to do some gymnastics to re-start the iteration if
> the cached pack does not contain the objects.
Okay, this took me while, but yes this function was really hard to
understand. Thanks for simplifying.
Naive question, what's the point of keeping a "last_found"? We have one
global "last_found" for the last time this function was called, and we
have no control which OIDs get passed to this function. Why look into
"last_found" first?
> Refactor this so that we check the cached packfile at the beginning. We
> don't have to re-verify whether the packfile meets the properties as we
> have already verified those when storing the pack in `last_found` in the
> first place. So all we need to do is to use `find_pack_entry_one()` to
> check whether the pack contains the object ID, and to skip the cached
> pack in the loop so that we don't search it twice.
>
> This refactoring significantly simplifies the logic and makes it much
> easier to follow.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> builtin/pack-objects.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 5348aebbe9f..861fef3f38a 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -4388,27 +4388,27 @@ static void add_unreachable_loose_objects(struct rev_info *revs)
>
> static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
> {
> - struct packfile_store *packs = the_repository->objects->packfiles;
> static struct packed_git *last_found = (void *)1;
> struct packed_git *p;
>
> - p = (last_found != (void *)1) ? last_found :
> - packfile_store_get_packs(packs);
> + if (last_found != (void *)1 && find_pack_entry_one(oid, last_found))
> + return 1;
>
> - while (p) {
> - if ((!p->pack_local || p->pack_keep ||
> - p->pack_keep_in_core) &&
> - find_pack_entry_one(oid, p)) {
> + repo_for_each_pack(the_repository, p) {
> + if ((!p->pack_local || p->pack_keep || p->pack_keep_in_core) &&
> + find_pack_entry_one(oid, p)) {
> last_found = p;
> return 1;
> }
> - if (p == last_found)
> - p = packfile_store_get_packs(packs);
> - else
> - p = p->next;
> - if (p == last_found)
> - p = p->next;
> +
> + /*
> + * We have already checked `last_found`, so there is no need to
> + * re-check here.
> + */
I had to reason with myself why you need to extra `(void *)1` check,
maybe you can extend the comment a bit:
/*
* When `last_found` was set to something else then
* `(void *)1` we have already checked it,
* so there is no need to re-check here.
*/
> + if (p == last_found && last_found != (void *)1)
> + continue;
--
Cheers,
Toon
next prev parent reply other threads:[~2025-10-29 14:55 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-28 11:08 [PATCH 0/8] packfiles: track pack lists via the packfile store Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 1/8] packfile: use a `strmap` to store packs by name Patrick Steinhardt
2025-10-29 22:16 ` Taylor Blau
2025-10-28 11:08 ` [PATCH 2/8] packfile: move the MRU list into the packfile store Patrick Steinhardt
2025-10-29 22:39 ` Taylor Blau
2025-10-30 8:59 ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 3/8] http: refactor subsystem to use `packfile_list`s Patrick Steinhardt
2025-10-29 14:24 ` Toon Claes
2025-10-30 8:58 ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 4/8] packfile: fix approximation of object counts Patrick Steinhardt
2025-10-29 22:49 ` Taylor Blau
2025-10-30 8:58 ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 5/8] builtin/pack-objects: simplify logic to find kept or nonlocal objects Patrick Steinhardt
2025-10-29 14:55 ` Toon Claes [this message]
2025-10-29 23:15 ` Taylor Blau
2025-10-30 8:59 ` Patrick Steinhardt
2025-10-29 23:13 ` Taylor Blau
2025-10-30 8:58 ` Patrick Steinhardt
2025-10-30 9:31 ` Toon Claes
2025-10-30 9:52 ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 6/8] packfile: move list of packs into the packfile store Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 7/8] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
2025-10-29 23:25 ` Taylor Blau
2025-10-30 8:58 ` Patrick Steinhardt
2025-10-28 11:08 ` [PATCH 8/8] packfile: track packs via the MRU list exclusively Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 0/8] packfiles: track pack lists via the packfile store Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 1/8] packfile: use a `strmap` to store packs by name Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 2/8] packfile: move the MRU list into the packfile store Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 3/8] http: refactor subsystem to use `packfile_list`s Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 4/8] packfile: fix approximation of object counts Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 5/8] builtin/pack-objects: simplify logic to find kept or nonlocal objects Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 6/8] packfile: move list of packs into the packfile store Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 7/8] packfile: always add packfiles to MRU when adding a pack Patrick Steinhardt
2025-10-30 10:38 ` [PATCH v2 8/8] packfile: track packs via the MRU list exclusively Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=875xbxrc4q.fsf@iotcl.com \
--to=toon@iotcl.com \
--cc=git@vger.kernel.org \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).