From: Patrick Steinhardt <ps@pks.im>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, christian.couder@gmail.com,
gitster@pobox.com, johannes.schindelin@gmx.de,
johncai86@gmail.com, jonathantanmy@google.com,
karthik.188@gmail.com, kristofferhaugsbakk@fastmail.com,
me@ttaylorr.com, newren@gmail.com, peff@peff.net,
Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH 2/3] path-walk: fix setup of pending objects
Date: Thu, 21 Aug 2025 10:01:09 +0200 [thread overview]
Message-ID: <aKbSRQJCPh3Lsew8@pks.im> (raw)
In-Reply-To: <0dc4a6323e66598070b403d286ee1918e6a9b791.1755715196.git.gitgitgadget@gmail.com>
On Wed, Aug 20, 2025 at 06:39:55PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <stolee@gmail.com>
>
> The previous change established a buggy instance of 'git repack -adf
> --path-walk' when there exist paths that are tracked in the index and
> that is the only instance of those paths in the history of the
> repository. This change fixes that bug.
>
> The core problem here is that the "maybe_interesting" member of 'struct
> type_and_oid_list' is not initialized to '1'. This member was added in
> 6333e7ae0b (path-walk: mark trees and blobs as UNINTERESTING,
> 2024-12-20) in a way to help when creating packfiles for a small commit
> range using the sparse path algorithm (enabled by pack.useSparse=true).
>
> The idea here is that the list is marked as "maybe_interesting" if an
> object is added that does not have the UNINITERSTING flag on it. Later,
s/UNINITERSTING/UNINTERESTING/
> this is checked again in case all objects in the list were marked
> UNINTERESTING after that point in time. In this case, the algorithm
> skips the list as there is no reason to visit it.
>
> This leads to the problem where the "maybe_interesting" member was not
> appropriately initialized when the list is created from pending objects.
> This is the fix for now.
>
> To help avoid this from happening in the future, a follow-up change will
> make initializing lists use a shared method instead of allowing for an
> update to this initialization process to miss some existing copies.
Yeah, I wanted to say that this feels quite fragile to me and very easy
to miss. Does this mechanism buy us a lot of performance in the first
place? Because if not we might as well just remove it entirely.
But if the answer is "yes" then adding APIs around it feels like a good
alternative.
Patrick
next prev parent reply other threads:[~2025-08-21 8:01 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-20 18:39 [PATCH 0/3] [2.51.0 Bug] Missing singleton objects in 'git repack -adf --path-walk' Derrick Stolee via GitGitGadget
2025-08-20 18:39 ` [PATCH 1/3] t7700: add failing --path-walk test Derrick Stolee via GitGitGadget
2025-08-21 8:00 ` Patrick Steinhardt
2025-08-21 12:42 ` Derrick Stolee
2025-08-21 16:22 ` Junio C Hamano
2025-08-21 23:21 ` Elijah Newren
2025-08-20 18:39 ` [PATCH 2/3] path-walk: fix setup of pending objects Derrick Stolee via GitGitGadget
2025-08-20 19:02 ` Junio C Hamano
2025-08-20 19:42 ` Derrick Stolee
2025-08-21 8:01 ` Patrick Steinhardt
2025-08-21 12:55 ` Derrick Stolee
2025-08-21 8:01 ` Patrick Steinhardt [this message]
2025-08-21 20:33 ` Derrick Stolee
2025-08-21 23:21 ` Elijah Newren
2025-08-20 18:39 ` [PATCH 3/3] path-walk: create initializer for path lists Derrick Stolee via GitGitGadget
2025-08-21 23:22 ` Elijah Newren
2025-08-25 12:49 ` [PATCH v2 0/2] [2.51.0 Bug] Missing singleton objects in 'git repack -adf --path-walk' Derrick Stolee via GitGitGadget
2025-08-25 12:49 ` [PATCH v2 1/2] path-walk: fix setup of pending objects Derrick Stolee via GitGitGadget
2025-08-25 12:49 ` [PATCH v2 2/2] path-walk: create initializer for path lists Derrick Stolee via GitGitGadget
2025-08-26 15:03 ` [PATCH v2 0/2] [2.51.0 Bug] Missing singleton objects in 'git repack -adf --path-walk' Elijah Newren
2025-08-26 15:58 ` Junio C Hamano
2025-09-02 11:19 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aKbSRQJCPh3Lsew8@pks.im \
--to=ps@pks.im \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=johannes.schindelin@gmx.de \
--cc=johncai86@gmail.com \
--cc=jonathantanmy@google.com \
--cc=karthik.188@gmail.com \
--cc=kristofferhaugsbakk@fastmail.com \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).