* Re: [PATCH v2 2/4] pack-objects: support reachability bitmaps with `--path-walk`
From: Derrick Stolee @ 2026-06-19 14:36 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Junio C Hamano, Jeff King, Elijah Newren
In-Reply-To: <ajVPJGXuhugDcT+A@nand.local>
On 6/19/2026 10:16 AM, Taylor Blau wrote:
> On Fri, Jun 12, 2026 at 09:03:41AM -0400, Derrick Stolee wrote:
>> On 6/2/2026 6:21 PM, Taylor Blau wrote:
>>
>>> As a result, we can see significantly reduced pack sizes from p5311
>>> before this commit:
>>
>> I mentioned this before, but the pack _sizes_ aren't changing in this
>> example. We are computing them more quickly, though.
>
> Thanks for pointing this out. The paragraph following the perf output
> below correctly explains the results ("We get the same size of output
> pack, but [...]"), but this one is obviously wrong.
>
>> Since we are testing --path-walk on both sides, the change across this
>> commit is that we are using the bitmaps for the "counting objects" phase
>> and then potentially using the --path-walk algorithm to construct the
>> packfile.
>
> I'm not sure I agree here. Because we are using bitmaps, we're relying
> on pack-reuse to construct the output pack, not --path-walk. I mentioned
> in git-pack-objects(1), but the combination of seeing "--path-walk" and
> "--use-bitmap-index" together only means that we will use a path-walk
> traversal as fallback if we can't get an answer by relying on bitmaps.
I guess my thought was that we'd construct bitmaps when they are
available, but how do we walk objects to get the objects for commits
that are not represented by bitmaps?
But you make a good point: we don't need to do that for functional
use: the bitmap code does an object walk to produce a bitmap, and it's
all in a layer "below" the pack-objects code.
So essentially, this _isn't_ a combined approach: it's "use bitmaps if
we can, and fall back to --path-walk if we can't" which is changing
from our previous behavior of "--path-walk means we don't try to use
bitmaps".
Thanks,
-Stolee
^ permalink raw reply
* Re: What's cooking in git.git (Jun 2026, #06)
From: Taylor Blau @ 2026-06-19 14:33 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, git
In-Reply-To: <xmqqtsr1w0z4.fsf@gitster.g>
On Wed, Jun 17, 2026 at 10:06:23AM -0700, Junio C Hamano wrote:
> * tb/midx-incremental-custom-base (2026-06-12) 3 commits
> - midx-write: include packs above custom incremental base
> - midx: pass custom '--base' through incremental writes
> - t5334: expose shared `nth_line()` helper
>
> The `git multi-pack-index write --incremental` command has been
> corrected to properly honor the `--base` option. Previously, the
> custom base was ignored by the normal write path, and the pack
> exclusion logic incorrectly skipped packs from layers above the
> selected base, breaking reachability closure for bitmaps.
>
> Needs review.
> source: <cover.1781294771.git.me@ttaylorr.com>
It would be nice to get this in before v2.55.0 is tagged, but I don't
think it's critical. In my analysis, the worst thing that could happen
is that generating MIDXs with a custom --base would result in a failure
to generate bitmaps, but not much else.
That's unlikely to be invoked manually, but does have the unfortunate
effect of rendering the new incremental MIDX-based repacking strategy as
useless in this release.
I'll add Peff to CC in case he has a moment to look it over.
Thanks,
Taylor
^ permalink raw reply
* Re: [PATCH v2 2/4] pack-objects: support reachability bitmaps with `--path-walk`
From: Taylor Blau @ 2026-06-19 14:28 UTC (permalink / raw)
To: Derrick Stolee; +Cc: git, Junio C Hamano, Jeff King, Elijah Newren
In-Reply-To: <849c659f-efa8-430a-bfac-0c26a3ed1aaa@gmail.com>
On Fri, Jun 12, 2026 at 09:24:32AM -0400, Derrick Stolee wrote:
> On 6/2/2026 6:21 PM, Taylor Blau wrote:
> > When 'pack-objects' is invoked with '--path-walk', it prevents us from
> > using reachability bitmaps.
>
> My earlier response focused on the _use_ of bitmaps when creating a
> packfile, but your patch also enables _writing_ bitmaps with the
> --path-walk option, which is significant and potentially more
> interesting from my perspective: we have evidence that --path-walk
> can produce significantly smaller packfiles than the standard
> algorithm, and once those packfiles are created we can benefit from
> that size in later packfile creation steps by reusing those deltas.
I am perhaps splitting hairs here, but I would frame the use of bitmaps
when reading with "--path-walk" as "either/or" not "both/and". The main
goal of this patch is to enable us to still generate bitmaps when
*writing* a pack with "--path-walk".
> Even more important here is that we have demonstrated examples of repos
> that change their packfile size when using the --path-walk method. We
> should demonstrate that the size continues to shrink with --path-walk
> even when producing a matching .bitmap file with --write-bitmap-index.
That's fair. One way to do this would be to:
--- 8< ---
diff --git a/t/perf/p5311-pack-bitmaps-fetch.sh b/t/perf/p5311-pack-bitmaps-fetch.sh
index 1b115d921a1..c1aed3e2aef 100755
--- a/t/perf/p5311-pack-bitmaps-fetch.sh
+++ b/t/perf/p5311-pack-bitmaps-fetch.sh
@@ -18,6 +18,10 @@ test_fetch_bitmaps () {
git repack -ad $argv
'
+ test_size "size of bitmapped pack ${argv:+($argv)}" '
+ test_file_size .git/objects/pack/pack-*.pack
+ '
+
# simulate a fetch from a repository that last fetched N days ago, for
# various values of N. We do so by following the first-parent chain,
# and assume the first entry in the chain that is N days older than the current
--- >8 ---
, which gives us:
Test HEAD^ HEAD
----------------------------------------------------------------------------------------
5311.3: size of bitmapped pack 278.8M 278.8M -0.0%
5311.38: size of bitmapped pack (--path-walk) 278.7M 278.7M +0.0%
(eliding other tests). I considered whether there are other interesting
tests, but I think "repack" is the right layer to run perf tests, since
you're always writing a closed pack. We could try different subsets of
the repository's objects (which would also have to be closed), but I
don't think this is that interesting.
> The other thing that I notice here is that the bitmaps will need to
> compute their reachable object set independently from the path-walk
> algorithm. But I suppose that already happens separately from the
> revision-walk approach that normally produces the packfile contents.
Right. The only wrinkle here is how we handle the internal traversal's
"--boundary" option, but see the last paragraph in the commit message
for details on why the proposed approach is OK.
> >From my perspective, the point of integrating these two things are:
>
> 1. Reachability bitmaps make it much faster to discover the reachable
> set and reuse bits of existing packfiles. (Your performance table
> demonstrates this is true.)
>
> 2. The --path-walk option can shrink packfile sizes by grouping
> trees and blobs by path before those paths collide in the name-hash
> sort. (I haven't seen evidence that this is happening.)
>
> With evidence of (1) and not (2), it's not clear from the data that
> these features are integrating completely. Without looking at the
> code, those numbers would be the same if we had instead swapped the
> preference of "the --path-walk option disables bitmaps" to "bitmaps
> disable --path-walk".
Let me know if modifying the perf test as above (and including the
relevant results in the commit message) would be sufficient in
addressing your concern.
Thanks,
Taylor
^ permalink raw reply related
* Re: [PATCH v2 2/4] pack-objects: support reachability bitmaps with `--path-walk`
From: Taylor Blau @ 2026-06-19 14:16 UTC (permalink / raw)
To: Derrick Stolee; +Cc: git, Junio C Hamano, Jeff King, Elijah Newren
In-Reply-To: <6e4a8764-3c56-42c8-a87e-40a94c6c34e9@gmail.com>
On Fri, Jun 12, 2026 at 09:03:41AM -0400, Derrick Stolee wrote:
> On 6/2/2026 6:21 PM, Taylor Blau wrote:
>
> > As a result, we can see significantly reduced pack sizes from p5311
> > before this commit:
>
> I mentioned this before, but the pack _sizes_ aren't changing in this
> example. We are computing them more quickly, though.
Thanks for pointing this out. The paragraph following the perf output
below correctly explains the results ("We get the same size of output
pack, but [...]"), but this one is obviously wrong.
> Since we are testing --path-walk on both sides, the change across this
> commit is that we are using the bitmaps for the "counting objects" phase
> and then potentially using the --path-walk algorithm to construct the
> packfile.
I'm not sure I agree here. Because we are using bitmaps, we're relying
on pack-reuse to construct the output pack, not --path-walk. I mentioned
in git-pack-objects(1), but the combination of seeing "--path-walk" and
"--use-bitmap-index" together only means that we will use a path-walk
traversal as fallback if we can't get an answer by relying on bitmaps.
> And I wonder if the test setup creates a situation where we are always
> reusing deltas from the underlying packfile, so the --path-walk algorithm
> isn't doing anything to help with delta compression at this point and the
> difference in this patch is that we are replacing the object reachability
> calculation entirely with bitmaps.
>
> I suppose what I'm really worried about is that I'm hoping to see some
> evidence from a large-scale test that demonstrates that the two algorithms
> are working in tandem in a non-trivial way. I haven't seen it yet, but I
> also don't have evidence that they _aren't_ working together.
Your thinking is correct here that the test setup intentionally creates
a situation where we are reusing objects/deltas verbatim from the
bitmapped pack.
I'm not sure what "working in tandem" means here. At read time, the two
options mutually exclude one another, meaning we'll use bitmaps if we
have them, or do a path-walk traversal otherwise (or if the bitmaps we
have are somehow insufficient to perform the traversal).
The goal of this patch is not to demonstrate that the two work together
at the same time, but rather that we can write a pack using --path-walk,
and generate reachability bitmaps simultaneously.
Let me know if you have more thoughts on what "working together in a
non-trivial" way would look like here. If there are ways to improve the
compatibility of these two features in a way that yields better
performance via either smaller packs, faster generation, or both, I'm
all ears :-).
Thanks,
Taylor
^ permalink raw reply
* Re: [PATCH v2 2/4] pack-objects: support reachability bitmaps with `--path-walk`
From: Taylor Blau @ 2026-06-19 14:08 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, Michael Montalbo, Derrick Stolee, Jeff King, Elijah Newren
In-Reply-To: <xmqqjyrzbjyf.fsf@gitster.g>
On Mon, Jun 15, 2026 at 01:57:28PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
> > index f693cb56691..69c5da1580a 100755
> > --- a/t/t5310-pack-bitmaps.sh
> > +++ b/t/t5310-pack-bitmaps.sh
> > ...
> > + for reuse in true false
> > + do
> > + : >trace.txt &&
> > +
> > + GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
> > + git -c pack.allowPackReuse=$reuse pack-objects \
> > + --stdout --revs --path-walk --use-bitmap-index \
> > + <in >out.pack &&
> > + grep "\"category\":\"bitmap\",\"key\":\"bitmap/hits\"" trace.txt &&
>
> This gets flagged by updated test linter X-<. Use test_grep to
> pacify it.
Oops, thanks for spotting.
Thanks,
Taylor
^ permalink raw reply
* Re: [PATCH] commit-graph: use timestamp_t for max parent generation accumulator
From: Taylor Blau @ 2026-06-19 14:05 UTC (permalink / raw)
To: Derrick Stolee
Cc: Patrick Steinhardt, Elijah Newren via GitGitGadget, git,
Elijah Newren
In-Reply-To: <09e50180-e165-48d8-a9d0-485283342f5c@gmail.com>
On Mon, Jun 15, 2026 at 07:44:19AM -0400, Derrick Stolee wrote:
> On 6/15/26 4:11 AM, Patrick Steinhardt wrote:
> > On Sun, Jun 14, 2026 at 06:57:50AM +0000, Elijah Newren via GitGitGadget wrote:
> > > commit-graph: use timestamp_t for max parent generation accumulator
> > > We found a few repositories in the wild with commits whose authors were
> > > apparently on a computer in the year 2120 when they recorded their
> > > commits. Apparently, in a century from now, some folks are going to have
> > > a really weird timezone as well (-13068837), though the timezone doesn't
> > > factor into this patch at all.
>
> > > @@ -1669,7 +1669,7 @@ static void compute_reachable_generation_numbers(
> > > struct commit *current = list->item;
> > > struct commit_list *parent;
> > > int all_parents_computed = 1;
> > > - uint32_t max_gen = 0;
> > > + timestamp_t max_gen = 0;
> > > for (parent = current->parents; parent; parent = parent->next) {
> > > repo_parse_commit(info->r, parent->item);
> >
> > This looks obviously correct.
>
> I agree. I was surprised this was the only necessary change, but
> your message clearly describes how the timing of the patch that
> delivered this change contributed to the mismatch.
Ditto. I reviewed a version of this patch before Elijah sent it to the
list, but this LGTM and is
Acked-by: Taylor Blau <me@ttaylorr.com>
Thanks,
Taylor
^ permalink raw reply
* Re: [PATCH] t4216: fix no-op test that breaks TAP output
From: Taylor Blau @ 2026-06-19 14:04 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Todd Zullinger, Junio C Hamano, Jeff King
In-Reply-To: <20260619-pks-t4216-drop-unused-prereq-v1-1-2ce0d7bea088@pks.im>
Hi Patrick,
A couple of thanks are owed: one to Todd for reporting this issue in the
first place, another to Peff for analyizing why it didn't appear broken
before, and a third for you for proposing a patch to fix it.
If you choose to delete this piece of test infrastructure entirely (I
think that there is an alternative direction that I would prefer, but
see below for more on why), I think the patch you wrote below is OK.
But...
On Fri, Jun 19, 2026 at 09:20:20AM +0200, Patrick Steinhardt wrote:
> In t4216 we have have a prerequisite that is active in case the system's
> `char` type is signed by default. This prerequisite isn't really used by
> anything though: while it is used to guard one of our tests, that
> specific test is essentially a no-op. So all this infrastructure does is
> to provide some debugging hint to a reader that pays a lot of attention.
I don't think that this is guarding nothing, but I agree that the test
as written is strange. As I recall, this was to sanity check the v1
Bloom values, but allow failures on platforms where the `char` type is
unsigned by default.
I don't feel that strongly about whether or not we check the exact
value of the filter, but I think there are a couple of arguments in
favor of doing so. Most compelling would be that we know that our
murmur3 implementation is correct (in at least one case) and that we
don't regress that case in the future. We do have these checks for v2
changed-path Bloom filters where the signed-ness of `char` is
irrelevant.
> Besides that, the way we set up the prerequisite also results in broken
> TAP output on systems where `char` is unsigned by default: we use
> `test_cmp()` to diff two files outside of of any test body, and if the
> files differ we enable the prerequisite. If so, the call to `test_cmp()`
> would also print output, and that output is of course not valid TAP
> output.
Given this and the above, I would probably err on the side of
designating this as 'test_lazy_prereq' or otherwise silencing the output
of 'test_cmp' so that this does not taint the TAP output.
Thanks,
Taylor
^ permalink raw reply
* Re: [RFH] Why do osx CI jobs so unreliable?
From: Patrick Steinhardt @ 2026-06-19 14:03 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqik7fnz90.fsf@gitster.g>
On Thu, Jun 18, 2026 at 05:35:23PM -0700, Junio C Hamano wrote:
> I've been observing that in recent push-out to 'master' and 'next',
> osx-* jobs in GitHub Actions CI keep running for 6 hours and get
> killed.
>
> What is troubling is that this seems to be very flaky. For example,
> https://github.com/git/git/actions/runs/27778820659 is testing
> 95e20213 (Hopefully final batch before -rc2, 2026-06-17) which got
> killed after wasting 6 hours in osx-clang and osx-gcc jobs.
>
> https://github.com/git/git/actions/runs/27790036076 is testing
> the same 'master', with a patch to .github/workflows/main.yml to
> remove everything except for config and osx-* jobs, which succeeded
> within 30 minutes.
>
> Stumped...
So the raw logs have the following trailer:
2026-06-18T23:53:33.2996180Z Cleaning up orphan processes
2026-06-18T23:53:33.7900380Z Terminate orphan process: pid (34022) (git-remote-http)
2026-06-18T23:53:33.9848670Z Terminate orphan process: pid (15488) (httpd)
2026-06-18T23:53:34.0321490Z Terminate orphan process: pid (13146) (httpd)
2026-06-18T23:53:34.0808280Z Terminate orphan process: pid (13145) (httpd)
2026-06-18T23:53:34.1212760Z Terminate orphan process: pid (13144) (httpd)
2026-06-18T23:53:34.1570160Z Terminate orphan process: pid (13141) (httpd)
2026-06-18T23:53:34.1924140Z Terminate orphan process: pid (12553) (bash)
2026-06-18T23:53:34.2472970Z Terminate orphan process: pid (12552) (tee)
2026-06-18T23:53:34.6547890Z Terminate orphan process: pid (21209) (bash)
So I strongly suspect that it most be one of the t555* tests.
Furthermore, the t5551 and t5559 (both of which are actually the same
test) are the only test suites that use lib-httpd.sh and which are
missing in the job logs.
I have not been able to reproduce this hang on my macOS virtual machine
though, and on GitLab I didn't notice a similar hang recently. Maybe
this is something that's specific to GitHub's environment...? No idea.
Patrick
^ permalink raw reply
* Re: [PATCH v2 2/2] doc: advise batching patch rerolls
From: Weijie Yuan @ 2026-06-19 13:20 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, ps
In-Reply-To: <xmqq4ij1vywy.fsf@gitster.g>
Sorry for the late reply. I spent some time looking back through the
discussions on earlier patch series, to check my patch itself, of course
because I'm apparently a newcomer here.
On Wed, Jun 17, 2026 at 10:50:53AM -0700, Junio C Hamano wrote:
> > If the comments require substantial rework, sending a new version
> > +sooner may save reviewers from spending time on a version you already know will
> > +change significantly.
>
> I am not sure about this one. Even though the intention to avoid
> wasting reviewers' time spent on reading through the previous
> version that will be invalidated is a good one, by definition, a
> substantial rework will naturally take time, and it is better not to
> rush and send an updated version with substantial changes that you
> yourself haven't had a chance to thoroughly review yet.
>
> In such a case, it would be a better idea to respond to the review
> that made you realize a substantial rewrite is needed with a simple
> "I'll make a substantial rework based on this comment, which would
> invalidate this and that part of the current patch series, so please
> do not waste reviewer cycles on these parts until I send an updated
> series out" message.
I think the approach you recommended is obviously more reasonable.
It would be better to give everyone a heads-up "I am working on a
new version."
I will improve this part accordingly.
> > If the topic is close to being accepted and the remaining
> > +comments are small, a quicker new version may also be fine.
>
> I am not sure if this needs to be codified.
>
> I often see (e.g., in patches from Patrick) that an iteration is
> marked clearly as final candidate that the author is not aware of
> any outstanding issues. This encourages reviewers to ask "what
> about this one raised there?" to remind what is missed, or chime in
> with "yup, this looks good" to show support. Such a note is highly
> recommended, but I do not see a need to say "the (supposedly) final
> one is specifically allowed to be sent without waiting" even then.
Actually I thought Patrick would say something here ;-) so I waited a
few more days to see whether anyone else had any suggestions.
But here I think Patrick's original intention is: If your series is
*close* to be accepted, (while I'm not sure what the precise definition
of this "close to be accepted", does it means: commented by Junio with
"Looks good", or reviewed by the community/core contributors with "Makes
sense"?) and this time there happens to be a small issue, you can
re-roll quickly to make your series more "sturdy" to wait for
maintainer's final examination and further merges.
So, I think the situation you are describing here is that this version
of the patch has already been declared by the *author* to be the final
version. (i.e. waiting for Junio to do the last exam)
Therefore, I do not think the two situations conflict with each other,
or are directly related. One concerns a patch that is already close to
receiving the maintainer's final verdict, where a minor issue is
discovered and the author quickly rerolls it. The other concerns an
author who, without realizing that some issues remain unresolved, rushes
to send what they believe to be the final version and then waits for the
maintainer to review it.
For the latter case, I think it would be better to add a sentence along
the lines of: "Before sending a new version/the final version, check
once more whether there are any unresolved issues," if the existing
documentation does not already make this clear.
That said, I am not familiar with how patch discussions have played out
in the past, so please directly point out any mistakes in my
understanding. I have to admit that, by this point in writing the
message, I have become a little tangled up in my own reasoning.
Thanks!
^ permalink raw reply
* Re: [PATCH v14 4/6] branch: add --prune-merged <branch>
From: Phillip Wood @ 2026-06-19 13:13 UTC (permalink / raw)
To: Junio C Hamano
Cc: Harald Nordgren, Harald Nordgren via GitGitGadget, git,
Kristoffer Haugsbakk, Johannes Sixt
In-Reply-To: <xmqqcxxnsufl.fsf@gitster.g>
On 18/06/2026 17:08, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
>
>> One thing I've just thought of related to this patch is whether we want
>> to protect branches that are the upstreams of branches that are not
>> slated for deletion. With stacked branches it is possible that a branch
>> has been merged but has other branches stacked on top of it that have
>> not been merged.
>
> An interesting point. We do have "this topic is built on the result
> of merging these other topics into main" and I expect the practice
> is wide spread. These base topics may graduate first, but other
> topics may still be updated.
>
> But when you rewrite these other topics, wouldn't you leave their
> bases untouched? IOW, a new iteration (i.e. "rebase -i") would
> reuse the base that was used in an earlier iteration, i.e. the
> result of an earlier merge of the other topics, some of which might
> have been pruned since then, into an older 'main', so it is OK to
> lose these other topics once they have graduated, simply because you
> wouldn't be recreating the merge that you used as the base of this
> remaining topic, no?
>
> Or am I missing something?
I was thinking that if I have feature1 with upstream origin/master and
feature2 with upstream feautre1, then once feature1 is merged I'd still
like "git log @{u}.." and "git rebase" without an explicit upstream to
work when feature2 is checked out. If "git branch --prune-merged
origin/master" deletes feautre1 then those commands stop working. Maybe
it would be sensible to update feature2's upstream once feature1 is
merged (which I think is what you're saying above) but do we really want
to force the user to do that by deleting feature1?
Thanks
Phillip
^ permalink raw reply
* Re: [PATCH] sequencer: Skip copying notes for commits that disappear during rebase
From: Uwe Kleine-König @ 2026-06-19 13:01 UTC (permalink / raw)
To: Phillip Wood; +Cc: Junio C Hamano, git
In-Reply-To: <67dbfb5c-5f07-49b8-aa32-a4635c585028@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 366 bytes --]
Hello Phillip,
On Fri, Jun 19, 2026 at 11:13:32AM +0100, Phillip Wood wrote:
> I'm happy to take this forward and try and fix at least some of the other
> bugs I've listed above. Uwe - if I don't cc you on some patches within the
> next couple of weeks please feel free to send a reminder.
Very appreciated! Looking forward to test your patches.
Best regards
Uwe
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH v3 3/4] history: add squash subcommand to fold a range
From: Patrick Steinhardt @ 2026-06-19 12:55 UTC (permalink / raw)
To: Harald Nordgren via GitGitGadget; +Cc: git, Harald Nordgren
In-Reply-To: <66b2f49fb427c7328136b2d440dc7461b97fb4e0.1781810227.git.gitgitgadget@gmail.com>
On Thu, Jun 18, 2026 at 07:17:05PM +0000, Harald Nordgren via GitGitGadget wrote:
> diff --git a/builtin/history.c b/builtin/history.c
> index 305bde3102..9d9416870f 100644
> --- a/builtin/history.c
> +++ b/builtin/history.c
> @@ -973,6 +975,156 @@ out:
> return ret;
> }
>
> +/*
> + * Resolve a "<base>..<tip>" revision range into the base commit just outside
> + * the range (which becomes the parent of the squashed commit), the oldest
> + * commit contained in the range (whose message the squash reuses), and the
> + * range tip (whose tree becomes the result). A merge inside the range is fine,
> + * but the range must have a single base and must not reach a root commit.
> + */
> +static int resolve_squash_range(struct repository *repo,
> + const char *range,
> + struct commit **base_out,
> + struct commit **oldest_out,
> + struct commit **tip_out)
> +{
> + struct rev_info revs;
> + struct commit *commit, *base = NULL, *oldest = NULL, *tip = NULL;
> + struct strvec args = STRVEC_INIT;
> + int ret;
> +
> + repo_init_revisions(repo, &revs, NULL);
> + strvec_push(&args, "ignored");
> + strvec_push(&args, "--reverse");
> + strvec_push(&args, "--topo-order");
> + strvec_push(&args, "--boundary");
> + strvec_push(&args, range);
We don't have any kind of input verification for "range". So in theory,
the user could pass whatever string here, and this may or may not work.
Also, should we use "--ancestry-path" with the first commit of the range
here? Otherwise we may incldue commits that aren't descendants of A in a
range "A..B". If not I wonder whether we might see multiple boundaries
even though we would be able to resolve the boundary unambiguously in
some cases.
> + setup_revisions_from_strvec(&args, &revs, NULL);
> + if (args.nr != 1) {
> + ret = error(_("'%s' does not name a revision range"), range);
> + goto out;
> + }
> +
> + if (prepare_revision_walk(&revs) < 0) {
> + ret = error(_("error preparing revisions"));
> + goto out;
> + }
> +
> + while ((commit = get_revision(&revs))) {
> + if (commit->object.flags & BOUNDARY) {
> + if (base) {
> + ret = error(_("range '%s' has more than one base; "
> + "cannot squash"), range);
> + goto out;
> + }
> + base = commit;
> + continue;
> + }
> + if (!oldest)
> + oldest = commit;
> + tip = commit;
> + }
Hmm. I really wonder whether we should also restrict merges. It might be
somewhat obvious that intermediate merge commits should just be
discarded. But is that equally obvious for HEAD and the base commit?
> + if (!oldest) {
> + ret = error(_("the range '%s' is empty"), range);
> + goto out;
> + }
> +
> + if (!base) {
> + ret = error(_("cannot squash the root commit"));
> + goto out;
> + }
In theory we can by squashing onto an empty tree. But it's fine to not
care about this edge case, we can still address it at a later point in
time if we ever feel the need to.
> + *base_out = base;
> + *oldest_out = oldest;
> + *tip_out = tip;
> + ret = 0;
> +
> +out:
> + reset_revision_walk();
> + release_revisions(&revs);
> + strvec_clear(&args);
> + return ret;
> +}
> +
> +static int cmd_history_squash(int argc,
> + const char **argv,
> + const char *prefix,
> + struct repository *repo)
> +{
> + const char * const usage[] = {
> + GIT_HISTORY_SQUASH_USAGE,
> + NULL,
> + };
> + enum ref_action action = REF_ACTION_DEFAULT;
> + enum commit_tree_flags flags = 0;
> + int dry_run = 0;
> + struct option options[] = {
> + OPT_CALLBACK_F(0, "update-refs", &action, "(branches|head)",
> + N_("control which refs should be updated"),
> + PARSE_OPT_NONEG, parse_ref_action),
> + OPT_BOOL('n', "dry-run", &dry_run,
> + N_("perform a dry-run without updating any refs")),
> + OPT_BIT(0, "reedit-message", &flags,
> + N_("open an editor to modify the commit message"),
> + COMMIT_TREE_EDIT_MESSAGE),
> + OPT_END(),
> + };
> + struct strbuf reflog_msg = STRBUF_INIT;
> + struct commit *base, *oldest, *tip, *rewritten;
> + const struct object_id *base_tree_oid, *tip_tree_oid;
> + struct commit_list *parents = NULL;
> + struct rev_info revs = { 0 };
> + int ret;
> +
> + argc = parse_options(argc, argv, prefix, options, usage, 0);
> + if (argc != 1) {
> + ret = error(_("command expects a single revision range"));
> + goto out;
> + }
> + repo_config(repo, git_default_config, NULL);
> +
> + if (action == REF_ACTION_DEFAULT)
> + action = REF_ACTION_BRANCHES;
> +
> + ret = resolve_squash_range(repo, argv[0], &base, &oldest, &tip);
> + if (ret < 0)
> + goto out;
> +
> + ret = setup_revwalk(repo, action, tip, &revs);
> + if (ret < 0)
> + goto out;
Oh, you already use `setup_revwalk()` here. Wouldn't that keep us from
accepting merge commits?
Patrick
^ permalink raw reply
* Re: [PATCH v3 0/4] history: add squash subcommand to fold a range
From: Patrick Steinhardt @ 2026-06-19 12:37 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Harald Nordgren via GitGitGadget, git, Harald Nordgren
In-Reply-To: <xmqqo6h7nza3.fsf@gitster.g>
On Thu, Jun 18, 2026 at 05:34:44PM -0700, Junio C Hamano wrote:
> "Harald Nordgren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > Adds git history squash <revision-range> to fold a range of commits into its
> > oldest one, reusing that commit's message and replaying any descendants on
> > top.
>
> One thing that just occurred to me.
>
> When you have a linear history
>
> o---A---B---C
>
> you run "git history squash A..C" and come to
>
> o---X
>
> where the tree of X is the same as C, with the log message of A
> reused for it. That is simple, clean, and easy to explain.
>
> But what should happen to refs (i.e., branch head) that point at A
> or B?
It's a very good question. I had `git history squash` in my backlog for
a while, and this very question made me defer that topic repeatedly.
> I am adressing this message to Patrick as this question relates to
> the grand vision for the "git history" command. I think "git
> replay" wants to rewrite all the refs that are involved in the
> rewrite operation, while "git rebase" (without "--update-refs")
> wants to leave all others refs intact and update only the branch it
> was told to rewrite. Is it the same design as "rebase" and
> "--update-refs" controls if we update _other_ refs that happened to
> be in the range that are rewritten?
Yeah.
> Now, assuming that there do exist a mode where the command can
> update these refs that point into the history that got rewritten,
> there probably are at least two possibilities.
>
> On one hand, I think it is reasonable to _remove_ these refs that
> used to point at a section of history that disappeared (like the one
> that were pointing at A or B). Perhaps A and B were pointed at by
> two branches or tags that were used to mark "up to this point things
> are broken" and "from here on things are fixed" (i.e., imagine a
> manual bisection). After squashing all of the commits in this
> section of history, the result no longer has such transition points.
I think just pruning references would be extremely surprising to our
users.
> It also is plausible that users may want these refs that used to
> point at A or B to point at X, just like the ref that used to point
> at C would now point at X, even though I cannot offhand think of a
> good story (like "there used to be transtion points, now there
> isn't" I said above to explain why these refs should disappear) to
> support such a behaviour.
>
> Thoughts?
There are two more modes:
- If a reference points at an intermediate commit then it stays there.
- We detect this case and reject the update. Optionally, we may ask
the user what they intend to do with those other refs.
It really is kind of ambiguous what is supposed to happen, and I can
think of different scenarios where each of the possibilities would be
the best choice. So ultimately, I think the last option is the best one,
as it also gives us a way to iterate.
If so, a user would already be able to achieve that other refs keep
pointing at X by saying `git history squash --update-refs=head`. The
other modes can then be added at a later point in time as the need
arises.
Patrick
^ permalink raw reply
* [PATCH v4 10/10] refs: drop local buffer in `refs_compute_filesystem_location()`
From: Patrick Steinhardt @ 2026-06-19 11:27 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Justin Tobler
In-Reply-To: <20260619-b4-pks-refs-avoid-chdir-notify-reparent-v4-0-a6472be7acc4@pks.im>
We're using a local buffer in `refs_compute_filesystem_location()` that
is only used so that we can fill it and then call `strbuf_realpath()` on
its result. This roundtrip isn't necessary though: `strbuf_realpath()`
already knows to use a single buffer as both input and output at the
same time. So all this does is to add a bit of confusion and an extra
memory allocation.
Drop the local buffer.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/refs.c b/refs.c
index f242e6ca96..582dbeff0a 100644
--- a/refs.c
+++ b/refs.c
@@ -3570,8 +3570,6 @@ void refs_compute_filesystem_location(const char *gitdir, const char *payload,
bool *is_worktree, struct strbuf *refdir,
struct strbuf *ref_common_dir)
{
- struct strbuf sb = STRBUF_INIT;
-
*is_worktree = get_common_dir_noenv(ref_common_dir, gitdir);
if (!payload) {
@@ -3585,8 +3583,8 @@ void refs_compute_filesystem_location(const char *gitdir, const char *payload,
}
if (!is_absolute_path(payload)) {
- strbuf_addf(&sb, "%s/%s", ref_common_dir->buf, payload);
- strbuf_realpath(ref_common_dir, sb.buf, 1);
+ strbuf_addf(ref_common_dir, "/%s", payload);
+ strbuf_realpath(ref_common_dir, ref_common_dir->buf, 1);
} else {
strbuf_realpath(ref_common_dir, payload, 1);
}
@@ -3599,6 +3597,4 @@ void refs_compute_filesystem_location(const char *gitdir, const char *payload,
BUG("worktree path does not contain slash");
strbuf_addf(refdir, "/worktrees/%s", wt_id + 1);
}
-
- strbuf_release(&sb);
}
--
2.55.0.rc1.722.g2b3ac350e6.dirty
^ permalink raw reply related
* [PATCH v4 09/10] refs: fix recursing `get_main_ref_store()` with "onbranch" config
From: Patrick Steinhardt @ 2026-06-19 11:27 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Justin Tobler
In-Reply-To: <20260619-b4-pks-refs-avoid-chdir-notify-reparent-v4-0-a6472be7acc4@pks.im>
When we have an "onbranch" condition we need to ask the reference
database whether HEAD currently points at the configured branch. This
unfortunately creates a chicken-and-egg problem:
- The reference database needs to read the configuration so that it
can configure itself.
- The configuration needs to construct a reference database to fully
parse all of its conditionals.
The way we handle this is by simply excluding "onbranch" conditionals
when we haven't yet configured the reference database.
The mechanism for this is broken though: to verify whether or not we
have configured the reference database we check whether its format is
set to `REF_STORAGE_UNKNOWN` in `include_by_branch()`. But typically,
the format _is_ already known at that time because we set it up during
repository discovery in "setup.c".
The consequence is that we recurse:
1. We call `get_main_ref_store()`.
2. We don't yet have a reference store, so we call `ref_store_init()`.
3. We parse the configuration required for the reference store.
4. We eventually end up in `include_by_branch()`.
5. We have already configured the reference storage format, so we end
up calling `get_main_ref_store()` again.
We still haven't finished (1) though, so `get_main_ref_store()` will now
call `ref_store_init()` a second time. The end result is that we have
constructed the same reference store twice.
Of course, as both reference stores would be assigned to `refs_private`,
we leak one of those two instances. This never surfaced as an actual
leak though because the pointer is kept alive by the "chdir_notify"
subsystem.
The mechanism to use the configured reference format is quite fragile in
the first place. Introduce a new mechanism that allows us to explicitly
skip evaluation of "onbranch" conditions and use it to fix the issue.
Add a sanity check in `get_main_ref_store()` to make sure we aren't
recursing, which would have failed before the fix.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
config.c | 4 +++-
config.h | 1 +
refs.c | 7 +++++++
refs/files-backend.c | 8 +++++++-
refs/reftable-backend.c | 8 +++++++-
5 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/config.c b/config.c
index a1b92fe083..223c252236 100644
--- a/config.c
+++ b/config.c
@@ -302,7 +302,9 @@ static int include_by_branch(struct config_include_data *data,
struct strbuf pattern = STRBUF_INIT;
const char *refname, *shortname;
- if (!data->repo || data->repo->ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN)
+ if (!data->repo ||
+ data->opts->ignore_refs ||
+ data->repo->ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN)
return 0;
refname = refs_resolve_ref_unsafe(get_main_ref_store(data->repo),
diff --git a/config.h b/config.h
index bf47fb3afc..42aedde878 100644
--- a/config.h
+++ b/config.h
@@ -88,6 +88,7 @@ typedef int (*config_parser_event_fn_t)(enum config_event_t type,
struct config_options {
unsigned int respect_includes : 1;
unsigned int ignore_repo : 1;
+ unsigned int ignore_refs : 1;
unsigned int ignore_worktree : 1;
unsigned int ignore_cmdline : 1;
unsigned int system_gently : 1;
diff --git a/refs.c b/refs.c
index 5b773b1c15..f242e6ca96 100644
--- a/refs.c
+++ b/refs.c
@@ -2359,15 +2359,22 @@ void ref_store_release(struct ref_store *ref_store)
struct ref_store *get_main_ref_store(struct repository *r)
{
+ static bool initializing;
+
if (r->refs_private)
return r->refs_private;
if (!r->gitdir)
BUG("attempting to get main_ref_store outside of repository");
+ if (initializing)
+ BUG("main reference store creation is recursing");
+ initializing = true;
r->refs_private = ref_store_init(r, r->ref_storage_format,
r->gitdir, REF_STORE_ALL_CAPS);
r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
+ initializing = false;
+
return r->refs_private;
}
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 79fb6735e1..ce29875cdd 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -141,6 +141,12 @@ static struct ref_store *files_ref_store_init(struct repository *repo,
const char *gitdir,
const struct ref_store_init_options *opts)
{
+ struct config_options config_opts = {
+ .respect_includes = 1,
+ .ignore_refs = 1,
+ .commondir = repo->commondir,
+ .git_dir = repo->gitdir,
+ };
struct files_ref_store *refs = xcalloc(1, sizeof(*refs));
struct ref_store *ref_store = (struct ref_store *)refs;
struct strbuf ref_common_dir = STRBUF_INIT;
@@ -158,7 +164,7 @@ static struct ref_store *files_ref_store_init(struct repository *repo,
refs->store_flags = opts->access_flags;
refs->log_all_ref_updates = LOG_REFS_UNSET;
- repo_config(repo, files_ref_store_config, refs);
+ config_with_options(files_ref_store_config, refs, NULL, repo, &config_opts);
chdir_notify_register(NULL, files_ref_store_reparent, refs);
strbuf_release(&refdir);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index ee92bd9c70..05d4edc6fd 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -390,6 +390,12 @@ static struct ref_store *reftable_be_init(struct repository *repo,
const char *gitdir,
const struct ref_store_init_options *opts)
{
+ struct config_options config_opts = {
+ .respect_includes = 1,
+ .ignore_refs = 1,
+ .commondir = repo->commondir,
+ .git_dir = repo->gitdir,
+ };
struct reftable_ref_store *refs = xcalloc(1, sizeof(*refs));
struct strbuf ref_common_dir = STRBUF_INIT;
struct strbuf refdir = STRBUF_INIT;
@@ -424,7 +430,7 @@ static struct ref_store *reftable_be_init(struct repository *repo,
refs->write_options.lock_timeout_ms = 100;
refs->log_all_ref_updates = LOG_REFS_UNSET;
- repo_config(repo, reftable_be_config, refs);
+ config_with_options(reftable_be_config, refs, NULL, repo, &config_opts);
/*
* It is somewhat unfortunate that we have to mirror the default block
--
2.55.0.rc1.722.g2b3ac350e6.dirty
^ permalink raw reply related
* [PATCH v4 08/10] refs/reftable-backend: manually parse "core.sharedRepository"
From: Patrick Steinhardt @ 2026-06-19 11:27 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Justin Tobler
In-Reply-To: <20260619-b4-pks-refs-avoid-chdir-notify-reparent-v4-0-a6472be7acc4@pks.im>
We're using `calc_shared_perm()` when creating a reftable repository.
This function internally uses `repo_settings_get_shared_repository()`,
which results in the same chicken-and-egg problem as mentioned in the
preceding commit.
Prepare for a fix by handling parsing of "core.sharedRepository"
manually in `reftable_be_config()` so that we have full control over how
exactly this configuration is read.
Note that this change requires a small reording in "setup.c" when
creating the repositroy, as we only write "core.sharedRepository" into
the configuration after we've already created the reference database.
This is too late though now that we parse the value directly from the
configuration, so we have to reverse the order.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
path.c | 11 ++++++-----
path.h | 2 +-
refs/reftable-backend.c | 8 +++++++-
setup.c | 8 ++++----
4 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/path.c b/path.c
index d7e17bf174..c28b057374 100644
--- a/path.c
+++ b/path.c
@@ -736,11 +736,10 @@ char *interpolate_path(const char *path, int real_home)
return NULL;
}
-int calc_shared_perm(struct repository *repo,
- int mode)
+int calc_shared_perm(int shared_repo, int mode)
{
int tweak;
- int shared_repo = repo_settings_get_shared_repository(repo);
+
if (shared_repo < 0)
tweak = -shared_repo;
else
@@ -763,13 +762,15 @@ int adjust_shared_perm(struct repository *repo,
const char *path)
{
int old_mode, new_mode;
+ int shared_repository;
- if (!repo_settings_get_shared_repository(repo))
+ shared_repository = repo_settings_get_shared_repository(repo);
+ if (!shared_repository)
return 0;
if (get_st_mode_bits(path, &old_mode) < 0)
return -1;
- new_mode = calc_shared_perm(repo, old_mode);
+ new_mode = calc_shared_perm(shared_repository, old_mode);
if (S_ISDIR(old_mode)) {
/* Copy read bits to execute bits */
new_mode |= (new_mode & 0444) >> 2;
diff --git a/path.h b/path.h
index 0434ba5e07..1188dc4729 100644
--- a/path.h
+++ b/path.h
@@ -145,7 +145,7 @@ const char *git_path_shallow(struct repository *r);
int ends_with_path_components(const char *path, const char *components);
-int calc_shared_perm(struct repository *repo, int mode);
+int calc_shared_perm(int shared_repository, int mode);
int adjust_shared_perm(struct repository *repo, const char *path);
char *interpolate_path(const char *path, int real_home);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 5115a3f4ce..ee92bd9c70 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -362,6 +362,11 @@ static int reftable_be_config(const char *var, const char *value,
refs->write_options.lock_timeout_ms = lock_timeout;
} else if (!strcmp(var, "core.logallrefupdates")) {
refs->log_all_ref_updates = refs_parse_log_all_ref_updates_config(value);
+ } else if (!strcmp(var, "core.sharedrepository")) {
+ mode_t mask = umask(0);
+ umask(mask);
+ refs->write_options.default_permissions = calc_shared_perm(git_config_perm(var, value),
+ 0666 & ~mask);
}
return 0;
@@ -412,7 +417,8 @@ static struct ref_store *reftable_be_init(struct repository *repo,
default:
BUG("unknown hash algorithm %d", repo->hash_algo->format_id);
}
- refs->write_options.default_permissions = calc_shared_perm(repo, 0666 & ~mask);
+
+ refs->write_options.default_permissions = 0666 & ~mask;
refs->write_options.disable_auto_compact =
!git_env_bool("GIT_TEST_REFTABLE_AUTOCOMPACTION", 1);
refs->write_options.lock_timeout_ms = 100;
diff --git a/setup.c b/setup.c
index 0c6efb0560..03ff359070 100644
--- a/setup.c
+++ b/setup.c
@@ -2846,10 +2846,6 @@ int init_db(struct repository *repo,
reinit = create_default_files(repo, template_dir, original_git_dir,
&repo_fmt, init_shared_repository);
- if (!(flags & INIT_DB_SKIP_REFDB))
- create_reference_database(repo, initial_branch, flags & INIT_DB_QUIET);
- create_object_directory(repo);
-
if (repo_settings_get_shared_repository(repo)) {
char buf[10];
/* We do not spell "group" and such, so that
@@ -2871,6 +2867,10 @@ int init_db(struct repository *repo,
repo_config_set(repo, "receive.denyNonFastforwards", "true");
}
+ if (!(flags & INIT_DB_SKIP_REFDB))
+ create_reference_database(repo, initial_branch, flags & INIT_DB_QUIET);
+ create_object_directory(repo);
+
if (!(flags & INIT_DB_QUIET)) {
int len = strlen(git_dir);
--
2.55.0.rc1.722.g2b3ac350e6.dirty
^ permalink raw reply related
* [PATCH v4 07/10] refs: move parsing of "core.logAllRefUpdates" back into ref stores
From: Patrick Steinhardt @ 2026-06-19 11:27 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Justin Tobler
In-Reply-To: <20260619-b4-pks-refs-avoid-chdir-notify-reparent-v4-0-a6472be7acc4@pks.im>
In cc42c88945 (refs: extract out reflog config to generic layer,
2026-05-04) we have refactored how we parse "core.logAllRefUpdates" so
that it happens in the generic layer. Unfortunately, this has worsened a
preexisting issue where we may recurse when creating the reference store
because of a chicken-and-egg problem between parsing the configuration
and evaluating "onbranch" conditions.
Prepare for a fix by essentially reverting that change so that we handle
this setting in the respective backends again. The backends are already
parsing other configuration anyway, so by moving the logic back in there
we can ensure that all backend configuration is parsed the same way.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/checkout.c | 7 +++++--
refs.c | 10 +++++++++-
refs.h | 9 +++++++++
refs/files-backend.c | 20 +++++++++++++++++---
refs/refs-internal.h | 6 ------
refs/reftable-backend.c | 20 +++++++++++---------
repo-settings.c | 16 ----------------
repo-settings.h | 9 ---------
setup.c | 7 ++++++-
9 files changed, 57 insertions(+), 47 deletions(-)
diff --git a/builtin/checkout.c b/builtin/checkout.c
index b78b3a1d16..aee84ca897 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -952,10 +952,13 @@ static void update_refs_for_switch(const struct checkout_opts *opts,
const char *old_desc, *reflog_msg;
if (opts->new_branch) {
if (opts->new_orphan_branch) {
- enum log_refs_config log_all_ref_updates =
- repo_settings_get_log_all_ref_updates(the_repository);
+ enum log_refs_config log_all_ref_updates = LOG_REFS_UNSET;
+ const char *value;
char *refname;
+ if (!repo_config_get_string_tmp(the_repository, "core.logallrefupdates", &value))
+ log_all_ref_updates = refs_parse_log_all_ref_updates_config(value);
+
refname = mkpathdup("refs/heads/%s", opts->new_orphan_branch);
if (opts->new_branch_log &&
!should_autocreate_reflog(log_all_ref_updates, refname)) {
diff --git a/refs.c b/refs.c
index d3caa9a633..5b773b1c15 100644
--- a/refs.c
+++ b/refs.c
@@ -1053,6 +1053,15 @@ static char *normalize_reflog_message(const char *msg)
return strbuf_detach(&sb, NULL);
}
+enum log_refs_config refs_parse_log_all_ref_updates_config(const char *value)
+{
+ if (value && !strcasecmp(value, "always"))
+ return LOG_REFS_ALWAYS;
+ else if (git_config_bool("core.logallrefupdates", value))
+ return LOG_REFS_NORMAL;
+ return LOG_REFS_NONE;
+}
+
int should_autocreate_reflog(enum log_refs_config log_all_ref_updates,
const char *refname)
{
@@ -2327,7 +2336,6 @@ static struct ref_store *ref_store_init(struct repository *repo,
struct ref_store *refs;
struct ref_store_init_options opts = {
.access_flags = flags,
- .log_all_ref_updates = repo_settings_get_log_all_ref_updates(repo),
};
be = find_ref_storage_backend(format);
diff --git a/refs.h b/refs.h
index 71d5c186d0..a381022c77 100644
--- a/refs.h
+++ b/refs.h
@@ -146,6 +146,15 @@ enum ref_transaction_error refs_verify_refname_available(struct ref_store *refs,
int refs_ref_exists(struct ref_store *refs, const char *refname);
+enum log_refs_config {
+ LOG_REFS_UNSET = -1,
+ LOG_REFS_NONE = 0,
+ LOG_REFS_NORMAL,
+ LOG_REFS_ALWAYS
+};
+
+enum log_refs_config refs_parse_log_all_ref_updates_config(const char *value);
+
int should_autocreate_reflog(enum log_refs_config log_all_ref_updates,
const char *refname);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 296981584b..79fb6735e1 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -117,6 +117,21 @@ static void files_ref_store_reparent(const char *name UNUSED,
refs->gitcommondir = tmp;
}
+static int files_ref_store_config(const char *var, const char *value,
+ const struct config_context *ctx UNUSED,
+ void *payload)
+{
+ struct files_ref_store *refs = payload;
+
+ if (!strcmp(var, "core.prefersymlinkrefs")) {
+ refs->prefer_symlink_refs = git_config_bool(var, value);
+ } else if (!strcmp(var, "core.logallrefupdates")) {
+ refs->log_all_ref_updates = refs_parse_log_all_ref_updates_config(value);
+ }
+
+ return 0;
+}
+
/*
* Create a new submodule ref cache and add it to the internal
* set of caches.
@@ -141,10 +156,9 @@ static struct ref_store *files_ref_store_init(struct repository *repo,
refs->packed_ref_store =
packed_ref_store_init(repo, NULL, refs->gitcommondir, opts);
refs->store_flags = opts->access_flags;
- refs->log_all_ref_updates = opts->log_all_ref_updates;
-
- repo_config_get_bool(repo, "core.prefersymlinkrefs", &refs->prefer_symlink_refs);
+ refs->log_all_ref_updates = LOG_REFS_UNSET;
+ repo_config(repo, files_ref_store_config, refs);
chdir_notify_register(NULL, files_ref_store_reparent, refs);
strbuf_release(&refdir);
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index a08d58900e..c3ac7b556f 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -406,12 +406,6 @@ struct ref_store;
struct ref_store_init_options {
/* The kind of operations that the ref_store is allowed to perform. */
unsigned int access_flags;
-
- /*
- * Denotes under what conditions reflogs should be created when updating
- * references.
- */
- enum log_refs_config log_all_ref_updates;
};
/*
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 8c93070677..5115a3f4ce 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -332,34 +332,36 @@ static void fill_reftable_log_record(struct reftable_log_record *log, const stru
static int reftable_be_config(const char *var, const char *value,
const struct config_context *ctx,
- void *_opts)
+ void *payload)
{
- struct reftable_write_options *opts = _opts;
+ struct reftable_ref_store *refs = payload;
if (!strcmp(var, "reftable.blocksize")) {
unsigned long block_size = git_config_ulong(var, value, ctx->kvi);
if (block_size > 16777215)
die("reftable block size cannot exceed 16MB");
- opts->block_size = block_size;
+ refs->write_options.block_size = block_size;
} else if (!strcmp(var, "reftable.restartinterval")) {
unsigned long restart_interval = git_config_ulong(var, value, ctx->kvi);
if (restart_interval > UINT16_MAX)
die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);
- opts->restart_interval = restart_interval;
+ refs->write_options.restart_interval = restart_interval;
} else if (!strcmp(var, "reftable.indexobjects")) {
- opts->skip_index_objects = !git_config_bool(var, value);
+ refs->write_options.skip_index_objects = !git_config_bool(var, value);
} else if (!strcmp(var, "reftable.geometricfactor")) {
unsigned long factor = git_config_ulong(var, value, ctx->kvi);
if (factor > UINT8_MAX)
die("reftable geometric factor cannot exceed %u", (unsigned)UINT8_MAX);
- opts->auto_compaction_factor = factor;
+ refs->write_options.auto_compaction_factor = factor;
} else if (!strcmp(var, "reftable.locktimeout")) {
int64_t lock_timeout = git_config_int64(var, value, ctx->kvi);
if (lock_timeout > LONG_MAX)
die("reftable lock timeout cannot exceed %"PRIdMAX, (intmax_t)LONG_MAX);
if (lock_timeout < 0 && lock_timeout != -1)
die("reftable lock timeout does not support negative values other than -1");
- opts->lock_timeout_ms = lock_timeout;
+ refs->write_options.lock_timeout_ms = lock_timeout;
+ } else if (!strcmp(var, "core.logallrefupdates")) {
+ refs->log_all_ref_updates = refs_parse_log_all_ref_updates_config(value);
}
return 0;
@@ -398,7 +400,6 @@ static struct ref_store *reftable_be_init(struct repository *repo,
base_ref_store_init(&refs->base, repo, refdir.buf, &refs_be_reftable);
strmap_init(&refs->worktree_backends);
- refs->log_all_ref_updates = opts->log_all_ref_updates;
refs->store_flags = opts->access_flags;
switch (repo->hash_algo->format_id) {
@@ -415,8 +416,9 @@ static struct ref_store *reftable_be_init(struct repository *repo,
refs->write_options.disable_auto_compact =
!git_env_bool("GIT_TEST_REFTABLE_AUTOCOMPACTION", 1);
refs->write_options.lock_timeout_ms = 100;
+ refs->log_all_ref_updates = LOG_REFS_UNSET;
- repo_config(repo, reftable_be_config, &refs->write_options);
+ repo_config(repo, reftable_be_config, refs);
/*
* It is somewhat unfortunate that we have to mirror the default block
diff --git a/repo-settings.c b/repo-settings.c
index 208e09ff17..f3be3b8c5a 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -177,22 +177,6 @@ void repo_settings_set_big_file_threshold(struct repository *repo, unsigned long
repo->settings.big_file_threshold = value;
}
-enum log_refs_config repo_settings_get_log_all_ref_updates(struct repository *repo)
-{
- const char *value;
-
- if (!repo_config_get_string_tmp(repo, "core.logallrefupdates", &value)) {
- if (value && !strcasecmp(value, "always"))
- return LOG_REFS_ALWAYS;
- else if (git_config_bool("core.logallrefupdates", value))
- return LOG_REFS_NORMAL;
- else
- return LOG_REFS_NONE;
- }
-
- return LOG_REFS_UNSET;
-}
-
int repo_settings_get_warn_ambiguous_refs(struct repository *repo)
{
prepare_repo_settings(repo);
diff --git a/repo-settings.h b/repo-settings.h
index cad9c3f0cc..e5253ead02 100644
--- a/repo-settings.h
+++ b/repo-settings.h
@@ -16,13 +16,6 @@ enum fetch_negotiation_setting {
FETCH_NEGOTIATION_NOOP,
};
-enum log_refs_config {
- LOG_REFS_UNSET = -1,
- LOG_REFS_NONE = 0,
- LOG_REFS_NORMAL,
- LOG_REFS_ALWAYS
-};
-
struct repo_settings {
int initialized;
@@ -86,8 +79,6 @@ struct repo_settings {
void prepare_repo_settings(struct repository *r);
void repo_settings_clear(struct repository *r);
-/* Read the value for "core.logAllRefUpdates". */
-enum log_refs_config repo_settings_get_log_all_ref_updates(struct repository *repo);
/* Read the value for "core.warnAmbiguousRefs". */
int repo_settings_get_warn_ambiguous_refs(struct repository *repo);
/* Read the value for "core.hooksPath". */
diff --git a/setup.c b/setup.c
index 79125db565..0c6efb0560 100644
--- a/setup.c
+++ b/setup.c
@@ -2584,10 +2584,15 @@ static int create_default_files(struct repository *repo,
if (is_bare_repository())
repo_config_set(repo, "core.bare", "true");
else {
+ const char *value;
+
repo_config_set(repo, "core.bare", "false");
+
/* allow template config file to override the default */
- if (repo_settings_get_log_all_ref_updates(repo) == LOG_REFS_UNSET)
+ if (repo_config_get_string_tmp(repo, "core.logallrefupdates", &value) ||
+ refs_parse_log_all_ref_updates_config(value) == LOG_REFS_UNSET)
repo_config_set(repo, "core.logallrefupdates", "true");
+
if (needs_work_tree_config(original_git_dir, work_tree))
repo_config_set(repo, "core.worktree", work_tree);
}
--
2.55.0.rc1.722.g2b3ac350e6.dirty
^ permalink raw reply related
* [PATCH v4 06/10] repository: free main reference database
From: Patrick Steinhardt @ 2026-06-19 11:27 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Justin Tobler
In-Reply-To: <20260619-b4-pks-refs-avoid-chdir-notify-reparent-v4-0-a6472be7acc4@pks.im>
While we release worktree and submodule reference databases when
clearing a repository, we don't ever release the main reference
database. This memory leak went unnoticed because its pointer is
kept alive by the "chdir_notify" subsystem.
Fix the memory leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
repository.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/repository.c b/repository.c
index 187dd471c4..e2b5c6712b 100644
--- a/repository.c
+++ b/repository.c
@@ -421,6 +421,11 @@ void repo_clear(struct repository *repo)
FREE_AND_NULL(repo->remote_state);
}
+ if (repo->refs_private) {
+ ref_store_release(repo->refs_private);
+ FREE_AND_NULL(repo->refs_private);
+ }
+
strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e)
ref_store_release(e->value);
strmap_clear(&repo->submodule_ref_stores, 1);
--
2.55.0.rc1.722.g2b3ac350e6.dirty
^ permalink raw reply related
* [PATCH v4 05/10] chdir-notify: drop unused `chdir_notify_reparent()`
From: Patrick Steinhardt @ 2026-06-19 11:27 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Justin Tobler
In-Reply-To: <20260619-b4-pks-refs-avoid-chdir-notify-reparent-v4-0-a6472be7acc4@pks.im>
With the preceding commit we've removed all callers of
`chdir_notify_reparent()`, so the function is unused now. Drop it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
chdir-notify.c | 26 --------------------------
chdir-notify.h | 6 +-----
2 files changed, 1 insertion(+), 31 deletions(-)
diff --git a/chdir-notify.c b/chdir-notify.c
index f8bfe3cbef..1237a45e2e 100644
--- a/chdir-notify.c
+++ b/chdir-notify.c
@@ -43,32 +43,6 @@ void chdir_notify_unregister(const char *name, chdir_notify_callback cb,
}
}
-static void reparent_cb(const char *name,
- const char *old_cwd,
- const char *new_cwd,
- void *data)
-{
- char **path = data;
- char *tmp = *path;
-
- if (!tmp)
- return;
-
- *path = reparent_relative_path(old_cwd, new_cwd, tmp);
- free(tmp);
-
- if (name) {
- trace_printf_key(&trace_setup_key,
- "setup: reparent %s to '%s'",
- name, *path);
- }
-}
-
-void chdir_notify_reparent(const char *name, char **path)
-{
- chdir_notify_register(name, reparent_cb, path);
-}
-
int chdir_notify(const char *new_cwd)
{
struct strbuf old_cwd = STRBUF_INIT;
diff --git a/chdir-notify.h b/chdir-notify.h
index 81eb69d846..36b4114472 100644
--- a/chdir-notify.h
+++ b/chdir-notify.h
@@ -19,10 +19,7 @@
* chdir_notify_register("description", foo, data);
*
* In practice most callers will want to move a relative path to the new root;
- * they can use the reparent_relative_path() helper for that. If that's all
- * you're doing, you can also use the convenience function:
- *
- * chdir_notify_reparent("description", &my_path);
+ * they can use the reparent_relative_path() helper for that.
*
* Whenever a chdir event occurs, that will update my_path (if it's relative)
* to adjust for the new cwd by freeing any existing string and allocating a
@@ -43,7 +40,6 @@ typedef void (*chdir_notify_callback)(const char *name,
void chdir_notify_register(const char *name, chdir_notify_callback cb, void *data);
void chdir_notify_unregister(const char *name, chdir_notify_callback cb,
void *data);
-void chdir_notify_reparent(const char *name, char **path);
/*
*
--
2.55.0.rc1.722.g2b3ac350e6.dirty
^ permalink raw reply related
* [PATCH v4 04/10] refs: unregister reference stores from "chdir_notify"
From: Patrick Steinhardt @ 2026-06-19 11:27 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Justin Tobler
In-Reply-To: <20260619-b4-pks-refs-avoid-chdir-notify-reparent-v4-0-a6472be7acc4@pks.im>
When creating reference stores we register them with the "chdir_notify"
subsystem. This is required because some of the paths we track may be
relative paths, so we have to reparent them in case the current working
directory changes.
But while we register the reference stores, we never unregister them.
This can have multiple outcomes:
- For a repository's main reference database we essentially keep the
pointer alive. We never free that database, either, and our leak
checker doesn't notice because it's still registered.
- For submodule and worktree reference databases we do eventually free
them in `repo_clear()`, so we may keep pointers to free'd memory
registered. We never notice though as we don't tend to chdir around
in the middle of the process.
We never noticed either of these symptoms, but they are obviously bad.
Partially fix those issues by unregistering the reference stores when
releasing them. The leak of the main reference database will be fixed in
a subsequent commit.
Note that this requires us to use `chdir_notify_register()` instead of
`chdir_notify_reparent()`, as there is no infrastructure to unregister the
latter.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
refs/files-backend.c | 22 +++++++++++++++++++---
refs/packed-backend.c | 16 +++++++++++++++-
refs/reftable-backend.c | 16 +++++++++++++++-
3 files changed, 49 insertions(+), 5 deletions(-)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index a4c7858787..296981584b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -100,6 +100,23 @@ static void clear_loose_ref_cache(struct files_ref_store *refs)
}
}
+static void files_ref_store_reparent(const char *name UNUSED,
+ const char *old_cwd,
+ const char *new_cwd,
+ void *payload)
+{
+ struct files_ref_store *refs = payload;
+ char *tmp;
+
+ tmp = reparent_relative_path(old_cwd, new_cwd, refs->base.gitdir);
+ free(refs->base.gitdir);
+ refs->base.gitdir = tmp;
+
+ tmp = reparent_relative_path(old_cwd, new_cwd, refs->gitcommondir);
+ free(refs->gitcommondir);
+ refs->gitcommondir = tmp;
+}
+
/*
* Create a new submodule ref cache and add it to the internal
* set of caches.
@@ -128,9 +145,7 @@ static struct ref_store *files_ref_store_init(struct repository *repo,
repo_config_get_bool(repo, "core.prefersymlinkrefs", &refs->prefer_symlink_refs);
- chdir_notify_reparent("files-backend $GIT_DIR", &refs->base.gitdir);
- chdir_notify_reparent("files-backend $GIT_COMMONDIR",
- &refs->gitcommondir);
+ chdir_notify_register(NULL, files_ref_store_reparent, refs);
strbuf_release(&refdir);
@@ -182,6 +197,7 @@ static void files_ref_store_release(struct ref_store *ref_store)
free(refs->gitcommondir);
ref_store_release(refs->packed_ref_store);
free(refs->packed_ref_store);
+ chdir_notify_unregister(NULL, files_ref_store_reparent, refs);
}
static void files_reflog_path(struct files_ref_store *refs,
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 0acde48c45..499cb55dfa 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -211,6 +211,19 @@ static size_t snapshot_hexsz(const struct snapshot *snapshot)
return snapshot->refs->base.repo->hash_algo->hexsz;
}
+static void packed_ref_store_reparent(const char *name UNUSED,
+ const char *old_cwd,
+ const char *new_cwd,
+ void *payload)
+{
+ struct packed_ref_store *refs = payload;
+ char *tmp;
+
+ tmp = reparent_relative_path(old_cwd, new_cwd, refs->path);
+ free(refs->path);
+ refs->path = tmp;
+}
+
/*
* Since packed-refs is only stored in the common dir, don't parse the
* payload and rely on the files-backend to set 'gitdir' correctly.
@@ -229,7 +242,7 @@ struct ref_store *packed_ref_store_init(struct repository *repo,
strbuf_addf(&sb, "%s/packed-refs", gitdir);
refs->path = strbuf_detach(&sb, NULL);
- chdir_notify_reparent("packed-refs", &refs->path);
+ chdir_notify_register(NULL, packed_ref_store_reparent, refs);
return ref_store;
}
@@ -274,6 +287,7 @@ static void packed_ref_store_release(struct ref_store *ref_store)
clear_snapshot(refs);
rollback_lock_file(&refs->lock);
delete_tempfile(&refs->tempfile);
+ chdir_notify_unregister(NULL, packed_ref_store_reparent, refs);
free(refs->path);
}
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 4ae22922de..8c93070677 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -365,6 +365,19 @@ static int reftable_be_config(const char *var, const char *value,
return 0;
}
+static void reftable_be_reparent(const char *name UNUSED,
+ const char *old_cwd,
+ const char *new_cwd,
+ void *payload)
+{
+ struct reftable_ref_store *refs = payload;
+ char *tmp;
+
+ tmp = reparent_relative_path(old_cwd, new_cwd, refs->base.gitdir);
+ free(refs->base.gitdir);
+ refs->base.gitdir = tmp;
+}
+
static struct ref_store *reftable_be_init(struct repository *repo,
const char *payload,
const char *gitdir,
@@ -447,7 +460,7 @@ static struct ref_store *reftable_be_init(struct repository *repo,
goto done;
}
- chdir_notify_reparent("reftables-backend $GIT_DIR", &refs->base.gitdir);
+ chdir_notify_register(NULL, reftable_be_reparent, refs);
done:
assert(refs->err != REFTABLE_API_ERROR);
@@ -474,6 +487,7 @@ static void reftable_be_release(struct ref_store *ref_store)
free(be);
}
strmap_clear(&refs->worktree_backends, 0);
+ chdir_notify_unregister(NULL, reftable_be_reparent, refs);
}
static int reftable_be_create_on_disk(struct ref_store *ref_store,
--
2.55.0.rc1.722.g2b3ac350e6.dirty
^ permalink raw reply related
* [PATCH v4 03/10] setup: don't apply "GIT_REFERENCE_BACKEND" without a repository
From: Patrick Steinhardt @ 2026-06-19 11:27 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Justin Tobler
In-Reply-To: <20260619-b4-pks-refs-avoid-chdir-notify-reparent-v4-0-a6472be7acc4@pks.im>
When discovering a repository we eventually also apply the
"GIT_REFERENCE_BACKEND" environment variable to the repository. There's
two problems with that:
- We do this unconditionally, which is rather pointless: we really
only have to configure the repository when we have found one.
- We have already applied the repository format at that point in time,
so we need to manually reapply it.
Move the logic around so that we only apply the environment variable
when a repository was discovered. This also allows us to drop the
explcit call to `repo_set_ref_storage_format()` because we now adjust
the format before we apply it via `apply_repository_format()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
setup.c | 39 +++++++++++++++++++--------------------
1 file changed, 19 insertions(+), 20 deletions(-)
diff --git a/setup.c b/setup.c
index 2748155964..79125db565 100644
--- a/setup.c
+++ b/setup.c
@@ -1906,7 +1906,6 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
static struct strbuf cwd = STRBUF_INIT;
struct strbuf dir = STRBUF_INIT, gitdir = STRBUF_INIT, report = STRBUF_INIT;
const char *prefix = NULL;
- const char *ref_backend_uri;
struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
/*
@@ -2032,6 +2031,25 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
if (startup_info->have_repository) {
struct strbuf err = STRBUF_INIT;
+ const char *ref_backend_uri;
+
+ /*
+ * The env variable should override the repository config
+ * for 'extensions.refStorage'.
+ */
+ ref_backend_uri = getenv(GIT_REFERENCE_BACKEND_ENVIRONMENT);
+ if (ref_backend_uri) {
+ char *format;
+
+ free(repo_fmt.ref_storage_payload);
+
+ parse_reference_uri(ref_backend_uri, &format, &repo_fmt.ref_storage_payload);
+ repo_fmt.ref_storage_format = ref_storage_format_by_name(format);
+ if (repo_fmt.ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN)
+ die(_("unknown ref storage format: '%s'"), format);
+
+ free(format);
+ }
if (apply_repository_format(repo, &repo_fmt,
APPLY_REPOSITORY_FORMAT_HONOR_ENV, &err) < 0)
@@ -2057,25 +2075,6 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
setenv(GIT_PREFIX_ENVIRONMENT, "", 1);
}
- /*
- * The env variable should override the repository config
- * for 'extensions.refStorage'.
- */
- ref_backend_uri = getenv(GIT_REFERENCE_BACKEND_ENVIRONMENT);
- if (ref_backend_uri) {
- char *backend, *payload;
- enum ref_storage_format format;
-
- parse_reference_uri(ref_backend_uri, &backend, &payload);
- format = ref_storage_format_by_name(backend);
- if (format == REF_STORAGE_FORMAT_UNKNOWN)
- die(_("unknown ref storage format: '%s'"), backend);
- repo_set_ref_storage_format(repo, format, payload);
-
- free(backend);
- free(payload);
- }
-
setup_original_cwd(repo);
strbuf_release(&dir);
--
2.55.0.rc1.722.g2b3ac350e6.dirty
^ permalink raw reply related
* [PATCH v4 02/10] setup: stop applying repository format twice
From: Patrick Steinhardt @ 2026-06-19 11:27 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Justin Tobler
In-Reply-To: <20260619-b4-pks-refs-avoid-chdir-notify-reparent-v4-0-a6472be7acc4@pks.im>
When discovering the repository in "setup.c" we apply the final
repository format multiple times:
- Once via `repository_format_configure()`, where we apply the hash
algorithm and ref storage format to both `struct repository_format`
and `struct repository`.
- And once via `apply_repository_format()`, where we apply these two
settings from `struct repository_format` to `struct repository`.
With the current flow both of these are in fact necessary. But this is
only because we call `repository_format_configure()` after we have
called `apply_repository_format()`. Consequently, if we only changed the
repository format in `repository_format_configure()` it would never
propagate to the repository.
Refactor the code so that we first configure the repository format
before applying it to the repository so that we can stop setting the
hash and reference storage format multiple times.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
setup.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/setup.c b/setup.c
index a9db1f2c23..2748155964 100644
--- a/setup.c
+++ b/setup.c
@@ -2710,8 +2710,7 @@ static int read_default_format_config(const char *key, const char *value,
return ret;
}
-static void repository_format_configure(struct repository *repo,
- struct repository_format *repo_fmt,
+static void repository_format_configure(struct repository_format *repo_fmt,
int hash, enum ref_storage_format ref_format)
{
struct default_format_config cfg = {
@@ -2748,7 +2747,6 @@ static void repository_format_configure(struct repository *repo,
} else if (cfg.hash != GIT_HASH_UNKNOWN) {
repo_fmt->hash_algo = cfg.hash;
}
- repo_set_hash_algo(repo, repo_fmt->hash_algo);
env = getenv("GIT_DEFAULT_REF_FORMAT");
if (repo_fmt->version >= 0 &&
@@ -2786,9 +2784,6 @@ static void repository_format_configure(struct repository *repo,
free(backend);
}
-
- repo_set_ref_storage_format(repo, repo_fmt->ref_storage_format,
- repo_fmt->ref_storage_payload);
}
int init_db(struct repository *repo,
@@ -2830,10 +2825,10 @@ int init_db(struct repository *repo,
* is an attempt to reinitialize new repository with an old tool.
*/
check_repository_format_gently(repo_get_git_dir(repo), &repo_fmt, NULL);
+ repository_format_configure(&repo_fmt, hash, ref_storage_format);
if (apply_repository_format(repo, &repo_fmt, APPLY_REPOSITORY_FORMAT_HONOR_ENV, &err) < 0)
die("%s", err.buf);
startup_info->have_repository = 1;
- repository_format_configure(repo, &repo_fmt, hash, ref_storage_format);
/*
* Ensure `core.hidedotfiles` is processed. This must happen after we
--
2.55.0.rc1.722.g2b3ac350e6.dirty
^ permalink raw reply related
* [PATCH v4 01/10] setup: inline `check_and_apply_repository_format()`
From: Patrick Steinhardt @ 2026-06-19 11:27 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Justin Tobler
In-Reply-To: <20260619-b4-pks-refs-avoid-chdir-notify-reparent-v4-0-a6472be7acc4@pks.im>
We have two callsites of `check_and_apply_repository_format()`. In a
subsequent commit we'll want to adapt one of those callsites to change
the order in which we read and apply the repository format, at which
point the helper function will not really be a good fit for us anymore.
Inline the function to both of the callsites.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
setup.c | 47 ++++++++++++++++-------------------------------
1 file changed, 16 insertions(+), 31 deletions(-)
diff --git a/setup.c b/setup.c
index b4652651df..a9db1f2c23 100644
--- a/setup.c
+++ b/setup.c
@@ -1788,32 +1788,6 @@ int apply_repository_format(struct repository *repo,
return 0;
}
-/*
- * Check the repository format version in the path found in repo_get_git_dir(repo),
- * and die if it is a version we don't understand. Generally one would
- * set_git_dir() before calling this, and use it only for "are we in a valid
- * repo?".
- *
- * If successful and fmt is not NULL, fill fmt with data.
- */
-static void check_and_apply_repository_format(struct repository *repo,
- struct repository_format *fmt,
- enum apply_repository_format_flags flags)
-{
- struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
- struct strbuf err = STRBUF_INIT;
-
- if (!fmt)
- fmt = &repo_fmt;
-
- check_repository_format_gently(repo_get_git_dir(repo), fmt, NULL);
- if (apply_repository_format(repo, fmt, flags, &err) < 0)
- die("%s", err.buf);
- startup_info->have_repository = 1;
-
- clear_repository_format(&repo_fmt);
-}
-
const char *enter_repo(struct repository *repo, const char *path, unsigned flags)
{
static struct strbuf validated_path = STRBUF_INIT;
@@ -1887,9 +1861,17 @@ const char *enter_repo(struct repository *repo, const char *path, unsigned flags
}
if (is_git_directory(".")) {
+ struct repository_format fmt = REPOSITORY_FORMAT_INIT;
+ struct strbuf err = STRBUF_INIT;
+
set_git_dir(repo, ".", 0);
- check_and_apply_repository_format(repo, NULL,
- APPLY_REPOSITORY_FORMAT_HONOR_ENV);
+ check_repository_format_gently(".", &fmt, NULL);
+ if (apply_repository_format(repo, &fmt, APPLY_REPOSITORY_FORMAT_HONOR_ENV, &err) < 0)
+ die("%s", err.buf);
+ startup_info->have_repository = 1;
+
+ clear_repository_format(&fmt);
+ strbuf_release(&err);
return path;
}
@@ -2820,6 +2802,7 @@ int init_db(struct repository *repo,
int exist_ok = flags & INIT_DB_EXIST_OK;
char *original_git_dir = real_pathdup(git_dir, 1);
struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
+ struct strbuf err = STRBUF_INIT;
if (real_git_dir) {
struct stat st;
@@ -2846,9 +2829,10 @@ int init_db(struct repository *repo,
* config file, so this will not fail. What we are catching
* is an attempt to reinitialize new repository with an old tool.
*/
- check_and_apply_repository_format(repo, &repo_fmt,
- APPLY_REPOSITORY_FORMAT_HONOR_ENV);
-
+ check_repository_format_gently(repo_get_git_dir(repo), &repo_fmt, NULL);
+ if (apply_repository_format(repo, &repo_fmt, APPLY_REPOSITORY_FORMAT_HONOR_ENV, &err) < 0)
+ die("%s", err.buf);
+ startup_info->have_repository = 1;
repository_format_configure(repo, &repo_fmt, hash, ref_storage_format);
/*
@@ -2904,6 +2888,7 @@ int init_db(struct repository *repo,
}
clear_repository_format(&repo_fmt);
+ strbuf_release(&err);
free(original_git_dir);
return 0;
}
--
2.55.0.rc1.722.g2b3ac350e6.dirty
^ permalink raw reply related
* [PATCH v4 00/10] refs: stop using `chdir_notify_reparent()`
From: Patrick Steinhardt @ 2026-06-19 11:27 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak, Jeff King, Justin Tobler
In-Reply-To: <20260610-b4-pks-refs-avoid-chdir-notify-reparent-v1-0-56c864b01c43@pks.im>
Hi,
this patch series is a follow-up of the discussion at [1]. It converts
the reference backends to always use absolute paths internally, which
then allows us to drop the calls to `chdir_notify_reparent()`.
Unfortunately, the series has grown quite a bit larger than anticipated.
This is due to a couple of weirdnesses in how the reference database is
constructed with an "onbranch" condition. We essentially construct the
refdb twice and loose one, but we never noticed because the chdir
notification subsystem kept the pointer to it reachable.
Note that the first couple patches that touch "setup.c" aren't strictly
required. They are a remnant of a previous iteration where I tried to
solve the issue in a different way. But I ultimately figured that these
changes are worth it by themselves as they simplify "setup.c" a bit.
This series is built on top of 1ff279f340 (The 13th batch, 2026-06-09)
with ps/setup-centralize-odb-creation at 42b9d3dc9d (setup: construct
object database in `apply_repository_format()`, 2026-06-04) merged into
it.
Changes in v4:
- Fix the "onbranch" recursion at the root of the problem by
explicitly disabling the use of the ref store when parsing
configuration at ref store initialization time.
- Link to v3: https://patch.msgid.link/20260618-b4-pks-refs-avoid-chdir-notify-reparent-v3-0-2a5669e8f486@pks.im
Changes in v3:
- Reduce the scope of applying the GIT_REFERENCE_BACKEND environment
variable even further so that we really only do this when we end up
applying the reference format.
- Fix a commit message that still referred to the dropped last commit.
- Link to v2: https://patch.msgid.link/20260615-b4-pks-refs-avoid-chdir-notify-reparent-v2-0-f4854aa99859@pks.im
Changes in v2:
- Drop the last patch. This seemingly destroys the whole purpose of
the patch series, but after Peff's hint that this is actually a
performance optimization I'm less inclined to drop the chdir_notify
infra. I still think that the remainder of the patches make sense
standalone, as they simplify "setup.c" and clean memory leaks. Going
forward I'd like to investigate the idea of introducing a `struct
fsroot` infrastructure that uses the platform-equivalent of openat
et al.
- Improve a couple of commit messages.
- Link to v1: https://patch.msgid.link/20260610-b4-pks-refs-avoid-chdir-notify-reparent-v1-0-56c864b01c43@pks.im
Thanks!
Patrick
[1]: <aifAVpxanV31KUpC@pks.im>
---
Patrick Steinhardt (10):
setup: inline `check_and_apply_repository_format()`
setup: stop applying repository format twice
setup: don't apply "GIT_REFERENCE_BACKEND" without a repository
refs: unregister reference stores from "chdir_notify"
chdir-notify: drop unused `chdir_notify_reparent()`
repository: free main reference database
refs: move parsing of "core.logAllRefUpdates" back into ref stores
refs/reftable-backend: manually parse "core.sharedRepository"
refs: fix recursing `get_main_ref_store()` with "onbranch" config
refs: drop local buffer in `refs_compute_filesystem_location()`
builtin/checkout.c | 7 ++-
chdir-notify.c | 26 ------------
chdir-notify.h | 6 +--
config.c | 4 +-
config.h | 1 +
path.c | 11 ++---
path.h | 2 +-
refs.c | 25 ++++++++---
refs.h | 9 ++++
refs/files-backend.c | 48 ++++++++++++++++++---
refs/packed-backend.c | 16 ++++++-
refs/refs-internal.h | 6 ---
refs/reftable-backend.c | 50 +++++++++++++++++-----
repo-settings.c | 16 -------
repo-settings.h | 9 ----
repository.c | 5 +++
setup.c | 110 +++++++++++++++++++++---------------------------
17 files changed, 192 insertions(+), 159 deletions(-)
Range-diff versus v3:
1: 3ac83ba983 = 1: 3ae112f84b setup: inline `check_and_apply_repository_format()`
2: b6b15770eb = 2: d03fb25a01 setup: stop applying repository format twice
3: 5850f0602d = 3: f437af7ce6 setup: don't apply "GIT_REFERENCE_BACKEND" without a repository
4: e4b12483b4 = 4: 7704b7e5db refs: unregister reference stores from "chdir_notify"
5: 4a78c5080a = 5: 545fe82dda chdir-notify: drop unused `chdir_notify_reparent()`
6: 3f8ae36acc = 6: 5ac9f8c2b3 repository: free main reference database
7: 2a22f9a2e0 < -: ---------- refs: fix recursing `get_main_ref_store()` with "onbranch" config
-: ---------- > 7: 0482470af1 refs: move parsing of "core.logAllRefUpdates" back into ref stores
-: ---------- > 8: 1b2f9d4ff9 refs/reftable-backend: manually parse "core.sharedRepository"
-: ---------- > 9: c7ec7d887f refs: fix recursing `get_main_ref_store()` with "onbranch" config
8: 6bc943659d = 10: 5fb782268b refs: drop local buffer in `refs_compute_filesystem_location()`
---
base-commit: 255322df35357168daefec8523a3cdc849edd6c1
change-id: 20260609-b4-pks-refs-avoid-chdir-notify-reparent-a4eaf1edbcab
^ permalink raw reply
* Re: [PATCH] sequencer: Skip copying notes for commits that disappear during rebase
From: Phillip Wood @ 2026-06-19 10:13 UTC (permalink / raw)
To: Uwe Kleine-König, Junio C Hamano; +Cc: git, Phillip Wood
In-Reply-To: <ajKimV1TDCgE-GzK@monoceros>
Hi Uwe and Junio
On 17/06/2026 14:58, Uwe Kleine-König wrote:
>
>> It is not yet clear to me if we want to _always_ discard a note from
>> a commit that would become "empty" during a rebase session (in other
>> words, a commit that becomes empty during a rebase is _always_ a
>> sign that the change it brings in is _already_ in the new base of
>> the rebase
>
> Yeah, or in a patch that was picked before.
>
>> and the necessary information the note wanted to carry to
>> the target branch is there without need to _duplicate_ it by copying
>> the note). But assuming that we want the behaviour, the code change
>> to sequencer.c looks very reasonable to me, except for one thing that
>> I am not clear about.
>
> I think given the commit goes away, it's natural that the note goes
> away, too. And to come back to your question above: I think it doesn't
> need documentation, that if a commit disappears its notes go away, too.
> But that might be subjective?!
I tend to agree with this - if we're throwing away the commit message
without asking the user I think it makes sense to do the same for the
notes. We have "--empty=ask" if the user does not want commits that
become empty to be automatically discarded.
>>> diff --git a/sequencer.c b/sequencer.c
>>> index 57855b0066ac..da2185a37c5d 100644
>>> --- a/sequencer.c
>>> +++ b/sequencer.c
>>> ...
>>> @@ -4965,7 +4965,7 @@ static int pick_one_commit(struct repository *r,
>>> return error_with_patch(r, commit,
>>> arg, item->arg_len, opts, res, !res);
>>> }
>>> - if (is_rebase_i(opts) && !res)
>>> + if (is_rebase_i(opts) && !res && !dropped_commit)
>>> record_in_rewritten(&item->commit->object.oid,
>>> peek_command(todo_list, 1));
>>
>> If we have a sequence of commits where a commit that was *not*
>> dropped is followed by a fixup commit that *is* dropped (e.g.,
>> because it became empty/redundant), wouldn't it prevent the
>> previously pending commit from being flushed to skip
>> `record_in_rewritten` entirely for the dropped fixup commit?
That's a good point - we should call flush_rewritten_pending() in that
case. Looking at the code there are some other bugs related to dropping
commits either because they become empty or the user runs "git rebase
--skip"
- If we drop the final fixup we don't cleanup the commit message
- If we drop an "edit" command then "git rebase --continue" records it
as being rewritten HEAD so we'll copy the notes to the wrong commit
- Running "git rebase --skip" causes the commit that had conflicts
to also be recorded as as being rewritten to HEAD leading to the
same issue.
> Huh, sounds possible. I wonder if that makes the change so complicated
> that my time isn't well spend working on that given that I'm not used to
> git's source code and it's better addressed by someone with deeper
> knowledge. Sounds as if we need a state signaling "Current commit is
> done".
I'm happy to take this forward and try and fix at least some of the
other bugs I've listed above. Uwe - if I don't cc you on some patches
within the next couple of weeks please feel free to send a reminder.
Thanks
Phillip
>> Wouldn't it map the note for `X` to rewritten `C`?
>>
>>> diff --git a/t/t3322-notes-rebase.sh b/t/t3322-notes-rebase.sh
>>> new file mode 100755
>>> index 000000000000..0eddde7f9961
>>> --- /dev/null
>>> +++ b/t/t3322-notes-rebase.sh
>>> @@ -0,0 +1,37 @@
>>> +#!/bin/sh
>>> +
>>> +test_description='Test notes on rebase'
>>> +
>>> +. ./test-lib.sh
>>> +
>>> +test_expect_success setup '
>>> + git init &&
>>> + git config notes.rewriteRef refs/notes/commits &&
>>> + git version > version &&
>>> + echo A > A &&
>>
>> Style. In our codebase, redirection operator sticks to the
>> redirection target without SP in between, i.e.
>>
>> git version >version &&
>> echo A >A &&
>>
>>> + git notes add -m "This is B" @ &&
>>
>> '@' is hard to read; when you refer to HEAD, please write HEAD.
>>
>>
>>> +test_expect_success 'rebase B + C on top of BD' '
>>> + git rebase @ master
>>> +'
>>> +
>>> +test_expect_success 'assert there is no note on BD' '
>>> + if git notes list branch >/tmp/lalaa; then return 1; fi
>>> +'
>>
>> Do not step outside of $TRASH_DIRECTORY without a good reason.
>
> Oh, that is a debug thing that shouldn't have made it into the patch.
>
>> Style. In our codebase, shell scripts do not use ';' and written
>> more like
>>
>> if git notes list branch >notes-list
>> then
>> return 1
>> fi
>>
>> But more importantly, if you want to make sure the command makes a
>> controlled exit (not crash), use
>>
>> test_must_fail git notes list branch
>
> Ah, I really wondered if I'm missing something because it should be
> easier to say "this command should fail".
>
> Best regards
> Uwe
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox