* Re: [PATCH v3 0/4] Add support for an external command for fetching notes
From: Siddh Raman Pant @ 2026-06-22 4:45 UTC (permalink / raw)
To: git@vger.kernel.org
Cc: oswald.buddenhagen@gmx.de, gitster@pobox.com,
code@khaugsbakk.name, j6t@kdbg.org, peff@peff.net, ps@pks.im,
sandals@crustytoothpaste.net, newren@gmail.com
In-Reply-To: <d266c22f90d7140d14fe5dd84d91601d8fad7d73.camel@oracle.com>
[-- Attachment #1: Type: text/plain, Size: 238 bytes --]
Pinging again...
Thanks,
Siddh
On Tue, Jun 16 2026 at 16:29:36 +0530, Siddh Raman Pant wrote:
> Ping...
>
> Thread link:
> https://lore.kernel.org/git/cover.1779532562.git.siddh.raman.pant@oracle.com/
>
> Thanks,
> Siddh
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [RFH] Why do osx CI jobs so unreliable?
From: Patrick Steinhardt @ 2026-06-22 4:42 UTC (permalink / raw)
To: Jeff King; +Cc: Michael Montalbo, git, Junio C Hamano
In-Reply-To: <20260621213407.GC2297179@coredump.intra.peff.net>
On Sun, Jun 21, 2026 at 05:34:07PM -0400, Jeff King wrote:
> On Sat, Jun 20, 2026 at 08:33:13AM -0700, Michael Montalbo wrote:
>
> > Patrick Steinhardt <ps@pks.im> writes:
> > > So I strongly suspect that it most be one of the t555* tests.
> > > [...]
> > > Maybe this is something that's specific to GitHub's environment...
> >
> > I think you're right it's t5551/t5559. The runs Junio linked:
> >
> > osx-clang cancelled 360min
> > osx-gcc cancelled 360min
> > osx-reftable success 35min
> > osx-meson success 61min
> >
> > All four run the same t5551/t5559 under EXPENSIVE. The two that
> > finished differ in just two ways, which look like the levers:
> > osx-reftable generates the 100k-ref advertisement in ~24ms vs ~1.2s
> > for loose refs on macOS (so much less time mid-response), and
> > osx-meson runs tests at nproc while the prove jobs hardcode --jobs=10
> > on a 3-core runner (over recent master/next the prove jobs hang ~40%,
> > meson ~10%).
>
> If the problem is a racy deadlock, there is a reasonable chance that
> some jobs may simply be lucky. Even if things like packing refs help, I
> suspect the problem may still be lurking. Maybe I'm just a pessimist,
> though. ;)
I had the same thought.
> > When it is wedged the whole chain sits at 0% CPU. upload-pack is
> > blocked in write() on the ls-refs advertisement, curl blocked in
> > select(). So it looks like an HTTP/2 flow-control stall on the
> > response side. The same stall resets itself after ~60-85s on my Linux
> > box and on a bare-metal Mac, but not on the GitHub runner; I haven't
> > pinned down why yet.
>
> We had some HTTP/2 stalls/deadlocks in the past, and they were dependent
> on libcurl and apache (actually h2_mod) versions. IIRC some of the
> non-TLS code paths for HTTP/2 were not well tested, which led to
> 8f2146dbf1 (t5559: make SSL/TLS the default, 2023-02-23). Of course
> after that commit those cleartext code paths should not be a problem, so
> that is probably not exactly the issue now.
>
> But it might be worth checking the versions you're running locally
> versus what's in the GitHub runner.
I didn't observe any similar hangs in GitLab's CI systems, so I wonder
whether this is because of different versions of curl. And indeed we use
different versions:
- On GitHub we use 8.6.0.
- On GitLab we use 8.7.1.
Now this of course doesn't mean that updating the curl version is the
fix to this whole issue, as there's a ton of other factors that could
play a role in whether or not the test hangs. So while we could just
upgrade parts of the stack and cross our fingers, but that feels rather
unsatisfactory. Still, one place to start could be to update our build
images to macOS 15.
But the big question to me is whether the hang is because of a bug in
Git with how we drive curl, a bug in curl itself, or a bug in Apache.
Patrick
^ permalink raw reply
* Re: git-diff in a worktree is an order of magnitude slower?
From: Junio C Hamano @ 2026-06-21 23:17 UTC (permalink / raw)
To: Jeff King; +Cc: D. Ben Knoble, Git
In-Reply-To: <20260621212805.GB2297179@coredump.intra.peff.net>
Jeff King <peff@peff.net> writes:
> But at the point that we are comparing nanoseconds, I don't think we
> even need to bother with the delay. It takes maybe 5 seconds to write
> out all of the linux.git files and then the final index. So ~20% of
> those files will have the same timestamp as the index. With nanosecond
> resolution, we'd expect that to drop by an order of a billion. Even if
> we get unlucky and have a single file with the same timestamp, that is
> not so bad.
>
> The code to do the nanosecond compare is already there! But it's gated
> on USE_NSEC. So this (plus a bonus debugging trace ;) ):
> ...
> if (!changed && is_racy_timestamp(istate, ce)) {
> + warning("%s is racy", ce->name);
> if (assume_racy_is_modified)
> changed |= DATA_CHANGED;
> else
>...
> makes the problem go away. I'm not sure if I'm missing some case where
> we could be bitten by the problem that led to making USE_NSEC
> conditional, though.
That's cute.
^ permalink raw reply
* [PATCH v3 4/4] pack-objects: support `--delta-islands` with `--path-walk`
From: Taylor Blau @ 2026-06-21 23:03 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
In-Reply-To: <cover.1782082975.git.me@ttaylorr.com>
Since the inception of `--path-walk`, this option has had a documented
incompatibility with `--delta-islands`.
When discussing those original patches on the list, a message from
Stolee in [1] noted the following:
this could be remedied by [...] doing a separate walk to identify
islands using the normal method
In a related portion of the thread, Peff explains[2]:
The delta islands code already does its own tree walk to propagate
the bits down (it does rely on the base walk's show_commit() to
propagate through the commits).
Once each object has its island bitmaps, I think however you
choose to come up with delta candidates [...] you should be able
to use it. It's fundamentally just answering the question of "am
I allowed to delta between these two objects".
That is similar to what this patch does, and it turns out the cheaper
option is sufficient: perform the same island side effects from the
path-walk callback rather than doing a second walk.
Recall how delta-islands are computed during a normal repack:
- `show_commit()` calls `propagate_island_marks()` for each commit,
which merges the commit's island bitset onto its root tree object and
onto each of its parent commits.
- `show_object()` for a tree records the tree's depth derived from the
slash-separated pathname. Subsequent `resolve_tree_islands()` uses
that depth to walk trees in increasing-depth order, propagating each
tree's marks to its children.
- At delta-search time, `in_same_island()` enforces that a delta
target's island bitmap is a subset of its base's: every island that
reaches the target must also reach the base.
Path-walk's enumeration callback is `add_objects_by_path()`. It already
adds objects to `to_pack`, but until now did not perform the
island-related side effects. Two things are needed:
- For each commit batch, call `propagate_island_marks()` on commits,
exactly as `show_commit()` does.
We have to be careful about the order in which we call this function,
and we must see a commit before its parents in order to have
island marks to propagate.
The path-walk batch preserves that order. Path-walk appends commits
to its `OBJ_COMMIT` batch as they come back from the same
`get_revision()` loop the regular traversal uses, and
`add_objects_by_path()` iterates the batch in array order. So every
commit reaches `propagate_island_marks()` in the same sequence that
`show_commit()` would have seen it, and the descendant-first chain
that the algorithm relies on is intact.
Skip island propagation for excluded commits to match the regular
traversal, whose `show_commit()` callback is only invoked for
interesting commits. Boundary commits may still be present in
path-walk's callback so they can serve as thin-pack bases, but they
should not contribute island marks.
- For each tree batch, record the tree's depth from the path. Use the
`record_tree_depth()` helper from the previous commit so both
callbacks behave identically, including the max-depth-wins behavior
when a tree is reached via more than one path. The helper accepts
both the `show_object()` path shape ("foo", "foo/bar") and the
path-walk shape with a trailing slash ("foo/", "foo/bar/"), so depths
recorded from either traversal mode are directly comparable.
This is implicit in the implementation sketch from Peff above.
`resolve_tree_islands()` sorts trees by `oe->tree_depth` in
increasing-depth order before propagating marks down, so that a
parent tree's marks are finalized before its children inherit them.
Without recording the depth at path-walk time, every
path-walk-discovered tree would land at depth 0 in `to_pack`, the
sort would lose its ordering, and children could inherit marks from
parents whose own contributions had not yet been merged in.
With those two pieces in place, `resolve_tree_islands()` receives the
same island inputs from path-walk as it would from the regular
traversal, so the existing island checks can be reused unchanged.
Drop the documented incompatibility between `--path-walk` and
`--delta-islands`, and add t5320 coverage for path-walk island repacks
with and without bitmap writing, as well as the same-island case where a
delta remains allowed.
[1]: https://lore.kernel.org/git/9aa2471b-0850-4707-9733-d3b33609f5f2@gmail.com/
[2]: https://lore.kernel.org/git/20240911063203.GA1538586@coredump.intra.peff.net/
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 14 +++++++-------
builtin/pack-objects.c | 22 ++++++++++++++++++----
t/t5320-delta-islands.sh | 29 +++++++++++++++++++++++++++++
3 files changed, 54 insertions(+), 11 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 0adce8961a3..65cd00c152f 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -402,13 +402,13 @@ will be automatically changed to version `1`.
of filenames that cause collisions in Git's default name-hash
algorithm.
+
-Incompatible with `--delta-islands`. When `--use-bitmap-index` is
-specified with `--path-walk`, a successful bitmap traversal is used for
-object enumeration, with path-walk remaining as the fallback traversal
-when the bitmap cannot satisfy the request. The `--path-walk` option
-supports the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
-`tree:0`, `object:type=<type>`, and `sparse:<oid>`. These supported filter
-types can be combined with the `combine:<spec>+<spec>` form.
+When `--use-bitmap-index` is specified with `--path-walk`, a successful
+bitmap traversal is used for object enumeration, with path-walk
+remaining as the fallback traversal when the bitmap cannot satisfy the
+request. The `--path-walk` option supports the `--filter=<spec>` forms
+`blob:none`, `blob:limit=<n>`, `tree:0`, `object:type=<type>`, and
+`sparse:<oid>`. These supported filter types can be combined with the
+`combine:<spec>+<spec>` form.
DELTA ISLANDS
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index ec02e2b21d2..f48ea7a888b 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4737,13 +4737,29 @@ static int add_objects_by_path(const char *path,
add_object_entry(oid, type, path, exclude);
- if (type == OBJ_COMMIT && write_bitmap_index) {
+ if (type == OBJ_COMMIT) {
struct commit *commit;
+ if (!write_bitmap_index && !use_delta_islands)
+ continue;
+
commit = lookup_commit(the_repository, oid);
if (!commit)
die(_("could not find commit %s"), oid_to_hex(oid));
- index_commit_for_bitmap(commit);
+ if (write_bitmap_index)
+ index_commit_for_bitmap(commit);
+ /*
+ * Skip island propagation for boundary commits.
+ * The regular traversal's show_commit() is only
+ * called for interesting commits; matching that
+ * here keeps path-walk from doing extra work that
+ * would only be a no-op anyway (boundary commits
+ * are not in island_marks).
+ */
+ if (use_delta_islands && !exclude)
+ propagate_island_marks(the_repository, commit);
+ } else if (type == OBJ_TREE && use_delta_islands) {
+ record_tree_depth(oid, path);
}
}
@@ -5205,8 +5221,6 @@ int cmd_pack_objects(int argc,
const char *option = NULL;
if (!path_walk_filter_compatible(&filter_options))
option = "--filter";
- else if (use_delta_islands)
- option = "--delta-islands";
if (option) {
warning(_("cannot use %s with %s"),
diff --git a/t/t5320-delta-islands.sh b/t/t5320-delta-islands.sh
index 2c961c70963..9b28344a0a3 100755
--- a/t/t5320-delta-islands.sh
+++ b/t/t5320-delta-islands.sh
@@ -53,6 +53,35 @@ test_expect_success 'separate islands disallows delta' '
! is_delta_base $two $one
'
+test_expect_success 'path-walk island repack respects islands' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-islands" \
+ git -c "pack.island=refs/heads/(.*)" repack -adfi \
+ --path-walk 2>err &&
+ test_region pack-objects path-walk trace.path-walk-islands &&
+ test_grep ! "cannot use --delta-islands with --path-walk" err &&
+ ! is_delta_base $one $two &&
+ ! is_delta_base $two $one
+'
+
+test_expect_success 'path-walk island bitmap repack respects islands' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-island-bitmap" \
+ git -c "pack.island=refs/heads/(.*)" repack -a -d -f -i -b \
+ --path-walk 2>err &&
+ test_region pack-objects path-walk trace.path-walk-island-bitmap &&
+ test_path_is_file .git/objects/pack/*.bitmap &&
+ git rev-list --test-bitmap --use-bitmap-index one &&
+ test_grep ! "cannot use --delta-islands with --path-walk" err &&
+ ! is_delta_base $one $two &&
+ ! is_delta_base $two $one
+'
+
+test_expect_success 'path-walk same island allows delta' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-same-island" \
+ git -c "pack.island=refs/heads" repack -adfi --path-walk &&
+ test_region pack-objects path-walk trace.path-walk-same-island &&
+ is_delta_base $one $two
+'
+
test_expect_success 'same island allows delta' '
git -c "pack.island=refs/heads" repack -adfi &&
is_delta_base $one $two
--
2.54.0.23.g371fc4317ad
^ permalink raw reply related
* [PATCH v3 3/4] pack-objects: extract `record_tree_depth()` helper
From: Taylor Blau @ 2026-06-21 23:03 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
In-Reply-To: <cover.1782082975.git.me@ttaylorr.com>
Prepare for a subsequent change that needs to record tree depths from a
second call site by factoring the delta-islands tree-depth bookkeeping
out of `show_object()` and into a helper, `record_tree_depth()`.
The helper looks up the object in `to_pack`, returns early when the
object was not added there, computes the depth from the slash count in
the supplied name, and preserves the existing max-depth-wins behavior
when a tree is reached by more than one path.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
builtin/pack-objects.c | 32 ++++++++++++++++++--------------
1 file changed, 18 insertions(+), 14 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index e4dcb563b7d..ec02e2b21d2 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2722,6 +2722,22 @@ static inline void oe_set_tree_depth(struct packing_data *pack,
pack->tree_depth[e - pack->objects] = tree_depth;
}
+static void record_tree_depth(const struct object_id *oid, const char *name)
+{
+ const char *p;
+ unsigned depth;
+ struct object_entry *ent;
+
+ /* the empty string is a root tree, which is depth 0 */
+ depth = *name ? 1 : 0;
+ for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
+ depth++;
+
+ ent = packlist_find(&to_pack, oid);
+ if (ent && depth > oe_tree_depth(&to_pack, ent))
+ oe_set_tree_depth(&to_pack, ent, depth);
+}
+
/*
* Return the size of the object without doing any delta
* reconstruction (so non-deltas are true object sizes, but deltas
@@ -4375,20 +4391,8 @@ static void show_object(struct object *obj, const char *name,
add_preferred_base_object(name);
add_object_entry(&obj->oid, obj->type, name, 0);
- if (use_delta_islands) {
- const char *p;
- unsigned depth;
- struct object_entry *ent;
-
- /* the empty string is a root tree, which is depth 0 */
- depth = *name ? 1 : 0;
- for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
- depth++;
-
- ent = packlist_find(&to_pack, &obj->oid);
- if (ent && depth > oe_tree_depth(&to_pack, ent))
- oe_set_tree_depth(&to_pack, ent, depth);
- }
+ if (use_delta_islands)
+ record_tree_depth(&obj->oid, name);
}
static void show_object__ma_allow_any(struct object *obj, const char *name, void *data)
--
2.54.0.23.g371fc4317ad
^ permalink raw reply related
* [PATCH v3 2/4] pack-objects: support reachability bitmaps with `--path-walk`
From: Taylor Blau @ 2026-06-21 23:03 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
In-Reply-To: <cover.1782082975.git.me@ttaylorr.com>
When 'pack-objects' is invoked with '--path-walk', it prevents us from
using reachability bitmaps.
This behavior dates back to 70664d2865c (pack-objects: add --path-walk
option, 2025-05-16), which included a comment in the relevant portion of
the command-line arguments handling that read as follows:
/*
* We must disable the bitmaps because we are removing
* the --objects / --objects-edge[-aggressive] options.
*/
In fb2c309b7d3 (pack-objects: pass --objects with --path-walk,
2026-05-02), path-walk learned to pass '--objects' again, but still
kept bitmap traversal disabled. That leaves two useful cases
unsupported:
* A path-walk repack that writes bitmaps does not give the bitmap
selector any commits, because path-walk reveals commits through
`add_objects_by_path()` rather than through `show_commit()`, where
`index_commit_for_bitmap()` is normally called.
* An invocation like "git pack-objects --use-bitmap-index --path-walk"
never tries an existing bitmap, even when one is available and could
answer the request.
Fortunately for us, neither restriction is required.
* On the writing side: teach the path-walk object callback to call
`index_commit_for_bitmap()` for commits that it adds to the pack.
That gives the bitmap selector the commit candidates it would have
seen from the regular traversal.
* For bitmap reading, keep passing '--objects' to the internal rev_list
machinery, but stop clearing `use_bitmap_index`. If an existing
bitmap can answer the request, use it; otherwise fall back to
path-walk's own enumeration.
As a result, we can see significantly reduced pack generation times from
p5311 (with our `GIT_PERF_REPO` set to a recent clone of the fluentui
repository) before this commit:
Test HEAD^ HEAD
----------------------------------------------------------------------------------------
5311.40: server (1 days, --path-walk) 1.43(1.39+0.04) 0.01(0.01+0.00) -99.3%
5311.41: size (1 days, --path-walk) 139.6K 139.7K +0.0%
5311.42: client (1 days, --path-walk) 0.02(0.02+0.00) 0.02(0.02+0.00) +0.0%
5311.44: server (2 days, --path-walk) 1.43(1.39+0.04) 0.01(0.00+0.00) -99.3%
5311.45: size (2 days, --path-walk) 139.6K 139.7K +0.0%
5311.46: client (2 days, --path-walk) 0.02(0.02+0.00) 0.02(0.02+0.00) +0.0%
5311.48: server (4 days, --path-walk) 1.44(1.39+0.04) 0.01(0.01+0.00) -99.3%
5311.49: size (4 days, --path-walk) 238.1K 238.1K +0.0%
5311.50: client (4 days, --path-walk) 0.03(0.03+0.00) 0.03(0.03+0.00) +0.0%
5311.52: server (8 days, --path-walk) 1.43(1.39+0.03) 0.01(0.00+0.00) -99.3%
5311.53: size (8 days, --path-walk) 344.9K 344.9K +0.0%
5311.54: client (8 days, --path-walk) 0.07(0.07+0.00) 0.07(0.08+0.00) +0.0%
5311.56: server (16 days, --path-walk) 1.47(1.44+0.03) 0.10(0.08+0.01) -93.2%
5311.57: size (16 days, --path-walk) 844.0K 844.0K +0.0%
5311.58: client (16 days, --path-walk) 0.09(0.09+0.00) 0.09(0.09+0.00) +0.0%
5311.60: server (32 days, --path-walk) 1.52(1.50+0.05) 0.14(0.15+0.02) -90.8%
5311.61: size (32 days, --path-walk) 4.2M 4.2M +0.1%
5311.62: client (32 days, --path-walk) 0.34(0.48+0.02) 0.34(0.45+0.05) +0.0%
5311.64: server (64 days, --path-walk) 1.55(1.52+0.06) 0.15(0.15+0.04) -90.3%
5311.65: size (64 days, --path-walk) 6.4M 6.4M -0.0%
5311.66: client (64 days, --path-walk) 0.51(0.79+0.05) 0.51(0.80+0.06) +0.0%
5311.68: server (128 days, --path-walk) 1.59(1.57+0.06) 0.16(0.21+0.01) -89.9%
5311.69: size (128 days, --path-walk) 8.4M 8.4M -0.0%
5311.70: client (128 days, --path-walk) 0.72(1.44+0.08) 0.71(1.47+0.09) -1.4%
We get the same size of output pack, but this commit allows us to do so
in a significantly shorter amount of time. Intuitively, we're generating
the same pack (hence the unchanged 'test_size' output from run to run),
but varying how we get there. Before this commit, pack-objects prefers
'--path-walk' to '--use-bitmap-index', so we generate the output pack by
performing a normal '--path-walk' traversal. With this commit, we are
operating over a *repacked* state (that itself was done with a
'--path-walk' traversal), but are able to perform pack-reuse on that
repacked state via bitmaps.
When comparing the size of the repacked pack with/without '--path-walk'
on the previous commit versus this one, we see that (a) the repacked size
improves significantly with '--path-walk', and that (b) writing bitmaps
during repacking does not regress this improvement:
Test HEAD^ HEAD
----------------------------------------------------------------------------------------
5311.3: size of bitmapped pack 558.4M 558.5M +0.0%
5311.38: size of bitmapped pack (--path-walk) 164.4M 164.4M +0.0%
(Note that to observe an improvement here, we must repack with '-F' in
order to avoid reusing non-'--path-walk' deltas, which would otherwise
skew our results.)
There is one wrinkle when it comes to '--boundary', which we must not
pass into the bitmap walk in the presence of both '--path-walk' and
'--use-bitmap-index'. Path-walk needs boundary commits when it performs
its own traversal, in order to discover bases for thin packs, but the
bitmap traversal does not expect this. Work around this by setting
`revs->boundary` as late as possible within the '--path-walk' traversal,
after any bitmap attempt has either succeeded or declined to answer the
request.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 6 +++--
builtin/pack-objects.c | 18 +++++++++++++--
t/perf/p5311-pack-bitmaps-fetch.sh | 18 +++++++++++----
t/t5310-pack-bitmaps.sh | 36 +++++++++++++++++++++++++++++
4 files changed, 70 insertions(+), 8 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 8a27aa19fd3..0adce8961a3 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -402,8 +402,10 @@ will be automatically changed to version `1`.
of filenames that cause collisions in Git's default name-hash
algorithm.
+
-Incompatible with `--delta-islands`. The `--use-bitmap-index` option is
-ignored in the presence of `--path-walk`. The `--path-walk` option
+Incompatible with `--delta-islands`. When `--use-bitmap-index` is
+specified with `--path-walk`, a successful bitmap traversal is used for
+object enumeration, with path-walk remaining as the fallback traversal
+when the bitmap cannot satisfy the request. The `--path-walk` option
supports the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
`tree:0`, `object:type=<type>`, and `sparse:<oid>`. These supported filter
types can be combined with the `combine:<spec>+<spec>` form.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index b783dc62bc9..e4dcb563b7d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4732,6 +4732,15 @@ static int add_objects_by_path(const char *path,
continue;
add_object_entry(oid, type, path, exclude);
+
+ if (type == OBJ_COMMIT && write_bitmap_index) {
+ struct commit *commit;
+
+ commit = lookup_commit(the_repository, oid);
+ if (!commit)
+ die(_("could not find commit %s"), oid_to_hex(oid));
+ index_commit_for_bitmap(commit);
+ }
}
oe_end = to_pack.nr_objects;
@@ -4764,6 +4773,13 @@ static int get_object_list_path_walk(struct rev_info *revs)
info.path_fn = add_objects_by_path;
info.path_fn_data = &processed;
+ /*
+ * Path-walk needs boundary commits to discover thin-pack bases, but
+ * bitmap traversal does not understand the boundary state. Set it
+ * here so any prior bitmap attempt sees the usual non-boundary walk.
+ */
+ revs->boundary = 1;
+
/*
* Allow the --[no-]sparse option to be interesting here, if only
* for testing purposes. Paths with no interesting objects will not
@@ -5195,9 +5211,7 @@ int cmd_pack_objects(int argc,
}
}
if (path_walk) {
- strvec_push(&rp, "--boundary");
strvec_push(&rp, "--objects");
- use_bitmap_index = 0;
} else if (thin) {
use_internal_rev_list = 1;
strvec_push(&rp, shallow
diff --git a/t/perf/p5311-pack-bitmaps-fetch.sh b/t/perf/p5311-pack-bitmaps-fetch.sh
index 5bea5c64e7b..50506216227 100755
--- a/t/perf/p5311-pack-bitmaps-fetch.sh
+++ b/t/perf/p5311-pack-bitmaps-fetch.sh
@@ -4,15 +4,22 @@ test_description='performance of fetches from bitmapped packs'
. ./perf-lib.sh
test_fetch_bitmaps () {
+ argv=$1
+ export argv
+
test_expect_success 'setup test directory' '
rm -fr * .git
'
test_perf_default_repo
- test_expect_success 'create bitmapped server repo' '
+ test_expect_success "create bitmapped server repo ${argv:+($argv)}" '
git config pack.writebitmaps true &&
- git repack -ad
+ git repack -adF $argv
+ '
+
+ test_size "size of bitmapped pack ${argv:+($argv)}" '
+ test_file_size .git/objects/pack/pack-*.pack
'
# simulate a fetch from a repository that last fetched N days ago, for
@@ -20,7 +27,7 @@ test_fetch_bitmaps () {
# and assume the first entry in the chain that is N days older than the current
# HEAD is where the HEAD would have been then.
for days in 1 2 4 8 16 32 64 128; do
- title=$(printf '%10s' "($days days)")
+ title=$(printf '%10s' "($days days${argv:+, $argv})")
test_expect_success "setup revs from $days days ago" '
now=$(git log -1 --format=%ct HEAD) &&
then=$(($now - ($days * 86400))) &&
@@ -47,6 +54,9 @@ test_fetch_bitmaps () {
done
}
-test_fetch_bitmaps
+for argv in '' --path-walk
+do
+ test_fetch_bitmaps $argv || return 1
+done
test_done
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index f693cb56691..7924208d99c 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -577,6 +577,42 @@ test_bitmap_cases
sane_unset GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL
+test_expect_success 'path-walk repack can write and use bitmap indexes' '
+ test_when_finished "rm -rf path-walk-bitmap" &&
+ git init path-walk-bitmap &&
+ (
+ cd path-walk-bitmap &&
+ test_commit first &&
+ test_commit second &&
+ test_commit third &&
+
+ git repack -a -d -b --path-walk &&
+ git rev-list --test-bitmap --use-bitmap-index HEAD &&
+
+ git rev-parse HEAD >in &&
+
+ git rev-list --objects --no-object-names HEAD >expect.raw &&
+ sort expect.raw >expect &&
+
+ for reuse in true false
+ do
+ : >trace.txt &&
+
+ GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+ git -c pack.allowPackReuse=$reuse pack-objects \
+ --stdout --revs --path-walk --use-bitmap-index \
+ <in >out.pack &&
+ test_grep "\"category\":\"bitmap\",\"key\":\"bitmap/hits\"" trace.txt &&
+
+ git index-pack out.pack &&
+
+ list_packed_objects out.idx >actual.raw &&
+ sort actual.raw >actual &&
+ test_cmp expect actual || return 1
+ done
+ )
+'
+
test_expect_success 'incremental repack fails when bitmaps are requested' '
test_commit more-1 &&
test_must_fail git repack -d 2>err &&
--
2.54.0.23.g371fc4317ad
^ permalink raw reply related
* [PATCH v3 1/4] t/perf: drop p5311's lookup-table permutation
From: Taylor Blau @ 2026-06-21 23:02 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
In-Reply-To: <cover.1782082975.git.me@ttaylorr.com>
p5311 measures the cost of serving a fetch from a bitmapped pack and
indexing the resulting pack on the client. Since 761416ef91d
(bitmap-lookup-table: add performance tests for lookup table,
2022-08-14), p5311 effectively runs itself twice: once with the bitmap's
lookup table extension enabled, and again with it disabled.
This comparison has served its useful purpose, as the lookup table is
almost four years old, and the de-facto default in server-side Git
deployments.
A following commit will want to test a different combination (repacking
with and without '--path-walk' instead of the lookup table). Instead of
multiplying the current test count by two again to produce four
variations of `test_fetch_bitmaps()`, drop the lookup table option to
reduce the number of perf tests we run. Retain `test_fetch_bitmaps()`
itself, since we will use this in the future for the new
parameterization.
(As an aside, a future commit outside of this series will adjust the
default value of 'pack.writeBitmapLookupTable' to "true", matching the
de-facto norm for deployments where the existence of bitmap lookup
tables is meaningful. Punt on that to a later series and instead make
the minimal change for now.)
Suggested-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
t/perf/p5311-pack-bitmaps-fetch.sh | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/t/perf/p5311-pack-bitmaps-fetch.sh b/t/perf/p5311-pack-bitmaps-fetch.sh
index 047efb995d6..5bea5c64e7b 100755
--- a/t/perf/p5311-pack-bitmaps-fetch.sh
+++ b/t/perf/p5311-pack-bitmaps-fetch.sh
@@ -12,7 +12,6 @@ test_fetch_bitmaps () {
test_expect_success 'create bitmapped server repo' '
git config pack.writebitmaps true &&
- git config pack.writeBitmapLookupTable '"$1"' &&
git repack -ad
'
@@ -32,7 +31,7 @@ test_fetch_bitmaps () {
} >revs
'
- test_perf "server $title (lookup=$1)" '
+ test_perf "server $title" '
git pack-objects --stdout --revs \
--thin --delta-base-offset \
<revs >tmp.pack
@@ -42,13 +41,12 @@ test_fetch_bitmaps () {
test_file_size tmp.pack
'
- test_perf "client $title (lookup=$1)" '
+ test_perf "client $title" '
git index-pack --stdin --fix-thin <tmp.pack
'
done
}
-test_fetch_bitmaps true
-test_fetch_bitmaps false
+test_fetch_bitmaps
test_done
--
2.54.0.23.g371fc4317ad
^ permalink raw reply related
* [PATCH v3 0/4] pack-objects: support bitmaps and delta-islands with `--path-walk`
From: Taylor Blau @ 2026-06-21 23:02 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
In-Reply-To: <cover.1779923907.git.me@ttaylorr.com>
Note to the maintainer:
* This series is still based on 'ds/path-walk-filters' with Patrick's
'ps/clang-w-glibc-2.43-and-_Generic' merged in.
Here is another small reroll of my series to make `--path-walk` work
with reachability bitmaps and delta-islands.
This round addresses Stolee's request to demonstrate the repack-size
side of the integration between `--path-walk` and bitmap writing, and
fixes an errant "grep" in the test suite.
Changes since v2 include:
* p5311 now forces a fresh repack with '-F' when building its bitmapped
test repository. This avoids reusing deltas from a non-'--path-walk'
pack when we are trying to measure a pack produced by `--path-walk`.
* p5311 now records the size of the bitmapped pack, both with and
without `--path-walk`, to show that writing bitmaps during a
`--path-walk` repack does not lose the pack-size improvement that
`--path-walk` provides in repositories where it helps.
* The second patch's commit message has updated p5311 numbers from a
recent fluentui clone, fixing the "pack sizes" typo and documenting
the new bitmapped-pack-size comparison.
* The t5310 grep assertion now uses `test_grep`, as suggested by Junio.
Outside of the above, the series is functionally unchanged.
Thanks in advance for another look.
Taylor Blau (4):
t/perf: drop p5311's lookup-table permutation
pack-objects: support reachability bitmaps with `--path-walk`
pack-objects: extract `record_tree_depth()` helper
pack-objects: support `--delta-islands` with `--path-walk`
Documentation/git-pack-objects.adoc | 12 ++---
builtin/pack-objects.c | 68 +++++++++++++++++++++--------
t/perf/p5311-pack-bitmaps-fetch.sh | 24 ++++++----
t/t5310-pack-bitmaps.sh | 36 +++++++++++++++
t/t5320-delta-islands.sh | 29 ++++++++++++
5 files changed, 138 insertions(+), 31 deletions(-)
Range-diff against v2:
1: 52d63e8910e = 1: b1dbf30ddbe t/perf: drop p5311's lookup-table permutation
2: ffad584a43e ! 2: 1884f495809 pack-objects: support reachability bitmaps with `--path-walk`
@@ Commit message
bitmap can answer the request, use it; otherwise fall back to
path-walk's own enumeration.
- As a result, we can see significantly reduced pack sizes from p5311
- before this commit:
+ As a result, we can see significantly reduced pack generation times from
+ p5311 (with our `GIT_PERF_REPO` set to a recent clone of the fluentui
+ repository) before this commit:
- Test HEAD^ HEAD
- ----------------------------------------------------------------------------------
- 5311.38: server (1 days, --path-walk) 2.56(2.52+0.03) 0.01(0.01+0.00) -99.6%
- 5311.39: size (1 days, --path-walk) 123.9K 123.9K +0.0%
- 5311.40: client (1 days, --path-walk) 0.00(0.01+0.00) 0.00(0.00+0.00) =
- 5311.42: server (2 days, --path-walk) 2.57(2.52+0.05) 0.01(0.01+0.00) -99.6%
- 5311.43: size (2 days, --path-walk) 123.9K 123.9K +0.0%
- 5311.44: client (2 days, --path-walk) 0.00(0.00+0.00) 0.00(0.00+0.00) =
- 5311.46: server (4 days, --path-walk) 2.58(2.51+0.07) 0.01(0.01+0.00) -99.6%
- 5311.47: size (4 days, --path-walk) 123.9K 123.9K +0.0%
- 5311.48: client (4 days, --path-walk) 0.00(0.00+0.00) 0.00(0.00+0.00) =
- 5311.50: server (8 days, --path-walk) 2.58(2.53+0.04) 0.02(0.02+0.00) -99.2%
- 5311.51: size (8 days, --path-walk) 152.4K 152.4K +0.0%
- 5311.52: client (8 days, --path-walk) 0.00(0.01+0.00) 0.00(0.01+0.00) =
- 5311.54: server (16 days, --path-walk) 2.58(2.52+0.05) 0.03(0.02+0.00) -98.8%
- 5311.55: size (16 days, --path-walk) 205.3K 205.3K +0.0%
- 5311.56: client (16 days, --path-walk) 0.01(0.01+0.00) 0.01(0.01+0.00) +0.0%
- 5311.58: server (32 days, --path-walk) 2.59(2.53+0.06) 0.03(0.03+0.00) -98.8%
- 5311.59: size (32 days, --path-walk) 209.3K 209.3K +0.0%
- 5311.60: client (32 days, --path-walk) 0.01(0.02+0.00) 0.01(0.02+0.00) +0.0%
- 5311.62: server (64 days, --path-walk) 2.70(2.76+0.06) 0.16(0.24+0.04) -94.1%
- 5311.63: size (64 days, --path-walk) 4.1M 4.1M +0.0%
- 5311.64: client (64 days, --path-walk) 0.44(0.50+0.02) 0.44(0.51+0.02) +0.0%
- 5311.66: server (128 days, --path-walk) 2.88(3.20+0.05) 0.34(0.65+0.05) -88.2%
- 5311.67: size (128 days, --path-walk) 9.0M 9.0M -0.0%
- 5311.68: client (128 days, --path-walk) 0.93(1.22+0.07) 0.93(1.20+0.08) +0.0%
+ Test HEAD^ HEAD
+ ----------------------------------------------------------------------------------------
+ 5311.40: server (1 days, --path-walk) 1.43(1.39+0.04) 0.01(0.01+0.00) -99.3%
+ 5311.41: size (1 days, --path-walk) 139.6K 139.7K +0.0%
+ 5311.42: client (1 days, --path-walk) 0.02(0.02+0.00) 0.02(0.02+0.00) +0.0%
+ 5311.44: server (2 days, --path-walk) 1.43(1.39+0.04) 0.01(0.00+0.00) -99.3%
+ 5311.45: size (2 days, --path-walk) 139.6K 139.7K +0.0%
+ 5311.46: client (2 days, --path-walk) 0.02(0.02+0.00) 0.02(0.02+0.00) +0.0%
+ 5311.48: server (4 days, --path-walk) 1.44(1.39+0.04) 0.01(0.01+0.00) -99.3%
+ 5311.49: size (4 days, --path-walk) 238.1K 238.1K +0.0%
+ 5311.50: client (4 days, --path-walk) 0.03(0.03+0.00) 0.03(0.03+0.00) +0.0%
+ 5311.52: server (8 days, --path-walk) 1.43(1.39+0.03) 0.01(0.00+0.00) -99.3%
+ 5311.53: size (8 days, --path-walk) 344.9K 344.9K +0.0%
+ 5311.54: client (8 days, --path-walk) 0.07(0.07+0.00) 0.07(0.08+0.00) +0.0%
+ 5311.56: server (16 days, --path-walk) 1.47(1.44+0.03) 0.10(0.08+0.01) -93.2%
+ 5311.57: size (16 days, --path-walk) 844.0K 844.0K +0.0%
+ 5311.58: client (16 days, --path-walk) 0.09(0.09+0.00) 0.09(0.09+0.00) +0.0%
+ 5311.60: server (32 days, --path-walk) 1.52(1.50+0.05) 0.14(0.15+0.02) -90.8%
+ 5311.61: size (32 days, --path-walk) 4.2M 4.2M +0.1%
+ 5311.62: client (32 days, --path-walk) 0.34(0.48+0.02) 0.34(0.45+0.05) +0.0%
+ 5311.64: server (64 days, --path-walk) 1.55(1.52+0.06) 0.15(0.15+0.04) -90.3%
+ 5311.65: size (64 days, --path-walk) 6.4M 6.4M -0.0%
+ 5311.66: client (64 days, --path-walk) 0.51(0.79+0.05) 0.51(0.80+0.06) +0.0%
+ 5311.68: server (128 days, --path-walk) 1.59(1.57+0.06) 0.16(0.21+0.01) -89.9%
+ 5311.69: size (128 days, --path-walk) 8.4M 8.4M -0.0%
+ 5311.70: client (128 days, --path-walk) 0.72(1.44+0.08) 0.71(1.47+0.09) -1.4%
We get the same size of output pack, but this commit allows us to do so
in a significantly shorter amount of time. Intuitively, we're generating
@@ Commit message
'--path-walk' traversal), but are able to perform pack-reuse on that
repacked state via bitmaps.
+ When comparing the size of the repacked pack with/without '--path-walk'
+ on the previous commit versus this one, we see that (a) the repacked size
+ improves significantly with '--path-walk', and that (b) writing bitmaps
+ during repacking does not regress this improvement:
+
+ Test HEAD^ HEAD
+ ----------------------------------------------------------------------------------------
+ 5311.3: size of bitmapped pack 558.4M 558.5M +0.0%
+ 5311.38: size of bitmapped pack (--path-walk) 164.4M 164.4M +0.0%
+
+ (Note that to observe an improvement here, we must repack with '-F' in
+ order to avoid reusing non-'--path-walk' deltas, which would otherwise
+ skew our results.)
+
There is one wrinkle when it comes to '--boundary', which we must not
pass into the bitmap walk in the presence of both '--path-walk' and
'--use-bitmap-index'. Path-walk needs boundary commits when it performs
@@ t/perf/p5311-pack-bitmaps-fetch.sh: test_description='performance of fetches fro
+ test_expect_success "create bitmapped server repo ${argv:+($argv)}" '
git config pack.writebitmaps true &&
- git repack -ad
-+ git repack -ad $argv
++ git repack -adF $argv
++ '
++
++ test_size "size of bitmapped pack ${argv:+($argv)}" '
++ test_file_size .git/objects/pack/pack-*.pack
'
# simulate a fetch from a repository that last fetched N days ago, for
@@ t/t5310-pack-bitmaps.sh: test_bitmap_cases
+ git -c pack.allowPackReuse=$reuse pack-objects \
+ --stdout --revs --path-walk --use-bitmap-index \
+ <in >out.pack &&
-+ grep "\"category\":\"bitmap\",\"key\":\"bitmap/hits\"" trace.txt &&
++ test_grep "\"category\":\"bitmap\",\"key\":\"bitmap/hits\"" trace.txt &&
+
+ git index-pack out.pack &&
+
3: 069c50d3370 = 3: 315ee0b1988 pack-objects: extract `record_tree_depth()` helper
4: ae57607b57f = 4: 371fc4317ad pack-objects: support `--delta-islands` with `--path-walk`
base-commit: 45a9ecee26839cc880fdd5e704339dd3cf4ffc26
--
2.54.0.23.g371fc4317ad
^ permalink raw reply
* Re: git-diff in a worktree is an order of magnitude slower?
From: Jeff King @ 2026-06-21 22:00 UTC (permalink / raw)
To: Junio C Hamano; +Cc: D. Ben Knoble, Git
In-Reply-To: <xmqqechz60ah.fsf@gitster.g>
On Sun, Jun 21, 2026 at 02:39:18PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > BTW, I don't think diffcore actually has the information it would need
> > to do so. The racy stuff is handled under the hood in ie_match_stat(),
> > which returns only a set of "changed" flags. So the caller cannot tell
> > the difference between the two cases:
> >
> > 1. We checked ce_match_stat_basic() which said "no change", and then
> > is_racy_timestamp() was false, so that was good enough.
> >
> > 2. is_racy_timestamp() is true, so we further did a content check,
> > found nothing, and returned the same "no change"
> >
> > Obviously we could pass back another flag, but that would disrupt the
> > other callers. Hmm. It looks like we could pass in a flag to say "assume
> > racy entries are modified". And then they come back to the diff code,
> > diffcore_skip_stat_unmatch() sees they're not real diffs and suppresses
> > them, but we _do_ count them as stat-dirty.
>
> Yeah. Because ie_match_stat() does have access to istate, we could
> add a new member to istate, next to "updated_workdir" and friends,
> and smudge the bit when the is_racy_timestamp() goes to the
> compare-data codepath and finds that we are better off auto
> refreshing. Then "were we told to do skip-stat-unmatch and actually
> found some that is worth refreshing?" code can be taught to pay
> attention to that bit as well.
Yeah, that sounds fairly clean. Though if using nanoseconds works out
and makes racy entries extremely unlikely, that is better still. :)
> This is a tangent, but why do we call refresh_index_quietly() in the
> central code path in cmd_diff() in the first place, I have to
> wonder? It should not matter when we are comparing two tree objects
> (or two commits), at least. It of course is not hurting, though.
It seems like it could probably just go into builtin_diff_files(), but
are there other paths that might hit stat-unmatch entries? Maybe the
builtin_diff_b_f() path?
It probably should also support --no-optional-locks, which is currently
only respected by git-status. I don't think it matters that much in
practice because the point is reducing conflict with commands running
frequently in the background, and people don't tend to do that with
git-diff.
Back when we added --no-optional-locks, the idea was that people could
apply it in more spots if they ran into them in practice. So I guess
nobody has with git-diff.
-Peff
^ permalink raw reply
* Re: Pinned references?
From: Jeff King @ 2026-06-21 21:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, Erik Östlund, git
In-Reply-To: <xmqqeci2lcpc.fsf@gitster.g>
On Fri, Jun 19, 2026 at 09:25:19AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > You can already kind of do this:
> >
> > $ git rev-parse v2.54.0
> > 0b13e48a3a30cdfa94e8ef842e24d6045ab3d015
> >
> > $ git rev-parse v2.54.0-0-g0b13e48a3
> > 0b13e48a3a30cdfa94e8ef842e24d6045ab3d015
> >
> > $ git rev-parse v2.54.0-0-g95e20213f
> > 95e20213faefeb95df29277c58ac1980ab68f701
> >
> > This is described under gitrevisions(7), `<describeOutput>`. The only
> > gotcha is that this format will not verify that the tag and the object
> > ID actually match. But other than that it gives you the ability to have
> > both the human-readable name and the machine-readable commit ID in
> > there.
> >
> > As said, we don't verify that those two revisions actually match. So in
> > the case where they don't the result is certainly going to be lots of
> > confusion. It certainly is one of the more surprising syntaxes that we
> > have in Git.
>
> It is very unlikely we would change this, but it is a fun thought
> experiment to imagine what would have happened if we insisted (i.e.,
> verified and then died if it does not hold true) on the presence of
> v2.54.0 tag and when the "hop" count is "-0-", we also insisted that
> the hexadecimal part exactly matched the contents of v2.54.0 tag, or
> when the "hop" count is not zero, we insisted that the hexadecimal
> part names a commit that is descendant of the commit v2.54.0 names.
This is somewhat related to the thread here:
https://lore.kernel.org/git/CAFb48S8LDz4kiWsKSCBn8J=AHyQ5SVPFH4GY=z+8=DntT=PyAw@mail.gmail.com/
The problem there was the opposite. A name "foo-gcc14" was taken as a
describe name (for object "cc14") when it was not. But one thing I noted
there is that you probably can't be too picky about having "foo" when
you see "foo-g1234abcd". Part of the point of putting the hash in the
described name is that the receiver does not necessarily have your same
refs!
So insisting that "v2.54.0-0-g1234abcd" have both "v2.54.0" and
"1234abcd" locally is probably going to cause some regressions. We could
quietly accept it as "1234abcd" if your "v2.54.0" ref is missing
entirely, though that is perhaps missing the point of the original
request.
The discussion around this patch series might also be relevant:
https://lore.kernel.org/git/xmqqed1i4pga.fsf@gitster.g/
-Peff
^ permalink raw reply
* Re: git-diff in a worktree is an order of magnitude slower?
From: Junio C Hamano @ 2026-06-21 21:39 UTC (permalink / raw)
To: Jeff King; +Cc: D. Ben Knoble, Git
In-Reply-To: <20260621174518.GB2206349@coredump.intra.peff.net>
Jeff King <peff@peff.net> writes:
> BTW, I don't think diffcore actually has the information it would need
> to do so. The racy stuff is handled under the hood in ie_match_stat(),
> which returns only a set of "changed" flags. So the caller cannot tell
> the difference between the two cases:
>
> 1. We checked ce_match_stat_basic() which said "no change", and then
> is_racy_timestamp() was false, so that was good enough.
>
> 2. is_racy_timestamp() is true, so we further did a content check,
> found nothing, and returned the same "no change"
>
> Obviously we could pass back another flag, but that would disrupt the
> other callers. Hmm. It looks like we could pass in a flag to say "assume
> racy entries are modified". And then they come back to the diff code,
> diffcore_skip_stat_unmatch() sees they're not real diffs and suppresses
> them, but we _do_ count them as stat-dirty.
Yeah. Because ie_match_stat() does have access to istate, we could
add a new member to istate, next to "updated_workdir" and friends,
and smudge the bit when the is_racy_timestamp() goes to the
compare-data codepath and finds that we are better off auto
refreshing. Then "were we told to do skip-stat-unmatch and actually
found some that is worth refreshing?" code can be taught to pay
attention to that bit as well.
This is a tangent, but why do we call refresh_index_quietly() in the
central code path in cmd_diff() in the first place, I have to
wonder? It should not matter when we are comparing two tree objects
(or two commits), at least. It of course is not hurting, though.
^ permalink raw reply
* Re: [RFH] Why do osx CI jobs so unreliable?
From: Jeff King @ 2026-06-21 21:34 UTC (permalink / raw)
To: Michael Montalbo; +Cc: Patrick Steinhardt, git, Junio C Hamano
In-Reply-To: <CAC2Qwm+9sh=ks1fuux415JGdDJ38Jq6eZrSH7-qzQxYCoy+Aug@mail.gmail.com>
On Sat, Jun 20, 2026 at 08:33:13AM -0700, Michael Montalbo wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > So I strongly suspect that it most be one of the t555* tests.
> > [...]
> > Maybe this is something that's specific to GitHub's environment...
>
> I think you're right it's t5551/t5559. The runs Junio linked:
>
> osx-clang cancelled 360min
> osx-gcc cancelled 360min
> osx-reftable success 35min
> osx-meson success 61min
>
> All four run the same t5551/t5559 under EXPENSIVE. The two that
> finished differ in just two ways, which look like the levers:
> osx-reftable generates the 100k-ref advertisement in ~24ms vs ~1.2s
> for loose refs on macOS (so much less time mid-response), and
> osx-meson runs tests at nproc while the prove jobs hardcode --jobs=10
> on a 3-core runner (over recent master/next the prove jobs hang ~40%,
> meson ~10%).
If the problem is a racy deadlock, there is a reasonable chance that
some jobs may simply be lucky. Even if things like packing refs help, I
suspect the problem may still be lurking. Maybe I'm just a pessimist,
though. ;)
> When it is wedged the whole chain sits at 0% CPU. upload-pack is
> blocked in write() on the ls-refs advertisement, curl blocked in
> select(). So it looks like an HTTP/2 flow-control stall on the
> response side. The same stall resets itself after ~60-85s on my Linux
> box and on a bare-metal Mac, but not on the GitHub runner; I haven't
> pinned down why yet.
We had some HTTP/2 stalls/deadlocks in the past, and they were dependent
on libcurl and apache (actually h2_mod) versions. IIRC some of the
non-TLS code paths for HTTP/2 were not well tested, which led to
8f2146dbf1 (t5559: make SSL/TLS the default, 2023-02-23). Of course
after that commit those cleartext code paths should not be a problem, so
that is probably not exactly the issue now.
But it might be worth checking the versions you're running locally
versus what's in the GitHub runner.
-Peff
^ permalink raw reply
* Re: git-diff in a worktree is an order of magnitude slower?
From: Jeff King @ 2026-06-21 21:28 UTC (permalink / raw)
To: Junio C Hamano; +Cc: D. Ben Knoble, Git
In-Reply-To: <xmqqfr2f7iay.fsf@gitster.g>
On Sun, Jun 21, 2026 at 01:24:53PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > I don't know if any of this is really worth digging too far. This feels
> > like a case we could do a bit better at, but I wonder how much it
> > matters in practice. As soon as you do any index-refresh (including "git
> > status"), the racy entries are cleared and everything is faster. It
> > just seems kind of lame that we write out the initial working tree with
> > so many racy entries.
>
> Yeah, We didn't want to stall for a full second back when we were
> not using subsecond in anywhere, with nanosecond resolution
> timestamps in place, we could delay writing the index file by 50
> milliseconds, nobody notices the delay, and raciness would go away,
> perhaps?
Yes, though that implies comparing the index and file mtimes with
nanosecond precision. We have that precision stored (at least
when the system supports it) but I'm not sure if that comparison would
run afoul of the reasons USE_NSEC was not the default in the first
place.
I guess not? The problem there is that the nanosecond portion would
sometimes get wiped if the entry was dropped from the kernel's in-memory
cache. And then stat-matching would not work. But if we are talking
about strictly asking "is this mtime later than that mtime", then I
think the worst case is that we fall back to the current behavior.
But at the point that we are comparing nanoseconds, I don't think we
even need to bother with the delay. It takes maybe 5 seconds to write
out all of the linux.git files and then the final index. So ~20% of
those files will have the same timestamp as the index. With nanosecond
resolution, we'd expect that to drop by an order of a billion. Even if
we get unlucky and have a single file with the same timestamp, that is
not so bad.
The code to do the nanosecond compare is already there! But it's gated
on USE_NSEC. So this (plus a bonus debugging trace ;) ):
diff --git a/read-cache.c b/read-cache.c
index 21829102ae..f84159a060 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -356,14 +356,10 @@ static int is_racy_stat(const struct index_state *istate,
const struct stat_data *sd)
{
return (istate->timestamp.sec &&
-#ifdef USE_NSEC
/* nanosecond timestamped files can also be racy! */
(istate->timestamp.sec < sd->sd_mtime.sec ||
(istate->timestamp.sec == sd->sd_mtime.sec &&
istate->timestamp.nsec <= sd->sd_mtime.nsec))
-#else
- istate->timestamp.sec <= sd->sd_mtime.sec
-#endif
);
}
@@ -434,6 +430,7 @@ int ie_match_stat(struct index_state *istate,
* carefully than others.
*/
if (!changed && is_racy_timestamp(istate, ce)) {
+ warning("%s is racy", ce->name);
if (assume_racy_is_modified)
changed |= DATA_CHANGED;
else
makes the problem go away. I'm not sure if I'm missing some case where
we could be bitten by the problem that led to making USE_NSEC
conditional, though.
-Peff
^ permalink raw reply related
* Re: [PATCH v2 7/8] refs: fix recursing `get_main_ref_store()` with "onbranch" config
From: Jeff King @ 2026-06-21 21:12 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Karthik Nayak
In-Reply-To: <ajTggBKIzgSpp99X@pks.im>
On Fri, Jun 19, 2026 at 08:25:42AM +0200, Patrick Steinhardt wrote:
> On Thu, Jun 18, 2026 at 12:40:35PM -0400, Jeff King wrote:
> > On Mon, Jun 15, 2026 at 03:56:53PM +0200, Patrick Steinhardt wrote:
> [snip]
> > I'd expect the ref database config (like the ref format) to be read not
> > through the regular config subsystem, but via read_repository_format()
> > and friends. And while that does build on the regular config code, it
> > should never enable includes at all. So includeIf.onbranch:foo.path is
> > just another uninteresting config key to it.
>
> This feels rather painful though, as we'd now have to do this for every
> single backend that we know about. Also, I think not enabling includes
> is an overly broad fix: there isn't any reason why "includeif.gitdir"
> and all the other conditions shouldn't apply. We really only want to
> disable "onbranch".
Sorry, I should probably gone back and edited my email after finishing
it. I was thinking that you meant not general config, but the specific
extensions.refStorage key. Which is not really config, but repo metadata
we happen to store in the .git/config file. And obviously you cannot
read any refs until you know what's in that key.
And that _is_ read separately while loading the repo config, which I
think is right. Other options, like core.logallrefupdates, are handled
separately. And I realized halfway through my reply that was probably
what you meant.
I agree those are user-facing config options that should generally
respect includes in the normal way. I thinks are a bit funny there,
though. See below.
> I actually tried lazy-loading, but I found it to be quite painful
> overall, as the above setting isn't the only one we use. The reftable
> backend for example has a bunch of additional settings that it reads.
>
> We could of course start lazy-loading all of these. But that may not
> work for future backends that really _need_ to parse some configuration
> at initiation time.
Yes, obviously there's some true chicken-and-egg issues if there are
config keys that are needed to initialize the backend. But I think there
are many that are not needed immediately (e.g., because they relate only
to writes, not reads) but still block loading.
For example, try this:
git init
git config core.logallrefupdates false
git config includeIf.onbranch:main.path alt-config
git config -f .git/alt-config core.logallrefupdates true
git commit --allow-empty -qm foo
echo "git-config => $(git config core.logallrefupdates)"
echo "reflog => $(git reflog show)"
git-config will report the value as true, but git-commit will not
respect it. But this used to work! Back when onbranch was added, we'd
create the reflog. Bisecting turns up eafb126456 (environment: stop
storing "core.logAllRefUpdates" globally, 2024-09-12), which makes
sense. That commit pushed the config read down into the ref
initialization function, which created the chicken-and-egg.
Now the config shown above is a bit silly, and I don't expect anybody to
do it in real life. But what worries me is two-fold:
1. There are some magic variables that just won't work with onbranch
includes, but the user doesn't necessarily know what they are.
2. We try to cache the results of config reads. Is it possible for an
"early" request like this to cache a state that skipped the
onbranch include, and then we use that state to look up other
unrelated variables? Or could we see a partially completed state in
the cache when we lookup a ref variable?
I'm not sure. The actual backend lookups use the uncached
repo_config() interface (and in your series here, explicitly
disables the use of refs during that read). But the
core.logallrefupdates lookup uses the cached version, and I think
there are others (some of which happen deep under the hood
through library calls, like calc_shared_perm()).
I tried to construct a few cases that might tickle this behavior, but
couldn't come up with one. But I have a nagging feeling that we are
mostly getting lucky on some of the ordering, and a seemingly unrelated
change could have bad effects.
Sorry, I know that's kind of vague and hand-wavy.
I'm not sure I have a specific recommendation for a direction. It just
feels like we're piling up hacks to avoid infinite recursion without a
clear model of what config is read when. I guess if I could suggest
anything, it would be that ref backends initialize themselves to do
reads while loading as little config as possible, and then perhaps load
additional config through the non-caching repo_config() path.
-Peff
^ permalink raw reply
* Re: [GSoC Patch v7 1/3] path: extract append_formatted_path() and use in rev-parse
From: Junio C Hamano @ 2026-06-21 21:02 UTC (permalink / raw)
To: K Jayatheerth
Cc: a3205153416, git, jltobler, kumarayushjha123, lucasseikioshiro,
phillip.wood, sandals
In-Reply-To: <20260621055534.46798-2-jayatheerthkulkarni2005@gmail.com>
K Jayatheerth <jayatheerthkulkarni2005@gmail.com> writes:
So, for the existing user of this logic, the preimage ...
> -static void print_path(const char *path, const char *prefix, enum format_type format, enum default_type def)
> {
> - char *cwd = NULL;
> - /*
> - * We don't ever produce a relative path if prefix is NULL, so set the
> - * prefix to the current directory so that we can produce a relative
> - * path whenever possible. If we're using RELATIVE_IF_SHARED mode, then
> - * we want an absolute path unless the two share a common prefix, so don't
> - * set it in that case, since doing so causes a relative path to always
> - * be produced if possible.
> - */
> - if (!prefix && (format != FORMAT_DEFAULT || def != DEFAULT_RELATIVE_IF_SHARED))
> - prefix = cwd = xgetcwd();
> - if (format == FORMAT_DEFAULT && def == DEFAULT_UNMODIFIED) {
> - puts(path);
> - } else if (format == FORMAT_RELATIVE ||
> - (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE)) {
> - /*
> - * In order for relative_path to work as expected, we need to
> - * make sure that both paths are absolute paths. If we don't,
> - * we can end up with an unexpected absolute path that the user
> - * didn't want.
> - */
> - struct strbuf buf = STRBUF_INIT, realbuf = STRBUF_INIT, prefixbuf = STRBUF_INIT;
> - if (!is_absolute_path(path)) {
> - strbuf_realpath_forgiving(&realbuf, path, 1);
> - path = realbuf.buf;
> - }
> - if (!is_absolute_path(prefix)) {
> - strbuf_realpath_forgiving(&prefixbuf, prefix, 1);
> - prefix = prefixbuf.buf;
> }
> - puts(relative_path(path, prefix, &buf));
> - strbuf_release(&buf);
> - strbuf_release(&realbuf);
> - strbuf_release(&prefixbuf);
> - } else if (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE_IF_SHARED) {
> - struct strbuf buf = STRBUF_INIT;
> - puts(relative_path(path, prefix, &buf));
> - strbuf_release(&buf);
> - } else {
> - struct strbuf buf = STRBUF_INIT;
> - strbuf_realpath_forgiving(&buf, path, 1);
> - puts(buf.buf);
> - strbuf_release(&buf);
> }
> - free(cwd);
> }
... now becomes this postimage.
> +static void print_path(const char *path, const char *prefix,
> + enum format_type format, enum default_type def)
> {
> + struct strbuf sb = STRBUF_INIT;
> + enum path_format fmt;
> +
> + if (format == FORMAT_RELATIVE) {
> + fmt = PATH_FORMAT_RELATIVE;
> + } else if (format == FORMAT_CANONICAL) {
> + fmt = PATH_FORMAT_CANONICAL;
> + } else /* FORMAT_DEFAULT */ {
> + switch (def) {
> + case DEFAULT_RELATIVE:
> + fmt = PATH_FORMAT_RELATIVE;
> + break;
> + case DEFAULT_RELATIVE_IF_SHARED:
> + fmt = PATH_FORMAT_RELATIVE_IF_SHARED;
> + break;
> + case DEFAULT_CANONICAL:
> + fmt = PATH_FORMAT_CANONICAL;
> + break;
> + case DEFAULT_UNMODIFIED:
> + default:
> + fmt = PATH_FORMAT_UNMODIFIED;
> + break;
> }
> }
> +
> + append_formatted_path(&sb, path, prefix, fmt);
> + puts(sb.buf);
> +
> + strbuf_release(&sb);
> }
Mostly, the code translates FORMAT_FOO constants into the new
PATH_FORMAT_FOO constants, and lets append_formatted_path() do the
heavy lifting.
It is a minor point, but wouldn't it make it simpler to handle
format_default first? I.e.,
if (format == FORMAT_DEFAULT)
switch (def) {
case DEFAULT_RELATIVE:
format = DEFAULT_RELATIVE;
break;
...
case DEFAULT_UNMODIFIED:
default:
format = DEFAULT_UNMODIFIED;
break;
}
switch (format) {
case FORMAT_RELATIVE: fmt = PATH_FORMAT_RELATIVE; break;
case FORMAT_CANONICAL: fmt = PATH_FORMAT_CANONICAL; break;
...
}
Perhaps yes, perhaps not. I dunno.
> diff --git a/path.c b/path.c
> index d7e17bf174..6d8e892ada 100644
> --- a/path.c
> +++ b/path.c
> @@ -1579,6 +1579,75 @@ char *xdg_cache_home(const char *filename)
> return NULL;
> }
>
> +void append_formatted_path(struct strbuf *dest, const char *path,
> + const char *prefix, enum path_format format)
> +{
> + switch (format) {
> + case PATH_FORMAT_UNMODIFIED:
> + strbuf_addstr(dest, path);
> + break;
In the orignal "print_path()", DEFAULT/UNMODIFIED did this "show
unmodified". OK.
> + case PATH_FORMAT_RELATIVE: {
> + struct strbuf relative_buf = STRBUF_INIT;
> + struct strbuf real_path = STRBUF_INIT;
> + struct strbuf real_prefix = STRBUF_INIT;
> + char *cwd = NULL;
> +
> + /*
> + * We don't ever produce a relative path if prefix is NULL,
> + * so set the prefix to the current directory so that we can
> + * produce a relative path whenever possible.
> + */
> + if (!prefix)
> + prefix = cwd = xgetcwd();
This is what was done in the original "print_path()" upfront, with
a similar comment to explay why this happens. Looking good. Also
we no longer call xgetcwd() when we do not need to, which is goodd.
> + if (!is_absolute_path(path)) {
> + strbuf_realpath_forgiving(&real_path, path, 1);
> + path = real_path.buf;
> + }
> + if (!is_absolute_path(prefix)) {
> + strbuf_realpath_forgiving(&real_prefix, prefix, 1);
> + prefix = real_prefix.buf;
> + }
There used to be a comment explaining why we make realpath calls,
which is now lost. Perhaps what the comment said was so obvious
that we are better off without it? I offhand do not know.
What is done to make the paths real is the same as before, which is
good.
> + strbuf_addstr(dest, relative_path(path, prefix, &relative_buf));
> +
> + strbuf_release(&relative_buf);
> + strbuf_release(&real_path);
> + strbuf_release(&real_prefix);
> + free(cwd);
> + break;
> + }
OK.
> + case PATH_FORMAT_RELATIVE_IF_SHARED: {
> + struct strbuf relative_buf = STRBUF_INIT;
> +
> + /*
> + * If we're using RELATIVE_IF_SHARED mode, then we want an
> + * absolute path unless the two share a common prefix, so don't
> + * default the prefix to the current working directory. Doing so
> + * would cause a relative path to always be produced if possible.
> + */
> + strbuf_addstr(dest, relative_path(path, prefix, &relative_buf));
> + strbuf_release(&relative_buf);
> + break;
> + }
Identical to the original, which is good.
> +
> + case PATH_FORMAT_CANONICAL: {
> + struct strbuf canonical_buf = STRBUF_INIT;
> +
> + strbuf_realpath_forgiving(&canonical_buf, path, 1);
> + strbuf_addbuf(dest, &canonical_buf);
> +
> + strbuf_release(&canonical_buf);
> + break;
> + }
> +
> + default:
> + BUG("unknown path_format value %d", format);
> + }
> +}
OK.
> +/**
> + * The formatting strategy to apply when writing a path into a buffer.
> + */
> +enum path_format {
> + /* Output the path exactly as-is without any modifications. */
> + PATH_FORMAT_UNMODIFIED,
> +
> + /* Output a path relative to the provided directory prefix. */
> + PATH_FORMAT_RELATIVE,
> +
> + /* Output a relative path only if the path shares a root with the prefix. */
> + PATH_FORMAT_RELATIVE_IF_SHARED,
> +
> + /* Output a fully resolved, absolute canonical path. */
> + PATH_FORMAT_CANONICAL
> +};
> +
> +/**
> + * Format a path according to the specified formatting strategy and append
> + * the result to the given strbuf.
> + *
> + * `dest` : The string buffer to append the formatted path to.
> + * `path` : The path string that needs to be formatted.
> + * `prefix` : The directory prefix to calculate relative offsets against.
> + * Pass NULL to default to the current working directory where applicable.
> + * `format` : The formatting behavior rule to execute.
> + */
> +void append_formatted_path(struct strbuf *dest, const char *path,
> + const char *prefix, enum path_format format);
> +
It is slightly unsatisfying that this function is defined to
"append" to any existing value in the dest strbuf, rather than
storing the result in the dest strbuf. The original caller
print_path() passes an empty strbuf to this helper, so it can let
strbuf_realpath_*() functions to strbuf_reset() it (e.g.,
abspath.c:get_root_part() called by strbuf_realpath_1(), wihch in
turn is called by strbuf_realpath() and strbuf_realpath_forgiving())
it freely, which means that use of temporary strbuf like
canonical_buf only to copy it out to dest is wasteful and unneeded.
But other callers we will have for this helper later may want to
append to what they already have, so perhaps it is OK (on the other
hand, we could say that preserving and appending is what these
callers can do themselves).
Otherwise, looking good as a no-op bug-to-bug compatible rewrite,
with a slight optimization (to skip xgetcwd()).
Thanks.
^ permalink raw reply
* Re: [PATCH v3 0/2] completion: hide dotfiles for selected path completion
From: Junio C Hamano @ 2026-06-21 20:28 UTC (permalink / raw)
To: D. Ben Knoble; +Cc: Zakariyah Ali via GitGitGadget, git, Zakariyah Ali
In-Reply-To: <CALnO6CBuxz_5x808Km0Z4Y4dh-WcZRKpT1fTNMWOF8_7Pjxt1w@mail.gmail.com>
"D. Ben Knoble" <ben.knoble@gmail.com> writes:
> [Small typo correction that may affect how the message is read]
Thanks, I spotted another one.
>> ... I find this range diff very troubling. If we look at patch 2,
>> it seems that it redoes some part of what is done in patch 1 saying
>> "oops that was wrong, so let's do it better this time". Such a
>> drunken-mans' walk that goes in one direction in an earlier step,
>> only to be corrected to move to a different course, is now how we
>
> "is not" :)
True.
>> want a new topic to be presented.
>>
>> The end result may be much easier to read, mostly thanks to updated
>> loop in the awk script, so if we really want to pretend this as two
"pretend" -> "present".
>> patches for "small pieces are easier to digest" value, perhaps have
>> [PATCH 1/2] that updates the awk script (without doing anything
>> related to hide-dotfiles theme) to make it easier to read by not
>> having multiple "print pfx p" in it, and then build on top of that
>> improved base, have [PATCH 2/2] that adds the support to hide
>> dotfiles, perhaps?
^ permalink raw reply
* Re: git-diff in a worktree is an order of magnitude slower?
From: Junio C Hamano @ 2026-06-21 20:24 UTC (permalink / raw)
To: Jeff King; +Cc: D. Ben Knoble, Git
In-Reply-To: <20260621174518.GB2206349@coredump.intra.peff.net>
Jeff King <peff@peff.net> writes:
> I don't know if any of this is really worth digging too far. This feels
> like a case we could do a bit better at, but I wonder how much it
> matters in practice. As soon as you do any index-refresh (including "git
> status"), the racy entries are cleared and everything is faster. It
> just seems kind of lame that we write out the initial working tree with
> so many racy entries.
Yeah, We didn't want to stall for a full second back when we were
not using subsecond in anywhere, with nanosecond resolution
timestamps in place, we could delay writing the index file by 50
milliseconds, nobody notices the delay, and raciness would go away,
perhaps?
^ permalink raw reply
* Re: [PATCH v3 0/2] environment: move ignore_case into repo_config_values
From: Junio C Hamano @ 2026-06-21 20:16 UTC (permalink / raw)
To: Tian Yuchen; +Cc: git, ps, phillip.wood123, johannes.schindelin, stolee
In-Reply-To: <20260619155152.642760-1-cat@malon.dev>
Tian Yuchen <cat@malon.dev> writes:
> This series continues the ongoing libification effort by moving
> this global variable into 'struct repo_config_values', tying it
> to the specific repository instance it was read from. This allows
> us to encapsulate the configuration without altering its
> eager-parsing behavior.
Looks good.
> compat/win32/path-utils.c --- Is it appropriate to include the
> repository.h header file?
As the compat/ layer is not meant as a general purpose POSIX
emulation wrapper that is generally reusable to projects other than
us, if we have a knob settable by end users to affect behaviours of
lower layer in compat/, it is natural to make repo-settings
available to them.
What is the perceived problem you have in mind, and what are your
proposed alternatives?
^ permalink raw reply
* Re: [PATCH GSoC RFC v13 10/12] cat-file: add remote-object-info to batch-command
From: Junio C Hamano @ 2026-06-21 20:02 UTC (permalink / raw)
To: Chandra Pratap
Cc: Pablo Sabater, peff, eric.peijian, chriscool, git, jltobler,
karthik.188, toon, Jonathan Tan, Calvin Wan
In-Reply-To: <CA+J6zkTjgHAWtJwxY8jo0i9zDtxwj9uUsKAtLS3z1=WxZfr8Zw@mail.gmail.com>
Chandra Pratap <chandrapratap3519@gmail.com> writes:
> [snip]
>> +static void parse_cmd_remote_object_info(struct batch_options *opt,
>> + const char *line, struct strbuf *output,
>> + struct expand_data *data)
>> +{
>> + int count;
>> + const char **argv;
>> + char *line_to_split;
>> + static struct object_info *remote_object_info;
>> + static struct oid_array object_info_oids = OID_ARRAY_INIT;
>
> I don't get the point of remote_object_info and object_info_oids
> being static here? These variables are allocated, utilized, and
> completely freed/disconnected within a single command cycle.
Great observation.
> Making them static gives me the false impression that state
> needs to persist between calls.
Yes, and makes it thread-unsafe, even though if is questionable if
this particular function has to be thread safe ;-)
^ permalink raw reply
* Re: [PATCH v14 4/6] branch: add --prune-merged <branch>
From: Harald Nordgren @ 2026-06-21 18:46 UTC (permalink / raw)
To: Junio C Hamano
Cc: Phillip Wood, Harald Nordgren via GitGitGadget, git,
Kristoffer Haugsbakk, Johannes Sixt
In-Reply-To: <CAHwyqnWt59h2HO5EJbFswYr7QEA7oNZKdBt_vTk5axNbWFZbpA@mail.gmail.com>
Looking into this more and attempting to implement the logic for
re-assigning the upstream, it becomes quite a lot of code.
Maybe an easier way forward now is to avoid deleting these cases. We
can always add the re-assigning logic later on without breaking
backward compatibility.
Harald
^ permalink raw reply
* Re: [PATCH v5 2/2] graph: indent visual root in graph
From: Jeff King @ 2026-06-21 18:05 UTC (permalink / raw)
To: Kristofer Karlsson
Cc: Pablo Sabater, git, ayu.chandekar, chandrapratap3519,
christian.couder, gitster, jltobler, karthik.188, phillip.wood,
siddharthasthana31
In-Reply-To: <CAL71e4MAtD4MqE-22UyYaNFVYcFgYmffngihhovEChVfHLmEdA@mail.gmail.com>
On Fri, Jun 19, 2026 at 09:34:16AM +0200, Kristofer Karlsson wrote:
> On Thu, 18 Jun 2026 at 18:05, Jeff King <peff@peff.net> wrote:
> >
> > Thanks for looking into it. I meant to also cc the Kristofer, the author
> > of dd4bc01c0a, for any thoughts (adding him now).
> >
>
> Thanks for the CC. I took a look at how this interacts with my
> change.
>
> dd4bc01c0a doesn't hurt here I think, but future followup changes
> might. From what I can tell --graph triggers topo_order, so
> the walk mode is either REV_WALK_TOPO or REV_WALK_LIMITED
> and the prio_queue change only applies to REV_WALK_STREAMING.
I'm not so sure. If I merge 53967f242a (graph: indent visual root in
graph, 2026-06-13) into master (so that it has both your commit_queue
changes and Pablo's topic), and then apply this:
diff --git a/graph.c b/graph.c
index e0d1e2a510..8a5f17a089 100644
--- a/graph.c
+++ b/graph.c
@@ -926,6 +926,10 @@ static void graph_peek_next_visible(struct git_graph *graph,
flags->is_next_visual_root = 0;
flags->next_has_column = 0;
+ warning("peeking at visible commits: %d in list, %d in queue",
+ commit_list_count(graph->revs->commits),
+ (int)graph->revs->commit_queue.nr);
+
for (cl = graph->revs->commits; cl; cl = cl->next) {
if (get_commit_action(graph->revs, cl->item) != commit_show)
continue;
and run:
./git log --graph -- Makefile
then we always see exactly one commit in the list, but an
ever-increasing number in the queue (up to ~4000). We do seem to be in
REV_WALK_TOPO mode, so I think we'd never return the commits via
get_revision(), but it is weird that we are sticking them in the queue
at all.
Looks like that happens via rewrite_parents(), which always writes into
commit_queue. I guess it doesn't matter because in topo mode we are
always pulling off of the topo_walk_info queue anyway? It does make me
wonder if there is a lurking bug around history simplification and
--topo-order, though.
> That said, graph_peek_next_visible() reaching directly into
> revs->commits feels fragile -- especially if we drop revs->commits
> in the future. One option would be to add a thin abstraction in
> revision.c that dispatches per walk mode, something like:
>
> int revision_has_more_commits(struct rev_info *revs)
> {
> if (revs->topo_walk_info)
> return revs->topo_walk_info->topo_queue.nr > 0;
> return revs->commits != NULL;
> }
>
> struct commit *revision_peek_next_commit(struct rev_info *revs)
> {
> if (revs->topo_walk_info)
> return prio_queue_peek(&revs->topo_walk_info->topo_queue);
> if (revs->commits)
> return revs->commits->item;
> return NULL;
> }
>
> That way graph.c does not need to know which data structure the
> walker uses, and if the internals change later the API adapts in
> one place.
Yeah, I agree some abstraction would help. I think it would have to be
full iteration, though; the graph code wants to know if there is any
commit that is actually going to be shown, not just a potential single
next one. So we at least need to be able to iterate in arbitrary order.
> As for the multi-element peek question, I think I would either opt
> for draining into a buffer if it's really needed, though when looking
> at the code here I think multi-element peeking is not truly needed.
> It seems like the logic just checks if there is at least another
> element after the peek, but it does not try to read the actual value,
> so we can just check the queue size instead.
We do look at some characteristics of the commit we find by peeking, but
I'm not sure how much it matters if we get the _next_ commit that will
be shown, or if any arbitrary commit is OK.
-Peff
^ permalink raw reply related
* Re: [PATCH] meson: wire up USE_NSEC build knob
From: Jeff King @ 2026-06-21 17:49 UTC (permalink / raw)
To: D. Ben Knoble
Cc: git, brian m . carlson, Junio C Hamano, Patrick Steinhardt,
Ramsay Jones
In-Reply-To: <c4c5ade901ff95b0f95939ea818870e4f3d59da1.1781971201.git.ben.knoble+github@gmail.com>
On Sat, Jun 20, 2026 at 12:00:24PM -0400, D. Ben Knoble wrote:
> Autotools-style builds permit enabling USE_NSEC for cases where that's
> desired; the equivalent knob is missing from meson-based builds.
Seems reasonable. This is not changing the defaults at all, but just
bringing meson's options to parity with the Makefile.
I'm not still not sure if turning on USE_NSEC is a good idea. There's
some discussion in Documentation/technical/racy-git.adoc:
With `USE_NSEC`
compile-time option, `st_mtim.tv_nsec` and `st_ctim.tv_nsec`
members are also compared. On Linux, this is not enabled by default
because in-core timestamps can have finer granularity than
on-disk timestamps, resulting in meaningless changes when an
inode is evicted from the inode cache. See commit 8ce13b0
of git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
([PATCH] Sync in core time granularity with filesystems,
2005-01-04). This patch is included in kernel 2.6.11 and newer, but
only fixes the issue for file systems with exactly 1 ns or 1 s
resolution. Other file systems are still broken in current Linux
kernels (e.g. CEPH, CIFS, NTFS, UDF), see
https://lore.kernel.org/lkml/5577240D.7020309@gmail.com/
That's the most succinct description of the problem I've seen, but I
have no idea how widely it still applies. Kernel 2.6.11 is quite old
now, but I could believe that other filesystems (especially network
ones) still exhibit the issue.
So I guess if we wanted to go further it would take some digging as to
how each platform behaves, and then flipping the config.make.uname knob
for ones where it can be argued that the behavior is always reasonable.
But that's all outside the scope of your patch here.
-Peff
^ permalink raw reply
* Re: git-diff in a worktree is an order of magnitude slower?
From: Jeff King @ 2026-06-21 17:45 UTC (permalink / raw)
To: Junio C Hamano; +Cc: D. Ben Knoble, Git
In-Reply-To: <20260621172432.GA2206349@coredump.intra.peff.net>
On Sun, Jun 21, 2026 at 01:24:32PM -0400, Jeff King wrote:
> I think this is the core of the issue. These entries are "racy git
> dirty" in the sense that their mtimes are the same as the index mtime,
> and so we double-check the contents. This is the first bullet point
> under the "Racy Git" section of Documentation/technical/racy-git.adoc.
>
> But diffcore_skip_stat_unmatch() doesn't count them as dirty, so we
> don't increment the counter, and thus top-level git-diff won't write out
> the new index. And thus every subsequent diff repeats the same
> expensive double-check.
>
> But I'm not sure where the blame lies. Either:
>
> 1. diffcore_skip_stat_unmatch() should be counting these in its
> "dirty" counter; or
BTW, I don't think diffcore actually has the information it would need
to do so. The racy stuff is handled under the hood in ie_match_stat(),
which returns only a set of "changed" flags. So the caller cannot tell
the difference between the two cases:
1. We checked ce_match_stat_basic() which said "no change", and then
is_racy_timestamp() was false, so that was good enough.
2. is_racy_timestamp() is true, so we further did a content check,
found nothing, and returned the same "no change"
Obviously we could pass back another flag, but that would disrupt the
other callers. Hmm. It looks like we could pass in a flag to say "assume
racy entries are modified". And then they come back to the diff code,
diffcore_skip_stat_unmatch() sees they're not real diffs and suppresses
them, but we _do_ count them as stat-dirty.
Like this:
diff --git a/builtin/diff.c b/builtin/diff.c
index 4b46e394ce..4d36b5c1e0 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -271,6 +271,9 @@ static void builtin_diff_files(struct rev_info *revs, int argc, const char **arg
argv++; argc--;
}
+ if (revs->diffopt.skip_stat_unmatch)
+ options |= DIFF_RACY_IS_MODIFIED;
+
/*
* "diff --base" should not combine merges because it was not
* asked to. "diff -c" should not densify (if the user wants
That seems to work, in the sense that "git diff" does refresh the index
afterwards. But the timings are a bit funny.
In my working tree of linux.git with many racy entries it was ~500ms to
do the first diff (and the second, and so on, because we never updated
the index). After the patch above, it is 1800ms to do the first diff,
and then fast (~30ms) after.
I could believe it takes twice as long when we refresh the index
(because I don't think we use the stat-cleanliness we collected from the
diff, but rather just do a from-scratch index refresh). But that would
imply it should take ~1000ms. Where does the extra 800ms go? I guess
that somehow the content-check done by diffcore_skip_stat_unmatch() is
slower than the one done by ie_match_stat(). I think the individual
functions are respectively diff_filespec_check_stat_unmatch() and
ce_modified_check_fs().
I don't know if any of this is really worth digging too far. This feels
like a case we could do a bit better at, but I wonder how much it
matters in practice. As soon as you do any index-refresh (including "git
status"), the racy entries are cleared and everything is faster. It
just seems kind of lame that we write out the initial working tree with
so many racy entries.
-Peff
^ permalink raw reply related
* Re: git-diff in a worktree is an order of magnitude slower?
From: Jeff King @ 2026-06-21 17:24 UTC (permalink / raw)
To: Junio C Hamano; +Cc: D. Ben Knoble, Git
In-Reply-To: <xmqqa4sog1e9.fsf@gitster.g>
On Sat, Jun 20, 2026 at 05:53:02PM -0700, Junio C Hamano wrote:
> > which dates to aecbf914c4 (git-diff: resurrect the traditional empty
> > "diff --git" behaviour, 2007-08-31). On my system that comparison is
> > false because the double-negation produces 1
> > (diff_auto_refresh_index=1 or the result of git_config_bool).
>
> Not quite. It was false because double-negation initializes the
> member to 1, which causes a call to diffcore_skip_stat_unmatch()
> be made, *and* the diffcore_skip_stat_unmatch() function did not
> find any ghost changes, i.e., paths that were only stat-dirty hence
> needed a call to refresh_index_quietly().
I think this is the core of the issue. These entries are "racy git
dirty" in the sense that their mtimes are the same as the index mtime,
and so we double-check the contents. This is the first bullet point
under the "Racy Git" section of Documentation/technical/racy-git.adoc.
But diffcore_skip_stat_unmatch() doesn't count them as dirty, so we
don't increment the counter, and thus top-level git-diff won't write out
the new index. And thus every subsequent diff repeats the same
expensive double-check.
But I'm not sure where the blame lies. Either:
1. diffcore_skip_stat_unmatch() should be counting these in its
"dirty" counter; or
2. the index should be marking these differently. The second bullet
point of that Racy Git section says:
When the index file is updated that contains racily clean
entries, cached `st_size` information is truncated to zero
before writing a new version of the index file.
Should the index be written out with a 0 size field here, so that
we know they are dirty and should be updated? I guess that would be
user-visible, though, because commands that _don't_ update the
index (like plumbing diff-files) would generate a spurious diff
there rather than doing the content-level comparison.
I dunno. You had solved most of the racy git stuff before I came along,
so I never gave it too much thought (and what little thought I did was
many years ago).
> > So… has that conditional been quietly dead all this time? I can't
> > imagine that's right, but…
>
> I initially thought it was an embarrassing thinko, but after seeing
> how .skip_stat_unmatch is used as a 1-based counter (i.e., if the
> member says 42, it means it saw 41 paths that were stat-dirty but
> without actual content change), I do not think so.
>
> Now, it is a different matter if such a "dual" purpose "more than a
> simple boolean" counter is a good idea. Apparently it confused both
> of us in this case ;-).
Make that three of us. ;)
-Peff
^ permalink raw reply
* Re: [PATCH v3 0/2] completion: hide dotfiles for selected path completion
From: D. Ben Knoble @ 2026-06-21 16:46 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Zakariyah Ali via GitGitGadget, git, Zakariyah Ali
In-Reply-To: <xmqq1pe0g08t.fsf@gitster.g>
[Small typo correction that may affect how the message is read]
On Sat, Jun 20, 2026 at 9:18 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Zakariyah Ali via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > The completion helper for index paths uses git ls-files rather than shell
> > filename completion. As a result, leading-dot paths such as a tracked
> > .gitignore were offered even when the user had not started the path with ..
> >
> > Hide leading-dot path components for git rm, git mv, and git ls-files when
> > completing an empty path component. Explicit dot completion is still
> > preserved, so git rm . can still complete .gitignore.
> >
> > This removes the existing TODO expectations in t/t9902-completion.sh and
> > adds coverage for explicit dot completion.
>
> OK.
>
> > Validation:
> >
> > * git diff --check -- contrib/completion/git-completion.bash
> > t/t9902-completion.sh
> > * bash -n contrib/completion/git-completion.bash
> > * ./t9902-completion.sh
>
> I am not sure what you wanted to say with these lines. If you did
> the above to build confidence that your patch works, that would be
> great. Or are you telling readers to do these things and when they
> do not see any issues consider your patch perfect?
>
> What is missing around here in this cover letter is a description of
> how this iteration is different from the previous one. And ...
>
> > Zakariyah Ali (2):
> > completion: hide dotfiles for selected path completion
> > completion: hide dotfiles by default for path completion
> >
> > contrib/completion/git-completion.bash | 53 +++++++++++++++-----------
> > t/t9902-completion.sh | 19 ++++-----
> > 2 files changed, 40 insertions(+), 32 deletions(-)
> >
> >
> > base-commit: 9b7fa37559a1b95ee32e32858b0d038b4cf583e5
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2311%2Falibaba0010%2Fcompletion-hide-dotfiles-v3
> > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2311/alibaba0010/completion-hide-dotfiles-v3
> > Pull-Request: https://github.com/git/git/pull/2311
> >
> > Range-diff vs v2:
> >
> > 1: 056e239e06 = 1: 056e239e06 completion: hide dotfiles for selected path completion
> > -: ---------- > 2: 7482ee4645 completion: hide dotfiles by default for path completion
>
> ... I find this range diff very troubling. If we look at patch 2,
> it seems that it redoes some part of what is done in patch 1 saying
> "oops that was wrong, so let's do it better this time". Such a
> drunken-mans' walk that goes in one direction in an earlier step,
> only to be corrected to move to a different course, is now how we
"is not" :)
> want a new topic to be presented.
>
> The end result may be much easier to read, mostly thanks to updated
> loop in the awk script, so if we really want to pretend this as two
> patches for "small pieces are easier to digest" value, perhaps have
> [PATCH 1/2] that updates the awk script (without doing anything
> related to hide-dotfiles theme) to make it easier to read by not
> having multiple "print pfx p" in it, and then build on top of that
> improved base, have [PATCH 2/2] that adds the support to hide
> dotfiles, perhaps?
>
> Since the initial iteration was quite a while ago, I no longer
> remember the details of the review I gave, but I recall having hard
> time telling which callers of the complete-index-file helper hide
> dotfiles from their output and which callers do not hide them, and
> how the patch decided to choose which ones should and should not
> hide. Has it been improved and if so how? That is something we
> expect the cover letter to tell, too.
>
> Thanks.
>
--
D. Ben Knoble
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox