Git development
 help / color / mirror / Atom feed
* Re: [PATCH] ls-files: filter pathspec before lstat
From: Kristoffer Haugsbakk @ 2026-06-07 16:17 UTC (permalink / raw)
  To: Tamir Duberstein
  Cc: git, René Scharfe, Patrick Steinhardt, Junio C Hamano
In-Reply-To: <CAJ-ks9nXybntsa5FCJVWSQ2u+hzxaMdrfCdL3D+vmzjO4e21kQ@mail.gmail.com>

On Sun, Jun 7, 2026, at 18:07, Tamir Duberstein wrote:
> On Sun, Jun 7, 2026 at 12:02 PM Kristoffer Haugsbakk
> <kristofferhaugsbakk@fastmail.com> wrote:
>>[snip]
>>
>> I have done the same thing in our company repo, crediting <LLM> for
>> authoring or co-authoring or helping with a specific thing. Using a
>> “people” trailer. But the intent was just to show how some LLM was
>> involved. So I think I am going to switch to the following trailer for
>> our company repo.
>>
>>     LLM: Yes
>
> This all sounds reasonable to me. The kernel has started asking for
> this trailer
> (https://github.com/torvalds/linux/commit/78d979db6cef557c171d6059cbce06c3db89c7ee)
> and I saw precedent in Git as recently as last month
> (https://github.com/git/git/commit/7a094d68a27e321a99c8ab6b700909e503904bd9)
> so I erred on the side of caution.
>
> I am also OK with this trailer being dropped or replaced on apply.

The most important thing to be aware of is “Use of Artificial
Intelligence (AI)” in `Documentation/SubmittingPatches`. :)

Thanks

^ permalink raw reply

* Re: [PATCH] ls-files: filter pathspec before lstat
From: Tamir Duberstein @ 2026-06-07 16:07 UTC (permalink / raw)
  To: Kristoffer Haugsbakk
  Cc: git, René Scharfe, Patrick Steinhardt, Junio C Hamano
In-Reply-To: <8f3bab63-3b37-4492-a39e-95e610a15a07@app.fastmail.com>

On Sun, Jun 7, 2026 at 12:02 PM Kristoffer Haugsbakk
<kristofferhaugsbakk@fastmail.com> wrote:
>
> On Sun, Jun 7, 2026, at 17:40, Tamir Duberstein wrote:
> >[snip]
> > Assisted-by: Codex gpt-5.5
>
> This is more of a Git for Windows trailer. The Git project doesn’t
> document its use.
>
> An aside here but these trailers attributing specific LLMs feels like
> etching “Peter was here” under some table. What benefit for the project
> does knowing that it was this version of Codex or Claude or something?
> A link to the prompt/conversation would provide provenance and show how
> the LLM was used. But three years from now, what information beyond the
> fact that an LLM was involved (any of them) does this offer?
>
> I can understand the benefit for the companies behind these LLMs to have
> these attributions in OSS projects.
>
> I have done the same thing in our company repo, crediting <LLM> for
> authoring or co-authoring or helping with a specific thing. Using a
> “people” trailer. But the intent was just to show how some LLM was
> involved. So I think I am going to switch to the following trailer for
> our company repo.
>
>     LLM: Yes

This all sounds reasonable to me. The kernel has started asking for
this trailer (https://github.com/torvalds/linux/commit/78d979db6cef557c171d6059cbce06c3db89c7ee)
and I saw precedent in Git as recently as last month
(https://github.com/git/git/commit/7a094d68a27e321a99c8ab6b700909e503904bd9)
so I erred on the side of caution.

I am also OK with this trailer being dropped or replaced on apply.

^ permalink raw reply

* Re: [PATCH] ls-files: filter pathspec before lstat
From: Kristoffer Haugsbakk @ 2026-06-07 16:02 UTC (permalink / raw)
  To: Tamir Duberstein, git
  Cc: René Scharfe, Patrick Steinhardt, Junio C Hamano
In-Reply-To: <20260607-ls-files-pathspec-lstat-v1-1-8cf40b730146@gmail.com>

On Sun, Jun 7, 2026, at 17:40, Tamir Duberstein wrote:
>[snip]
> Assisted-by: Codex gpt-5.5

This is more of a Git for Windows trailer. The Git project doesn’t
document its use.

An aside here but these trailers attributing specific LLMs feels like
etching “Peter was here” under some table. What benefit for the project
does knowing that it was this version of Codex or Claude or something?
A link to the prompt/conversation would provide provenance and show how
the LLM was used. But three years from now, what information beyond the
fact that an LLM was involved (any of them) does this offer?

I can understand the benefit for the companies behind these LLMs to have
these attributions in OSS projects.

I have done the same thing in our company repo, crediting <LLM> for
authoring or co-authoring or helping with a specific thing. Using a
“people” trailer. But the intent was just to show how some LLM was
involved. So I think I am going to switch to the following trailer for
our company repo.

    LLM: Yes

> Signed-off-by: Tamir Duberstein <tamird@gmail.com>
> ---
>[snip]

^ permalink raw reply

* [PATCH] ls-files: filter pathspec before lstat
From: Tamir Duberstein @ 2026-06-07 15:40 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, Patrick Steinhardt, Junio C Hamano,
	Tamir Duberstein

show_files() checks whether each index entry is deleted or modified
before show_ce() applies the pathspec. prune_index() avoids most of this
work for pathspecs with a common directory prefix, but a top-level name
or leading wildcard leaves every entry to be checked.

Match the pathspec before lstat() for the deleted and modified modes.
Keep the later match in show_ce() so --error-unmatch is satisfied only
by entries that are actually shown.

On a repository with 859,211 index entries, a 19,931,862-byte index, and
25,303,439 packed objects occupying 21.13 GiB, I ran the following
command with the parent and patched binaries:

    hyperfine --warmup 0 --runs 3 \
        'git -c core.fsmonitor=false ls-files --deleted -- README.md'

The results were:

             parent       this commit
  elapsed    60.742 s     1.061 s
  user        1.117 s     0.963 s
  system     10.740 s     0.042 s

Both revisions were built with -O3, -mcpu=native, and ThinLTO using
Apple clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro
(Mac16,6) with a 16-core Apple M4 Max (12 performance and four
efficiency cores) and 128 GB RAM.

Assisted-by: Codex gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
 builtin/ls-files.c                  |  7 +++++++
 t/meson.build                       |  1 +
 t/perf/p3010-ls-files.sh            | 27 +++++++++++++++++++++++++++
 t/t3010-ls-files-killed-modified.sh | 18 ++++++++++++++++++
 4 files changed, 53 insertions(+)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e1a22b41b9..702c607183 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -450,6 +450,13 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 			continue;
 		if (ce_skip_worktree(ce))
 			continue;
+		/* Only entries shown by show_ce() satisfy --error-unmatch. */
+		if (pathspec.nr &&
+		    !match_pathspec(repo->index, &pathspec, fullname.buf,
+				    fullname.len, max_prefix_len, NULL,
+				    S_ISDIR(ce->ce_mode) ||
+				    S_ISGITLINK(ce->ce_mode)))
+			continue;
 		stat_err = lstat(fullname.buf, &st);
 		if (stat_err && (errno != ENOENT && errno != ENOTDIR))
 			error_errno("cannot lstat '%s'", fullname.buf);
diff --git a/t/meson.build b/t/meson.build
index 2af8d01279..ee8086e6ef 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -1140,6 +1140,7 @@ benchmarks = [
   'perf/p1500-graph-walks.sh',
   'perf/p1501-rev-parse-oneline.sh',
   'perf/p2000-sparse-operations.sh',
+  'perf/p3010-ls-files.sh',
   'perf/p3400-rebase.sh',
   'perf/p3404-rebase-interactive.sh',
   'perf/p4000-diff-algorithms.sh',
diff --git a/t/perf/p3010-ls-files.sh b/t/perf/p3010-ls-files.sh
new file mode 100755
index 0000000000..bb80768063
--- /dev/null
+++ b/t/perf/p3010-ls-files.sh
@@ -0,0 +1,27 @@
+#!/bin/sh
+
+test_description='Tests ls-files worktree performance'
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+test_checkout_worktree
+
+test_expect_success 'select a zero-prefix pathspec' '
+	tracked_file=$(git ls-files | sed -n 1p) &&
+	test -n "$tracked_file" &&
+	pathspec="?${tracked_file#?}" &&
+	test_export pathspec
+'
+
+test_perf 'ls-files --deleted with pathspec' '
+	git -c core.fsmonitor=false ls-files --deleted \
+		-- "$pathspec" >/dev/null
+'
+
+test_perf 'ls-files --modified with pathspec' '
+	git -c core.fsmonitor=false ls-files --modified \
+		-- "$pathspec" >/dev/null
+'
+
+test_done
diff --git a/t/t3010-ls-files-killed-modified.sh b/t/t3010-ls-files-killed-modified.sh
index 7af4532cd1..6e38e10219 100755
--- a/t/t3010-ls-files-killed-modified.sh
+++ b/t/t3010-ls-files-killed-modified.sh
@@ -124,4 +124,22 @@ test_expect_success 'validate git ls-files -m output.' '
 	test_cmp .expected .output
 '
 
+test_expect_success 'worktree modes honor wildcard pathspecs' '
+	cat >.expected <<-\EOF &&
+	path2/file2
+	path3/file3
+	EOF
+	git ls-files --deleted -- "path?/file?" >.output &&
+	test_cmp .expected .output &&
+
+	cat >.expected <<-\EOF &&
+	path7
+	path8
+	EOF
+	git ls-files --modified --error-unmatch -- "path[78]" >.output &&
+	test_cmp .expected .output &&
+
+	test_must_fail git ls-files --modified --error-unmatch -- path10
+'
+
 test_done

---
base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
change-id: 20260607-ls-files-pathspec-lstat-885125a5d644

Best regards,
--  
Tamir Duberstein <tamird@gmail.com>


^ permalink raw reply related

* Re: [PATCH v3 4/6] diff: add long-running diff process via diff.<driver>.process
From: Johannes Schindelin @ 2026-06-07 14:36 UTC (permalink / raw)
  To: Michael Montalbo via GitGitGadget; +Cc: git, Michael Montalbo, Michael Montalbo
In-Reply-To: <d044fa0ee5c9cda7dfe4f663f34443103521ef43.1780087700.git.gitgitgadget@gmail.com>

Hi Michael,

I stumbled about this patch when it broke CI in Git for Windows, where we
do _not_ use `NO_PYTHON`, even though Python is unavailable in the
build/test CI jobs. The existing tests handle this situation gracefully,
this here patch does not:

On Sun, 7 Jun 2026, Michael Montalbo via GitGitGadget wrote:

> diff --git a/t/t4080-diff-process.sh b/t/t4080-diff-process.sh
> new file mode 100755
> index 0000000000..f159cd86d8
> --- /dev/null
> +++ b/t/t4080-diff-process.sh
> @@ -0,0 +1,538 @@
> +#!/bin/sh
> +
> +test_description='diff process via long-running process'
> +
> +. ./test-lib.sh
> +
> +if test_have_prereq PYTHON
> +then
> +	PYTHON_PATH=$(command -v python3) || PYTHON_PATH=$(command -v python)

When neither `python3` nor `python` are available (which is the case in
the minimal Git for Windows SDK used in Git's CI runs), this fails under
`set -e`. Before even running the first test case. Resulting in an
unexpected TAP format error.

Now, we could "fix" this by imitating what `lib-p4` does (see
https://github.com/dscho/git/commit/bd0b5570c744f678911a67a62da63f30655f20d8
which demonstrates that it is indeed a work-around on Windows):

-- snip --
diff --git a/t/t4080-diff-process.sh b/t/t4080-diff-process.sh
index fdf6da1c341e67..bd22c247ff3856 100755
--- a/t/t4080-diff-process.sh
+++ b/t/t4080-diff-process.sh
@@ -4,9 +4,10 @@ test_description='diff process via long-running process'
 
 . ./test-lib.sh
 
-if test_have_prereq PYTHON
+if ! test_have_prereq PYTHON || ! test -x "$PYTHON_PATH"
 then
-	PYTHON_PATH=$(command -v python3) || PYTHON_PATH=$(command -v python)
+	skip_all='python interpreter not available'
+	test_done
 fi
 
 #
-- snap --

Of course, this uncovers _another_ problem with the Python script: It uses
Python3-only `f"..."` format strings, which cannot be handled by the
Python2 to which the `PYTHON_PATH` variable in `linux-TEST-vars` points.
So this requires _another follow-up (see also
https://github.com/dscho/git/commit/c12a9f4c80e5ce8db0fe370fac46fb45be2b775f):

-- snip --
diff --git a/t/t4080-diff-process.sh b/t/t4080-diff-process.sh
index bd22c247ff3856..ba14682a9086e4 100755
--- a/t/t4080-diff-process.sh
+++ b/t/t4080-diff-process.sh
@@ -39,7 +39,8 @@ setup_backend () {
 
 	def write_pkt(line):
 	    data = (line + "\n").encode()
-	    sys.stdout.buffer.write(f"{len(data)+4:04x}".encode() + data)
+	    hdr = "{:04x}".format(len(data) + 4).encode()
+	    sys.stdout.buffer.write(hdr + data)
 	    sys.stdout.buffer.flush()
 
 	def write_flush():
@@ -98,7 +99,8 @@ setup_backend () {
 	    new = read_content()
 	    old_first = old.split(b"\n")[0].decode(errors="replace") if old else ""
 	    new_first = new.split(b"\n")[0].decode(errors="replace") if new else ""
-	    log(f"command={cmd} pathname={pathname} old={old_first} new={new_first}")
+	    log("command={} pathname={} old={} new={}".format(
+	        cmd, pathname, old_first, new_first))
 
 	    if mode == "error":
 	        write_flush()
@@ -130,7 +132,7 @@ setup_backend () {
 	        else:
 	            ol = old.count(b"\n")
 	            nl = new.count(b"\n")
-	            write_pkt(f"hunk 1 {ol} 1 {nl}")
+	            write_pkt("hunk 1 {} 1 {}".format(ol, nl))
 	        write_flush()
 	        write_pkt("status=success")
 	        write_flush()
-- snap --

And this is still not enough to make it work with Python2, see
https://github.com/dscho/git/actions/runs/27091523842/job/79955895737:

-- snip --
[...]
+ git -c diff.cdiff.process=./diff-process-backend --mode=fixed-hunk diff boundary.c
  Traceback (most recent call last):
    File "/__w/git/git/t/trash directory.t4080-diff-process/diff-process-backend.py", line 45, in <module>
      assert read_pkt() == "git-diff-client"
    File "/__w/git/git/t/trash directory.t4080-diff-process/diff-process-backend.py", line 4, in read_pkt
      hdr = sys.stdin.buffer.read(4)
  AttributeError: 'file' object has no attribute 'buffer'
-- snap --

I have experienced similar patterns in my career, where a single decision
required multiple follow-up fixes _just_ to avoid having to revert that
decision. This kind of doubling down has never ended well.

Therefore I would like to take a step back, and ask: Is it _really_ a good
idea to use Python here? Are we certain that we want to _require_ Python
to run this test and skip it if Python isn't available (as is the case in
the Windows-related parts of Git's very own CI) even if Python has nothing
at all to do with the feature that is being tested?

I don't want to be doomed to repeat history, and we can very well learn
e.g. from prior art in this very project, where the tests for the
clean/smudge filters (which _also_ want to speak pkt-line over stdio)
needlessly incurred Perl as a requirement to run the tests. It was
Matheus's heroic work in 52917a998ef3a (t0021: implementation the
rot13-filter.pl script in C, 2022-08-14) and 4d1d843be7a15 (tests: use the
new C rot13-filter helper to avoid PERL prereq, 2022-08-14) that avoided
that unnecessary prerequisite.

Likewise, there is `test-tool pkt-line` intended for driving the pkt-line
protocol via simple shell scripts.

So the conscious project direction has been: fold pkt-line test backends
into `test-tool` and drop the scripting-language prereq. Reintroducing the
same shape in Python would walk this back.

Patrick's careful effort in 27bd8ee311719 (Merge branch 'ps/fewer-perl',
2025-04-29) has been another clear sign that the Git project is actively
_removing_ scripting-language dependencies from the build and test
infrastructure, not adding new ones.

The clear prior art in Git's own tests for what t4080 wants to do, as of
today, is `t/helper/test-rot13-filter.c`, which could be imitated here
instead of (re-)introducing a dependency on a scripting language other
than Unix shell in Git's test suite.

The `PYTHON` prereq exists in exactly five files today, all `git p4`
related (where Python is an intrinsic prerequisite given that `git-p4.py`
_is_ written in Python): `t/lib-git-p4.sh`, `t/t9802-git-p4-filetype.sh`,
`t/t9810-git-p4-rcs.sh`, `t/t9835-git-p4-metadata-encoding-python2.sh`,
and `t/t9836-git-p4-metadata-encoding-python3.sh`.

After 7cdbff14d482 (remove merge-recursive-old, 2006-11-20), this here
patch would be the first one, after almost 20 years, to re-introduce
Python as a dependency outside `git p4`.

And it would also be the first ever to embed a Python script as a heredoc:

> +fi
> +
> +#
> +# A single parametric diff process.
> +# Usage: diff-process-backend --mode=<mode> [--log=<path>]
> +#
> +# Modes:
> +#   whole-file  - report all lines as changed (default)
> +#   fixed-hunk  - always report hunk 5 2 5 2
> +#   bad-hunk    - report out-of-bounds hunk 999 1 999 1
> +#   bad-sync    - report hunk with mismatched unchanged totals
> +#   overlap     - report two overlapping hunks
> +#   no-hunks   - return no hunks (files considered equivalent)
> +#   error       - return status=error for every request
> +#   abort       - return status=abort for every request
> +#   crash       - read one request then exit without responding
> +#
> +setup_backend () {
> +	cat >"$TRASH_DIRECTORY/diff-process-backend.py" <<-\PYEOF
> +	import sys, os
> +
> +	def read_pkt():
> +	    hdr = sys.stdin.buffer.read(4)
> +	    if len(hdr) < 4: return None
> +	    length = int(hdr, 16)
> +	    if length == 0: return ""
> +	    data = sys.stdin.buffer.read(length - 4)
> +	    return data.decode().rstrip("\n")
> +
> +	def write_pkt(line):
> +	    data = (line + "\n").encode()
> +	    sys.stdout.buffer.write(f"{len(data)+4:04x}".encode() + data)
> +	    sys.stdout.buffer.flush()
> +
> +	def write_flush():
> +	    sys.stdout.buffer.write(b"0000")
> +	    sys.stdout.buffer.flush()
> +
> +	def read_content():
> +	    chunks = []
> +	    while True:
> +	        hdr = sys.stdin.buffer.read(4)
> +	        if len(hdr) < 4: break
> +	        length = int(hdr, 16)
> +	        if length == 0: break
> +	        chunks.append(sys.stdin.buffer.read(length - 4))
> +	    return b"".join(chunks)
> +
> +	mode = "whole-file"
> +	logfile = None
> +	for arg in sys.argv[1:]:
> +	    if arg.startswith("--mode="):
> +	        mode = arg[7:]
> +	    elif arg.startswith("--log="):
> +	        logfile = open(arg[6:], "a")
> +
> +	def log(msg):
> +	    if logfile:
> +	        logfile.write(msg + "\n")
> +	        logfile.flush()
> +
> +	# Handshake
> +	assert read_pkt() == "git-diff-client"
> +	assert read_pkt() == "version=1"
> +	read_pkt()
> +	write_pkt("git-diff-server")
> +	write_pkt("version=1")
> +	write_flush()
> +	while True:
> +	    p = read_pkt()
> +	    if p == "": break
> +	write_pkt("capability=hunks")
> +	write_flush()
> +
> +	log("ready")
> +
> +	while True:
> +	    cmd = None
> +	    pathname = None
> +	    while True:
> +	        p = read_pkt()
> +	        if p is None: sys.exit(0)
> +	        if p == "": break
> +	        if p.startswith("command="): cmd = p.split("=",1)[1]
> +	        if p.startswith("pathname="): pathname = p.split("=",1)[1]
> +	    if cmd is None: sys.exit(0)
> +	    old = read_content()
> +	    new = read_content()
> +	    old_first = old.split(b"\n")[0].decode(errors="replace") if old else ""
> +	    new_first = new.split(b"\n")[0].decode(errors="replace") if new else ""
> +	    log(f"command={cmd} pathname={pathname} old={old_first} new={new_first}")
> +
> +	    if mode == "error":
> +	        write_flush()
> +	        write_pkt("status=error")
> +	        write_flush()
> +	        continue
> +
> +	    if mode == "abort":
> +	        write_flush()
> +	        write_pkt("status=abort")
> +	        write_flush()
> +	        continue
> +
> +	    if mode == "crash":
> +	        sys.exit(1)
> +
> +	    if cmd == "hunks":
> +	        if mode == "fixed-hunk":
> +	            write_pkt("hunk 5 2 5 2")
> +	        elif mode == "bad-hunk":
> +	            write_pkt("hunk 999 1 999 1")
> +	        elif mode == "bad-sync":
> +	            write_pkt("hunk 1 2 1 1")
> +	        elif mode == "overlap":
> +	            write_pkt("hunk 1 5 1 5")
> +	            write_pkt("hunk 3 2 3 2")
> +	        elif mode == "no-hunks":
> +	            pass
> +	        else:
> +	            ol = old.count(b"\n")
> +	            nl = new.count(b"\n")
> +	            write_pkt(f"hunk 1 {ol} 1 {nl}")
> +	        write_flush()
> +	        write_pkt("status=success")
> +	        write_flush()
> +	    else:
> +	        write_flush()
> +	        write_pkt("status=error")
> +	        write_flush()
> +	PYEOF

The existing pattern is to provide larger scripts as fixtures in
associated `t/tNNNN/` directories, not as heredoc, see e.g.
`t/t1509/prepare-chroot.sh`. Writing scripts, especially lengthy ones, in
heredoc strings makes it virtually impossible to use static code analysis
or syntax highlighting to fend off banal errors.

Given the complexity of what t4080 tries to test (error, abort, crash,
bad-sync, no-hunks, multiple files in one session, capability
negotiation), it would unfortunately be infeasible to use `test-tool
pkt-line` from a shell script implementing that `diff.*.process` protocol.

So I've spiked a demo how the `test-tool diff-process-backend` could look
like (letting Opus do the menial typing, so that I can enjoy at least part
of a sunny Sunday outside), which also passes the CI build and test:
https://github.com/dscho/git/commit/b6e3c93381b00929476c3a00155f7cf7334a22e6

That commit is of course not intended to be used as-is; Feel free to pick
code parts of it and integrate them into your topic branch. Or write your
own test-tool helper from scratch if that's more your jam.

Ciao,
Johannes

> +	write_script diff-process-backend <<-SHEOF
> +	exec "$PYTHON_PATH" "$TRASH_DIRECTORY/diff-process-backend.py" "\$@"
> +	SHEOF
> +}
> +
> +BACKEND="./diff-process-backend"
> +
> +test_expect_success PYTHON 'setup' '
> +	setup_backend &&
> +	echo "*.c diff=cdiff" >.gitattributes &&
> +	git add .gitattributes &&
> +
> +	# boundary.c: 10 lines, changes at 5-6 and 9-10.
> +	# Used by: hunk boundaries, error fallback, crash, bad hunks, overlap.
> +	cat >boundary.c <<-\EOF &&
> +	line1
> +	line2
> +	line3
> +	line4
> +	OLD5
> +	OLD6
> +	line7
> +	line8
> +	OLD9
> +	OLD10
> +	EOF
> +	git add boundary.c &&
> +
> +	# worddiff.c: single-line function, value changes 1 -> 999.
> +	# Used by: word-diff, --diff-algorithm, --no-ext-diff, --stat.
> +	cat >worddiff.c <<-\EOF &&
> +	int value(void) { return 1; }
> +	EOF
> +	git add worddiff.c &&
> +
> +	# newfile.c: single-line function, value changes 42 -> 99.
> +	# Used by: new file, --exit-code, multiple drivers.
> +	cat >newfile.c <<-\EOF &&
> +	int new_func(void) { return 42; }
> +	EOF
> +	git add newfile.c &&
> +
> +	# logtest.c: single-line function for log/format-patch tests.
> +	# Needs two commits so log -1 has a diff.
> +	cat >logtest.c <<-\EOF &&
> +	int logfunc(void) { return 1; }
> +	EOF
> +	git add logtest.c &&
> +
> +	# two.c/one.c: two-file pair for error/abort/startup-failure tests.
> +	cat >one.c <<-\EOF &&
> +	int first(void) { return 1; }
> +	EOF
> +	cat >two.c <<-\EOF &&
> +	int second(void) { return 2; }
> +	EOF
> +	git add one.c two.c &&
> +
> +	git commit -m "initial" &&
> +
> +	# Second commit for logtest.c (so log -1 has something to show).
> +	cat >logtest.c <<-\EOF &&
> +	int logfunc(void) { return 2; }
> +	EOF
> +	git add logtest.c &&
> +	git commit -m "change logtest.c" &&
> +
> +	# Working tree modifications (not committed).
> +	cat >boundary.c <<-\EOF &&
> +	line1
> +	line2
> +	line3
> +	line4
> +	NEW5
> +	NEW6
> +	line7
> +	line8
> +	NEW9
> +	NEW10
> +	EOF
> +
> +	cat >worddiff.c <<-\EOF &&
> +	int value(void) { return 999; }
> +	EOF
> +
> +	cat >newfile.c <<-\EOF &&
> +	int new_func(void) { return 99; }
> +	EOF
> +
> +	cat >one.c <<-\EOF &&
> +	int first(void) { return 10; }
> +	EOF
> +
> +	cat >two.c <<-\EOF
> +	int second(void) { return 20; }
> +	EOF
> +'
> +
> +#
> +# Core behavior: the tool controls which lines are marked as changed.
> +#
> +
> +test_expect_success PYTHON 'diff process hunk boundaries affect output' '
> +	# The file has changes at lines 5-6 and 9-10, but fixed-hunk
> +	# only reports lines 5-6 as changed.  Lines 9-10 should not
> +	# appear as changed in the output.
> +	git -c diff.cdiff.process="$BACKEND --mode=fixed-hunk" \
> +		diff boundary.c >actual &&
> +	test_grep "^-OLD5" actual &&
> +	test_grep "^-OLD6" actual &&
> +	test_grep "^+NEW5" actual &&
> +	test_grep "^+NEW6" actual &&
> +	test_grep ! "^-OLD9" actual &&
> +	test_grep ! "^-OLD10" actual &&
> +	test_grep ! "^+NEW9" actual &&
> +	test_grep ! "^+NEW10" actual
> +'
> +
> +test_expect_success PYTHON 'diff process works with new file' '
> +	rm -f backend.log &&
> +	git -c diff.cdiff.process="$BACKEND --log=backend.log" \
> +		diff -- newfile.c >actual 2>stderr &&
> +	test_grep "return 99" actual &&
> +	test_grep "pathname=newfile.c" backend.log &&
> +	test_must_be_empty stderr
> +'
> +
> +test_expect_success PYTHON 'diff process works with added file (empty old side)' '
> +	cat >added.c <<-\EOF &&
> +	int added(void) { return 1; }
> +	EOF
> +	git add added.c &&
> +
> +	rm -f backend.log &&
> +	git -c diff.cdiff.process="$BACKEND --log=backend.log" \
> +		diff --cached -- added.c >actual 2>stderr &&
> +	test_grep "added" actual &&
> +	test_grep "pathname=added.c" backend.log &&
> +	test_must_be_empty stderr
> +'
> +
> +test_expect_success PYTHON 'diff process skipped for binary files' '
> +	printf "\\0binary" >binary.c &&
> +	git add binary.c &&
> +	git commit -m "add binary" &&
> +	printf "\\0changed" >binary.c &&
> +
> +	rm -f backend.log &&
> +	git -c diff.cdiff.process="$BACKEND --log=backend.log" \
> +		diff -- binary.c >actual &&
> +	test_grep "Binary files" actual &&
> +	test_path_is_missing backend.log
> +'
> +
> +test_expect_success PYTHON 'diff process not consulted for unmatched driver' '
> +	echo "not tracked by cdiff" >unmatched.txt &&
> +	git add unmatched.txt &&
> +	git commit -m "add unmatched.txt" &&
> +
> +	echo "modified" >unmatched.txt &&
> +
> +	rm -f backend.log &&
> +	git -c diff.cdiff.process="$BACKEND --log=backend.log" \
> +		diff -- unmatched.txt >actual &&
> +	test_grep "modified" actual &&
> +	test_path_is_missing backend.log
> +'
> +
> +test_expect_success PYTHON 'multiple drivers use separate processes' '
> +	echo "*.h diff=hdiff" >>.gitattributes &&
> +	git add .gitattributes &&
> +
> +	cat >multi.h <<-\EOF &&
> +	int header(void) { return 1; }
> +	EOF
> +	git add multi.h &&
> +	git commit -m "add multi.h" &&
> +
> +	cat >multi.h <<-\EOF &&
> +	int header(void) { return 2; }
> +	EOF
> +
> +	rm -f backend-c.log backend-h.log &&
> +	git -c diff.cdiff.process="$BACKEND --log=backend-c.log" \
> +	    -c diff.hdiff.process="$BACKEND --log=backend-h.log" \
> +		diff -- newfile.c multi.h >actual 2>stderr &&
> +	test_grep "pathname=newfile.c" backend-c.log &&
> +	test_grep "pathname=multi.h" backend-h.log &&
> +	test_must_be_empty stderr
> +'
> +
> +test_expect_success PYTHON 'diff process works alongside textconv' '
> +	write_script uppercase-filter <<-\EOF &&
> +	tr "a-z" "A-Z" <"$1"
> +	EOF
> +
> +	cat >textconv.c <<-\EOF &&
> +	hello world
> +	EOF
> +	git add textconv.c &&
> +	git commit -m "add textconv.c" &&
> +
> +	cat >textconv.c <<-\EOF &&
> +	goodbye world
> +	EOF
> +
> +	rm -f backend.log &&
> +	git -c diff.cdiff.textconv="./uppercase-filter" \
> +	    -c diff.cdiff.process="$BACKEND --log=backend.log" \
> +		diff -- textconv.c >actual 2>stderr &&
> +	# The diff process receives textconv-transformed (uppercase) content.
> +	test_grep "pathname=textconv.c" backend.log &&
> +	test_grep "old=HELLO WORLD" backend.log &&
> +	test_grep "new=GOODBYE WORLD" backend.log &&
> +	test_must_be_empty stderr
> +'
> +
> +#
> +# Downstream features: word diff, log, equivalent files, exit code.
> +#
> +
> +test_expect_success PYTHON 'diff process with --word-diff' '
> +	rm -f backend.log &&
> +	git -c diff.cdiff.process="$BACKEND --log=backend.log" \
> +		diff --word-diff worddiff.c >actual 2>stderr &&
> +	test_grep "\[-1;-\]" actual &&
> +	test_grep "{+999;+}" actual &&
> +	test_grep "pathname=worddiff.c" backend.log &&
> +	test_must_be_empty stderr
> +'
> +
> +test_expect_success PYTHON 'diff process works with git log -p' '
> +	# With no-hunks mode, the tool says the files are equivalent,
> +	# so log -p should show the commit but no diff content.
> +	rm -f backend.log &&
> +	git -c diff.cdiff.process="$BACKEND --mode=no-hunks --log=backend.log" \
> +		log -1 -p -- logtest.c >actual 2>stderr &&
> +	test_grep "change logtest.c" actual &&
> +	test_grep ! "return 2" actual &&
> +	test_grep "command=hunks pathname=logtest.c" backend.log &&
> +	test_must_be_empty stderr
> +'
> +
> +test_expect_success PYTHON 'diff process no hunks suppresses diff output' '
> +	cat >nohunks.c <<-\EOF &&
> +	int zero(void) { return 0; }
> +	EOF
> +	git add nohunks.c &&
> +	git commit -m "add nohunks.c" &&
> +
> +	cat >nohunks.c <<-\EOF &&
> +	int zero(void) { return 999; }
> +	EOF
> +
> +	git -c diff.cdiff.process="$BACKEND --mode=no-hunks" \
> +		diff nohunks.c >actual &&
> +	test_must_be_empty actual
> +'
> +
> +test_expect_success PYTHON 'diff process no hunks with --exit-code returns success' '
> +	git -c diff.cdiff.process="$BACKEND --mode=no-hunks" \
> +		diff --exit-code nohunks.c
> +'
> +
> +test_expect_success PYTHON 'diff process with --exit-code and hunks returns failure' '
> +	test_expect_code 1 git -c diff.cdiff.process="$BACKEND" \
> +		diff --exit-code newfile.c
> +'
> +
> +#
> +# Bypass mechanisms: flags and commands that skip the diff process.
> +#
> +
> +test_expect_success PYTHON 'diff process bypassed by --diff-algorithm' '
> +	rm -f backend.log &&
> +	git -c diff.cdiff.process="$BACKEND --log=backend.log" \
> +		diff --diff-algorithm=patience worddiff.c >actual &&
> +	test_grep "return 999" actual &&
> +	test_path_is_missing backend.log
> +'
> +
> +test_expect_success PYTHON 'diff process not used by --stat' '
> +	rm -f backend.log &&
> +	git -c diff.cdiff.process="$BACKEND --log=backend.log" \
> +		diff --stat worddiff.c >actual &&
> +	test_grep "worddiff.c" actual &&
> +	test_path_is_missing backend.log
> +'
> +
> +#
> +# Error handling and fallback.
> +#
> +
> +test_expect_success PYTHON 'diff process fallback on tool error status' '
> +	rm -f backend.log &&
> +	git -c diff.cdiff.process="$BACKEND --mode=error --log=backend.log" \
> +		diff boundary.c >actual 2>stderr &&
> +	# Fallback produces the full builtin diff (both change regions).
> +	test_grep "^-OLD5" actual &&
> +	test_grep "^+NEW5" actual &&
> +	test_grep "^-OLD9" actual &&
> +	test_grep "^+NEW9" actual &&
> +	# Tool was contacted (it replied with error, not crash).
> +	test_grep "command=hunks pathname=boundary.c" backend.log &&
> +	test_grep "diff process.*failed" stderr
> +'
> +
> +test_expect_success PYTHON 'diff process error keeps tool available for next file' '
> +	rm -f backend.log &&
> +	git -c diff.cdiff.process="$BACKEND --mode=error --log=backend.log" \
> +		diff -- one.c two.c >actual 2>stderr &&
> +	# Unlike abort, error keeps the tool available: both files
> +	# are sent to the tool (and both fall back).
> +	test_grep "pathname=one.c" backend.log &&
> +	test_grep "pathname=two.c" backend.log &&
> +	test_grep "return 10" actual &&
> +	test_grep "return 20" actual
> +'
> +
> +test_expect_success PYTHON 'diff process abort disables for session' '
> +	rm -f backend.log &&
> +	git -c diff.cdiff.process="$BACKEND --mode=abort --log=backend.log" \
> +		diff -- one.c two.c >actual &&
> +	# Both files should still produce diff output via fallback.
> +	test_grep "return 10" actual &&
> +	test_grep "return 20" actual &&
> +	# The tool aborts on the first file and git clears its
> +	# capability.  The second file never contacts the tool.
> +	test_grep "pathname=one.c" backend.log &&
> +	test_grep ! "pathname=two.c" backend.log
> +'
> +
> +test_expect_success PYTHON 'diff process fallback on tool crash' '
> +	git -c diff.cdiff.process="$BACKEND --mode=crash" \
> +		diff boundary.c >actual 2>stderr &&
> +	test_grep "^-OLD5" actual &&
> +	test_grep "^+NEW5" actual &&
> +	test_grep "^-OLD9" actual &&
> +	test_grep "^+NEW9" actual &&
> +	# Crash is a communication failure, so a warning is emitted.
> +	test_grep "diff process.*failed" stderr
> +'
> +
> +test_expect_success PYTHON 'diff process startup failure only warns once' '
> +	git -c diff.cdiff.process="/nonexistent/tool" \
> +		diff -- one.c two.c >actual 2>stderr &&
> +	# Both files produce diff output via fallback.
> +	test_grep "return 10" actual &&
> +	test_grep "return 20" actual &&
> +	# Sentinel prevents repeated warnings: only one, not one per file.
> +	test_grep "diff process.*failed" stderr >warnings &&
> +	test_line_count = 1 warnings
> +'
> +
> +test_expect_success PYTHON 'diff process fallback on bad hunks' '
> +	git -c diff.cdiff.process="$BACKEND --mode=bad-hunk" \
> +		diff boundary.c >actual 2>stderr &&
> +	test_grep "^-OLD5" actual &&
> +	test_grep "^+NEW5" actual &&
> +	test_grep "^-OLD9" actual &&
> +	test_grep "^+NEW9" actual &&
> +	# Invalid hunks are caught by xdiff validation, not the
> +	# protocol layer, so no warning is emitted.
> +	test_must_be_empty stderr
> +'
> +
> +test_expect_success PYTHON 'diff process fallback on mismatched unchanged totals' '
> +	cat >synctest.c <<-\EOF &&
> +	line1
> +	line2
> +	line3
> +	EOF
> +	git add synctest.c &&
> +	git commit -m "add synctest.c" &&
> +
> +	cat >synctest.c <<-\EOF &&
> +	line1
> +	changed
> +	line3
> +	EOF
> +
> +	# bad-sync reports hunk 1 2 1 1: marks 2 old lines and 1 new
> +	# line as changed, leaving 1 unchanged old vs 2 unchanged new.
> +	# The synchronization invariant fails and git falls back.
> +	git -c diff.cdiff.process="$BACKEND --mode=bad-sync" \
> +		diff synctest.c >actual 2>stderr &&
> +	test_grep "changed" actual
> +'
> +
> +test_expect_success PYTHON 'diff process fallback on overlapping hunks' '
> +	# boundary.c has 10 lines, so both hunks are in bounds
> +	# but they overlap at lines 3-5, triggering the ordering check.
> +	git -c diff.cdiff.process="$BACKEND --mode=overlap" \
> +		diff boundary.c >actual 2>stderr &&
> +	test_grep "NEW5" actual
> +'
> +
> +test_done
> diff --git a/userdiff.h b/userdiff.h
> index 51c26e0d41..a98eabe377 100644
> --- a/userdiff.h
> +++ b/userdiff.h
> @@ -3,6 +3,7 @@
>  
>  #include "notes-cache.h"
>  
> +struct diff_subprocess;
>  struct index_state;
>  struct repository;
>  
> @@ -33,6 +34,8 @@ struct userdiff_driver {
>  	int textconv_want_cache;
>  	const char *process;
>  	char *process_owned;
> +	struct diff_subprocess *diff_subprocess;
> +	unsigned diff_process_failed : 1;
>  };
>  enum userdiff_driver_type {
>  	USERDIFF_DRIVER_TYPE_BUILTIN = 1<<0,
> -- 
> gitgitgadget
> 
> 
> 

^ permalink raw reply related

* Re: [PATCH v2] prio-queue: use cascade-down for faster extract-min
From: Kristofer Karlsson @ 2026-06-07 12:07 UTC (permalink / raw)
  To: René Scharfe; +Cc: Kristofer Karlsson via GitGitGadget, git
In-Reply-To: <1aa5b755-0f74-46d5-bd6e-a9cb7f3fbb12@web.de>

On Sat, 7 Jun 2026 at 09:30, Rene Scharfe <l.s.r@web.de> wrote:
>
> Right.  I was wondering, though: Why is sift-down so much faster than
> cascade in the describe benchmark from 30598ccc4d (describe: use oidset
> in finish_depth_computation(), 2025-09-02)?
>
> I think I mostly understand it now: cascade is better in prio_queue_get()
> because the sift-down item is from the bottom and will likely end up back
> at the bottom, just of a different branch of the heap.  Thus a sift-down
> costs 3 compares times the number of levels, while a cascade costs just
> 2 compares times the number of levels and there is likely little to no
> need to sift it back up.
>
> For prio_queue_replace() we sift down a random item, though; we don't
> know where it will end up.  If it belongs at the very top then sift-down
> just needs 3 compares, while cascade needs 2 compares times the number
> of levels to bring the hole down and the same to bring the item up.

Yes, I think that reasoning is correct. It depends on where the item
will land.

For get() the last array element came from the bottom of the heap
and will almost certainly end up back near the bottom, so
cascade's blind descent is a good default.

For replace() the new element is arbitrary -- in describe's pattern
it often belongs near the root, so sift-down's early exit after
2-3 compares dominates.

Your benchmark confirms this: v2 (cascade only in get) matches the
baseline exactly for describe, while the hybrid is 4% slower.

> So I guess we keep the full sift-down for prio_queue_replace(), knowing
> that sometimes we have a lot of items that end up at or close to the
> root of the heap.

Agreed. And with the lazy-fold series (v3 just sent), replace() is
removed as a public API entirely. But the same principle applies to
the fused replace path inside prio_queue_put(): when get_pending is
set and a put arrives, we write the new element at the root and
sift it down -- that path should keep sift-down for the same reason
your analysis shows.

If (that's a big if) both my patches eventually land,
the split would be:
 - unfused get-flush (in get() and peek()): use cascade
 - fused replace (in put()): keep sift-down

Which is exactly the split your analysis predicts is optimal.

Now I am thinking it would be easier to reason about this if the other
patch lands first, since the cascade change becomes simpler to evaluate
when replace is already gone and only the unfused paths remain.

- Kristofer

^ permalink raw reply

* [PATCH v3 2/2] prio-queue: rename .nr to .nr_internal to prevent direct access
From: Kristofer Karlsson via GitGitGadget @ 2026-06-07 11:43 UTC (permalink / raw)
  To: git
  Cc: René Scharfe, Kristofer Karlsson, Kristofer Karlsson,
	Kristofer Karlsson
In-Reply-To: <pull.2140.v3.git.1780832592.gitgitgadget@gmail.com>

From: Kristofer Karlsson <krka@spotify.com>

Rename the .nr member to .nr_internal so that callers outside
prio-queue.c that directly reference .nr get a compilation error.
This catches both existing misuse and future in-flight topics.

Add prio_queue_for_each() macro for callers that need to walk all
elements in the queue, accounting for the get_pending offset.

Convert all external .nr users:
 - Loop conditions: use prio_queue_size(), prio_queue_get(), or
   prio_queue_peek() as the loop condition
 - Array iterations: use prio_queue_for_each()

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
 builtin/describe.c      |  7 +++---
 builtin/last-modified.c |  5 ++---
 builtin/show-branch.c   |  9 ++++----
 commit-reach.c          | 19 +++++++++-------
 fetch-pack.c            |  4 ++--
 negotiator/default.c    |  4 +++-
 negotiator/skipping.c   | 12 ++++++-----
 object-name.c           |  2 +-
 pack-bitmap-write.c     |  6 +++---
 path-walk.c             |  8 +++----
 prio-queue.c            | 48 +++++++++++++++++++++++------------------
 prio-queue.h            |  9 ++++++--
 revision.c              | 11 +++++-----
 13 files changed, 79 insertions(+), 65 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 85564f3487..64424543ef 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -258,10 +258,9 @@ static unsigned long finish_depth_computation(struct prio_queue *queue,
 	struct oidset unflagged = OIDSET_INIT;
 	struct commit *c;
 
-	for (size_t i = queue->get_pending; i < queue->nr; i++) {
-		struct commit *commit = queue->array[i].data;
-		if (!(commit->object.flags & best->flag_within))
-			oidset_insert(&unflagged, &commit->object.oid);
+	prio_queue_for_each(queue, c) {
+		if (!(c->object.flags & best->flag_within))
+			oidset_insert(&unflagged, &c->object.oid);
 	}
 
 	while ((c = prio_queue_get(queue))) {
diff --git a/builtin/last-modified.c b/builtin/last-modified.c
index df2a508244..5478182f2e 100644
--- a/builtin/last-modified.c
+++ b/builtin/last-modified.c
@@ -344,7 +344,7 @@ static void process_parent(struct last_modified *lm,
 static int last_modified_run(struct last_modified *lm)
 {
 	int max_count, queue_popped = 0;
-	struct commit *c;
+	struct commit *c, *n;
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	struct prio_queue not_queue = { compare_commits_by_gen_then_commit_date };
 	struct commit_list *list;
@@ -416,9 +416,8 @@ static int last_modified_run(struct last_modified *lm)
 		 */
 		repo_parse_commit(lm->rev.repo, c);
 
-		while (not_queue.nr) {
+		while ((n = prio_queue_get(&not_queue))) {
 			struct commit_list *np;
-			struct commit *n = prio_queue_get(&not_queue);
 
 			repo_parse_commit(lm->rev.repo, n);
 
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 9f7f28f339..2435e8aeda 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -62,11 +62,10 @@ static const char *get_color_reset_code(void)
 
 static struct commit *interesting(struct prio_queue *queue)
 {
-	for (size_t i = queue->get_pending; i < queue->nr; i++) {
-		struct commit *commit = queue->array[i].data;
-		if (commit->object.flags & UNINTERESTING)
-			continue;
-		return commit;
+	struct commit *commit;
+	prio_queue_for_each(queue, commit) {
+		if (!(commit->object.flags & UNINTERESTING))
+			return commit;
 	}
 	return NULL;
 }
diff --git a/commit-reach.c b/commit-reach.c
index 0fec2f00be..dfe6016cb2 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -41,8 +41,8 @@ static int compare_commits_by_gen(const void *_a, const void *_b)
 
 static int queue_has_nonstale(struct prio_queue *queue)
 {
-	for (size_t i = 0; i < queue->nr; i++) {
-		struct commit *commit = queue->array[i].data;
+	struct commit *commit;
+	prio_queue_for_each(queue, commit) {
 		if (!(commit->object.flags & STALE))
 			return 1;
 	}
@@ -1069,6 +1069,7 @@ void ahead_behind(struct repository *r,
 		  struct commit **commits, size_t commits_nr,
 		  struct ahead_behind_count *counts, size_t counts_nr)
 {
+	struct commit *c;
 	struct prio_queue queue = { .compare = compare_commits_by_gen_then_commit_date };
 	size_t width = DIV_ROUND_UP(commits_nr, BITS_IN_EWORD);
 
@@ -1085,17 +1086,19 @@ void ahead_behind(struct repository *r,
 	init_bit_arrays(&bit_arrays);
 
 	for (size_t i = 0; i < commits_nr; i++) {
-		struct commit *c = commits[i];
-		struct bitmap *bitmap = get_bit_array(c, width);
+		struct bitmap *bitmap;
+		c = commits[i];
+		bitmap = get_bit_array(c, width);
 
 		bitmap_set(bitmap, i);
 		insert_no_dup(&queue, c);
 	}
 
 	while (queue_has_nonstale(&queue)) {
-		struct commit *c = prio_queue_get(&queue);
 		struct commit_list *p;
-		struct bitmap *bitmap_c = get_bit_array(c, width);
+		struct bitmap *bitmap_c;
+		c = prio_queue_get(&queue);
+		bitmap_c = get_bit_array(c, width);
 
 		for (size_t i = 0; i < counts_nr; i++) {
 			int reach_from_tip = !!bitmap_get(bitmap_c, counts[i].tip_index);
@@ -1135,8 +1138,8 @@ void ahead_behind(struct repository *r,
 
 	/* STALE is used here, PARENT2 is used by insert_no_dup(). */
 	repo_clear_commit_marks(r, PARENT2 | STALE);
-	for (size_t i = 0; i < queue.nr; i++)
-		free_bit_array(queue.array[i].data);
+	prio_queue_for_each(&queue, c)
+		free_bit_array(c);
 	clear_bit_arrays(&bit_arrays);
 	clear_prio_queue(&queue);
 }
diff --git a/fetch-pack.c b/fetch-pack.c
index 120e01f3cf..29c41132ee 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -662,8 +662,8 @@ static int mark_complete_oid(const struct reference *ref, void *cb_data UNUSED)
 static void mark_recent_complete_commits(struct fetch_pack_args *args,
 					 timestamp_t cutoff)
 {
-	while (complete.nr) {
-		struct commit *item = prio_queue_peek(&complete);
+	struct commit *item;
+	while ((item = prio_queue_peek(&complete))) {
 		if (item->date < cutoff)
 			break;
 		print_verbose(args, _("Marking %s as complete"),
diff --git a/negotiator/default.c b/negotiator/default.c
index 78d58d57ce..19cdf3808c 100644
--- a/negotiator/default.c
+++ b/negotiator/default.c
@@ -113,10 +113,12 @@ static const struct object_id *get_rev(struct negotiation_state *ns)
 		unsigned int mark;
 		struct commit_list *parents;
 
-		if (ns->rev_list.nr == 0 || ns->non_common_revs == 0)
+		if (ns->non_common_revs == 0)
 			return NULL;
 
 		commit = prio_queue_get(&ns->rev_list);
+		if (!commit)
+			return NULL;
 		repo_parse_commit(the_repository, commit);
 		parents = commit->parents;
 
diff --git a/negotiator/skipping.c b/negotiator/skipping.c
index 68c9b3b997..db90fa77b5 100644
--- a/negotiator/skipping.c
+++ b/negotiator/skipping.c
@@ -143,8 +143,7 @@ static int push_parent(struct data *data, struct entry *entry,
 		/*
 		 * Find the existing entry and use it.
 		 */
-		for (size_t i = 0; i < data->rev_list.nr; i++) {
-			parent_entry = data->rev_list.array[i].data;
+		prio_queue_for_each(&data->rev_list, parent_entry) {
 			if (parent_entry->commit == to_push)
 				goto parent_found;
 		}
@@ -181,10 +180,12 @@ static const struct object_id *get_rev(struct data *data)
 		struct commit_list *p;
 		int parent_pushed = 0;
 
-		if (data->rev_list.nr == 0 || data->non_common_revs == 0)
+		if (data->non_common_revs == 0)
 			return NULL;
 
 		entry = prio_queue_get(&data->rev_list);
+		if (!entry)
+			return NULL;
 		commit = entry->commit;
 		commit->object.flags |= POPPED;
 		if (!(commit->object.flags & COMMON))
@@ -253,8 +254,9 @@ static void have_sent(struct fetch_negotiator *n, struct commit *c)
 static void release(struct fetch_negotiator *n)
 {
 	struct data *data = n->data;
-	for (size_t i = 0; i < data->rev_list.nr; i++)
-		free(data->rev_list.array[i].data);
+	void *entry;
+	prio_queue_for_each(&data->rev_list, entry)
+		free(entry);
 	clear_prio_queue(&data->rev_list);
 	FREE_AND_NULL(data);
 }
diff --git a/object-name.c b/object-name.c
index 9ac86f19c7..2fedfe1761 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1208,7 +1208,7 @@ static int get_oid_oneline(struct repository *r,
 		l->item->object.flags |= ONELINE_SEEN;
 		prio_queue_put(&copy, l->item);
 	}
-	while (copy.nr) {
+	while (prio_queue_size(&copy)) {
 		const char *p, *buf;
 		struct commit *commit;
 		int matches;
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index f7c63e3027..ed9714b135 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -514,6 +514,7 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
 			      const uint32_t *mapping)
 {
 	struct commit *c;
+	struct tree *tree;
 	int found;
 	uint32_t pos;
 	if (!ent->bitmap)
@@ -574,9 +575,8 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
 		}
 	}
 
-	while (tree_queue->nr) {
-		if (fill_bitmap_tree(writer, ent->bitmap,
-				     prio_queue_get(tree_queue)) < 0)
+	while ((tree = prio_queue_get(tree_queue))) {
+		if (fill_bitmap_tree(writer, ent->bitmap, tree) < 0)
 			return -1;
 	}
 	return 0;
diff --git a/path-walk.c b/path-walk.c
index 94ff90bd15..cf3b2d0765 100644
--- a/path-walk.c
+++ b/path-walk.c
@@ -699,6 +699,7 @@ int walk_objects_by_path(struct path_walk_info *info)
 	int ret;
 	size_t commits_nr = 0, paths_nr = 0;
 	struct commit *c;
+	char *path;
 	struct type_and_oid_list *root_tree_list;
 	struct type_and_oid_list *commit_list;
 	struct path_walk_context ctx = {
@@ -808,8 +809,7 @@ int walk_objects_by_path(struct path_walk_info *info)
 	free(commit_list);
 
 	trace2_region_enter("path-walk", "path-walk", info->revs->repo);
-	while (!ret && ctx.path_stack.nr) {
-		char *path = prio_queue_get(&ctx.path_stack);
+	while (!ret && (path = prio_queue_get(&ctx.path_stack))) {
 		paths_nr++;
 
 		ret = walk_path(&ctx, path);
@@ -821,12 +821,12 @@ int walk_objects_by_path(struct path_walk_info *info)
 	if (!strmap_empty(&ctx.paths_to_lists)) {
 		struct hashmap_iter iter;
 		struct strmap_entry *entry;
+		char *path;
 
 		strmap_for_each_entry(&ctx.paths_to_lists, &iter, entry)
 			push_to_stack(&ctx, entry->key);
 
-		while (!ret && ctx.path_stack.nr) {
-			char *path = prio_queue_get(&ctx.path_stack);
+		while (!ret && (path = prio_queue_get(&ctx.path_stack))) {
 			paths_nr++;
 
 			ret = walk_path(&ctx, path);
diff --git a/prio-queue.c b/prio-queue.c
index a03c617470..f96b810c15 100644
--- a/prio-queue.c
+++ b/prio-queue.c
@@ -22,16 +22,16 @@ void prio_queue_reverse(struct prio_queue *queue)
 
 	if (queue->compare)
 		BUG("prio_queue_reverse() on non-LIFO queue");
-	if (!queue->nr)
+	if (!queue->nr_internal)
 		return;
-	for (i = 0; i < (j = (queue->nr - 1) - i); i++)
+	for (i = 0; i < (j = (queue->nr_internal - 1) - i); i++)
 		swap(queue, i, j);
 }
 
 void clear_prio_queue(struct prio_queue *queue)
 {
 	FREE_AND_NULL(queue->array);
-	queue->nr = 0;
+	queue->nr_internal = 0;
 	queue->alloc = 0;
 	queue->insertion_ctr = 0;
 	queue->get_pending = 0;
@@ -41,13 +41,16 @@ static void sift_down_root(struct prio_queue *queue)
 {
 	size_t ix, child;
 
-	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
-		child = ix * 2 + 1;
-		if (child + 1 < queue->nr &&
+	/* Push down the one at the root */
+	for (ix = 0; ix * 2 + 1 < queue->nr_internal; ix = child) {
+		child = ix * 2 + 1; /* left */
+		if (child + 1 < queue->nr_internal &&
 		    compare(queue, child, child + 1) >= 0)
-			child++;
+			child++; /* use right child */
+
 		if (compare(queue, ix, child) <= 0)
 			break;
+
 		swap(queue, child, ix);
 	}
 }
@@ -64,34 +67,37 @@ void prio_queue_put(struct prio_queue *queue, void *thing)
 		return;
 	}
 
-	ALLOC_GROW(queue->array, queue->nr + 1, queue->alloc);
-	queue->array[queue->nr].ctr = queue->insertion_ctr++;
-	queue->array[queue->nr].data = thing;
-	queue->nr++;
+	/* Append at the end */
+	ALLOC_GROW(queue->array, queue->nr_internal + 1, queue->alloc);
+	queue->array[queue->nr_internal].ctr = queue->insertion_ctr++;
+	queue->array[queue->nr_internal].data = thing;
+	queue->nr_internal++;
 	if (!queue->compare)
-		return;
+		return; /* LIFO */
 
-	for (ix = queue->nr - 1; ix; ix = parent) {
+	/* Bubble up the new one */
+	for (ix = queue->nr_internal - 1; ix; ix = parent) {
 		parent = (ix - 1) / 2;
 		if (compare(queue, parent, ix) <= 0)
 			break;
+
 		swap(queue, parent, ix);
 	}
 }
 
 void *prio_queue_get(struct prio_queue *queue)
 {
-	if (!queue->nr)
+	if (!queue->nr_internal)
 		return NULL;
 	if (!queue->compare)
-		return queue->array[--queue->nr].data;
+		return queue->array[--queue->nr_internal].data; /* LIFO */
 
 	if (queue->get_pending) {
-		if (!--queue->nr) {
+		if (!--queue->nr_internal) {
 			queue->get_pending = 0;
 			return NULL;
 		}
-		queue->array[0] = queue->array[queue->nr];
+		queue->array[0] = queue->array[queue->nr_internal];
 		sift_down_root(queue);
 	}
 
@@ -101,16 +107,16 @@ void *prio_queue_get(struct prio_queue *queue)
 
 void *prio_queue_peek(struct prio_queue *queue)
 {
-	if (!queue->nr)
+	if (!queue->nr_internal)
 		return NULL;
 	if (!queue->compare)
-		return queue->array[queue->nr - 1].data;
+		return queue->array[queue->nr_internal - 1].data;
 
 	if (queue->get_pending) {
 		queue->get_pending = 0;
-		if (!--queue->nr)
+		if (!--queue->nr_internal)
 			return NULL;
-		queue->array[0] = queue->array[queue->nr];
+		queue->array[0] = queue->array[queue->nr_internal];
 		sift_down_root(queue);
 	}
 
diff --git a/prio-queue.h b/prio-queue.h
index 482ab5e71d..f08ab87691 100644
--- a/prio-queue.h
+++ b/prio-queue.h
@@ -30,7 +30,7 @@ struct prio_queue {
 	prio_queue_compare_fn compare;
 	size_t insertion_ctr;
 	void *cb_data;
-	size_t alloc, nr;
+	size_t alloc, nr_internal; /* use prio_queue_size() for logical count */
 	struct prio_queue_entry *array;
 	unsigned get_pending;
 };
@@ -55,9 +55,14 @@ void *prio_queue_peek(struct prio_queue *);
 
 static inline size_t prio_queue_size(struct prio_queue *queue)
 {
-	return queue->nr - queue->get_pending;
+	return queue->nr_internal - queue->get_pending;
 }
 
+#define prio_queue_for_each(queue, it) \
+	for (size_t pq_ix_ = (queue)->get_pending; \
+	     pq_ix_ < (queue)->nr_internal && ((it) = (queue)->array[pq_ix_].data, 1); \
+	     pq_ix_++)
+
 void clear_prio_queue(struct prio_queue *);
 
 /* Reverse the LIFO elements */
diff --git a/revision.c b/revision.c
index 8ce8ffa43d..34e2d146f4 100644
--- a/revision.c
+++ b/revision.c
@@ -476,16 +476,15 @@ static struct commit *handle_commit(struct rev_info *revs,
 static int everybody_uninteresting(struct prio_queue *orig,
 				   struct commit **interesting_cache)
 {
-	size_t i;
+	struct commit *commit;
 
 	if (*interesting_cache) {
-		struct commit *commit = *interesting_cache;
+		commit = *interesting_cache;
 		if (!(commit->object.flags & UNINTERESTING))
 			return 0;
 	}
 
-	for (i = 0; i < orig->nr; i++) {
-		struct commit *commit = orig->array[i].data;
+	prio_queue_for_each(orig, commit) {
 		if (commit->object.flags & UNINTERESTING)
 			continue;
 
@@ -4027,8 +4026,8 @@ static enum rewrite_result rewrite_one_1(struct rev_info *revs,
 
 static void merge_queue_into_list(struct prio_queue *q, struct commit_list **list)
 {
-	while (q->nr) {
-		struct commit *item = prio_queue_peek(q);
+	struct commit *item;
+	while ((item = prio_queue_peek(q))) {
 		struct commit_list *p = *list;
 
 		if (p && p->item->date >= item->date)
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v3 1/2] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: Kristofer Karlsson via GitGitGadget @ 2026-06-07 11:43 UTC (permalink / raw)
  To: git
  Cc: René Scharfe, Kristofer Karlsson, Kristofer Karlsson,
	Kristofer Karlsson
In-Reply-To: <pull.2140.v3.git.1780832592.gitgitgadget@gmail.com>

From: Kristofer Karlsson <krka@spotify.com>

Defer the actual removal in prio_queue_get() until the next
operation.  If that next operation is a prio_queue_put(), the
removal and insertion are fused into a single replace, writing
the new element at the root and sifting it down which avoids
a full remove-rebalance-insert cycle.

This matches the dominant usage pattern in git's commit traversal:
get a commit, then put its parents. The first parent insertion
after each get is now a replace operation automatically.

This generalizes the lazy_queue pattern from builtin/describe.c
(introduced in 08bb69d70f) into prio_queue itself. Three callers
independently implemented the same get+put fusion:

  - builtin/describe.c had a full lazy_queue wrapper
  - commit.c:pop_most_recent_commit() used peek+replace
  - builtin/show-branch.c:join_revs() used peek+replace

All three now collapse to plain _get() and _put(), with the data
structure handling the fusion internally. This simplifies callers
and means every prio_queue user gets the optimization for free
without needing to implement it manually.

Remove prio_queue_replace() since no external callers remain.
Add prio_queue_size() for callers that need the logical element
count, since the physical nr may temporarily include a
pending-removal element.

Benchmarked on a large monorepo (30 interleaved runs,
paired t-test, Xeon @ 2.20GHz):

Code paths that previously did eager get+put (new optimization):

  Command                       base    patched  change  p
  merge-base --all A A~1000     3828ms  3725ms   -2.69%  0.0001
  rev-list --count A~1000..A    3055ms  2986ms   -2.27%  0.0601
  log --oneline A~1000..A       3408ms  3350ms   -1.71%  0.0482

Code paths that already had manual get+put fusion (expect
neutral - the optimization moves into prio_queue but the number
of heap operations stays the same):

  Command                base   patched  change  p
  show-branch A A~1000   9156ms  9127ms  -0.32%  0.3470
  describe (git.git)     1983ms  1963ms  -1.02%  <0.001

No regressions in any scenario.

Suggested-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
 builtin/describe.c          | 67 +++++++---------------------
 builtin/last-modified.c     |  4 +-
 builtin/show-branch.c       | 17 +++----
 commit-reach.c              |  5 +--
 commit.c                    | 11 +----
 pack-bitmap-write.c         |  4 +-
 prio-queue.c                | 88 ++++++++++++++++++-------------------
 prio-queue.h                | 12 +++--
 revision.c                  |  5 +--
 t/unit-tests/u-prio-queue.c |  6 +--
 walker.c                    |  4 +-
 11 files changed, 85 insertions(+), 138 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 1c47d7c0b7..85564f3487 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -251,56 +251,20 @@ static int compare_pt(const void *a_, const void *b_)
 	return 0;
 }
 
-struct lazy_queue {
-	struct prio_queue queue;
-	bool get_pending;
-};
-
-#define LAZY_QUEUE_INIT { { compare_commits_by_commit_date }, false }
-
-static void *lazy_queue_get(struct lazy_queue *queue)
-{
-	if (queue->get_pending)
-		prio_queue_get(&queue->queue);
-	else
-		queue->get_pending = true;
-	return prio_queue_peek(&queue->queue);
-}
-
-static void lazy_queue_put(struct lazy_queue *queue, void *thing)
-{
-	if (queue->get_pending)
-		prio_queue_replace(&queue->queue, thing);
-	else
-		prio_queue_put(&queue->queue, thing);
-	queue->get_pending = false;
-}
-
-static bool lazy_queue_empty(const struct lazy_queue *queue)
-{
-	return queue->queue.nr == (queue->get_pending ? 1 : 0);
-}
-
-static void lazy_queue_clear(struct lazy_queue *queue)
-{
-	clear_prio_queue(&queue->queue);
-	queue->get_pending = false;
-}
-
-static unsigned long finish_depth_computation(struct lazy_queue *queue,
+static unsigned long finish_depth_computation(struct prio_queue *queue,
 					      struct possible_tag *best)
 {
 	unsigned long seen_commits = 0;
 	struct oidset unflagged = OIDSET_INIT;
+	struct commit *c;
 
-	for (size_t i = queue->get_pending ? 1 : 0; i < queue->queue.nr; i++) {
-		struct commit *commit = queue->queue.array[i].data;
+	for (size_t i = queue->get_pending; i < queue->nr; i++) {
+		struct commit *commit = queue->array[i].data;
 		if (!(commit->object.flags & best->flag_within))
 			oidset_insert(&unflagged, &commit->object.oid);
 	}
 
-	while (!lazy_queue_empty(queue)) {
-		struct commit *c = lazy_queue_get(queue);
+	while ((c = prio_queue_get(queue))) {
 		struct commit_list *parents = c->parents;
 		seen_commits++;
 		if (c->object.flags & best->flag_within) {
@@ -316,7 +280,7 @@ static unsigned long finish_depth_computation(struct lazy_queue *queue,
 			repo_parse_commit(the_repository, p);
 			seen = p->object.flags & SEEN;
 			if (!seen)
-				lazy_queue_put(queue, p);
+				prio_queue_put(queue, p);
 			flag_before = p->object.flags & best->flag_within;
 			p->object.flags |= c->object.flags;
 			flag_after = p->object.flags & best->flag_within;
@@ -364,8 +328,8 @@ static void append_suffix(int depth, const struct object_id *oid, struct strbuf
 
 static void describe_commit(struct commit *cmit, struct strbuf *dst)
 {
-	struct commit *gave_up_on = NULL;
-	struct lazy_queue queue = LAZY_QUEUE_INIT;
+	struct commit *c, *gave_up_on = NULL;
+	struct prio_queue queue = { compare_commits_by_commit_date };
 	struct commit_name *n;
 	struct possible_tag all_matches[MAX_TAGS];
 	unsigned int match_cnt = 0, annotated_cnt = 0, cur_match;
@@ -407,9 +371,8 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 	}
 
 	cmit->object.flags = SEEN;
-	lazy_queue_put(&queue, cmit);
-	while (!lazy_queue_empty(&queue)) {
-		struct commit *c = lazy_queue_get(&queue);
+	prio_queue_put(&queue, cmit);
+	while ((c = prio_queue_get(&queue))) {
 		struct commit_list *parents = c->parents;
 		struct commit_name **slot;
 
@@ -443,7 +406,7 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 				t->depth++;
 		}
 		/* Stop if last remaining path already covered by best candidate(s) */
-		if (annotated_cnt && lazy_queue_empty(&queue)) {
+		if (annotated_cnt && !prio_queue_size(&queue)) {
 			int best_depth = INT_MAX;
 			unsigned best_within = 0;
 			for (cur_match = 0; cur_match < match_cnt; cur_match++) {
@@ -466,7 +429,7 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 			struct commit *p = parents->item;
 			repo_parse_commit(the_repository, p);
 			if (!(p->object.flags & SEEN))
-				lazy_queue_put(&queue, p);
+				prio_queue_put(&queue, p);
 			p->object.flags |= c->object.flags;
 			parents = parents->next;
 
@@ -481,7 +444,7 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 			strbuf_add_unique_abbrev(dst, cmit_oid, abbrev);
 			if (suffix)
 				strbuf_addstr(dst, suffix);
-			lazy_queue_clear(&queue);
+			clear_prio_queue(&queue);
 			return;
 		}
 		if (unannotated_cnt)
@@ -497,11 +460,11 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 	QSORT(all_matches, match_cnt, compare_pt);
 
 	if (gave_up_on) {
-		lazy_queue_put(&queue, gave_up_on);
+		prio_queue_put(&queue, gave_up_on);
 		seen_commits--;
 	}
 	seen_commits += finish_depth_computation(&queue, &all_matches[0]);
-	lazy_queue_clear(&queue);
+	clear_prio_queue(&queue);
 
 	if (debug) {
 		static int label_width = -1;
diff --git a/builtin/last-modified.c b/builtin/last-modified.c
index 8900ceece1..df2a508244 100644
--- a/builtin/last-modified.c
+++ b/builtin/last-modified.c
@@ -344,6 +344,7 @@ static void process_parent(struct last_modified *lm,
 static int last_modified_run(struct last_modified *lm)
 {
 	int max_count, queue_popped = 0;
+	struct commit *c;
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	struct prio_queue not_queue = { compare_commits_by_gen_then_commit_date };
 	struct commit_list *list;
@@ -389,10 +390,9 @@ static int last_modified_run(struct last_modified *lm)
 		}
 	}
 
-	while (queue.nr) {
+	while ((c = prio_queue_get(&queue))) {
 		int parent_i;
 		struct commit_list *p;
-		struct commit *c = prio_queue_get(&queue);
 		struct bitmap *active_c = active_paths_for(lm, c);
 
 		if ((0 <= max_count && max_count < ++queue_popped) ||
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index f02831b085..9f7f28f339 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -62,7 +62,7 @@ static const char *get_color_reset_code(void)
 
 static struct commit *interesting(struct prio_queue *queue)
 {
-	for (size_t i = 0; i < queue->nr; i++) {
+	for (size_t i = queue->get_pending; i < queue->nr; i++) {
 		struct commit *commit = queue->array[i].data;
 		if (commit->object.flags & UNINTERESTING)
 			continue;
@@ -228,17 +228,18 @@ static void join_revs(struct prio_queue *queue,
 {
 	int all_mask = ((1u << (REV_SHIFT + num_rev)) - 1);
 	int all_revs = all_mask & ~((1u << REV_SHIFT) - 1);
+	struct commit *commit;
 
-	while (queue->nr) {
+	while ((commit = prio_queue_peek(queue))) {
 		struct commit_list *parents;
 		int still_interesting = !!interesting(queue);
-		struct commit *commit = prio_queue_peek(queue);
-		bool get_pending = true;
 		int flags = commit->object.flags & all_mask;
 
 		if (!still_interesting && extra <= 0)
 			break;
 
+		prio_queue_get(queue);
+
 		mark_seen(commit, seen_p);
 		if ((flags & all_revs) == all_revs)
 			flags |= UNINTERESTING;
@@ -254,14 +255,8 @@ static void join_revs(struct prio_queue *queue,
 			if (mark_seen(p, seen_p) && !still_interesting)
 				extra--;
 			p->object.flags |= flags;
-			if (get_pending)
-				prio_queue_replace(queue, p);
-			else
-				prio_queue_put(queue, p);
-			get_pending = false;
+			prio_queue_put(queue, p);
 		}
-		if (get_pending)
-			prio_queue_get(queue);
 	}
 
 	/*
diff --git a/commit-reach.c b/commit-reach.c
index 9b3ea46d6f..0fec2f00be 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -1269,7 +1269,7 @@ int get_branch_base_for_tip(struct repository *r,
 			    size_t bases_nr)
 {
 	int best_index = -1;
-	struct commit *branch_point = NULL;
+	struct commit *c, *branch_point = NULL;
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	int found_missing_gen = 0;
 
@@ -1322,8 +1322,7 @@ int get_branch_base_for_tip(struct repository *r,
 		prio_queue_put(&queue, c);
 	}
 
-	while (queue.nr) {
-		struct commit *c = prio_queue_get(&queue);
+	while ((c = prio_queue_get(&queue))) {
 		int best_for_c = get_best(c);
 		int best_for_p, positive;
 		struct commit *parent;
diff --git a/commit.c b/commit.c
index fd8723502e..976bfc4618 100644
--- a/commit.c
+++ b/commit.c
@@ -795,24 +795,17 @@ void commit_list_sort_by_date(struct commit_list **list)
 struct commit *pop_most_recent_commit(struct prio_queue *queue,
 				      unsigned int mark)
 {
-	struct commit *ret = prio_queue_peek(queue);
-	int get_pending = 1;
+	struct commit *ret = prio_queue_get(queue);
 	struct commit_list *parents = ret->parents;
 
 	while (parents) {
 		struct commit *commit = parents->item;
 		if (!repo_parse_commit(the_repository, commit) && !(commit->object.flags & mark)) {
 			commit->object.flags |= mark;
-			if (get_pending)
-				prio_queue_replace(queue, commit);
-			else
-				prio_queue_put(queue, commit);
-			get_pending = 0;
+			prio_queue_put(queue, commit);
 		}
 		parents = parents->next;
 	}
-	if (get_pending)
-		prio_queue_get(queue);
 	return ret;
 }
 
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 1c8070f99c..f7c63e3027 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -513,6 +513,7 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
 			      struct bitmap_index *old_bitmap,
 			      const uint32_t *mapping)
 {
+	struct commit *c;
 	int found;
 	uint32_t pos;
 	if (!ent->bitmap)
@@ -520,9 +521,8 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
 
 	prio_queue_put(queue, commit);
 
-	while (queue->nr) {
+	while ((c = prio_queue_get(queue))) {
 		struct commit_list *p;
-		struct commit *c = prio_queue_get(queue);
 
 		if (old_bitmap && mapping) {
 			struct ewah_bitmap *old;
diff --git a/prio-queue.c b/prio-queue.c
index 9748528ce6..a03c617470 100644
--- a/prio-queue.c
+++ b/prio-queue.c
@@ -34,64 +34,69 @@ void clear_prio_queue(struct prio_queue *queue)
 	queue->nr = 0;
 	queue->alloc = 0;
 	queue->insertion_ctr = 0;
+	queue->get_pending = 0;
+}
+
+static void sift_down_root(struct prio_queue *queue)
+{
+	size_t ix, child;
+
+	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
+		child = ix * 2 + 1;
+		if (child + 1 < queue->nr &&
+		    compare(queue, child, child + 1) >= 0)
+			child++;
+		if (compare(queue, ix, child) <= 0)
+			break;
+		swap(queue, child, ix);
+	}
 }
 
 void prio_queue_put(struct prio_queue *queue, void *thing)
 {
 	size_t ix, parent;
 
-	/* Append at the end */
+	if (queue->get_pending) {
+		queue->get_pending = 0;
+		queue->array[0].ctr = queue->insertion_ctr++;
+		queue->array[0].data = thing;
+		sift_down_root(queue);
+		return;
+	}
+
 	ALLOC_GROW(queue->array, queue->nr + 1, queue->alloc);
 	queue->array[queue->nr].ctr = queue->insertion_ctr++;
 	queue->array[queue->nr].data = thing;
 	queue->nr++;
 	if (!queue->compare)
-		return; /* LIFO */
+		return;
 
-	/* Bubble up the new one */
 	for (ix = queue->nr - 1; ix; ix = parent) {
 		parent = (ix - 1) / 2;
 		if (compare(queue, parent, ix) <= 0)
 			break;
-
 		swap(queue, parent, ix);
 	}
 }
 
-static void sift_down_root(struct prio_queue *queue)
-{
-	size_t ix, child;
-
-	/* Push down the one at the root */
-	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
-		child = ix * 2 + 1; /* left */
-		if (child + 1 < queue->nr &&
-		    compare(queue, child, child + 1) >= 0)
-			child++; /* use right child */
-
-		if (compare(queue, ix, child) <= 0)
-			break;
-
-		swap(queue, child, ix);
-	}
-}
-
 void *prio_queue_get(struct prio_queue *queue)
 {
-	void *result;
-
 	if (!queue->nr)
 		return NULL;
 	if (!queue->compare)
-		return queue->array[--queue->nr].data; /* LIFO */
-
-	result = queue->array[0].data;
-	if (!--queue->nr)
-		return result;
+		return queue->array[--queue->nr].data;
+
+	if (queue->get_pending) {
+		if (!--queue->nr) {
+			queue->get_pending = 0;
+			return NULL;
+		}
+		queue->array[0] = queue->array[queue->nr];
+		sift_down_root(queue);
+	}
 
-	queue->array[0] = queue->array[queue->nr];
-	sift_down_root(queue);
-	return result;
+	queue->get_pending = 1;
+	return queue->array[0].data;
 }
 
 void *prio_queue_peek(struct prio_queue *queue)
@@ -100,19 +105,14 @@ void *prio_queue_peek(struct prio_queue *queue)
 		return NULL;
 	if (!queue->compare)
 		return queue->array[queue->nr - 1].data;
-	return queue->array[0].data;
-}
 
-void prio_queue_replace(struct prio_queue *queue, void *thing)
-{
-	if (!queue->nr) {
-		prio_queue_put(queue, thing);
-	} else if (!queue->compare) {
-		queue->array[queue->nr - 1].ctr = queue->insertion_ctr++;
-		queue->array[queue->nr - 1].data = thing;
-	} else {
-		queue->array[0].ctr = queue->insertion_ctr++;
-		queue->array[0].data = thing;
+	if (queue->get_pending) {
+		queue->get_pending = 0;
+		if (!--queue->nr)
+			return NULL;
+		queue->array[0] = queue->array[queue->nr];
 		sift_down_root(queue);
 	}
+
+	return queue->array[0].data;
 }
diff --git a/prio-queue.h b/prio-queue.h
index da7fad2f1f..482ab5e71d 100644
--- a/prio-queue.h
+++ b/prio-queue.h
@@ -32,6 +32,7 @@ struct prio_queue {
 	void *cb_data;
 	size_t alloc, nr;
 	struct prio_queue_entry *array;
+	unsigned get_pending;
 };
 
 /*
@@ -52,13 +53,10 @@ void *prio_queue_get(struct prio_queue *);
  */
 void *prio_queue_peek(struct prio_queue *);
 
-/*
- * Replace the "thing" that compares the smallest with a new "thing",
- * like prio_queue_get()+prio_queue_put() would do, but in a more
- * efficient way.  Does the same as prio_queue_put() if the queue is
- * empty.
- */
-void prio_queue_replace(struct prio_queue *queue, void *thing);
+static inline size_t prio_queue_size(struct prio_queue *queue)
+{
+	return queue->nr - queue->get_pending;
+}
 
 void clear_prio_queue(struct prio_queue *);
 
diff --git a/revision.c b/revision.c
index 5693618be4..8ce8ffa43d 100644
--- a/revision.c
+++ b/revision.c
@@ -1446,7 +1446,7 @@ static int limit_list(struct rev_info *revs)
 	struct commit_list *original_list = revs->commits;
 	struct commit_list *newlist = NULL;
 	struct commit_list **p = &newlist;
-	struct commit *interesting_cache = NULL;
+	struct commit *commit, *interesting_cache = NULL;
 	struct prio_queue queue = { .compare = compare_commits_by_commit_date };
 
 	if (revs->ancestry_path_implicit_bottoms) {
@@ -1461,8 +1461,7 @@ static int limit_list(struct rev_info *revs)
 		prio_queue_put(&queue, commit);
 	}
 
-	while (queue.nr) {
-		struct commit *commit = prio_queue_get(&queue);
+	while ((commit = prio_queue_get(&queue))) {
 		struct object *obj = &commit->object;
 
 		if (commit == interesting_cache)
diff --git a/t/unit-tests/u-prio-queue.c b/t/unit-tests/u-prio-queue.c
index 63e58114ae..af3e0b8598 100644
--- a/t/unit-tests/u-prio-queue.c
+++ b/t/unit-tests/u-prio-queue.c
@@ -53,13 +53,13 @@ static void test_prio_queue(int *input, size_t input_size,
 			prio_queue_reverse(&pq);
 			break;
 		case REPLACE:
-			peek = prio_queue_peek(&pq);
+			get = prio_queue_get(&pq);
 			cl_assert(i + 1 < input_size);
 			cl_assert(input[i + 1] >= 0);
 			cl_assert(j < result_size);
-			cl_assert_equal_i(result[j], show(peek));
+			cl_assert_equal_i(result[j], show(get));
 			j++;
-			prio_queue_replace(&pq, &input[++i]);
+			prio_queue_put(&pq, &input[++i]);
 			break;
 		default:
 			prio_queue_put(&pq, &input[i]);
diff --git a/walker.c b/walker.c
index e98eb6da53..e3de77f092 100644
--- a/walker.c
+++ b/walker.c
@@ -84,12 +84,12 @@ static struct prio_queue complete = { compare_commits_by_commit_date };
 static int process_commit(struct walker *walker, struct commit *commit)
 {
 	struct commit_list *parents;
+	struct commit *item;
 
 	if (repo_parse_commit(the_repository, commit))
 		return -1;
 
-	while (complete.nr) {
-		struct commit *item = prio_queue_peek(&complete);
+	while ((item = prio_queue_peek(&complete))) {
 		if (item->date < commit->date)
 			break;
 		pop_most_recent_commit(&complete, COMPLETE);
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v3 0/2] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: Kristofer Karlsson via GitGitGadget @ 2026-06-07 11:43 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, Kristofer Karlsson, Kristofer Karlsson
In-Reply-To: <pull.2140.v2.git.1780772477.gitgitgadget@gmail.com>

Rene's lazy_queue wrapper in describe.c was a clever optimization -- by
deferring the get, a following put becomes a simple replace, avoiding a full
remove-rebalance-insert cycle.

It turns out this pattern is so common in git's traversal code that it makes
sense to fold it into prio_queue itself. Gets and puts are interleaved in
virtually every commit walk, so the fusion is essentially always a win.

This is mostly a code simplification -- three callers had independently
reimplemented the same optimization, and they all collapse to plain get+put
now. The 1.7-2.7% speedup on traversal-heavy workloads is a nice bonus.

More details and benchmark numbers in the commit message.

Related to but independent of the cascade sift-down work in
kk/prio-queue-cascade-sift -- the two can land in either order.

Changes in v3:

 * Adopted Rene's suggestion to move the flush logic below the LIFO
   early-return (LIFO mode never sets get_pending, so flushing there is a
   no-op).

 * Went a step further and inlined the flush logic directly into get() and
   peek(), eliminating the flush_get() helper and its forward declaration of
   sift_down_root().

 * Updated benchmark numbers with more rigorous methodology: 30 interleaved
   runs with paired t-test on an idle server. Split results into code paths
   that already had manual fusion (neutral) vs code paths that benefit from
   the new automatic fusion (1.7-2.7% improvement).

Changes in v2:

 * Added a second commit that renames .nr to .nr_internal so that direct
   access from outside prio-queue.c is a compile error. Verified that after
   the rename, only prio-queue.c references nr_internal.

 * Added prio_queue_for_each() macro for callers that need to walk all
   elements (describe.c, show-branch.c, commit-reach.c, revision.c,
   negotiator/skipping.c).

 * Converted remaining .nr loop conditions to use
   prio_queue_get()/prio_queue_peek() as the loop condition, or
   prio_queue_size() where get/peek isn't suitable.

 * Fixed several callers missed in v1 (object-name.c, fetch-pack.c,
   path-walk.c, pack-bitmap-write.c, negotiator/default.c,
   negotiator/skipping.c, revision.c, builtin/last-modified.c).

Kristofer Karlsson (2):
  prio-queue: fold lazy_queue into prio_queue for automatic get+put
    fusion
  prio-queue: rename .nr to .nr_internal to prevent direct access

 builtin/describe.c          |  70 ++++++------------------
 builtin/last-modified.c     |   7 +--
 builtin/show-branch.c       |  24 +++-----
 commit-reach.c              |  24 ++++----
 commit.c                    |  11 +---
 fetch-pack.c                |   4 +-
 negotiator/default.c        |   4 +-
 negotiator/skipping.c       |  12 ++--
 object-name.c               |   2 +-
 pack-bitmap-write.c         |  10 ++--
 path-walk.c                 |   8 +--
 prio-queue.c                | 106 +++++++++++++++++++-----------------
 prio-queue.h                |  19 ++++---
 revision.c                  |  16 +++---
 t/unit-tests/u-prio-queue.c |   6 +-
 walker.c                    |   4 +-
 16 files changed, 144 insertions(+), 183 deletions(-)


base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2140%2Fspkrka%2Flazy-prio-queue-pr-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2140/spkrka/lazy-prio-queue-pr-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/2140

Range-diff vs v2:

 1:  29af24445e ! 1:  e882206d29 prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
     @@ Commit message
      
          Defer the actual removal in prio_queue_get() until the next
          operation.  If that next operation is a prio_queue_put(), the
     -    removal and insertion are fused into a single replace — writing
     -    the new element at the root and sifting it down — which avoids
     +    removal and insertion are fused into a single replace, writing
     +    the new element at the root and sifting it down which avoids
          a full remove-rebalance-insert cycle.
      
          This matches the dominant usage pattern in git's commit traversal:
     -    get a commit, then put its parents.  The first parent insertion
     +    get a commit, then put its parents. The first parent insertion
          after each get is now a replace operation automatically.
      
          This generalizes the lazy_queue pattern from builtin/describe.c
     -    (introduced in 08bb69d70f) into prio_queue itself.  Three callers
     +    (introduced in 08bb69d70f) into prio_queue itself. Three callers
          independently implemented the same get+put fusion:
      
            - builtin/describe.c had a full lazy_queue wrapper
     -      - commit.c:pop_most_recent_commit() reimplements the same
     -        get_pending flag with peek+replace
     -      - builtin/show-branch.c:join_revs() used the same peek+replace
     -        pattern
     +      - commit.c:pop_most_recent_commit() used peek+replace
     +      - builtin/show-branch.c:join_revs() used peek+replace
      
     -    All three now collapse to plain _get() and _put(),
     -    with the data structure handling the fusion internally.
     +    All three now collapse to plain _get() and _put(), with the data
     +    structure handling the fusion internally. This simplifies callers
     +    and means every prio_queue user gets the optimization for free
     +    without needing to implement it manually.
      
          Remove prio_queue_replace() since no external callers remain.
          Add prio_queue_size() for callers that need the logical element
          count, since the physical nr may temporarily include a
          pending-removal element.
      
     -    Benchmarked on a large monorepo (10-15 interleaved runs, 1 warmup):
     +    Benchmarked on a large monorepo (30 interleaved runs,
     +    paired t-test, Xeon @ 2.20GHz):
      
     -      Command                       base    patched  speedup
     -      merge-base --all A A~1000     3.88s   3.77s    1.03x
     -      rev-list --count A~1000..A    3.57s   3.43s    1.04x
     -      log --oneline A~1000..A       3.70s   3.49s    1.06x
     -      rev-parse :/pattern           365ms   364ms    1.00x
     -      describe HEAD (linux.git)     184ms   190ms    1.00x
     +    Code paths that previously did eager get+put (new optimization):
     +
     +      Command                       base    patched  change  p
     +      merge-base --all A A~1000     3828ms  3725ms   -2.69%  0.0001
     +      rev-list --count A~1000..A    3055ms  2986ms   -2.27%  0.0601
     +      log --oneline A~1000..A       3408ms  3350ms   -1.71%  0.0482
     +
     +    Code paths that already had manual get+put fusion (expect
     +    neutral - the optimization moves into prio_queue but the number
     +    of heap operations stays the same):
     +
     +      Command                base   patched  change  p
     +      show-branch A A~1000   9156ms  9127ms  -0.32%  0.3470
     +      describe (git.git)     1983ms  1963ms  -1.02%  <0.001
      
          No regressions in any scenario.
      
     +    Suggested-by: René Scharfe <l.s.r@web.de>
          Signed-off-by: Kristofer Karlsson <krka@spotify.com>
      
       ## builtin/describe.c ##
     @@ prio-queue.c: void clear_prio_queue(struct prio_queue *queue)
      +	queue->get_pending = 0;
      +}
      +
     -+static void sift_down_root(struct prio_queue *queue);
     -+
     -+static inline void flush_get(struct prio_queue *queue)
     ++static void sift_down_root(struct prio_queue *queue)
      +{
     -+	if (!queue->get_pending)
     -+		return;
     -+	queue->get_pending = 0;
     -+	if (!--queue->nr)
     -+		return;
     -+	queue->array[0] = queue->array[queue->nr];
     -+	sift_down_root(queue);
     ++	size_t ix, child;
     ++
     ++	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
     ++		child = ix * 2 + 1;
     ++		if (child + 1 < queue->nr &&
     ++		    compare(queue, child, child + 1) >= 0)
     ++			child++;
     ++		if (compare(queue, ix, child) <= 0)
     ++			break;
     ++		swap(queue, child, ix);
     ++	}
       }
       
       void prio_queue_put(struct prio_queue *queue, void *thing)
       {
       	size_t ix, parent;
       
     +-	/* Append at the end */
      +	if (queue->get_pending) {
      +		queue->get_pending = 0;
      +		queue->array[0].ctr = queue->insertion_ctr++;
     @@ prio-queue.c: void clear_prio_queue(struct prio_queue *queue)
      +		return;
      +	}
      +
     - 	/* Append at the end */
       	ALLOC_GROW(queue->array, queue->nr + 1, queue->alloc);
       	queue->array[queue->nr].ctr = queue->insertion_ctr++;
     -@@ prio-queue.c: static void sift_down_root(struct prio_queue *queue)
     + 	queue->array[queue->nr].data = thing;
     + 	queue->nr++;
     + 	if (!queue->compare)
     +-		return; /* LIFO */
     ++		return;
     + 
     +-	/* Bubble up the new one */
     + 	for (ix = queue->nr - 1; ix; ix = parent) {
     + 		parent = (ix - 1) / 2;
     + 		if (compare(queue, parent, ix) <= 0)
     + 			break;
     +-
     + 		swap(queue, parent, ix);
     + 	}
     + }
       
     +-static void sift_down_root(struct prio_queue *queue)
     +-{
     +-	size_t ix, child;
     +-
     +-	/* Push down the one at the root */
     +-	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
     +-		child = ix * 2 + 1; /* left */
     +-		if (child + 1 < queue->nr &&
     +-		    compare(queue, child, child + 1) >= 0)
     +-			child++; /* use right child */
     +-
     +-		if (compare(queue, ix, child) <= 0)
     +-			break;
     +-
     +-		swap(queue, child, ix);
     +-	}
     +-}
     +-
       void *prio_queue_get(struct prio_queue *queue)
       {
      -	void *result;
     -+	flush_get(queue);
     - 
     +-
       	if (!queue->nr)
       		return NULL;
       	if (!queue->compare)
     - 		return queue->array[--queue->nr].data; /* LIFO */
     - 
     +-		return queue->array[--queue->nr].data; /* LIFO */
     +-
      -	result = queue->array[0].data;
      -	if (!--queue->nr)
      -		return result;
     --
     ++		return queue->array[--queue->nr].data;
     ++
     ++	if (queue->get_pending) {
     ++		if (!--queue->nr) {
     ++			queue->get_pending = 0;
     ++			return NULL;
     ++		}
     ++		queue->array[0] = queue->array[queue->nr];
     ++		sift_down_root(queue);
     ++	}
     + 
      -	queue->array[0] = queue->array[queue->nr];
      -	sift_down_root(queue);
      -	return result;
     @@ prio-queue.c: static void sift_down_root(struct prio_queue *queue)
       }
       
       void *prio_queue_peek(struct prio_queue *queue)
     - {
     -+	flush_get(queue);
     -+
     - 	if (!queue->nr)
     +@@ prio-queue.c: void *prio_queue_peek(struct prio_queue *queue)
       		return NULL;
       	if (!queue->compare)
       		return queue->array[queue->nr - 1].data;
     - 	return queue->array[0].data;
     - }
     --
     +-	return queue->array[0].data;
     +-}
     + 
      -void prio_queue_replace(struct prio_queue *queue, void *thing)
      -{
      -	if (!queue->nr) {
     @@ prio-queue.c: static void sift_down_root(struct prio_queue *queue)
      -	} else {
      -		queue->array[0].ctr = queue->insertion_ctr++;
      -		queue->array[0].data = thing;
     --		sift_down_root(queue);
     --	}
     --}
     ++	if (queue->get_pending) {
     ++		queue->get_pending = 0;
     ++		if (!--queue->nr)
     ++			return NULL;
     ++		queue->array[0] = queue->array[queue->nr];
     + 		sift_down_root(queue);
     + 	}
     ++
     ++	return queue->array[0].data;
     + }
      
       ## prio-queue.h ##
      @@ prio-queue.h: struct prio_queue {
 2:  bb8b0f78f1 ! 2:  033215e304 prio-queue: rename .nr to .nr_internal to prevent direct access
     @@ prio-queue.c: void prio_queue_reverse(struct prio_queue *queue)
       	queue->alloc = 0;
       	queue->insertion_ctr = 0;
       	queue->get_pending = 0;
     -@@ prio-queue.c: static inline void flush_get(struct prio_queue *queue)
     - 	if (!queue->get_pending)
     - 		return;
     - 	queue->get_pending = 0;
     --	if (!--queue->nr)
     -+	if (!--queue->nr_internal)
     - 		return;
     --	queue->array[0] = queue->array[queue->nr];
     -+	queue->array[0] = queue->array[queue->nr_internal];
     - 	sift_down_root(queue);
     - }
     +@@ prio-queue.c: static void sift_down_root(struct prio_queue *queue)
     + {
     + 	size_t ix, child;
       
     +-	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
     +-		child = ix * 2 + 1;
     +-		if (child + 1 < queue->nr &&
     ++	/* Push down the one at the root */
     ++	for (ix = 0; ix * 2 + 1 < queue->nr_internal; ix = child) {
     ++		child = ix * 2 + 1; /* left */
     ++		if (child + 1 < queue->nr_internal &&
     + 		    compare(queue, child, child + 1) >= 0)
     +-			child++;
     ++			child++; /* use right child */
     ++
     + 		if (compare(queue, ix, child) <= 0)
     + 			break;
     ++
     + 		swap(queue, child, ix);
     + 	}
     + }
      @@ prio-queue.c: void prio_queue_put(struct prio_queue *queue, void *thing)
     + 		return;
       	}
       
     - 	/* Append at the end */
      -	ALLOC_GROW(queue->array, queue->nr + 1, queue->alloc);
      -	queue->array[queue->nr].ctr = queue->insertion_ctr++;
      -	queue->array[queue->nr].data = thing;
      -	queue->nr++;
     ++	/* Append at the end */
      +	ALLOC_GROW(queue->array, queue->nr_internal + 1, queue->alloc);
      +	queue->array[queue->nr_internal].ctr = queue->insertion_ctr++;
      +	queue->array[queue->nr_internal].data = thing;
      +	queue->nr_internal++;
       	if (!queue->compare)
     - 		return; /* LIFO */
     +-		return;
     ++		return; /* LIFO */
       
     - 	/* Bubble up the new one */
      -	for (ix = queue->nr - 1; ix; ix = parent) {
     ++	/* Bubble up the new one */
      +	for (ix = queue->nr_internal - 1; ix; ix = parent) {
       		parent = (ix - 1) / 2;
       		if (compare(queue, parent, ix) <= 0)
       			break;
     -@@ prio-queue.c: static void sift_down_root(struct prio_queue *queue)
     - 	size_t ix, child;
     - 
     - 	/* Push down the one at the root */
     --	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
     -+	for (ix = 0; ix * 2 + 1 < queue->nr_internal; ix = child) {
     - 		child = ix * 2 + 1; /* left */
     --		if (child + 1 < queue->nr &&
     -+		if (child + 1 < queue->nr_internal &&
     - 		    compare(queue, child, child + 1) >= 0)
     - 			child++; /* use right child */
     ++
     + 		swap(queue, parent, ix);
     + 	}
     + }
       
     -@@ prio-queue.c: void *prio_queue_get(struct prio_queue *queue)
     + void *prio_queue_get(struct prio_queue *queue)
       {
     - 	flush_get(queue);
     - 
      -	if (!queue->nr)
      +	if (!queue->nr_internal)
       		return NULL;
       	if (!queue->compare)
     --		return queue->array[--queue->nr].data; /* LIFO */
     +-		return queue->array[--queue->nr].data;
      +		return queue->array[--queue->nr_internal].data; /* LIFO */
       
     - 	queue->get_pending = 1;
     - 	return queue->array[0].data;
     -@@ prio-queue.c: void *prio_queue_peek(struct prio_queue *queue)
     - {
     - 	flush_get(queue);
     + 	if (queue->get_pending) {
     +-		if (!--queue->nr) {
     ++		if (!--queue->nr_internal) {
     + 			queue->get_pending = 0;
     + 			return NULL;
     + 		}
     +-		queue->array[0] = queue->array[queue->nr];
     ++		queue->array[0] = queue->array[queue->nr_internal];
     + 		sift_down_root(queue);
     + 	}
       
     +@@ prio-queue.c: void *prio_queue_get(struct prio_queue *queue)
     + 
     + void *prio_queue_peek(struct prio_queue *queue)
     + {
      -	if (!queue->nr)
      +	if (!queue->nr_internal)
       		return NULL;
       	if (!queue->compare)
      -		return queue->array[queue->nr - 1].data;
      +		return queue->array[queue->nr_internal - 1].data;
     - 	return queue->array[0].data;
     - }
     + 
     + 	if (queue->get_pending) {
     + 		queue->get_pending = 0;
     +-		if (!--queue->nr)
     ++		if (!--queue->nr_internal)
     + 			return NULL;
     +-		queue->array[0] = queue->array[queue->nr];
     ++		queue->array[0] = queue->array[queue->nr_internal];
     + 		sift_down_root(queue);
     + 	}
     + 
      
       ## prio-queue.h ##
      @@ prio-queue.h: struct prio_queue {

-- 
gitgitgadget

^ permalink raw reply

* Re: [PATCH v2 0/2] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: Kristofer Karlsson @ 2026-06-07  9:30 UTC (permalink / raw)
  To: René Scharfe; +Cc: Kristofer Karlsson via GitGitGadget, git
In-Reply-To: <fe20bde6-9e86-4162-9bbd-af4d058e499e@web.de>

On Sun, 7 Jun 2026 at 09:30, René Scharfe <l.s.r@web.de> wrote:
>
> Calling flush_get() later, when we know that we have items and a
> compare function, is cleaner, as we never need it in LIFO mode, and
> is also slightly faster (patch below).

Thanks for the benchmark and the suggestion to move flush_get()
below the LIFO check - that's cleaner since LIFO never sets
get_pending.

One edge case to note: without a second !nr_internal check after
flush_get(), two consecutive get() calls on a single-element queue
will return stale data instead of NULL. I went a step further and
inlined the flush logic directly into get()/peek(), which also
removes the forward declaration.

> Still there's this 1% performance gap to the current code that I
> don't understand.  Do you see it as well?

Yes, I saw a similar trend on my laptop (Core Ultra 7 155U),
but with very high variance - the results were too noisy to be
conclusive even with 20+ runs.
On an idle server (Xeon @ 2.20GHz) with much lower variance, all
three variants (v2 as posted, your patch, and the inlined version)
came out ~1.3% faster than the baseline across 30 interleaved
runs (p < 0.01). So it seems CPU-dependent - possibly branch
prediction or code alignment differences between microarchitectures.

Results from the idle server (30 interleaved runs, paired t-test):

  Variant               Avg       SE  vs baseline           95% CI          p
  -------------- ---------- -------- ------------ ---------------- ----------
  baseline          2002.5ms     9.2ms   (baseline)
  v2-posted         1976.6ms     3.2ms      -1.29%    -41 to -11ms     0.0019
  v2-rene           1977.7ms     3.1ms      -1.24%    -42 to  -8ms     0.0071
  v2-latest         1975.3ms     1.8ms      -1.36%    -46 to  -9ms     0.0069

  baseline:  9ac3f193c0 (The 11th batch)
  v2-posted: v2 as sent to the list
  v2-rene:   v2 + your flush_get patch
  v2-latest: v2 + inlined flush (for v3)

Will send a v3 with the inlined flush shortly.

- Kristofer

^ permalink raw reply

* Re: [PATCH v2 0/2] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: René Scharfe @ 2026-06-07  7:30 UTC (permalink / raw)
  To: Kristofer Karlsson via GitGitGadget, git; +Cc: Kristofer Karlsson
In-Reply-To: <pull.2140.v2.git.1780772477.gitgitgadget@gmail.com>

On 6/6/26 9:01 PM, Kristofer Karlsson via GitGitGadget wrote:
> Rene's lazy_queue wrapper in describe.c was a clever optimization -- by
> deferring the get, a following put becomes a simple replace, avoiding a full
> remove-rebalance-insert cycle.
> 
> It turns out this pattern is so common in git's traversal code that it makes
> sense to fold it into prio_queue itself. Gets and puts are interleaved in
> virtually every commit walk, so the fusion is essentially always a win.
> 
> This is mostly a code simplification -- three callers had independently
> reimplemented the same optimization, and they all collapse to plain get+put
> now. The 3-6% speedup on traversal-heavy workloads is a nice bonus.
> 
> More details and benchmark numbers in the commit message. Benchmarks were
> run on next which includes kk/commit-reach-optim -- those results represent
> the more realistic end state.
> 
> Related to but independent of the cascade sift-down work in
> kk/prio-queue-cascade-sift -- the two can land in either order.
> 
> Changes since v1:
> 
>  * Added a second commit that renames .nr to .nr_internal so that direct
>    access from outside prio-queue.c is a compile error. Verified that after
>    the rename, only prio-queue.c references nr_internal.
> 
>  * Added prio_queue_for_each() macro for callers that need to walk all
>    elements (describe.c, show-branch.c, commit-reach.c, revision.c,
>    negotiator/skipping.c).
> 
>  * Converted remaining .nr loop conditions to use
>    prio_queue_get()/prio_queue_peek() as the loop condition, or
>    prio_queue_size() where get/peek isn't suitable.
> 
>  * Fixed several callers missed in v1 (object-name.c, fetch-pack.c,
>    path-walk.c, pack-bitmap-write.c, negotiator/default.c,
>    negotiator/skipping.c, revision.c, builtin/last-modified.c).
> 
> Kristofer Karlsson (2):
>   prio-queue: fold lazy_queue into prio_queue for automatic get+put
>     fusion
>   prio-queue: rename .nr to .nr_internal to prevent direct access
> 
>  builtin/describe.c          | 70 ++++++++-------------------------
>  builtin/last-modified.c     |  7 ++--
>  builtin/show-branch.c       | 24 +++++-------
>  commit-reach.c              | 24 ++++++------
>  commit.c                    | 11 +-----
>  fetch-pack.c                |  4 +-
>  negotiator/default.c        |  4 +-
>  negotiator/skipping.c       | 12 +++---
>  object-name.c               |  2 +-
>  pack-bitmap-write.c         | 10 ++---
>  path-walk.c                 |  8 ++--
>  prio-queue.c                | 77 ++++++++++++++++++++-----------------
>  prio-queue.h                | 19 +++++----
>  revision.c                  | 16 ++++----
>  t/unit-tests/u-prio-queue.c |  6 +--
>  walker.c                    |  4 +-
>  16 files changed, 129 insertions(+), 169 deletions(-)
> 
> 
> base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2140%2Fspkrka%2Flazy-prio-queue-pr-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2140/spkrka/lazy-prio-queue-pr-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/2140
> 
> Range-diff vs v1:
> 
>  1:  29af24445e = 1:  29af24445e prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
>  -:  ---------- > 2:  bb8b0f78f1 prio-queue: rename .nr to .nr_internal to prevent direct access
> 

My earlier attempt in <90270818-c52b-4611-8da2-6cee20628fc2@web.de>
copied the last item to the root and decreased .nr, to allow callers to
scan items and get their count directly.

Checking emptiness by doing the existing calls of prio_queue_peek() and
prio_queue_get() a bit earlier and scanning using a foreach macro are
fine as well and arguably cleaner, at the low cost of having to change
all the callers.

The result is faster than my attempt, but still slower than the current
code in the describe benchmark from 30598ccc4d (describe: use oidset in
finish_depth_computation(), 2025-09-02):

Benchmark 1: ./git_main describe $(git rev-list v2.41.0..v2.47.0)
  Time (mean ± σ):     601.7 ms ±   1.9 ms    [User: 538.6 ms, System: 47.3 ms]
  Range (min … max):   599.3 ms … 606.5 ms    10 runs

Benchmark 2: ./git_auto_replace describe $(git rev-list v2.41.0..v2.47.0)
  Time (mean ± σ):     618.0 ms ±   1.1 ms    [User: 554.5 ms, System: 47.6 ms]
  Range (min … max):   616.7 ms … 620.2 ms    10 runs

Benchmark 3: ./git_fold describe $(git rev-list v2.41.0..v2.47.0)
  Time (mean ± σ):     609.9 ms ±   0.8 ms    [User: 546.7 ms, System: 47.4 ms]
  Range (min … max):   608.8 ms … 611.2 ms    10 runs

Benchmark 4: ./git describe $(git rev-list v2.41.0..v2.47.0)
  Time (mean ± σ):     606.1 ms ±   1.2 ms    [User: 543.7 ms, System: 46.7 ms]
  Range (min … max):   604.7 ms … 609.1 ms    10 runs

Summary
  ./git_main describe $(git rev-list v2.41.0..v2.47.0) ran
    1.01 ± 0.00 times faster than ./git describe $(git rev-list v2.41.0..v2.47.0)
    1.01 ± 0.00 times faster than ./git_fold describe $(git rev-list v2.41.0..v2.47.0)
    1.03 ± 0.00 times faster than ./git_auto_replace describe $(git rev-list v2.41.0..v2.47.0)

git_auto_replace: <90270818-c52b-4611-8da2-6cee20628fc2@web.de> and
  revert of 08bb69d70f (describe: use prio_queue_replace(), 2025-08-03)
git_fold: this series
git: this series and the patch below

My attempt leaves performance on the table by using a bool.  Using
an unsigned for the flag is measurably faster -- but still slower
than your series here.

Calling flush_get() later, when we know that we have items and a
compare function, is cleaner, as we never need it in LIFO mode, and
is also slightly faster (patch below).

Still there's this 1% performance gap to the current code that I
don't understand.  Do you see it as well?

René

---
 prio-queue.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/prio-queue.c b/prio-queue.c
index d11ca6ac36..45709187d3 100644
--- a/prio-queue.c
+++ b/prio-queue.c
@@ -100,24 +100,23 @@ static void sift_down_root(struct prio_queue *queue)
 
 void *prio_queue_get(struct prio_queue *queue)
 {
-	flush_get(queue);
-
 	if (!queue->nr_internal)
 		return NULL;
 	if (!queue->compare)
 		return queue->array[--queue->nr_internal].data; /* LIFO */
 
+	flush_get(queue);
 	queue->get_pending = 1;
 	return queue->array[0].data;
 }
 
 void *prio_queue_peek(struct prio_queue *queue)
 {
-	flush_get(queue);
-
 	if (!queue->nr_internal)
 		return NULL;
 	if (!queue->compare)
 		return queue->array[queue->nr_internal - 1].data;
+
+	flush_get(queue);
 	return queue->array[0].data;
 }


^ permalink raw reply related

* Re: [PATCH v2] prio-queue: use cascade-down for faster extract-min
From: René Scharfe @ 2026-06-07  7:30 UTC (permalink / raw)
  To: Kristofer Karlsson; +Cc: Kristofer Karlsson via GitGitGadget, git
In-Reply-To: <CAL71e4PV-1aDvn1JnweMa3OR1xxB75fWjzJOBvM54KOWqC0stw@mail.gmail.com>

On 6/5/26 10:39 PM, Kristofer Karlsson wrote:
> I did some more benchmarking to understand how these approaches
> interact, with four variants based on origin/next on my large monorepo:
> 
>   1. base: next as-is
>   2. cascade: base + sift_up_rebalance from this patch (v2)
>   3. lazy-fold: base + lazy get fusion folded into prio_queue
>   4. cascade+lazy: both combined
> 
> Note that alt 3 is not yet shared with the mailing list so it's hard for you
> to reason about it, though it's quite straightforward. I will submit a new
> patch for that one soon, not necessarily with the primary goal to merge it,
> but rather show how it is implemented.
> 
>   merge-base --all master master~1000:
>     base            4.27s
>     cascade         4.07s  (1.05x)
>     lazy-fold       4.12s  (1.03x)
>     cascade+lazy    4.01s  (1.06x)
> 
>   rev-list --count master~1000..master:
>     base            3.60s
>     cascade         3.35s  (1.08x)
>     lazy-fold       3.37s  (1.07x)
>     cascade+lazy    3.30s  (1.09x)
> 
> So both optimizations are valuable both on their own, and when combined,
> which I think helps to reason about it. This cascading sift seems to have a
> larger effect, but folding lazy_queue into prio_queue also speeds up other
> use cases and simplifies the code a bit.
Right.  I was wondering, though: Why is sift-down so much faster than
cascade in the describe benchmark from 30598ccc4d (describe: use oidset
in finish_depth_computation(), 2025-09-02)?

I think I mostly understand it now: cascade is better in prio_queue_get()
because the sift-down item is from the bottom and will likely end up back
at the bottom, just of a different branch of the heap.  Thus a sift-down
costs 3 compares times the number of levels, while a cascade costs just
2 compares times the number of levels and there is likely little to no
need to sift it back up.

For prio_queue_replace() we sift down a random item, though; we don't
know where it will end up.  If it belongs at the very top then sift-down
just needs 3 compares, while cascade needs 2 compares times the number
of levels to bring the hole down and the same to bring the item up.

Below is a diff on top of your second cascade patch to use sift-down
only for the root and cascade otherwise.  It comes remarkably close to
the performance of a full sift-down.  I don't know how to find the
optimal number of levels to try sift-down before switching to cascade
for a given random item, though.

So I guess we keep the full sift-down for prio_queue_replace(), knowing
that sometimes we have a lot of items that end up at or close to the
root of the heap.

Benchmark 1: ./git_main describe $(git rev-list v2.41.0..v2.47.0)
  Time (mean ± σ):     602.4 ms ±   1.2 ms    [User: 539.2 ms, System: 47.7 ms]
  Range (min … max):   600.5 ms … 604.7 ms    10 runs

Benchmark 2: ./git_cascade1 describe $(git rev-list v2.41.0..v2.47.0)
  Time (mean ± σ):     993.9 ms ±   1.7 ms    [User: 930.2 ms, System: 48.2 ms]
  Range (min … max):   991.1 ms … 996.6 ms    10 runs

Benchmark 3: ./git_cascade2 describe $(git rev-list v2.41.0..v2.47.0)
  Time (mean ± σ):     602.4 ms ±   1.7 ms    [User: 539.1 ms, System: 47.6 ms]
  Range (min … max):   599.9 ms … 606.2 ms    10 runs

Benchmark 4: ./git describe $(git rev-list v2.41.0..v2.47.0)
  Time (mean ± σ):     625.4 ms ±   1.7 ms    [User: 561.8 ms, System: 48.0 ms]
  Range (min … max):   623.4 ms … 627.9 ms    10 runs

Summary
  ./git_main describe $(git rev-list v2.41.0..v2.47.0) ran
    1.00 ± 0.00 times faster than ./git_cascade2 describe $(git rev-list v2.41.0..v2.47.0)
    1.04 ± 0.00 times faster than ./git describe $(git rev-list v2.41.0..v2.47.0)
    1.65 ± 0.00 times faster than ./git_cascade1 describe $(git rev-list v2.41.0..v2.47.0)

git_main and git_cascade2 (your v2): sift-down
git_cascade1 (your v1): cascade
git (your v2 and the patch below): sift-down for root then cascade

René


diff --git a/prio-queue.c b/prio-queue.c
index 66d445b078..4d7debc2ba 100644
--- a/prio-queue.c
+++ b/prio-queue.c
@@ -58,30 +58,12 @@ void prio_queue_put(struct prio_queue *queue, void *thing)
 	}
 }
 
-static void sift_down_root(struct prio_queue *queue)
+static void sift_up_rebalance(struct prio_queue *queue, size_t ix)
 {
-	size_t ix, child;
-
-	/* Push down the one at the root */
-	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
-		child = ix * 2 + 1; /* left */
-		if (child + 1 < queue->nr &&
-		    compare(queue, child, child + 1) >= 0)
-			child++; /* use right child */
-
-		if (compare(queue, ix, child) <= 0)
-			break;
-
-		swap(queue, child, ix);
-	}
-}
-
-static void sift_up_rebalance(struct prio_queue *queue)
-{
-	size_t ix, child;
+	size_t child;
 
 	/* Cascade: promote smaller child at each level. */
-	for (ix = 0; (child = ix * 2 + 1) < queue->nr; ix = child) {
+	for (; (child = ix * 2 + 1) < queue->nr; ix = child) {
 		if (child + 1 < queue->nr &&
 		    compare(queue, child, child + 1) >= 0)
 			child++;
@@ -112,7 +94,7 @@ void *prio_queue_get(struct prio_queue *queue)
 	if (!--queue->nr)
 		return result;
 
-	sift_up_rebalance(queue);
+	sift_up_rebalance(queue, 0);
 	return result;
 }
 
@@ -132,9 +114,20 @@ void prio_queue_replace(struct prio_queue *queue, void *thing)
 	} else if (!queue->compare) {
 		queue->array[queue->nr - 1].ctr = queue->insertion_ctr++;
 		queue->array[queue->nr - 1].data = thing;
+	} else if (queue->nr < 3) {
+		prio_queue_get(queue);
+		prio_queue_put(queue, thing);
 	} else {
-		queue->array[0].ctr = queue->insertion_ctr++;
+		size_t child = compare(queue, 1, 2) <= 0 ? 1 : 2;
+		queue->array[0].ctr = queue->insertion_ctr;
 		queue->array[0].data = thing;
-		sift_down_root(queue);
+		if (compare(queue, 0, child) <= 0) {
+			queue->insertion_ctr++;
+		} else {
+			queue->array[0] = queue->array[child];
+			queue->nr--;
+			sift_up_rebalance(queue, child);
+			prio_queue_put(queue, thing);
+		}
 	}
 }


^ permalink raw reply related

* Re: [PATCH v3] doc: fix typos via codespell
From: Johannes Schindelin @ 2026-06-06 20:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Andrew Kreimer, git
In-Reply-To: <xmqqzf1dujtf.fsf@gitster.g>

Hi Junio,

On Sat, 6 Jun 2026, Junio C Hamano wrote:

> Andrew Kreimer <algonell@gmail.com> writes:
> 
> > There are some typos in the documentation, comments, etc.
> > Fix them via codespell.
>
> [...]
> 
> I'll squash the fix-up I already had into [v2] that I have queued,
> which should be sufficient to get to the state this [v3] should have
> been, I think.

The mechanical nature of these fixes explains another issue: One typo fix
touched two test fixtures which might seem harmless at first, but those
fixtures are littered with checksums that relied on the original
(misspelled) form.

Please adopt this follow-up into ak/typofixes:

-- snipsnap --
From 54aa4f7f7adf0c0e02b5463b5f7f64547e80cbce Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 6 Jun 2026 22:09:04 +0200
Subject: [PATCH] svn-test-dumps: restore checksums after the `hapenning` typo
 fix

b8b38eee85 (doc: fix typos via codespell, 2026-05-31) ran codespell
against the entire tree and rewrote `hapenning` to `happening`
inside the body of `t/t9150/svk-merge.dump` and
`t/t9151/svn-mergeinfo.dump`. Both files are Subversion dump
files: each `Node-path:` block embeds `Text-content-md5` /
`Text-content-sha1` for the new content and, on copy operations,
`Text-copy-source-md5` / `Text-copy-source-sha1` for the source
content as observed at the cited revision. None of those
checksums were updated, so loading the dumps with svnadmin 1.14.5
(present in `ubuntu:rolling`'s CI image) fails immediately with
`E200014: Checksum mismatch for '/trunk/Makefile'` and the two
tests stop before any of the assertions they actually exercise can
run. The CI failure has been visible on every `seen`-based
linux-sha256 / linux-reftable build since 2026-06-02 (the first
run that picked up b8b38eee85).

Because `happening` and `hapenning` have the same length, no
header byte counts need updating; only the embedded checksums do.
Recompute the MD5 and SHA1 of every text body in the two dumps,
and for every `Node-copyfrom-path` consult the path's most
recently defined content to refresh the corresponding
`Text-copy-source-md5` / `Text-copy-source-sha1`. After this,
`svnadmin load -q` accepts both dumps cleanly and t9150 and t9151
get past their setup steps.

This commit only touches the two dump files; the typo correction
in their surrounding human-readable comment is preserved.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t9150/svk-merge.dump     | 10 ++++----
 t/t9151/svn-mergeinfo.dump | 48 +++++++++++++++++++-------------------
 2 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/t/t9150/svk-merge.dump b/t/t9150/svk-merge.dump
index 6a8ac81b11e6..3c46afc18a65 100644
--- a/t/t9150/svk-merge.dump
+++ b/t/t9150/svk-merge.dump
@@ -71,7 +71,7 @@ Node-kind: file
 Node-action: add
 Prop-content-length: 10
 Text-content-length: 2401
-Text-content-md5: bfd8ff778d1492dc6758567373176a89
+Text-content-md5: d6a3917748b0c09ad85c2783f1d4dac1
 Content-length: 2411
 
 PROPS-END
@@ -201,7 +201,7 @@ Node-path: branches/left/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2465
-Text-content-md5: 16e38d9753b061731650561ce01b1195
+Text-content-md5: 3f413450a7a26596d9e512ee385a9b19
 Content-length: 2465
 
 # -DCOLLISION_CHECK if you believe that SHA1's
@@ -305,7 +305,7 @@ Node-path: trunk/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2521
-Text-content-md5: 0668418a621333f4aa8b6632cd63e2a0
+Text-content-md5: 89788781014278d76ff23648b8b08b2d
 Content-length: 2521
 
 # -DCOLLISION_CHECK if you believe that SHA1's
@@ -412,7 +412,7 @@ Node-path: branches/left/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2593
-Text-content-md5: 5ccff689fb290e00b85fe18ee50c54ba
+Text-content-md5: 706d73919e6f319a0e624aa50c8b8b38
 Content-length: 2593
 
 # -DCOLLISION_CHECK if you believe that SHA1's
@@ -529,7 +529,7 @@ Node-path: trunk/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2713
-Text-content-md5: 0afbe34f244cd662b1f97d708c687f90
+Text-content-md5: 1c05266da99e8f01a5ccf816be47a484
 Content-length: 2713
 
 # -DCOLLISION_CHECK if you believe that SHA1's
diff --git a/t/t9151/svn-mergeinfo.dump b/t/t9151/svn-mergeinfo.dump
index d5e169563745..ad741400104e 100644
--- a/t/t9151/svn-mergeinfo.dump
+++ b/t/t9151/svn-mergeinfo.dump
@@ -80,8 +80,8 @@ Node-kind: file
 Node-action: add
 Prop-content-length: 10
 Text-content-length: 2401
-Text-content-md5: bfd8ff778d1492dc6758567373176a89
-Text-content-sha1: 103205ce331f7d64086dba497574734f78439590
+Text-content-md5: d6a3917748b0c09ad85c2783f1d4dac1
+Text-content-sha1: 9ffe895eb95d4a7c2ee2712dcf7a13637edee6a9
 Content-length: 2411
 
 PROPS-END
@@ -194,8 +194,8 @@ Node-kind: file
 Node-action: add
 Node-copyfrom-rev: 2
 Node-copyfrom-path: trunk/Makefile
-Text-copy-source-md5: bfd8ff778d1492dc6758567373176a89
-Text-copy-source-sha1: 103205ce331f7d64086dba497574734f78439590
+Text-copy-source-md5: d6a3917748b0c09ad85c2783f1d4dac1
+Text-copy-source-sha1: 9ffe895eb95d4a7c2ee2712dcf7a13637edee6a9
 
 
 Revision-number: 4
@@ -228,8 +228,8 @@ Node-kind: file
 Node-action: add
 Node-copyfrom-rev: 2
 Node-copyfrom-path: trunk/Makefile
-Text-copy-source-md5: bfd8ff778d1492dc6758567373176a89
-Text-copy-source-sha1: 103205ce331f7d64086dba497574734f78439590
+Text-copy-source-md5: d6a3917748b0c09ad85c2783f1d4dac1
+Text-copy-source-sha1: 9ffe895eb95d4a7c2ee2712dcf7a13637edee6a9
 
 
 Revision-number: 5
@@ -254,8 +254,8 @@ Node-path: branches/left/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2465
-Text-content-md5: 16e38d9753b061731650561ce01b1195
-Text-content-sha1: 36da4b84ea9b64218ab48171dfc5c48ae025f38b
+Text-content-md5: 3f413450a7a26596d9e512ee385a9b19
+Text-content-sha1: b3cd389d63c5e3af4fe22b7464cf97968662ad1a
 Content-length: 2465
 
 # -DCOLLISION_CHECK if you believe that SHA1's
@@ -359,8 +359,8 @@ Node-path: branches/right/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2521
-Text-content-md5: 0668418a621333f4aa8b6632cd63e2a0
-Text-content-sha1: 4f29afd038e52f45acb5ef8c41acfc70062a741a
+Text-content-md5: 89788781014278d76ff23648b8b08b2d
+Text-content-sha1: f52afb2d6230e5a418416b77c3c9ad610edfd202
 Content-length: 2521
 
 # -DCOLLISION_CHECK if you believe that SHA1's
@@ -467,8 +467,8 @@ Node-path: branches/left/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2529
-Text-content-md5: f6b197cc3f2e89a83e545d4bb003de73
-Text-content-sha1: 2f656677cfec0bceec85e53036ffb63e25126f8e
+Text-content-md5: abcac8d04eb061b0a3053e359e44a2a0
+Text-content-sha1: 866caf95e04809a5ed897aea41075b24833612ea
 Content-length: 2529
 
 # -DCOLLISION_CHECK if you believe that SHA1's
@@ -572,8 +572,8 @@ Node-path: branches/left/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2593
-Text-content-md5: 5ccff689fb290e00b85fe18ee50c54ba
-Text-content-sha1: a13de8e23f1483efca3e57b2b64b0ae6f740ce10
+Text-content-md5: 706d73919e6f319a0e624aa50c8b8b38
+Text-content-sha1: 9992d5a9aea960c7856ef6a9364aedd5b710ef53
 Content-length: 2593
 
 # -DCOLLISION_CHECK if you believe that SHA1's
@@ -689,8 +689,8 @@ Node-kind: file
 Node-action: add
 Node-copyfrom-rev: 8
 Node-copyfrom-path: branches/left/Makefile
-Text-copy-source-md5: 5ccff689fb290e00b85fe18ee50c54ba
-Text-copy-source-sha1: a13de8e23f1483efca3e57b2b64b0ae6f740ce10
+Text-copy-source-md5: 706d73919e6f319a0e624aa50c8b8b38
+Text-copy-source-sha1: 9992d5a9aea960c7856ef6a9364aedd5b710ef53
 
 
 
@@ -761,8 +761,8 @@ Node-path: trunk/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2593
-Text-content-md5: 5ccff689fb290e00b85fe18ee50c54ba
-Text-content-sha1: a13de8e23f1483efca3e57b2b64b0ae6f740ce10
+Text-content-md5: 706d73919e6f319a0e624aa50c8b8b38
+Text-content-sha1: 9992d5a9aea960c7856ef6a9364aedd5b710ef53
 Content-length: 2593
 
 # -DCOLLISION_CHECK if you believe that SHA1's
@@ -942,8 +942,8 @@ Node-path: trunk/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2713
-Text-content-md5: 0afbe34f244cd662b1f97d708c687f90
-Text-content-sha1: 46d9377d783e67a9b581da110352e799517c8a14
+Text-content-md5: 1c05266da99e8f01a5ccf816be47a484
+Text-content-sha1: 0cba212974e2b288389d73317f3220be11158e00
 Content-length: 2713
 
 # -DCOLLISION_CHECK if you believe that SHA1's
@@ -1166,8 +1166,8 @@ Node-path: branches/left-sub/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2713
-Text-content-md5: 0afbe34f244cd662b1f97d708c687f90
-Text-content-sha1: 46d9377d783e67a9b581da110352e799517c8a14
+Text-content-md5: 1c05266da99e8f01a5ccf816be47a484
+Text-content-sha1: 0cba212974e2b288389d73317f3220be11158e00
 Content-length: 2713
 
 # -DCOLLISION_CHECK if you believe that SHA1's
@@ -1408,8 +1408,8 @@ Node-path: branches/left/Makefile
 Node-kind: file
 Node-action: change
 Text-content-length: 2713
-Text-content-md5: 0afbe34f244cd662b1f97d708c687f90
-Text-content-sha1: 46d9377d783e67a9b581da110352e799517c8a14
+Text-content-md5: 1c05266da99e8f01a5ccf816be47a484
+Text-content-sha1: 0cba212974e2b288389d73317f3220be11158e00
 Content-length: 2713
 
 # -DCOLLISION_CHECK if you believe that SHA1's
-- 
2.54.0.windows.1.10.gd5b8d9bb7af0


^ permalink raw reply related

* [PATCH v2 2/2] prio-queue: rename .nr to .nr_internal to prevent direct access
From: Kristofer Karlsson via GitGitGadget @ 2026-06-06 19:01 UTC (permalink / raw)
  To: git
  Cc: René Scharfe, Kristofer Karlsson, Kristofer Karlsson,
	Kristofer Karlsson
In-Reply-To: <pull.2140.v2.git.1780772477.gitgitgadget@gmail.com>

From: Kristofer Karlsson <krka@spotify.com>

Rename the .nr member to .nr_internal so that callers outside
prio-queue.c that directly reference .nr get a compilation error.
This catches both existing misuse and future in-flight topics.

Add prio_queue_for_each() macro for callers that need to walk all
elements in the queue, accounting for the get_pending offset.

Convert all external .nr users:
 - Loop conditions: use prio_queue_size(), prio_queue_get(), or
   prio_queue_peek() as the loop condition
 - Array iterations: use prio_queue_for_each()

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
 builtin/describe.c      |  7 +++----
 builtin/last-modified.c |  5 ++---
 builtin/show-branch.c   |  9 ++++-----
 commit-reach.c          | 19 +++++++++++--------
 fetch-pack.c            |  4 ++--
 negotiator/default.c    |  4 +++-
 negotiator/skipping.c   | 12 +++++++-----
 object-name.c           |  2 +-
 pack-bitmap-write.c     |  6 +++---
 path-walk.c             |  8 ++++----
 prio-queue.c            | 32 ++++++++++++++++----------------
 prio-queue.h            |  9 +++++++--
 revision.c              | 11 +++++------
 13 files changed, 68 insertions(+), 60 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 85564f3487..64424543ef 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -258,10 +258,9 @@ static unsigned long finish_depth_computation(struct prio_queue *queue,
 	struct oidset unflagged = OIDSET_INIT;
 	struct commit *c;
 
-	for (size_t i = queue->get_pending; i < queue->nr; i++) {
-		struct commit *commit = queue->array[i].data;
-		if (!(commit->object.flags & best->flag_within))
-			oidset_insert(&unflagged, &commit->object.oid);
+	prio_queue_for_each(queue, c) {
+		if (!(c->object.flags & best->flag_within))
+			oidset_insert(&unflagged, &c->object.oid);
 	}
 
 	while ((c = prio_queue_get(queue))) {
diff --git a/builtin/last-modified.c b/builtin/last-modified.c
index df2a508244..5478182f2e 100644
--- a/builtin/last-modified.c
+++ b/builtin/last-modified.c
@@ -344,7 +344,7 @@ static void process_parent(struct last_modified *lm,
 static int last_modified_run(struct last_modified *lm)
 {
 	int max_count, queue_popped = 0;
-	struct commit *c;
+	struct commit *c, *n;
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	struct prio_queue not_queue = { compare_commits_by_gen_then_commit_date };
 	struct commit_list *list;
@@ -416,9 +416,8 @@ static int last_modified_run(struct last_modified *lm)
 		 */
 		repo_parse_commit(lm->rev.repo, c);
 
-		while (not_queue.nr) {
+		while ((n = prio_queue_get(&not_queue))) {
 			struct commit_list *np;
-			struct commit *n = prio_queue_get(&not_queue);
 
 			repo_parse_commit(lm->rev.repo, n);
 
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 9f7f28f339..2435e8aeda 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -62,11 +62,10 @@ static const char *get_color_reset_code(void)
 
 static struct commit *interesting(struct prio_queue *queue)
 {
-	for (size_t i = queue->get_pending; i < queue->nr; i++) {
-		struct commit *commit = queue->array[i].data;
-		if (commit->object.flags & UNINTERESTING)
-			continue;
-		return commit;
+	struct commit *commit;
+	prio_queue_for_each(queue, commit) {
+		if (!(commit->object.flags & UNINTERESTING))
+			return commit;
 	}
 	return NULL;
 }
diff --git a/commit-reach.c b/commit-reach.c
index 0fec2f00be..dfe6016cb2 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -41,8 +41,8 @@ static int compare_commits_by_gen(const void *_a, const void *_b)
 
 static int queue_has_nonstale(struct prio_queue *queue)
 {
-	for (size_t i = 0; i < queue->nr; i++) {
-		struct commit *commit = queue->array[i].data;
+	struct commit *commit;
+	prio_queue_for_each(queue, commit) {
 		if (!(commit->object.flags & STALE))
 			return 1;
 	}
@@ -1069,6 +1069,7 @@ void ahead_behind(struct repository *r,
 		  struct commit **commits, size_t commits_nr,
 		  struct ahead_behind_count *counts, size_t counts_nr)
 {
+	struct commit *c;
 	struct prio_queue queue = { .compare = compare_commits_by_gen_then_commit_date };
 	size_t width = DIV_ROUND_UP(commits_nr, BITS_IN_EWORD);
 
@@ -1085,17 +1086,19 @@ void ahead_behind(struct repository *r,
 	init_bit_arrays(&bit_arrays);
 
 	for (size_t i = 0; i < commits_nr; i++) {
-		struct commit *c = commits[i];
-		struct bitmap *bitmap = get_bit_array(c, width);
+		struct bitmap *bitmap;
+		c = commits[i];
+		bitmap = get_bit_array(c, width);
 
 		bitmap_set(bitmap, i);
 		insert_no_dup(&queue, c);
 	}
 
 	while (queue_has_nonstale(&queue)) {
-		struct commit *c = prio_queue_get(&queue);
 		struct commit_list *p;
-		struct bitmap *bitmap_c = get_bit_array(c, width);
+		struct bitmap *bitmap_c;
+		c = prio_queue_get(&queue);
+		bitmap_c = get_bit_array(c, width);
 
 		for (size_t i = 0; i < counts_nr; i++) {
 			int reach_from_tip = !!bitmap_get(bitmap_c, counts[i].tip_index);
@@ -1135,8 +1138,8 @@ void ahead_behind(struct repository *r,
 
 	/* STALE is used here, PARENT2 is used by insert_no_dup(). */
 	repo_clear_commit_marks(r, PARENT2 | STALE);
-	for (size_t i = 0; i < queue.nr; i++)
-		free_bit_array(queue.array[i].data);
+	prio_queue_for_each(&queue, c)
+		free_bit_array(c);
 	clear_bit_arrays(&bit_arrays);
 	clear_prio_queue(&queue);
 }
diff --git a/fetch-pack.c b/fetch-pack.c
index 120e01f3cf..29c41132ee 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -662,8 +662,8 @@ static int mark_complete_oid(const struct reference *ref, void *cb_data UNUSED)
 static void mark_recent_complete_commits(struct fetch_pack_args *args,
 					 timestamp_t cutoff)
 {
-	while (complete.nr) {
-		struct commit *item = prio_queue_peek(&complete);
+	struct commit *item;
+	while ((item = prio_queue_peek(&complete))) {
 		if (item->date < cutoff)
 			break;
 		print_verbose(args, _("Marking %s as complete"),
diff --git a/negotiator/default.c b/negotiator/default.c
index 78d58d57ce..19cdf3808c 100644
--- a/negotiator/default.c
+++ b/negotiator/default.c
@@ -113,10 +113,12 @@ static const struct object_id *get_rev(struct negotiation_state *ns)
 		unsigned int mark;
 		struct commit_list *parents;
 
-		if (ns->rev_list.nr == 0 || ns->non_common_revs == 0)
+		if (ns->non_common_revs == 0)
 			return NULL;
 
 		commit = prio_queue_get(&ns->rev_list);
+		if (!commit)
+			return NULL;
 		repo_parse_commit(the_repository, commit);
 		parents = commit->parents;
 
diff --git a/negotiator/skipping.c b/negotiator/skipping.c
index 68c9b3b997..db90fa77b5 100644
--- a/negotiator/skipping.c
+++ b/negotiator/skipping.c
@@ -143,8 +143,7 @@ static int push_parent(struct data *data, struct entry *entry,
 		/*
 		 * Find the existing entry and use it.
 		 */
-		for (size_t i = 0; i < data->rev_list.nr; i++) {
-			parent_entry = data->rev_list.array[i].data;
+		prio_queue_for_each(&data->rev_list, parent_entry) {
 			if (parent_entry->commit == to_push)
 				goto parent_found;
 		}
@@ -181,10 +180,12 @@ static const struct object_id *get_rev(struct data *data)
 		struct commit_list *p;
 		int parent_pushed = 0;
 
-		if (data->rev_list.nr == 0 || data->non_common_revs == 0)
+		if (data->non_common_revs == 0)
 			return NULL;
 
 		entry = prio_queue_get(&data->rev_list);
+		if (!entry)
+			return NULL;
 		commit = entry->commit;
 		commit->object.flags |= POPPED;
 		if (!(commit->object.flags & COMMON))
@@ -253,8 +254,9 @@ static void have_sent(struct fetch_negotiator *n, struct commit *c)
 static void release(struct fetch_negotiator *n)
 {
 	struct data *data = n->data;
-	for (size_t i = 0; i < data->rev_list.nr; i++)
-		free(data->rev_list.array[i].data);
+	void *entry;
+	prio_queue_for_each(&data->rev_list, entry)
+		free(entry);
 	clear_prio_queue(&data->rev_list);
 	FREE_AND_NULL(data);
 }
diff --git a/object-name.c b/object-name.c
index 9ac86f19c7..2fedfe1761 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1208,7 +1208,7 @@ static int get_oid_oneline(struct repository *r,
 		l->item->object.flags |= ONELINE_SEEN;
 		prio_queue_put(&copy, l->item);
 	}
-	while (copy.nr) {
+	while (prio_queue_size(&copy)) {
 		const char *p, *buf;
 		struct commit *commit;
 		int matches;
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index f7c63e3027..ed9714b135 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -514,6 +514,7 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
 			      const uint32_t *mapping)
 {
 	struct commit *c;
+	struct tree *tree;
 	int found;
 	uint32_t pos;
 	if (!ent->bitmap)
@@ -574,9 +575,8 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
 		}
 	}
 
-	while (tree_queue->nr) {
-		if (fill_bitmap_tree(writer, ent->bitmap,
-				     prio_queue_get(tree_queue)) < 0)
+	while ((tree = prio_queue_get(tree_queue))) {
+		if (fill_bitmap_tree(writer, ent->bitmap, tree) < 0)
 			return -1;
 	}
 	return 0;
diff --git a/path-walk.c b/path-walk.c
index 94ff90bd15..cf3b2d0765 100644
--- a/path-walk.c
+++ b/path-walk.c
@@ -699,6 +699,7 @@ int walk_objects_by_path(struct path_walk_info *info)
 	int ret;
 	size_t commits_nr = 0, paths_nr = 0;
 	struct commit *c;
+	char *path;
 	struct type_and_oid_list *root_tree_list;
 	struct type_and_oid_list *commit_list;
 	struct path_walk_context ctx = {
@@ -808,8 +809,7 @@ int walk_objects_by_path(struct path_walk_info *info)
 	free(commit_list);
 
 	trace2_region_enter("path-walk", "path-walk", info->revs->repo);
-	while (!ret && ctx.path_stack.nr) {
-		char *path = prio_queue_get(&ctx.path_stack);
+	while (!ret && (path = prio_queue_get(&ctx.path_stack))) {
 		paths_nr++;
 
 		ret = walk_path(&ctx, path);
@@ -821,12 +821,12 @@ int walk_objects_by_path(struct path_walk_info *info)
 	if (!strmap_empty(&ctx.paths_to_lists)) {
 		struct hashmap_iter iter;
 		struct strmap_entry *entry;
+		char *path;
 
 		strmap_for_each_entry(&ctx.paths_to_lists, &iter, entry)
 			push_to_stack(&ctx, entry->key);
 
-		while (!ret && ctx.path_stack.nr) {
-			char *path = prio_queue_get(&ctx.path_stack);
+		while (!ret && (path = prio_queue_get(&ctx.path_stack))) {
 			paths_nr++;
 
 			ret = walk_path(&ctx, path);
diff --git a/prio-queue.c b/prio-queue.c
index 1407f2f801..d11ca6ac36 100644
--- a/prio-queue.c
+++ b/prio-queue.c
@@ -22,16 +22,16 @@ void prio_queue_reverse(struct prio_queue *queue)
 
 	if (queue->compare)
 		BUG("prio_queue_reverse() on non-LIFO queue");
-	if (!queue->nr)
+	if (!queue->nr_internal)
 		return;
-	for (i = 0; i < (j = (queue->nr - 1) - i); i++)
+	for (i = 0; i < (j = (queue->nr_internal - 1) - i); i++)
 		swap(queue, i, j);
 }
 
 void clear_prio_queue(struct prio_queue *queue)
 {
 	FREE_AND_NULL(queue->array);
-	queue->nr = 0;
+	queue->nr_internal = 0;
 	queue->alloc = 0;
 	queue->insertion_ctr = 0;
 	queue->get_pending = 0;
@@ -44,9 +44,9 @@ static inline void flush_get(struct prio_queue *queue)
 	if (!queue->get_pending)
 		return;
 	queue->get_pending = 0;
-	if (!--queue->nr)
+	if (!--queue->nr_internal)
 		return;
-	queue->array[0] = queue->array[queue->nr];
+	queue->array[0] = queue->array[queue->nr_internal];
 	sift_down_root(queue);
 }
 
@@ -63,15 +63,15 @@ void prio_queue_put(struct prio_queue *queue, void *thing)
 	}
 
 	/* Append at the end */
-	ALLOC_GROW(queue->array, queue->nr + 1, queue->alloc);
-	queue->array[queue->nr].ctr = queue->insertion_ctr++;
-	queue->array[queue->nr].data = thing;
-	queue->nr++;
+	ALLOC_GROW(queue->array, queue->nr_internal + 1, queue->alloc);
+	queue->array[queue->nr_internal].ctr = queue->insertion_ctr++;
+	queue->array[queue->nr_internal].data = thing;
+	queue->nr_internal++;
 	if (!queue->compare)
 		return; /* LIFO */
 
 	/* Bubble up the new one */
-	for (ix = queue->nr - 1; ix; ix = parent) {
+	for (ix = queue->nr_internal - 1; ix; ix = parent) {
 		parent = (ix - 1) / 2;
 		if (compare(queue, parent, ix) <= 0)
 			break;
@@ -85,9 +85,9 @@ static void sift_down_root(struct prio_queue *queue)
 	size_t ix, child;
 
 	/* Push down the one at the root */
-	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
+	for (ix = 0; ix * 2 + 1 < queue->nr_internal; ix = child) {
 		child = ix * 2 + 1; /* left */
-		if (child + 1 < queue->nr &&
+		if (child + 1 < queue->nr_internal &&
 		    compare(queue, child, child + 1) >= 0)
 			child++; /* use right child */
 
@@ -102,10 +102,10 @@ void *prio_queue_get(struct prio_queue *queue)
 {
 	flush_get(queue);
 
-	if (!queue->nr)
+	if (!queue->nr_internal)
 		return NULL;
 	if (!queue->compare)
-		return queue->array[--queue->nr].data; /* LIFO */
+		return queue->array[--queue->nr_internal].data; /* LIFO */
 
 	queue->get_pending = 1;
 	return queue->array[0].data;
@@ -115,9 +115,9 @@ void *prio_queue_peek(struct prio_queue *queue)
 {
 	flush_get(queue);
 
-	if (!queue->nr)
+	if (!queue->nr_internal)
 		return NULL;
 	if (!queue->compare)
-		return queue->array[queue->nr - 1].data;
+		return queue->array[queue->nr_internal - 1].data;
 	return queue->array[0].data;
 }
diff --git a/prio-queue.h b/prio-queue.h
index 482ab5e71d..f08ab87691 100644
--- a/prio-queue.h
+++ b/prio-queue.h
@@ -30,7 +30,7 @@ struct prio_queue {
 	prio_queue_compare_fn compare;
 	size_t insertion_ctr;
 	void *cb_data;
-	size_t alloc, nr;
+	size_t alloc, nr_internal; /* use prio_queue_size() for logical count */
 	struct prio_queue_entry *array;
 	unsigned get_pending;
 };
@@ -55,9 +55,14 @@ void *prio_queue_peek(struct prio_queue *);
 
 static inline size_t prio_queue_size(struct prio_queue *queue)
 {
-	return queue->nr - queue->get_pending;
+	return queue->nr_internal - queue->get_pending;
 }
 
+#define prio_queue_for_each(queue, it) \
+	for (size_t pq_ix_ = (queue)->get_pending; \
+	     pq_ix_ < (queue)->nr_internal && ((it) = (queue)->array[pq_ix_].data, 1); \
+	     pq_ix_++)
+
 void clear_prio_queue(struct prio_queue *);
 
 /* Reverse the LIFO elements */
diff --git a/revision.c b/revision.c
index 8ce8ffa43d..34e2d146f4 100644
--- a/revision.c
+++ b/revision.c
@@ -476,16 +476,15 @@ static struct commit *handle_commit(struct rev_info *revs,
 static int everybody_uninteresting(struct prio_queue *orig,
 				   struct commit **interesting_cache)
 {
-	size_t i;
+	struct commit *commit;
 
 	if (*interesting_cache) {
-		struct commit *commit = *interesting_cache;
+		commit = *interesting_cache;
 		if (!(commit->object.flags & UNINTERESTING))
 			return 0;
 	}
 
-	for (i = 0; i < orig->nr; i++) {
-		struct commit *commit = orig->array[i].data;
+	prio_queue_for_each(orig, commit) {
 		if (commit->object.flags & UNINTERESTING)
 			continue;
 
@@ -4027,8 +4026,8 @@ static enum rewrite_result rewrite_one_1(struct rev_info *revs,
 
 static void merge_queue_into_list(struct prio_queue *q, struct commit_list **list)
 {
-	while (q->nr) {
-		struct commit *item = prio_queue_peek(q);
+	struct commit *item;
+	while ((item = prio_queue_peek(q))) {
 		struct commit_list *p = *list;
 
 		if (p && p->item->date >= item->date)
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v2 1/2] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: Kristofer Karlsson via GitGitGadget @ 2026-06-06 19:01 UTC (permalink / raw)
  To: git
  Cc: René Scharfe, Kristofer Karlsson, Kristofer Karlsson,
	Kristofer Karlsson
In-Reply-To: <pull.2140.v2.git.1780772477.gitgitgadget@gmail.com>

From: Kristofer Karlsson <krka@spotify.com>

Defer the actual removal in prio_queue_get() until the next
operation.  If that next operation is a prio_queue_put(), the
removal and insertion are fused into a single replace — writing
the new element at the root and sifting it down — which avoids
a full remove-rebalance-insert cycle.

This matches the dominant usage pattern in git's commit traversal:
get a commit, then put its parents.  The first parent insertion
after each get is now a replace operation automatically.

This generalizes the lazy_queue pattern from builtin/describe.c
(introduced in 08bb69d70f) into prio_queue itself.  Three callers
independently implemented the same get+put fusion:

  - builtin/describe.c had a full lazy_queue wrapper
  - commit.c:pop_most_recent_commit() reimplements the same
    get_pending flag with peek+replace
  - builtin/show-branch.c:join_revs() used the same peek+replace
    pattern

All three now collapse to plain _get() and _put(),
with the data structure handling the fusion internally.

Remove prio_queue_replace() since no external callers remain.
Add prio_queue_size() for callers that need the logical element
count, since the physical nr may temporarily include a
pending-removal element.

Benchmarked on a large monorepo (10-15 interleaved runs, 1 warmup):

  Command                       base    patched  speedup
  merge-base --all A A~1000     3.88s   3.77s    1.03x
  rev-list --count A~1000..A    3.57s   3.43s    1.04x
  log --oneline A~1000..A       3.70s   3.49s    1.06x
  rev-parse :/pattern           365ms   364ms    1.00x
  describe HEAD (linux.git)     184ms   190ms    1.00x

No regressions in any scenario.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
 builtin/describe.c          | 67 +++++++++----------------------------
 builtin/last-modified.c     |  4 +--
 builtin/show-branch.c       | 17 ++++------
 commit-reach.c              |  5 ++-
 commit.c                    | 11 ++----
 pack-bitmap-write.c         |  4 +--
 prio-queue.c                | 49 +++++++++++++++------------
 prio-queue.h                | 12 +++----
 revision.c                  |  5 ++-
 t/unit-tests/u-prio-queue.c |  6 ++--
 walker.c                    |  4 +--
 11 files changed, 68 insertions(+), 116 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 1c47d7c0b7..85564f3487 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -251,56 +251,20 @@ static int compare_pt(const void *a_, const void *b_)
 	return 0;
 }
 
-struct lazy_queue {
-	struct prio_queue queue;
-	bool get_pending;
-};
-
-#define LAZY_QUEUE_INIT { { compare_commits_by_commit_date }, false }
-
-static void *lazy_queue_get(struct lazy_queue *queue)
-{
-	if (queue->get_pending)
-		prio_queue_get(&queue->queue);
-	else
-		queue->get_pending = true;
-	return prio_queue_peek(&queue->queue);
-}
-
-static void lazy_queue_put(struct lazy_queue *queue, void *thing)
-{
-	if (queue->get_pending)
-		prio_queue_replace(&queue->queue, thing);
-	else
-		prio_queue_put(&queue->queue, thing);
-	queue->get_pending = false;
-}
-
-static bool lazy_queue_empty(const struct lazy_queue *queue)
-{
-	return queue->queue.nr == (queue->get_pending ? 1 : 0);
-}
-
-static void lazy_queue_clear(struct lazy_queue *queue)
-{
-	clear_prio_queue(&queue->queue);
-	queue->get_pending = false;
-}
-
-static unsigned long finish_depth_computation(struct lazy_queue *queue,
+static unsigned long finish_depth_computation(struct prio_queue *queue,
 					      struct possible_tag *best)
 {
 	unsigned long seen_commits = 0;
 	struct oidset unflagged = OIDSET_INIT;
+	struct commit *c;
 
-	for (size_t i = queue->get_pending ? 1 : 0; i < queue->queue.nr; i++) {
-		struct commit *commit = queue->queue.array[i].data;
+	for (size_t i = queue->get_pending; i < queue->nr; i++) {
+		struct commit *commit = queue->array[i].data;
 		if (!(commit->object.flags & best->flag_within))
 			oidset_insert(&unflagged, &commit->object.oid);
 	}
 
-	while (!lazy_queue_empty(queue)) {
-		struct commit *c = lazy_queue_get(queue);
+	while ((c = prio_queue_get(queue))) {
 		struct commit_list *parents = c->parents;
 		seen_commits++;
 		if (c->object.flags & best->flag_within) {
@@ -316,7 +280,7 @@ static unsigned long finish_depth_computation(struct lazy_queue *queue,
 			repo_parse_commit(the_repository, p);
 			seen = p->object.flags & SEEN;
 			if (!seen)
-				lazy_queue_put(queue, p);
+				prio_queue_put(queue, p);
 			flag_before = p->object.flags & best->flag_within;
 			p->object.flags |= c->object.flags;
 			flag_after = p->object.flags & best->flag_within;
@@ -364,8 +328,8 @@ static void append_suffix(int depth, const struct object_id *oid, struct strbuf
 
 static void describe_commit(struct commit *cmit, struct strbuf *dst)
 {
-	struct commit *gave_up_on = NULL;
-	struct lazy_queue queue = LAZY_QUEUE_INIT;
+	struct commit *c, *gave_up_on = NULL;
+	struct prio_queue queue = { compare_commits_by_commit_date };
 	struct commit_name *n;
 	struct possible_tag all_matches[MAX_TAGS];
 	unsigned int match_cnt = 0, annotated_cnt = 0, cur_match;
@@ -407,9 +371,8 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 	}
 
 	cmit->object.flags = SEEN;
-	lazy_queue_put(&queue, cmit);
-	while (!lazy_queue_empty(&queue)) {
-		struct commit *c = lazy_queue_get(&queue);
+	prio_queue_put(&queue, cmit);
+	while ((c = prio_queue_get(&queue))) {
 		struct commit_list *parents = c->parents;
 		struct commit_name **slot;
 
@@ -443,7 +406,7 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 				t->depth++;
 		}
 		/* Stop if last remaining path already covered by best candidate(s) */
-		if (annotated_cnt && lazy_queue_empty(&queue)) {
+		if (annotated_cnt && !prio_queue_size(&queue)) {
 			int best_depth = INT_MAX;
 			unsigned best_within = 0;
 			for (cur_match = 0; cur_match < match_cnt; cur_match++) {
@@ -466,7 +429,7 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 			struct commit *p = parents->item;
 			repo_parse_commit(the_repository, p);
 			if (!(p->object.flags & SEEN))
-				lazy_queue_put(&queue, p);
+				prio_queue_put(&queue, p);
 			p->object.flags |= c->object.flags;
 			parents = parents->next;
 
@@ -481,7 +444,7 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 			strbuf_add_unique_abbrev(dst, cmit_oid, abbrev);
 			if (suffix)
 				strbuf_addstr(dst, suffix);
-			lazy_queue_clear(&queue);
+			clear_prio_queue(&queue);
 			return;
 		}
 		if (unannotated_cnt)
@@ -497,11 +460,11 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 	QSORT(all_matches, match_cnt, compare_pt);
 
 	if (gave_up_on) {
-		lazy_queue_put(&queue, gave_up_on);
+		prio_queue_put(&queue, gave_up_on);
 		seen_commits--;
 	}
 	seen_commits += finish_depth_computation(&queue, &all_matches[0]);
-	lazy_queue_clear(&queue);
+	clear_prio_queue(&queue);
 
 	if (debug) {
 		static int label_width = -1;
diff --git a/builtin/last-modified.c b/builtin/last-modified.c
index 8900ceece1..df2a508244 100644
--- a/builtin/last-modified.c
+++ b/builtin/last-modified.c
@@ -344,6 +344,7 @@ static void process_parent(struct last_modified *lm,
 static int last_modified_run(struct last_modified *lm)
 {
 	int max_count, queue_popped = 0;
+	struct commit *c;
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	struct prio_queue not_queue = { compare_commits_by_gen_then_commit_date };
 	struct commit_list *list;
@@ -389,10 +390,9 @@ static int last_modified_run(struct last_modified *lm)
 		}
 	}
 
-	while (queue.nr) {
+	while ((c = prio_queue_get(&queue))) {
 		int parent_i;
 		struct commit_list *p;
-		struct commit *c = prio_queue_get(&queue);
 		struct bitmap *active_c = active_paths_for(lm, c);
 
 		if ((0 <= max_count && max_count < ++queue_popped) ||
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index f02831b085..9f7f28f339 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -62,7 +62,7 @@ static const char *get_color_reset_code(void)
 
 static struct commit *interesting(struct prio_queue *queue)
 {
-	for (size_t i = 0; i < queue->nr; i++) {
+	for (size_t i = queue->get_pending; i < queue->nr; i++) {
 		struct commit *commit = queue->array[i].data;
 		if (commit->object.flags & UNINTERESTING)
 			continue;
@@ -228,17 +228,18 @@ static void join_revs(struct prio_queue *queue,
 {
 	int all_mask = ((1u << (REV_SHIFT + num_rev)) - 1);
 	int all_revs = all_mask & ~((1u << REV_SHIFT) - 1);
+	struct commit *commit;
 
-	while (queue->nr) {
+	while ((commit = prio_queue_peek(queue))) {
 		struct commit_list *parents;
 		int still_interesting = !!interesting(queue);
-		struct commit *commit = prio_queue_peek(queue);
-		bool get_pending = true;
 		int flags = commit->object.flags & all_mask;
 
 		if (!still_interesting && extra <= 0)
 			break;
 
+		prio_queue_get(queue);
+
 		mark_seen(commit, seen_p);
 		if ((flags & all_revs) == all_revs)
 			flags |= UNINTERESTING;
@@ -254,14 +255,8 @@ static void join_revs(struct prio_queue *queue,
 			if (mark_seen(p, seen_p) && !still_interesting)
 				extra--;
 			p->object.flags |= flags;
-			if (get_pending)
-				prio_queue_replace(queue, p);
-			else
-				prio_queue_put(queue, p);
-			get_pending = false;
+			prio_queue_put(queue, p);
 		}
-		if (get_pending)
-			prio_queue_get(queue);
 	}
 
 	/*
diff --git a/commit-reach.c b/commit-reach.c
index 9b3ea46d6f..0fec2f00be 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -1269,7 +1269,7 @@ int get_branch_base_for_tip(struct repository *r,
 			    size_t bases_nr)
 {
 	int best_index = -1;
-	struct commit *branch_point = NULL;
+	struct commit *c, *branch_point = NULL;
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	int found_missing_gen = 0;
 
@@ -1322,8 +1322,7 @@ int get_branch_base_for_tip(struct repository *r,
 		prio_queue_put(&queue, c);
 	}
 
-	while (queue.nr) {
-		struct commit *c = prio_queue_get(&queue);
+	while ((c = prio_queue_get(&queue))) {
 		int best_for_c = get_best(c);
 		int best_for_p, positive;
 		struct commit *parent;
diff --git a/commit.c b/commit.c
index fd8723502e..976bfc4618 100644
--- a/commit.c
+++ b/commit.c
@@ -795,24 +795,17 @@ void commit_list_sort_by_date(struct commit_list **list)
 struct commit *pop_most_recent_commit(struct prio_queue *queue,
 				      unsigned int mark)
 {
-	struct commit *ret = prio_queue_peek(queue);
-	int get_pending = 1;
+	struct commit *ret = prio_queue_get(queue);
 	struct commit_list *parents = ret->parents;
 
 	while (parents) {
 		struct commit *commit = parents->item;
 		if (!repo_parse_commit(the_repository, commit) && !(commit->object.flags & mark)) {
 			commit->object.flags |= mark;
-			if (get_pending)
-				prio_queue_replace(queue, commit);
-			else
-				prio_queue_put(queue, commit);
-			get_pending = 0;
+			prio_queue_put(queue, commit);
 		}
 		parents = parents->next;
 	}
-	if (get_pending)
-		prio_queue_get(queue);
 	return ret;
 }
 
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 1c8070f99c..f7c63e3027 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -513,6 +513,7 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
 			      struct bitmap_index *old_bitmap,
 			      const uint32_t *mapping)
 {
+	struct commit *c;
 	int found;
 	uint32_t pos;
 	if (!ent->bitmap)
@@ -520,9 +521,8 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
 
 	prio_queue_put(queue, commit);
 
-	while (queue->nr) {
+	while ((c = prio_queue_get(queue))) {
 		struct commit_list *p;
-		struct commit *c = prio_queue_get(queue);
 
 		if (old_bitmap && mapping) {
 			struct ewah_bitmap *old;
diff --git a/prio-queue.c b/prio-queue.c
index 9748528ce6..1407f2f801 100644
--- a/prio-queue.c
+++ b/prio-queue.c
@@ -34,12 +34,34 @@ void clear_prio_queue(struct prio_queue *queue)
 	queue->nr = 0;
 	queue->alloc = 0;
 	queue->insertion_ctr = 0;
+	queue->get_pending = 0;
+}
+
+static void sift_down_root(struct prio_queue *queue);
+
+static inline void flush_get(struct prio_queue *queue)
+{
+	if (!queue->get_pending)
+		return;
+	queue->get_pending = 0;
+	if (!--queue->nr)
+		return;
+	queue->array[0] = queue->array[queue->nr];
+	sift_down_root(queue);
 }
 
 void prio_queue_put(struct prio_queue *queue, void *thing)
 {
 	size_t ix, parent;
 
+	if (queue->get_pending) {
+		queue->get_pending = 0;
+		queue->array[0].ctr = queue->insertion_ctr++;
+		queue->array[0].data = thing;
+		sift_down_root(queue);
+		return;
+	}
+
 	/* Append at the end */
 	ALLOC_GROW(queue->array, queue->nr + 1, queue->alloc);
 	queue->array[queue->nr].ctr = queue->insertion_ctr++;
@@ -78,41 +100,24 @@ static void sift_down_root(struct prio_queue *queue)
 
 void *prio_queue_get(struct prio_queue *queue)
 {
-	void *result;
+	flush_get(queue);
 
 	if (!queue->nr)
 		return NULL;
 	if (!queue->compare)
 		return queue->array[--queue->nr].data; /* LIFO */
 
-	result = queue->array[0].data;
-	if (!--queue->nr)
-		return result;
-
-	queue->array[0] = queue->array[queue->nr];
-	sift_down_root(queue);
-	return result;
+	queue->get_pending = 1;
+	return queue->array[0].data;
 }
 
 void *prio_queue_peek(struct prio_queue *queue)
 {
+	flush_get(queue);
+
 	if (!queue->nr)
 		return NULL;
 	if (!queue->compare)
 		return queue->array[queue->nr - 1].data;
 	return queue->array[0].data;
 }
-
-void prio_queue_replace(struct prio_queue *queue, void *thing)
-{
-	if (!queue->nr) {
-		prio_queue_put(queue, thing);
-	} else if (!queue->compare) {
-		queue->array[queue->nr - 1].ctr = queue->insertion_ctr++;
-		queue->array[queue->nr - 1].data = thing;
-	} else {
-		queue->array[0].ctr = queue->insertion_ctr++;
-		queue->array[0].data = thing;
-		sift_down_root(queue);
-	}
-}
diff --git a/prio-queue.h b/prio-queue.h
index da7fad2f1f..482ab5e71d 100644
--- a/prio-queue.h
+++ b/prio-queue.h
@@ -32,6 +32,7 @@ struct prio_queue {
 	void *cb_data;
 	size_t alloc, nr;
 	struct prio_queue_entry *array;
+	unsigned get_pending;
 };
 
 /*
@@ -52,13 +53,10 @@ void *prio_queue_get(struct prio_queue *);
  */
 void *prio_queue_peek(struct prio_queue *);
 
-/*
- * Replace the "thing" that compares the smallest with a new "thing",
- * like prio_queue_get()+prio_queue_put() would do, but in a more
- * efficient way.  Does the same as prio_queue_put() if the queue is
- * empty.
- */
-void prio_queue_replace(struct prio_queue *queue, void *thing);
+static inline size_t prio_queue_size(struct prio_queue *queue)
+{
+	return queue->nr - queue->get_pending;
+}
 
 void clear_prio_queue(struct prio_queue *);
 
diff --git a/revision.c b/revision.c
index 5693618be4..8ce8ffa43d 100644
--- a/revision.c
+++ b/revision.c
@@ -1446,7 +1446,7 @@ static int limit_list(struct rev_info *revs)
 	struct commit_list *original_list = revs->commits;
 	struct commit_list *newlist = NULL;
 	struct commit_list **p = &newlist;
-	struct commit *interesting_cache = NULL;
+	struct commit *commit, *interesting_cache = NULL;
 	struct prio_queue queue = { .compare = compare_commits_by_commit_date };
 
 	if (revs->ancestry_path_implicit_bottoms) {
@@ -1461,8 +1461,7 @@ static int limit_list(struct rev_info *revs)
 		prio_queue_put(&queue, commit);
 	}
 
-	while (queue.nr) {
-		struct commit *commit = prio_queue_get(&queue);
+	while ((commit = prio_queue_get(&queue))) {
 		struct object *obj = &commit->object;
 
 		if (commit == interesting_cache)
diff --git a/t/unit-tests/u-prio-queue.c b/t/unit-tests/u-prio-queue.c
index 63e58114ae..af3e0b8598 100644
--- a/t/unit-tests/u-prio-queue.c
+++ b/t/unit-tests/u-prio-queue.c
@@ -53,13 +53,13 @@ static void test_prio_queue(int *input, size_t input_size,
 			prio_queue_reverse(&pq);
 			break;
 		case REPLACE:
-			peek = prio_queue_peek(&pq);
+			get = prio_queue_get(&pq);
 			cl_assert(i + 1 < input_size);
 			cl_assert(input[i + 1] >= 0);
 			cl_assert(j < result_size);
-			cl_assert_equal_i(result[j], show(peek));
+			cl_assert_equal_i(result[j], show(get));
 			j++;
-			prio_queue_replace(&pq, &input[++i]);
+			prio_queue_put(&pq, &input[++i]);
 			break;
 		default:
 			prio_queue_put(&pq, &input[i]);
diff --git a/walker.c b/walker.c
index e98eb6da53..e3de77f092 100644
--- a/walker.c
+++ b/walker.c
@@ -84,12 +84,12 @@ static struct prio_queue complete = { compare_commits_by_commit_date };
 static int process_commit(struct walker *walker, struct commit *commit)
 {
 	struct commit_list *parents;
+	struct commit *item;
 
 	if (repo_parse_commit(the_repository, commit))
 		return -1;
 
-	while (complete.nr) {
-		struct commit *item = prio_queue_peek(&complete);
+	while ((item = prio_queue_peek(&complete))) {
 		if (item->date < commit->date)
 			break;
 		pop_most_recent_commit(&complete, COMPLETE);
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 0/2] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: Kristofer Karlsson via GitGitGadget @ 2026-06-06 19:01 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, Kristofer Karlsson, Kristofer Karlsson
In-Reply-To: <pull.2140.git.1780757885582.gitgitgadget@gmail.com>

Rene's lazy_queue wrapper in describe.c was a clever optimization -- by
deferring the get, a following put becomes a simple replace, avoiding a full
remove-rebalance-insert cycle.

It turns out this pattern is so common in git's traversal code that it makes
sense to fold it into prio_queue itself. Gets and puts are interleaved in
virtually every commit walk, so the fusion is essentially always a win.

This is mostly a code simplification -- three callers had independently
reimplemented the same optimization, and they all collapse to plain get+put
now. The 3-6% speedup on traversal-heavy workloads is a nice bonus.

More details and benchmark numbers in the commit message. Benchmarks were
run on next which includes kk/commit-reach-optim -- those results represent
the more realistic end state.

Related to but independent of the cascade sift-down work in
kk/prio-queue-cascade-sift -- the two can land in either order.

Changes since v1:

 * Added a second commit that renames .nr to .nr_internal so that direct
   access from outside prio-queue.c is a compile error. Verified that after
   the rename, only prio-queue.c references nr_internal.

 * Added prio_queue_for_each() macro for callers that need to walk all
   elements (describe.c, show-branch.c, commit-reach.c, revision.c,
   negotiator/skipping.c).

 * Converted remaining .nr loop conditions to use
   prio_queue_get()/prio_queue_peek() as the loop condition, or
   prio_queue_size() where get/peek isn't suitable.

 * Fixed several callers missed in v1 (object-name.c, fetch-pack.c,
   path-walk.c, pack-bitmap-write.c, negotiator/default.c,
   negotiator/skipping.c, revision.c, builtin/last-modified.c).

Kristofer Karlsson (2):
  prio-queue: fold lazy_queue into prio_queue for automatic get+put
    fusion
  prio-queue: rename .nr to .nr_internal to prevent direct access

 builtin/describe.c          | 70 ++++++++-------------------------
 builtin/last-modified.c     |  7 ++--
 builtin/show-branch.c       | 24 +++++-------
 commit-reach.c              | 24 ++++++------
 commit.c                    | 11 +-----
 fetch-pack.c                |  4 +-
 negotiator/default.c        |  4 +-
 negotiator/skipping.c       | 12 +++---
 object-name.c               |  2 +-
 pack-bitmap-write.c         | 10 ++---
 path-walk.c                 |  8 ++--
 prio-queue.c                | 77 ++++++++++++++++++++-----------------
 prio-queue.h                | 19 +++++----
 revision.c                  | 16 ++++----
 t/unit-tests/u-prio-queue.c |  6 +--
 walker.c                    |  4 +-
 16 files changed, 129 insertions(+), 169 deletions(-)


base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2140%2Fspkrka%2Flazy-prio-queue-pr-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2140/spkrka/lazy-prio-queue-pr-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/2140

Range-diff vs v1:

 1:  29af24445e = 1:  29af24445e prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
 -:  ---------- > 2:  bb8b0f78f1 prio-queue: rename .nr to .nr_internal to prevent direct access

-- 
gitgitgadget

^ permalink raw reply

* Re: [PATCH] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: Kristofer Karlsson @ 2026-06-06 17:24 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Kristofer Karlsson via GitGitGadget, git, René Scharfe
In-Reply-To: <xmqqqzmjbpfp.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

> How can we be sure that all such users of prio_queue has been
> converted?  Are direct references to .nr member, outside of the
> prio-queue.c implementation, all now suspect?

You're right, and the patch is thus broken in its current state.

I did a rename of .nr to ._nr on the branch and rebuilt -- that
immediately found several callers I missed:

 - object-name.c: get_oid_oneline()
   (like you also found)
 - fetch-pack.c: mark_recent_complete_commits()
 - builtin/last-modified.c: last_modified_run()
 - path-walk.c: walk_objects_by_path()
 - commit-reach.c: queue_has_nonstale()

The describe.c and show-branch.c callers already compensate for
get_pending in their iteration bounds, but they still reach into
.nr directly.

> Perhaps the member should be renamed to catch in-flight topics
> that add more users of prio-queue that peek into the .nr member,
> or something like that.

Agreed, that's the right fix. I looked for existing ways of marking
fields as private, internal or hidden but the only thing I found was
the convention of using a code comment: /* for internal use only */

I will apply a rename and submit a v2. Perhaps something like
nr_internal to make it look less like a public API.

^ permalink raw reply

* Re: [PATCH] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: Junio C Hamano @ 2026-06-06 16:31 UTC (permalink / raw)
  To: Kristofer Karlsson via GitGitGadget
  Cc: git, René Scharfe, Kristofer Karlsson
In-Reply-To: <pull.2140.git.1780757885582.gitgitgadget@gmail.com>

"Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Add prio_queue_size() for callers that need the logical element
> count, since the physical nr may temporarily include a
> pending-removal element.

Many code paths used to learn how many elements it logically has by
directly peeking into .nr member of the prio_queue struct.  Now they
should call this new helper function, and you converted some in this
patch.

How can we be sure that all such users of prio_queue has been
converted?  Are direct references to .nr member, outside of the
prio-queue.c implementation, all now suspect?

For example, object-name.c:get_oid_oneline() uses a prio-queue
"copy", and loops "while (copy.nr)".  In the loop, it calls
pop_most_recent_commit(), which does a get followed by put of its
parents.  If the get become hanging (e.g., root commit, causing no
_put() performed in pop_most_recent_commit()), would copy.nr still
remain 1 but logically no elements remain in the queue.

There seem to be other direct peeking of .nr member remaining in the
code.  Perhaps the member should be renamed to catch in-flight
topics that add more users of prio-queue that peek into the .nr
member, or something like that.

^ permalink raw reply

* [PATCH] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: Kristofer Karlsson via GitGitGadget @ 2026-06-06 14:58 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, Kristofer Karlsson, Kristofer Karlsson

From: Kristofer Karlsson <krka@spotify.com>

Defer the actual removal in prio_queue_get() until the next
operation.  If that next operation is a prio_queue_put(), the
removal and insertion are fused into a single replace — writing
the new element at the root and sifting it down — which avoids
a full remove-rebalance-insert cycle.

This matches the dominant usage pattern in git's commit traversal:
get a commit, then put its parents.  The first parent insertion
after each get is now a replace operation automatically.

This generalizes the lazy_queue pattern from builtin/describe.c
(introduced in 08bb69d70f) into prio_queue itself.  Three callers
independently implemented the same get+put fusion:

  - builtin/describe.c had a full lazy_queue wrapper
  - commit.c:pop_most_recent_commit() reimplements the same
    get_pending flag with peek+replace
  - builtin/show-branch.c:join_revs() used the same peek+replace
    pattern

All three now collapse to plain _get() and _put(),
with the data structure handling the fusion internally.

Remove prio_queue_replace() since no external callers remain.
Add prio_queue_size() for callers that need the logical element
count, since the physical nr may temporarily include a
pending-removal element.

Benchmarked on a large monorepo (10-15 interleaved runs, 1 warmup):

  Command                       base    patched  speedup
  merge-base --all A A~1000     3.88s   3.77s    1.03x
  rev-list --count A~1000..A    3.57s   3.43s    1.04x
  log --oneline A~1000..A       3.70s   3.49s    1.06x
  rev-parse :/pattern           365ms   364ms    1.00x
  describe HEAD (linux.git)     184ms   190ms    1.00x

No regressions in any scenario.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
    prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
    
    Rene's lazy_queue wrapper in describe.c was a clever optimization -- by
    deferring the get, a following put becomes a simple replace, avoiding a
    full remove-rebalance-insert cycle.
    
    It turns out this pattern is so common in git's traversal code that it
    makes sense to fold it into prio_queue itself. Gets and puts are
    interleaved in virtually every commit walk, so the fusion is essentially
    always a win.
    
    This is mostly a code simplification -- three callers had independently
    reimplemented the same optimization, and they all collapse to plain
    get+put now. The 3-6% speedup on traversal-heavy workloads is a nice
    bonus.
    
    More details and benchmark numbers in the commit message. Benchmarks
    were run on next which includes kk/commit-reach-optim -- those results
    represent the more realistic end state.
    
    Related to but independent of the cascade sift-down work in
    kk/prio-queue-cascade-sift -- the two can land in either order.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2140%2Fspkrka%2Flazy-prio-queue-pr-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2140/spkrka/lazy-prio-queue-pr-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2140

 builtin/describe.c          | 67 +++++++++----------------------------
 builtin/last-modified.c     |  4 +--
 builtin/show-branch.c       | 17 ++++------
 commit-reach.c              |  5 ++-
 commit.c                    | 11 ++----
 pack-bitmap-write.c         |  4 +--
 prio-queue.c                | 49 +++++++++++++++------------
 prio-queue.h                | 12 +++----
 revision.c                  |  5 ++-
 t/unit-tests/u-prio-queue.c |  6 ++--
 walker.c                    |  4 +--
 11 files changed, 68 insertions(+), 116 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 1c47d7c0b7..85564f3487 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -251,56 +251,20 @@ static int compare_pt(const void *a_, const void *b_)
 	return 0;
 }
 
-struct lazy_queue {
-	struct prio_queue queue;
-	bool get_pending;
-};
-
-#define LAZY_QUEUE_INIT { { compare_commits_by_commit_date }, false }
-
-static void *lazy_queue_get(struct lazy_queue *queue)
-{
-	if (queue->get_pending)
-		prio_queue_get(&queue->queue);
-	else
-		queue->get_pending = true;
-	return prio_queue_peek(&queue->queue);
-}
-
-static void lazy_queue_put(struct lazy_queue *queue, void *thing)
-{
-	if (queue->get_pending)
-		prio_queue_replace(&queue->queue, thing);
-	else
-		prio_queue_put(&queue->queue, thing);
-	queue->get_pending = false;
-}
-
-static bool lazy_queue_empty(const struct lazy_queue *queue)
-{
-	return queue->queue.nr == (queue->get_pending ? 1 : 0);
-}
-
-static void lazy_queue_clear(struct lazy_queue *queue)
-{
-	clear_prio_queue(&queue->queue);
-	queue->get_pending = false;
-}
-
-static unsigned long finish_depth_computation(struct lazy_queue *queue,
+static unsigned long finish_depth_computation(struct prio_queue *queue,
 					      struct possible_tag *best)
 {
 	unsigned long seen_commits = 0;
 	struct oidset unflagged = OIDSET_INIT;
+	struct commit *c;
 
-	for (size_t i = queue->get_pending ? 1 : 0; i < queue->queue.nr; i++) {
-		struct commit *commit = queue->queue.array[i].data;
+	for (size_t i = queue->get_pending; i < queue->nr; i++) {
+		struct commit *commit = queue->array[i].data;
 		if (!(commit->object.flags & best->flag_within))
 			oidset_insert(&unflagged, &commit->object.oid);
 	}
 
-	while (!lazy_queue_empty(queue)) {
-		struct commit *c = lazy_queue_get(queue);
+	while ((c = prio_queue_get(queue))) {
 		struct commit_list *parents = c->parents;
 		seen_commits++;
 		if (c->object.flags & best->flag_within) {
@@ -316,7 +280,7 @@ static unsigned long finish_depth_computation(struct lazy_queue *queue,
 			repo_parse_commit(the_repository, p);
 			seen = p->object.flags & SEEN;
 			if (!seen)
-				lazy_queue_put(queue, p);
+				prio_queue_put(queue, p);
 			flag_before = p->object.flags & best->flag_within;
 			p->object.flags |= c->object.flags;
 			flag_after = p->object.flags & best->flag_within;
@@ -364,8 +328,8 @@ static void append_suffix(int depth, const struct object_id *oid, struct strbuf
 
 static void describe_commit(struct commit *cmit, struct strbuf *dst)
 {
-	struct commit *gave_up_on = NULL;
-	struct lazy_queue queue = LAZY_QUEUE_INIT;
+	struct commit *c, *gave_up_on = NULL;
+	struct prio_queue queue = { compare_commits_by_commit_date };
 	struct commit_name *n;
 	struct possible_tag all_matches[MAX_TAGS];
 	unsigned int match_cnt = 0, annotated_cnt = 0, cur_match;
@@ -407,9 +371,8 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 	}
 
 	cmit->object.flags = SEEN;
-	lazy_queue_put(&queue, cmit);
-	while (!lazy_queue_empty(&queue)) {
-		struct commit *c = lazy_queue_get(&queue);
+	prio_queue_put(&queue, cmit);
+	while ((c = prio_queue_get(&queue))) {
 		struct commit_list *parents = c->parents;
 		struct commit_name **slot;
 
@@ -443,7 +406,7 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 				t->depth++;
 		}
 		/* Stop if last remaining path already covered by best candidate(s) */
-		if (annotated_cnt && lazy_queue_empty(&queue)) {
+		if (annotated_cnt && !prio_queue_size(&queue)) {
 			int best_depth = INT_MAX;
 			unsigned best_within = 0;
 			for (cur_match = 0; cur_match < match_cnt; cur_match++) {
@@ -466,7 +429,7 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 			struct commit *p = parents->item;
 			repo_parse_commit(the_repository, p);
 			if (!(p->object.flags & SEEN))
-				lazy_queue_put(&queue, p);
+				prio_queue_put(&queue, p);
 			p->object.flags |= c->object.flags;
 			parents = parents->next;
 
@@ -481,7 +444,7 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 			strbuf_add_unique_abbrev(dst, cmit_oid, abbrev);
 			if (suffix)
 				strbuf_addstr(dst, suffix);
-			lazy_queue_clear(&queue);
+			clear_prio_queue(&queue);
 			return;
 		}
 		if (unannotated_cnt)
@@ -497,11 +460,11 @@ static void describe_commit(struct commit *cmit, struct strbuf *dst)
 	QSORT(all_matches, match_cnt, compare_pt);
 
 	if (gave_up_on) {
-		lazy_queue_put(&queue, gave_up_on);
+		prio_queue_put(&queue, gave_up_on);
 		seen_commits--;
 	}
 	seen_commits += finish_depth_computation(&queue, &all_matches[0]);
-	lazy_queue_clear(&queue);
+	clear_prio_queue(&queue);
 
 	if (debug) {
 		static int label_width = -1;
diff --git a/builtin/last-modified.c b/builtin/last-modified.c
index 8900ceece1..df2a508244 100644
--- a/builtin/last-modified.c
+++ b/builtin/last-modified.c
@@ -344,6 +344,7 @@ static void process_parent(struct last_modified *lm,
 static int last_modified_run(struct last_modified *lm)
 {
 	int max_count, queue_popped = 0;
+	struct commit *c;
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	struct prio_queue not_queue = { compare_commits_by_gen_then_commit_date };
 	struct commit_list *list;
@@ -389,10 +390,9 @@ static int last_modified_run(struct last_modified *lm)
 		}
 	}
 
-	while (queue.nr) {
+	while ((c = prio_queue_get(&queue))) {
 		int parent_i;
 		struct commit_list *p;
-		struct commit *c = prio_queue_get(&queue);
 		struct bitmap *active_c = active_paths_for(lm, c);
 
 		if ((0 <= max_count && max_count < ++queue_popped) ||
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index f02831b085..9f7f28f339 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -62,7 +62,7 @@ static const char *get_color_reset_code(void)
 
 static struct commit *interesting(struct prio_queue *queue)
 {
-	for (size_t i = 0; i < queue->nr; i++) {
+	for (size_t i = queue->get_pending; i < queue->nr; i++) {
 		struct commit *commit = queue->array[i].data;
 		if (commit->object.flags & UNINTERESTING)
 			continue;
@@ -228,17 +228,18 @@ static void join_revs(struct prio_queue *queue,
 {
 	int all_mask = ((1u << (REV_SHIFT + num_rev)) - 1);
 	int all_revs = all_mask & ~((1u << REV_SHIFT) - 1);
+	struct commit *commit;
 
-	while (queue->nr) {
+	while ((commit = prio_queue_peek(queue))) {
 		struct commit_list *parents;
 		int still_interesting = !!interesting(queue);
-		struct commit *commit = prio_queue_peek(queue);
-		bool get_pending = true;
 		int flags = commit->object.flags & all_mask;
 
 		if (!still_interesting && extra <= 0)
 			break;
 
+		prio_queue_get(queue);
+
 		mark_seen(commit, seen_p);
 		if ((flags & all_revs) == all_revs)
 			flags |= UNINTERESTING;
@@ -254,14 +255,8 @@ static void join_revs(struct prio_queue *queue,
 			if (mark_seen(p, seen_p) && !still_interesting)
 				extra--;
 			p->object.flags |= flags;
-			if (get_pending)
-				prio_queue_replace(queue, p);
-			else
-				prio_queue_put(queue, p);
-			get_pending = false;
+			prio_queue_put(queue, p);
 		}
-		if (get_pending)
-			prio_queue_get(queue);
 	}
 
 	/*
diff --git a/commit-reach.c b/commit-reach.c
index 9b3ea46d6f..0fec2f00be 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -1269,7 +1269,7 @@ int get_branch_base_for_tip(struct repository *r,
 			    size_t bases_nr)
 {
 	int best_index = -1;
-	struct commit *branch_point = NULL;
+	struct commit *c, *branch_point = NULL;
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	int found_missing_gen = 0;
 
@@ -1322,8 +1322,7 @@ int get_branch_base_for_tip(struct repository *r,
 		prio_queue_put(&queue, c);
 	}
 
-	while (queue.nr) {
-		struct commit *c = prio_queue_get(&queue);
+	while ((c = prio_queue_get(&queue))) {
 		int best_for_c = get_best(c);
 		int best_for_p, positive;
 		struct commit *parent;
diff --git a/commit.c b/commit.c
index fd8723502e..976bfc4618 100644
--- a/commit.c
+++ b/commit.c
@@ -795,24 +795,17 @@ void commit_list_sort_by_date(struct commit_list **list)
 struct commit *pop_most_recent_commit(struct prio_queue *queue,
 				      unsigned int mark)
 {
-	struct commit *ret = prio_queue_peek(queue);
-	int get_pending = 1;
+	struct commit *ret = prio_queue_get(queue);
 	struct commit_list *parents = ret->parents;
 
 	while (parents) {
 		struct commit *commit = parents->item;
 		if (!repo_parse_commit(the_repository, commit) && !(commit->object.flags & mark)) {
 			commit->object.flags |= mark;
-			if (get_pending)
-				prio_queue_replace(queue, commit);
-			else
-				prio_queue_put(queue, commit);
-			get_pending = 0;
+			prio_queue_put(queue, commit);
 		}
 		parents = parents->next;
 	}
-	if (get_pending)
-		prio_queue_get(queue);
 	return ret;
 }
 
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 1c8070f99c..f7c63e3027 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -513,6 +513,7 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
 			      struct bitmap_index *old_bitmap,
 			      const uint32_t *mapping)
 {
+	struct commit *c;
 	int found;
 	uint32_t pos;
 	if (!ent->bitmap)
@@ -520,9 +521,8 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
 
 	prio_queue_put(queue, commit);
 
-	while (queue->nr) {
+	while ((c = prio_queue_get(queue))) {
 		struct commit_list *p;
-		struct commit *c = prio_queue_get(queue);
 
 		if (old_bitmap && mapping) {
 			struct ewah_bitmap *old;
diff --git a/prio-queue.c b/prio-queue.c
index 9748528ce6..1407f2f801 100644
--- a/prio-queue.c
+++ b/prio-queue.c
@@ -34,12 +34,34 @@ void clear_prio_queue(struct prio_queue *queue)
 	queue->nr = 0;
 	queue->alloc = 0;
 	queue->insertion_ctr = 0;
+	queue->get_pending = 0;
+}
+
+static void sift_down_root(struct prio_queue *queue);
+
+static inline void flush_get(struct prio_queue *queue)
+{
+	if (!queue->get_pending)
+		return;
+	queue->get_pending = 0;
+	if (!--queue->nr)
+		return;
+	queue->array[0] = queue->array[queue->nr];
+	sift_down_root(queue);
 }
 
 void prio_queue_put(struct prio_queue *queue, void *thing)
 {
 	size_t ix, parent;
 
+	if (queue->get_pending) {
+		queue->get_pending = 0;
+		queue->array[0].ctr = queue->insertion_ctr++;
+		queue->array[0].data = thing;
+		sift_down_root(queue);
+		return;
+	}
+
 	/* Append at the end */
 	ALLOC_GROW(queue->array, queue->nr + 1, queue->alloc);
 	queue->array[queue->nr].ctr = queue->insertion_ctr++;
@@ -78,41 +100,24 @@ static void sift_down_root(struct prio_queue *queue)
 
 void *prio_queue_get(struct prio_queue *queue)
 {
-	void *result;
+	flush_get(queue);
 
 	if (!queue->nr)
 		return NULL;
 	if (!queue->compare)
 		return queue->array[--queue->nr].data; /* LIFO */
 
-	result = queue->array[0].data;
-	if (!--queue->nr)
-		return result;
-
-	queue->array[0] = queue->array[queue->nr];
-	sift_down_root(queue);
-	return result;
+	queue->get_pending = 1;
+	return queue->array[0].data;
 }
 
 void *prio_queue_peek(struct prio_queue *queue)
 {
+	flush_get(queue);
+
 	if (!queue->nr)
 		return NULL;
 	if (!queue->compare)
 		return queue->array[queue->nr - 1].data;
 	return queue->array[0].data;
 }
-
-void prio_queue_replace(struct prio_queue *queue, void *thing)
-{
-	if (!queue->nr) {
-		prio_queue_put(queue, thing);
-	} else if (!queue->compare) {
-		queue->array[queue->nr - 1].ctr = queue->insertion_ctr++;
-		queue->array[queue->nr - 1].data = thing;
-	} else {
-		queue->array[0].ctr = queue->insertion_ctr++;
-		queue->array[0].data = thing;
-		sift_down_root(queue);
-	}
-}
diff --git a/prio-queue.h b/prio-queue.h
index da7fad2f1f..482ab5e71d 100644
--- a/prio-queue.h
+++ b/prio-queue.h
@@ -32,6 +32,7 @@ struct prio_queue {
 	void *cb_data;
 	size_t alloc, nr;
 	struct prio_queue_entry *array;
+	unsigned get_pending;
 };
 
 /*
@@ -52,13 +53,10 @@ void *prio_queue_get(struct prio_queue *);
  */
 void *prio_queue_peek(struct prio_queue *);
 
-/*
- * Replace the "thing" that compares the smallest with a new "thing",
- * like prio_queue_get()+prio_queue_put() would do, but in a more
- * efficient way.  Does the same as prio_queue_put() if the queue is
- * empty.
- */
-void prio_queue_replace(struct prio_queue *queue, void *thing);
+static inline size_t prio_queue_size(struct prio_queue *queue)
+{
+	return queue->nr - queue->get_pending;
+}
 
 void clear_prio_queue(struct prio_queue *);
 
diff --git a/revision.c b/revision.c
index 5693618be4..8ce8ffa43d 100644
--- a/revision.c
+++ b/revision.c
@@ -1446,7 +1446,7 @@ static int limit_list(struct rev_info *revs)
 	struct commit_list *original_list = revs->commits;
 	struct commit_list *newlist = NULL;
 	struct commit_list **p = &newlist;
-	struct commit *interesting_cache = NULL;
+	struct commit *commit, *interesting_cache = NULL;
 	struct prio_queue queue = { .compare = compare_commits_by_commit_date };
 
 	if (revs->ancestry_path_implicit_bottoms) {
@@ -1461,8 +1461,7 @@ static int limit_list(struct rev_info *revs)
 		prio_queue_put(&queue, commit);
 	}
 
-	while (queue.nr) {
-		struct commit *commit = prio_queue_get(&queue);
+	while ((commit = prio_queue_get(&queue))) {
 		struct object *obj = &commit->object;
 
 		if (commit == interesting_cache)
diff --git a/t/unit-tests/u-prio-queue.c b/t/unit-tests/u-prio-queue.c
index 63e58114ae..af3e0b8598 100644
--- a/t/unit-tests/u-prio-queue.c
+++ b/t/unit-tests/u-prio-queue.c
@@ -53,13 +53,13 @@ static void test_prio_queue(int *input, size_t input_size,
 			prio_queue_reverse(&pq);
 			break;
 		case REPLACE:
-			peek = prio_queue_peek(&pq);
+			get = prio_queue_get(&pq);
 			cl_assert(i + 1 < input_size);
 			cl_assert(input[i + 1] >= 0);
 			cl_assert(j < result_size);
-			cl_assert_equal_i(result[j], show(peek));
+			cl_assert_equal_i(result[j], show(get));
 			j++;
-			prio_queue_replace(&pq, &input[++i]);
+			prio_queue_put(&pq, &input[++i]);
 			break;
 		default:
 			prio_queue_put(&pq, &input[i]);
diff --git a/walker.c b/walker.c
index e98eb6da53..e3de77f092 100644
--- a/walker.c
+++ b/walker.c
@@ -84,12 +84,12 @@ static struct prio_queue complete = { compare_commits_by_commit_date };
 static int process_commit(struct walker *walker, struct commit *commit)
 {
 	struct commit_list *parents;
+	struct commit *item;
 
 	if (repo_parse_commit(the_repository, commit))
 		return -1;
 
-	while (complete.nr) {
-		struct commit *item = prio_queue_peek(&complete);
+	while ((item = prio_queue_peek(&complete))) {
 		if (item->date < commit->date)
 			break;
 		pop_most_recent_commit(&complete, COMPLETE);

base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v1 0/1] environment: move protect_hfs and protect_ntfs
From: Tian Yuchen @ 2026-06-06 14:34 UTC (permalink / raw)
  To: git
  Cc: christian, phillip.wood123, Tian Yuchen, Christian Couder,
	Ayush Chandekar, Olamide Caleb Bello
In-Reply-To: <20260606143412.15443-1-cat@malon.dev>

Hi everyone,

This series continues the ongoing libification effort by moving the
global **filesystem** variables, protect_hfs and protect_ntfs, into
struct repo_config_values.

Place them within the **per-repository** configuration structure
aligns with our goal of removing global states.

RFC Questions:

1. Should we keep PROTECT_HFS_DEFAULT and PROTECT_NTFS_DEFAULT
in repo_config_values_init()?

	void repo_config_values_init(struct repo_config_values *cfg)
		{
			cfg->attributes_file = NULL;
			cfg->apply_sparse_checkout = 0;
			cfg->protect_hfs = PROTECT_HFS_DEFAULT;
			cfg->protect_ntfs = PROTECT_NTFS_DEFAULT;
			cfg->branch_track = BRANCH_TRACK_REMOTE;
		}

Or is it better if they are used anywhere other than in environment.c?

If so... 
2. Is it worth introducing a Macro or Getter for safe access?

	((the_repository->gitdir ? repo_config_values(the_repository)->protect_hfs : 0))

The current approach looks verbose and lacks readability, and
hard-coded 0 and 1 are used as fallback values. I wonder if a macro or a
getter could be introduced, for example... 

	#define SAFE_PROTECT_HFS(repo) \
		(((repo) && (repo)->gitdir && (repo) == the_repository) ? \ 
		repo_config_values(repo)->protect_hfs : PROTECT_HFS_DEFAULT)

...to improve the coding style a bit. Although I am aware that introducing
new macros is generally frowned upon, I would still like to know which
parts this might make difficult to maintain.

3. Note that Derrick attempted to use get_int_config_global to wrap
this kind of Filesystem Level global variables. This approach bypassed
struct repository, did not actually eliminate global state, and the
reviewer politely rejected it. Nevertheless, I am still curious as
to whether this approach might still be inspiring today.

https://lore.kernel.org/git/a42dd9397d07b2dc4a0d7e75bfe1af2e46cad262.1685716420.git.gitgitgadget@gmail.com/

Thanks!

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Ayush Chandekar <ayu.chandekar@gmail.com>
Mentored-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Tian Yuchen <cat@malon.dev>

Tian Yuchen (1):
  environment.c: move 'protect_hfs' and 'protect_ntfs' into
    'repo_config_values'

 compat/mingw.c             |  2 +-
 environment.c              |  8 ++++----
 environment.h              |  4 ++--
 read-cache.c               |  7 ++++---
 t/helper/test-path-utils.c | 26 ++++++++++++++++----------
 5 files changed, 27 insertions(+), 20 deletions(-)

-- 
2.43.0


^ permalink raw reply

* [PATCH v1 1/1] environment.c: move 'protect_hfs' and 'protect_ntfs' into 'repo_config_values'
From: Tian Yuchen @ 2026-06-06 14:34 UTC (permalink / raw)
  To: git
  Cc: christian, phillip.wood123, Tian Yuchen, Christian Couder,
	Ayush Chandekar, Olamide Caleb Bello

Move the global 'protect_hfs' and 'protect_ntfs' configurations
into the repository-specific 'repo_config_values' struct.
This will help with the elimination of 'the_repository'

For now, associated functions access this configuration by
explicitly falling back to 'the_repository', which needs to
be addressed in the future.

Note: In 't/helper/test-path-utils.c', there is a function
'protect_ntfs_hfs_benchmark()' where these two global
variables are used as loop iterators. New local variables
have been created to replace them.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Ayush Chandekar <ayu.chandekar@gmail.com>
Mentored-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Tian Yuchen <cat@malon.dev>
---
 compat/mingw.c             |  2 +-
 environment.c              |  8 ++++----
 environment.h              |  4 ++--
 read-cache.c               |  7 ++++---
 t/helper/test-path-utils.c | 26 ++++++++++++++++----------
 5 files changed, 27 insertions(+), 20 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index aa7525f419..c77696ba8a 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -3392,7 +3392,7 @@ int is_valid_win32_path(const char *path, int allow_literal_nul)
 	const char *p = path;
 	int preceding_space_or_period = 0, i = 0, periods = 0;
 
-	if (!protect_ntfs)
+	if (!(the_repository->gitdir ? repo_config_values(the_repository)->protect_ntfs : 1))
 		return 1;
 
 	skip_dos_drive_prefix((char **)&path);
diff --git a/environment.c b/environment.c
index fc3ed8bb1c..0730bfcbba 100644
--- a/environment.c
+++ b/environment.c
@@ -82,12 +82,10 @@ unsigned long pack_size_limit_cfg;
 #ifndef PROTECT_HFS_DEFAULT
 #define PROTECT_HFS_DEFAULT 0
 #endif
-int protect_hfs = PROTECT_HFS_DEFAULT;
 
 #ifndef PROTECT_NTFS_DEFAULT
 #define PROTECT_NTFS_DEFAULT 1
 #endif
-int protect_ntfs = PROTECT_NTFS_DEFAULT;
 
 /*
  * The character that begins a commented line in user-editable file
@@ -541,12 +539,12 @@ int git_default_core_config(const char *var, const char *value,
 	}
 
 	if (!strcmp(var, "core.protecthfs")) {
-		protect_hfs = git_config_bool(var, value);
+		cfg->protect_hfs = git_config_bool(var, value);
 		return 0;
 	}
 
 	if (!strcmp(var, "core.protectntfs")) {
-		protect_ntfs = git_config_bool(var, value);
+		cfg->protect_ntfs = git_config_bool(var, value);
 		return 0;
 	}
 
@@ -720,5 +718,7 @@ void repo_config_values_init(struct repo_config_values *cfg)
 {
 	cfg->attributes_file = NULL;
 	cfg->apply_sparse_checkout = 0;
+	cfg->protect_hfs = PROTECT_HFS_DEFAULT;
+	cfg->protect_ntfs = PROTECT_NTFS_DEFAULT;
 	cfg->branch_track = BRANCH_TRACK_REMOTE;
 }
diff --git a/environment.h b/environment.h
index 9eb97b3869..d48fd2719c 100644
--- a/environment.h
+++ b/environment.h
@@ -91,6 +91,8 @@ struct repo_config_values {
 	/* section "core" config values */
 	char *attributes_file;
 	int apply_sparse_checkout;
+	int protect_hfs;
+	int protect_ntfs;
 
 	/* section "branch" config values */
 	enum branch_track branch_track;
@@ -173,8 +175,6 @@ extern int pack_compression_level;
 extern unsigned long pack_size_limit_cfg;
 
 extern int precomposed_unicode;
-extern int protect_hfs;
-extern int protect_ntfs;
 
 extern int core_sparse_checkout_cone;
 extern int sparse_expect_files_outside_of_patterns;
diff --git a/read-cache.c b/read-cache.c
index 21829102ae..b64a5629ef 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1002,7 +1002,7 @@ static enum verify_path_result verify_path_internal(const char *path,
 			return PATH_OK;
 		if (is_dir_sep(c)) {
 inside:
-			if (protect_hfs) {
+			if ((the_repository->gitdir ? repo_config_values(the_repository)->protect_hfs : 0)) {
 
 				if (is_hfs_dotgit(path))
 					return PATH_INVALID;
@@ -1011,7 +1011,7 @@ static enum verify_path_result verify_path_internal(const char *path,
 						return PATH_INVALID;
 				}
 			}
-			if (protect_ntfs) {
+			if ((the_repository->gitdir ? repo_config_values(the_repository)->protect_ntfs : 1)) {
 #if defined GIT_WINDOWS_NATIVE || defined __CYGWIN__
 				if (c == '\\')
 					return PATH_INVALID;
@@ -1035,7 +1035,8 @@ static enum verify_path_result verify_path_internal(const char *path,
 			if (c == '\0')
 				return S_ISDIR(mode) ? PATH_DIR_WITH_SEP :
 						       PATH_INVALID;
-		} else if (c == '\\' && protect_ntfs) {
+		} else if (c == '\\' &&
+			   (the_repository->gitdir ? repo_config_values(the_repository)->protect_ntfs : 1)) {
 			if (is_ntfs_dotgit(path))
 				return PATH_INVALID;
 			if (S_ISLNK(mode)) {
diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c
index 15eb44485c..4455a68903 100644
--- a/t/helper/test-path-utils.c
+++ b/t/helper/test-path-utils.c
@@ -250,6 +250,7 @@ static int protect_ntfs_hfs_benchmark(int argc, const char **argv)
 	double m[3][2], v[3][2];
 	uint64_t cumul;
 	double cumul2;
+	int ntfs, hfs;
 
 	if (argc > 1 && !strcmp(argv[1], "--with-symlink-mode")) {
 		file_mode = 0120000;
@@ -275,9 +276,14 @@ static int protect_ntfs_hfs_benchmark(int argc, const char **argv)
 		while (len > 0)
 			names[i][--len] = (char)(' ' + (my_random() % ('\x7f' - ' ')));
 	}
-
-	for (protect_ntfs = 0; protect_ntfs < 2; protect_ntfs++)
-		for (protect_hfs = 0; protect_hfs < 2; protect_hfs++) {
+	
+	if (!the_repository->gitdir)
+		the_repository->gitdir = xstrdup(".git");
+
+	for (ntfs = 0; ntfs < 2; ntfs++)
+		for (hfs = 0; hfs < 2; hfs++) {
+			repo_config_values(the_repository)->protect_ntfs = ntfs;
+			repo_config_values(the_repository)->protect_hfs = hfs;
 			cumul = 0;
 			cumul2 = 0;
 			for (i = 0; i < repetitions; i++) {
@@ -285,18 +291,18 @@ static int protect_ntfs_hfs_benchmark(int argc, const char **argv)
 				for (j = 0; j < nr; j++)
 					verify_path(names[j], file_mode);
 				end = getnanotime();
-				printf("protect_ntfs = %d, protect_hfs = %d: %lfms\n", protect_ntfs, protect_hfs, (end-begin) / (double)1e6);
+				printf("protect_ntfs = %d, protect_hfs = %d: %lfms\n", ntfs, hfs, (end-begin) / (double)1e6);
 				cumul += end - begin;
 				cumul2 += (end - begin) * (end - begin);
 			}
-			m[protect_ntfs][protect_hfs] = cumul / (double)repetitions;
-			v[protect_ntfs][protect_hfs] = my_sqrt(cumul2 / (double)repetitions - m[protect_ntfs][protect_hfs] * m[protect_ntfs][protect_hfs]);
-			printf("mean: %lfms, stddev: %lfms\n", m[protect_ntfs][protect_hfs] / (double)1e6, v[protect_ntfs][protect_hfs] / (double)1e6);
+			m[ntfs][hfs] = cumul / (double)repetitions;
+			v[ntfs][hfs] = my_sqrt(cumul2 / (double)repetitions - m[ntfs][hfs] * m[ntfs][hfs]);
+			printf("mean: %lfms, stddev: %lfms\n", m[ntfs][hfs] / (double)1e6, v[ntfs][hfs] / (double)1e6);
 		}
 
-	for (protect_ntfs = 0; protect_ntfs < 2; protect_ntfs++)
-		for (protect_hfs = 0; protect_hfs < 2; protect_hfs++)
-			printf("ntfs=%d/hfs=%d: %lf%% slower\n", protect_ntfs, protect_hfs, (m[protect_ntfs][protect_hfs] - m[0][0]) * 100 / m[0][0]);
+	for (ntfs = 0; ntfs < 2; ntfs++)
+		for (hfs = 0; hfs < 2; hfs++)
+			printf("ntfs=%d/hfs=%d: %lf%% slower\n", ntfs, hfs, (m[ntfs][hfs] - m[0][0]) * 100 / m[0][0]);
 
 	return 0;
 }
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH v6 00/10] parseopt: add subcommand autocorrection
From: Junio C Hamano @ 2026-06-06 14:23 UTC (permalink / raw)
  To: Jiamu Sun; +Cc: git, Aaron Plattner, Karthik Nayak
In-Reply-To: <SY0P300MB0801E50FCB7EB2F45CD15208CE042@SY0P300MB0801.AUSP300.PROD.OUTLOOK.COM>

Jiamu Sun <39@barroit.sh> writes:

> On Mon, May 11, 2026 at 12:03:12PM +0900, Junio C Hamano wrote:
>> I've been carrying the following fix on top of these series since
>> Apr 23 when the topic was merged to 'seen'.  Can you fix these up at
>> the source, so that we can move forward with this topic?
>> 
>> Thanks.
>
> Sorry for the delay. This email didn't reach my inbox.
>
> By the time I saw this fix, I had already sent v6. Should I resend v6
> with this fix squashed in, or bump to v7?

Sorry for the delay; this exchange fell through the cracks, and then
I no longer recall exactly what the "fix" was.  If your v6 still
lacks the "fix" I gave (sorry but I do not remember what it was, and
I am away from my desk), then please do incorporate and send an
updated one as v7.  Hopefully it would be the final edition, unless
there are other issues outstanding.

Thanks.

^ permalink raw reply

* Re: [PATCH v4] git-gui: silence install recipes under "make -s"
From: Johannes Sixt @ 2026-06-06 11:47 UTC (permalink / raw)
  To: Harald Nordgren; +Cc: git, Harald Nordgren via GitGitGadget
In-Reply-To: <pull.2318.v4.git.git.1780742303298.gitgitgadget@gmail.com>

Thanks, queued.

-- Hannes


^ permalink raw reply

* [PATCH v4] git-gui: silence install recipes under "make -s"
From: Harald Nordgren via GitGitGadget @ 2026-06-06 10:38 UTC (permalink / raw)
  To: git; +Cc: Johannes Sixt, Harald Nordgren, Harald Nordgren
In-Reply-To: <pull.2318.v3.git.git.1780555730228.gitgitgadget@gmail.com>

From: Harald Nordgren <haraldnordgren@gmail.com>

Several install and uninstall recipes embed "echo" calls that fire as
part of the recipe itself, so the install banners (DEST, INSTALL,
LINK, REMOVE) were visible whenever the variables expand non-empty.

Guard the whole "ifndef V" block on "-s" so the loud variants are
selected only when "-s" is absent and V=1 is unset. The existing
"-s" check also had its findstring arguments in the wrong order
(needle "-s" never fit in haystack "s"), so swap them while moving
the check to wrap the block.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
---
    git-gui: silence install recipes under "make -s"
    
    Change sign-off email from work email to correct personal email.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2318%2FHaraldNordgren%2Fgit-gui-respect-silent-flag-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2318/HaraldNordgren/git-gui-respect-silent-flag-v4
Pull-Request: https://github.com/git/git/pull/2318

Range-diff vs v3:

 1:  1375fdc1aa ! 1:  27d9fcf26b git-gui: silence install recipes under "make -s"
     @@ Commit message
          (needle "-s" never fit in haystack "s"), so swap them while moving
          the check to wrap the block.
      
     -    Signed-off-by: Harald Nordgren <harald.nordgren@kostdoktorn.se>
     +    Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
      
       ## git-gui/Makefile ##
      @@ git-gui/Makefile: REMOVE_F0  = $(RM_RF) # space is required here


 git-gui/Makefile | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/git-gui/Makefile b/git-gui/Makefile
index ca01068810..d33204e875 100644
--- a/git-gui/Makefile
+++ b/git-gui/Makefile
@@ -64,6 +64,7 @@ REMOVE_F0  = $(RM_RF) # space is required here
 REMOVE_F1  =
 CLEAN_DST  = true
 
+ifneq ($(findstring s,$(firstword -$(MAKEFLAGS))),s)
 ifndef V
 	QUIET          = @
 	QUIET_GEN      = $(QUIET)echo '   ' GEN '$@' &&
@@ -89,6 +90,7 @@ ifndef V
 	REMOVE_F0 = dst=
 	REMOVE_F1 = && echo '   ' REMOVE `basename "$$dst"` && $(RM_RF) "$$dst"
 endif
+endif
 
 TCLTK_PATH ?= wish
 ifeq (./,$(dir $(TCLTK_PATH)))
@@ -97,10 +99,6 @@ else
 	TCL_PATH ?= $(dir $(TCLTK_PATH))$(notdir $(subst wish,tclsh,$(TCLTK_PATH)))
 endif
 
-ifeq ($(findstring $(firstword -$(MAKEFLAGS)),s),s)
-QUIET_GEN =
-endif
-
 -include config.mak
 
 DESTDIR_SQ = $(subst ','\'',$(DESTDIR))

base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH v3] git-gui: silence install recipes under "make -s"
From: Johannes Sixt @ 2026-06-06  9:38 UTC (permalink / raw)
  To: Harald Nordgren; +Cc: git, Harald Nordgren via GitGitGadget
In-Reply-To: <pull.2318.v3.git.git.1780555730228.gitgitgadget@gmail.com>

Am 04.06.26 um 08:48 schrieb Harald Nordgren via GitGitGadget:
> From: Harald Nordgren <haraldnordgren@gmail.com>
> 
> Several install and uninstall recipes embed "echo" calls that fire as
> part of the recipe itself, so the install banners (DEST, INSTALL,
> LINK, REMOVE) were visible whenever the variables expand non-empty.
> 
> Guard the whole "ifndef V" block on "-s" so the loud variants are
> selected only when "-s" is absent and V=1 is unset. The existing
> "-s" check also had its findstring arguments in the wrong order
> (needle "-s" never fit in haystack "s"), so swap them while moving
> the check to wrap the block.
> 
> Signed-off-by: Harald Nordgren <harald.nordgren@kostdoktorn.se>

The new text looks good. However, the email addresses of author and
signer-off are different. They should be the same. I notice that you use
the gmail address in both places in other patch submissions, so I can
use that if you agree (and you don't need to send another round).

-- Hannes


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox