Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH/RFC 1/5] replay: support replaying 2-parent merges
From: Kristoffer Haugsbakk @ 2026-05-26 21:15 UTC (permalink / raw)
  To: Koji Nakamaru, git; +Cc: Elijah Newren, Patrick Steinhardt, Johannes Schindelin
In-Reply-To: <034ab0f83822e6db67baa423d9fcb753b12b5ac8.1778107405.git.gitgitgadget@gmail.com>

On Thu, May 7, 2026, at 00:43, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>[snip]
> diff --git a/replay.c b/replay.c
>[snip]
> +out:
> +	free(ancestor_name);
> +	free_commit_list(parent_bases);
> +	free_commit_list(replayed_bases);

`free_commit_list` is deprecated in favor of `commit_list_free` since
52882024 (Merge branch 'ps/commit-list-functions-renamed', 2026-02-13).

> +	merge_finalize(&remerge_opt, &remerge_res);
> +	merge_finalize(&new_merge_opt, &new_merge_res);
> +	return picked;
>  }
>[snip]

^ permalink raw reply

* Re: [PATCH 0/5] git son: add command to create independent child repositories
From: Ben Knoble @ 2026-05-26 21:27 UTC (permalink / raw)
  To: Evan Haque via GitGitGadget; +Cc: git, Evan Haque
In-Reply-To: <pull.2122.git.1779814052.gitgitgadget@gmail.com>


> Le 26 mai 2026 à 13:08, Evan Haque via GitGitGadget <gitgitgadget@gmail.com> a écrit :
> 
> 
> Motivation
> ==========
> 
> When spinning off a new project that is related to an existing repository,
> there is no built-in way to create a child repository that maintains a link
> back to its parent without the tight coupling of submodules. Submodules pin
> the child to a specific commit and require the parent to track the child in
> its index, which is too heavyweight when the child is meant to be fully
> independent.
> 
> The typical workflow today is manual: git init, git remote add, update
> .gitignore — three steps that are easy to forget or get wrong. git son
> automates this and establishes a lightweight convention for the parent-child
> relationship: a remote named parent in the child, and nothing in the parent
> except an ignore rule.

I don’t really understand the motivation, but if your goal is to create another repo with the current one as a remote, how does something like

    git clone . child

help you? (I’m pretty sure you can even set the remote name to « parent » if you wish.)

You also didn’t mention worktrees or subtrees, which might be useful for you. 

^ permalink raw reply

* Re: git mv after the fact
From: Ben Knoble @ 2026-05-26 21:29 UTC (permalink / raw)
  To: Chris Torek; +Cc: Frieder Hannenheim, git
In-Reply-To: <CAPx1Gvd9+z0th9whCbcA60_bWproPp+kwp3qDmhQOe4G=0=E6A@mail.gmail.com>


> Le 26 mai 2026 à 12:46, Chris Torek <chris.torek@gmail.com> a écrit :
> 
> On Tue, May 26, 2026 at 6:18 AM Frieder Hannenheim <mail@fhannenheim.net> wrote:
>> I'd like to propose a new flag for git mv, that updates the index
>> like git mv normally would but does not move the file. ...
> 
> You may already know this, but technically no flag is needed:
> you can just "git add" the new name and "git rm" the old one,
> with the same effect.

Or indeed, « git add old new » should also work, I think.

> A flag for "git mv" would be convenient (and slightly more
> efficient, not in terms of storage but in terms of CPU time
> spent discovering that the contents under the new name
> already exist in the object database). But Git will discover
> the rename on its own in the usual way regardless of how
> you get to that point.
> 
> Chris

^ permalink raw reply

* Re: [PATCH v3] git-jump: pick a mode automatically when invoked without arguments
From: Erik Cervin Edin @ 2026-05-26 21:33 UTC (permalink / raw)
  To: git
In-Reply-To: <20260522052821.GC861761@coredump.intra.peff.net>

On 26/05/22 01:28AM, Jeff King wrote:
> On Thu, May 21, 2026 at 01:45:09PM +0000, Greg Hurrell via GitGitGadget wrote:
> 
> >      * Don't both teaching "auto" to select "ws" mode, because it is always
> >        subsumed by "diff".
> 
> Dropping the "ws" mode from auto makes sense to me. It could be slotted
> in between "merge" and "diff" (a whitespace problem always implies a
> diff, but a diff does not always imply a whitespace problem). But would
> that actually be useful?

When I originally proposed the idea of a third branch, there was a
subtle difference -- it was a git diff --cached --check.

On 26/05/14 05:40PM, Erik Cervin Edin wrote:
> If we're going to teach git-jump how to be more clever about where to jump,
> does it also make sense to bake `git jump ws` into this?
> 
>         elif ! git diff --cached --check >/dev/null 2>&1; then
>             mode_ws --cached "$@"

Ever so often I come across file with a diff --check offending
white-space (often a missing newline at the end of some file) and
because I have a commit hook set up, I have to go looking for where that
error is. In this particular case it's always a staged change, and then
a *staged* white space problem doesn't imply a diff.

This happens rarely enough that I haven't internalized that I can use
"git jump diff --check --cached" and it takes me a while to navigate to
the offending files.

But the main suggestion was really considering the possibility to
expand this beyond these two auto jumps in the future -- I'm not sure
an auto jump that goes looking for staged white spaces issues would be
useful to anyone in practise and at this stage I thinks it's best to
drop it.

^ permalink raw reply

* Re: [PATCH v2 0/9] doc: interpret-trailers: explain key format
From: Ben Knoble @ 2026-05-26 21:34 UTC (permalink / raw)
  To: Kristoffer Haugsbakk
  Cc: Junio C Hamano, git, Christian Couder, jackmanb, Linus Arver
In-Reply-To: <fc1f8149-98c2-48e5-9725-08cc21696cb2@app.fastmail.com>


> Le 24 mai 2026 à 08:41, Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com> a écrit :
> 
> On Mon, May 11, 2026, at 21:23, D. Ben Knoble wrote:
>> Overall looks good to me. Repeating a few points throughout the doc
>> might create headaches if format restrictions are changed, but I think
>> they are essential points worth repeating for now.
> 
> Thanks for taking a look again. :)

Thank you for working on it :)

>>> [snip]
>>> @@ -81,19 +87,25 @@ trailer.sign.key "Signed-off-by: "
>>> in your configuration, you only need to specify `--trailer="sign: foo"`
>>> on the command line instead of `--trailer="Signed-off-by: foo"`.
>>> 
>>> -By default the new trailer will appear at the end of all the existing
>>> -trailers. If there is no existing trailer, the new trailer will appear
>>> -at the end of the input. A blank line will be added before the new
>>> -trailer if there isn't one already.
>>> +By default the new trailer will appear at the end of the trailer block.
>>> +A trailer block will be created with only that trailer if a trailer
>>> +block does not already exist. Recall that a trailer block needs to be
>>> +preceded by a blank line, so a blank line (specifically an empty line)
>>> +will be inserted before the new trailer block in that case.
>> 
>> [not strictly related to this patch, but while we're here…]
>> 
>> Even in context, I find the original (and new) paragraph somewhat
>> jarring. In "the new trailer," there's no antecedent for "the
>> trailer", so which new trailer are we talking about? The previous
>> paragraph is about "<key-alias>es" for --trailer="<key>: value".
>> 
>> We _could_ move this paragraph up one, so that it follows the
>> paragraph on trailers being appended when given with --trailer.
>> 
>> Either way, adjusting "the new trailer" to "a new trailer" might feel
>> better to me. Other suggestions welcome.
> 
> The paragraph about new trailers originally came right after the
> separated-by sentence:[1]
> 
>    By default, a '<token>=<value>' or '<token>:<value>' [...]
> 
>    ------------------------------------------------
>    token: value
>    ------------------------------------------------
> 
>    This means that the trimmed <token> and <value> will be separated by
>    `': '` (one colon followed by one space).
> 
>    By default the new trailer will appear [...]
> 
> † 1: dfd66ddf (Documentation: add documentation for 'git
>     interpret-trailers', 2014-10-13)
> 
> Nine years later in [2], a “For convenience, <token>” was added to that *existing paragraph:
> 
>    [...]
>    `': '` (one colon followed by one space). For convenience, the <token> can be a
>    shortened string key (e.g., "sign") instead of the full string which should
>    appear before the separator on the output (e.g., "Signed-off-by"). This can be
>    configured using the 'trailer.<token>.key' configuration variable.
> 
>    By default the new trailer will appear at the end [...]
> 
> † 2: eda2c44c (doc: trailer: mention 'key' in DESCRIPTION, 2023-06-15)
> 
> A little later in [3], that part was split into its own paragraph—and
> expanded into two more blocks (source block and paragraph):
> 
>    [...] <key> and <value> will be separated by `': '` (one colon followed
>    by one space).
> 
>    For convenience, a <keyAlias> can be configured to [...]
> 
>    ------------------------------------------------
>    key: value
>    ------------------------------------------------
> 
>    in your configuration, [...]
> 
>    By default the new trailer will appear at the end [...]
> 
> † 3: 6ccbc667 (trailer doc: <token> is a <key> or <keyAlias>, not both,
>     2023-09-07)
> 
>> We _could_ move this paragraph up one, so that it follows the
>> paragraph on trailers being appended when given with --trailer.
> 
> But going back to commit [1], there are two paragraphs that talk about
> how “By default” the new trailer will be appended to the end:
> 
>    By default, a '<token>=<value>' or '<token>:<value>' argument given
>    using `--trailer` will be appended after the existing trailers only if
>    the last trailer has a different (<token>, <value>) pair (or if there
>    is no existing trailer). The <token> and <value> parts will be trimmed
>    to remove starting and trailing whitespace, and the resulting trimmed
>    <token> and <value> will appear in the message like this:
> 
>    ------------------------------------------------
>    token: value
>    ------------------------------------------------
> 
>    This means that the trimmed <token> and <value> will be separated by
>    `': '` (one colon followed by one space).
> 
>    By default the new trailer will appear at the end of all the existing
>    trailers. If there is no existing trailer, the new trailer will appear
>    after the commit message part of the ouput, and, if there is no line
>    with only spaces at the end of the commit message part, one blank line
>    will be added before the new trailer.
> 
> These two seem to overlap? They both talk about appending. Why does one
> talk about how specifically <token>/<key> and <value> will be treated
> when appended, then a later paragraph *also* says that it will be
> appended?
> 
> Here is a draft of this part of the doc. I have tried to consolidate
> these two “By default” paragrahs and be more explicit about what “the
> trailer” is. I have included one unchanged paragraph before and after
> for context.

I’ve read through the below a few times, and I don’t really have much to add for now :) I think that’s a fine improvement.

Whether you roll that into this patch series or wait until the dust settles is up to you.

> ***
> 
> Some configuration variables control the way the `--trailer` arguments
> are applied to each input and the way any existing trailer in
> the input is changed. They also make it possible to
> automatically add some trailers.
> 
> Let's consider new trailers added with `--trailer`.
> By default, the new trailer will appear at the end of the trailer block.
> Also by default, this new trailer will only be added
> if the last trailer is different to it.
> A trailer block will be created with only that trailer if a trailer
> block does not already exist. Recall that a trailer block needs to be
> preceded by a blank line, so a blank line (specifically an empty line)
> will be inserted before the new trailer block in that case.
> 
> More concretely, this is how the new trailer is added: a `<key>=<value>`
> or `<key>:<value>` argument given using `--trailer` will be appended
> after the existing trailers. The _<key>_ and _<value>_ parts will be
> trimmed to remove starting and trailing whitespace, and the resulting
> trimmed _<key>_ and _<value>_ will appear in the output like this:
> 
> ------------------------------------------------
> key: value
> ------------------------------------------------
> 
> This means that the trimmed _<key>_ and _<value>_ will be separated by
> "`:`{nbsp}" (one colon followed by one space).
> 
> ***
> 
>> [snip]
>>> -a group of one or more lines that (i) is all trailers, or (ii) contains at
>>> -least one Git-generated or user-configured trailer and consists of at
>>> +Existing trailers are extracted from the input by looking for the
>>> +trailer block. Concretely, that is a group of one or more lines that (i)
>>> +is all trailers, or (ii) contains at least one Git-generated or
>>> +user-configured trailer and consists of at
>>> [snip]

^ permalink raw reply

* Re: [PATCH v2 0/9] doc: interpret-trailers: explain key format
From: Kristoffer Haugsbakk @ 2026-05-26 21:42 UTC (permalink / raw)
  To: D. Ben Knoble
  Cc: Junio C Hamano, git, Christian Couder, jackmanb, Linus Arver
In-Reply-To: <4DD440D4-145A-4A9E-ACBA-8E6ACFA231D1@gmail.com>

On Tue, May 26, 2026, at 23:34, Ben Knoble wrote:
>> Le 24 mai 2026 à 08:41, Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com> a écrit :
>>
>> On Mon, May 11, 2026, at 21:23, D. Ben Knoble wrote:
>>> Overall looks good to me. Repeating a few points throughout the doc
>>> might create headaches if format restrictions are changed, but I think
>>> they are essential points worth repeating for now.
>>
>> Thanks for taking a look again. :)
>
> Thank you for working on it :)
>
>>[snip]
>>
>> Here is a draft of this part of the doc. I have tried to consolidate
>> these two “By default” paragrahs and be more explicit about what “the
>> trailer” is. I have included one unchanged paragraph before and after
>> for context.
>
> I’ve read through the below a few times, and I don’t really have much
> to add for now :) I think that’s a fine improvement.
>
> Whether you roll that into this patch series or wait until the dust
> settles is up to you.

Many thanks!

^ permalink raw reply

* Re: [PATCH v2 0/9] doc: interpret-trailers: explain key format
From: Kristoffer Haugsbakk @ 2026-05-26 21:45 UTC (permalink / raw)
  To: D. Ben Knoble
  Cc: Junio C Hamano, git, Christian Couder, jackmanb, Linus Arver
In-Reply-To: <0faba437-31cf-4004-adaf-2dfcd2274a5b@app.fastmail.com>

On Tue, May 26, 2026, at 23:42, Kristoffer Haugsbakk wrote:
>>[snip]
>>
>> I’ve read through the below a few times, and I don’t really have much
>> to add for now :) I think that’s a fine improvement.
>>
>> Whether you roll that into this patch series or wait until the dust
>> settles is up to you.
>
> Many thanks!

Sorry. I forgot to add: the plan right now is to weave it into this series.

^ permalink raw reply

* Re: git mv after the fact
From: Junio C Hamano @ 2026-05-27  3:09 UTC (permalink / raw)
  To: Chris Torek; +Cc: Frieder Hannenheim, git
In-Reply-To: <CAPx1Gvd9+z0th9whCbcA60_bWproPp+kwp3qDmhQOe4G=0=E6A@mail.gmail.com>

Chris Torek <chris.torek@gmail.com> writes:

> On Tue, May 26, 2026 at 6:18 AM Frieder Hannenheim <mail@fhannenheim.net> wrote:
>> I'd like to propose a new flag for git mv, that updates the index
>> like git mv normally would but does not move the file. ...
>
> You may already know this, but technically no flag is needed:
> you can just "git add" the new name and "git rm" the old one,
> with the same effect.

Correct.

> A flag for "git mv" would be convenient (and slightly more
> efficient, not in terms of storage but in terms of CPU time
> spent discovering that the contents under the new name
> already exist in the object database).

May be convenient, but I do not get the "efficient" part.  Do you
mean that for two operations "add" and "rm", you need to spend two
index writes plus one file contents hash, as opposed to one index
rite without having to do any contents hash?

> But Git will discover
> the rename on its own in the usual way regardless of how
> you get to that point.

This is not incorrect per-se, but it is a confusing thing to say to
somebody who does not know the equivalence of "mv" and "rm + add".
It would not be clear to them that you are not talking about what
happens during "mv" or "rm + add", but about what happend during
"git log -M", "git diff -M", etc.

There is "git rm --cached" that can be used to recover from an
"oops, I removed the file from the filesystem without telling Git".

    $ date >new-file.txt
    $ git add new-file.txt
    $ rm new-file.txt
    $ git rm --cached new-file.txt

I think the requested "feature" is not all that outrageous.  It
would be a similar value as a morning-after correction measure for
"oops, I moved the file in the filesystem without telling Git".

    $ date >old-file.txt
    $ git add old-file.txt
    $ mv old-file.txt new-file.txt
    $ git mv --cached old-file.txt new-file.txt

Thanks.

^ permalink raw reply

* Re: git mv after the fact
From: Chris Torek @ 2026-05-27  3:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Frieder Hannenheim, git
In-Reply-To: <xmqqy0h5lfa0.fsf@gitster.g>

> Chris Torek <chris.torek@gmail.com> writes:
>> A flag for "git mv" would be convenient (and slightly moreefficient ...
>

On Tue, May 26, 2026 at 8:09 PM Junio C Hamano <gitster@pobox.com> wrote:
> May be convenient, but I do not get the "efficient" part.

A normal `git mv` renames the index entry and the file in the working
tree without running `git add` on the *contents*, so there's no new hash
computation.  Presumably a `git mv --after foo bar` would do the same: verify
that there is no existing `bar` in the index, that there is an existing `foo` in
the index, and that there is no `foo` but there is a `bar` in the working tree,
and then it would rename (add-and-remove, really, because of sorting)
the index entry, without scanning the working tree contents.

In other words, we skip reading the 3 terabyte file, or whatever.

Anyway, comparing to `git rm --cached`:

> I think the requested "feature" is not all that outrageous.  It
> would be a similar value as a morning-after correction measure for
> "oops, I moved the file in the filesystem without telling Git".

I agree, but I also don't see it as valuable enough to bother
writing a proper implementation.

I did write it up as a shell script though, long ago. Adding it here
via gmail would mess with white space so I'll just provide a link
to the file on GitHub:

https://github.com/chris3torek/scripts/blob/master/git-mv-after

Chris

^ permalink raw reply

* Re: [PATCH v2] completion: hide dotfiles for selected path completion
From: Junio C Hamano @ 2026-05-27  3:22 UTC (permalink / raw)
  To: Zakariyah Ali via GitGitGadget; +Cc: git, Zakariyah Ali
In-Reply-To: <pull.2311.v2.git.git.1779808987825.gitgitgadget@gmail.com>

"Zakariyah Ali via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This matches standard shell filename completion behavior, where dotfiles
> are hidden by default unless the user starts their input with a dot.

OK, with this rationale added, I no longer have problem with the
proposed new behaviour.

As I'm not going to give a serious review  on the patch body itself,
I would really appreciate somebody more knowledgeable on the
existing bach completion code than I am to take a look.

Thanks.

^ permalink raw reply

* Re: git mv after the fact
From: Tim Tassonis @ 2026-05-27  3:27 UTC (permalink / raw)
  To: Junio C Hamano, Chris Torek; +Cc: Frieder Hannenheim, git
In-Reply-To: <xmqqy0h5lfa0.fsf@gitster.g>

Yeah, and while we're at at it: why not another patch for

git rm 
file_i_deleted_but_didnt_tell_git_and_dont_want_an_error_message_because_thats_offensive

because that's always very rude, too, telling me to have to use

git rm -f

Because I also am very sensitive and don't like to be told I fucked up 
and have to be more specific about what I actually want. That's just 
toxic, man.




On 5/27/26 05:09, Junio C Hamano wrote:
> Chris Torek <chris.torek@gmail.com> writes:
> 
>> On Tue, May 26, 2026 at 6:18 AM Frieder Hannenheim <mail@fhannenheim.net> wrote:
>>> I'd like to propose a new flag for git mv, that updates the index
>>> like git mv normally would but does not move the file. ...
>>
>> You may already know this, but technically no flag is needed:
>> you can just "git add" the new name and "git rm" the old one,
>> with the same effect.
> 
> Correct.
> 
>> A flag for "git mv" would be convenient (and slightly more
>> efficient, not in terms of storage but in terms of CPU time
>> spent discovering that the contents under the new name
>> already exist in the object database).
> 
> May be convenient, but I do not get the "efficient" part.  Do you
> mean that for two operations "add" and "rm", you need to spend two
> index writes plus one file contents hash, as opposed to one index
> rite without having to do any contents hash?
> 
>> But Git will discover
>> the rename on its own in the usual way regardless of how
>> you get to that point.
> 
> This is not incorrect per-se, but it is a confusing thing to say to
> somebody who does not know the equivalence of "mv" and "rm + add".
> It would not be clear to them that you are not talking about what
> happens during "mv" or "rm + add", but about what happend during
> "git log -M", "git diff -M", etc.
> 
> There is "git rm --cached" that can be used to recover from an
> "oops, I removed the file from the filesystem without telling Git".
> 
>      $ date >new-file.txt
>      $ git add new-file.txt
>      $ rm new-file.txt
>      $ git rm --cached new-file.txt
> 
> I think the requested "feature" is not all that outrageous.  It
> would be a similar value as a morning-after correction measure for
> "oops, I moved the file in the filesystem without telling Git".
> 
>      $ date >old-file.txt
>      $ git add old-file.txt
>      $ mv old-file.txt new-file.txt
>      $ git mv --cached old-file.txt new-file.txt
> 
> Thanks.
> 
> 

-- 
decentral.ch - IT Stuff
Tim Tassonis
Badenerstrasse 219
8003 Zürich
stuff@decentral.ch
+41 79 229 36 17


^ permalink raw reply

* [RFC PATCH 0/3] diff: pair edited lines inside moved blocks
From: Keita Oda @ 2026-05-27  4:23 UTC (permalink / raw)
  To: git; +Cc: Keita ODA

From: Keita ODA <ainsophyao@gmail.com>

This is an RFC for a review aid, not a proposed final UI or option name.

The motivation is the gap between --word-diff and --color-moved.
--word-diff is very useful when the line-level diff already found useful
old/new line pairs.  --color-moved is useful when moved lines are exact
matches.  But when a block is moved and one line inside the block is edited,
the small edit can be buried in a large delete/add region.

That case matters for review.  A one-line move is usually easy to inspect by
eye.  A ten-line moved block with a one-character change inside it is harder
to audit.  A small synthetic permission-table example in patch 3 uses this
shape:

  -#define PERM_RESOURCE_EXPORT       0x0008
  +#define PERM_RESOURCE_EXPORT       0x0001

That particular toy example is not meant to show something that
--color-moved cannot see.  It is meant to make the review question small:
can Git expose "this moved line was also edited" in a lightweight way?
The real-world cases below are less about proving that existing modes are
blind, and more about making the row-to-row correspondence explicit enough
that the small edits are easy to check.

This series adds an opt-in prototype, --word-diff-align, that post-processes
the emitted diff symbols and tries to pair similar deleted and inserted lines.
It does not change the underlying diff algorithm, patch semantics, apply, or
merge behavior.
The prototype is deliberately language-agnostic.  It does not parse source
code or build an AST; it only tokenizes diff lines into small text tokens and
scores local token overlap.  This keeps the experiment applicable to code,
tests, generated tables, documentation, and other text files.

The prototype is intentionally split into three pieces:

  * patch 1 adds the candidate retrieval and line-pair scoring, and exposes
    selected pairs with an RFC/debug comment;
  * patch 2 adds a small RFC-only renderer that inserts word-diff-like
    markers on the selected pairs, so that the recovered pairs are easier to
    inspect;
  * patch 3 adds a focused test case.

The current prototype is still larger than I would like, but the split keeps
the experimental pieces visible.  The full series is about 1000 inserted
lines; roughly 800 lines are option plumbing, tokenization, candidate
retrieval, scoring, pair selection, and debug comments, while about 200 lines
are temporary rendering code for review.

The scoring model is:

  S = W + aL

where W is a 5-line-window token overlap score and L is a center-line token
LCS score.  A small 64-bit window fingerprint is used only as a candidate
retrieval index; candidate pairs are scored again before they are selected.
Tokens repeated in the surrounding small window carry less weight for the
center-line score, which is a local-IDF-like approximation.  This keeps tokens
such as "import" or "#define" from overwhelming the line-specific identifier.

Some real-world examples that motivated the prototype:

  * CPython opcode/metadata renumbering, where many table rows stay logically
    paired but their numeric values shift;
  * CPython test parameterization rewrites such as tuple rows becoming
    dict(input=..., expected=...) rows;
  * Git's own expected-output tables, where a column width change adds spaces
    across many rows and a row insertion shifts the surrounding context;
  * Git's own remote.c refactoring, where extracted helper code has small
    identifier changes.

As a rough trigger-rate sanity check, I ran the prototype over 5734 changed
file pairs sampled from recent Git, CPython, and Rust history.  The stricter
"crossing and edited" signal, which ignores the many adjacent row pairs and
looks for pairs that cross another recovered pair, appeared in 739 file pairs
(about 13%).  This is not a gold-label quality number, but it suggests that
the mode is not only triggering on the synthetic test.
A small manual review found both clear wins and loose matches.

I found the problem easiest to inspect with four-way comparisons:

  * git diff --histogram
  * git diff --histogram --word-diff=plain
  * git diff --histogram --color-moved=blocks
  * git diff --histogram --word-diff-align

I put a small set of rendered four-way examples here:

  https://oda.github.io/git-diff-rfc-examples/rfc-word-diff-align/

These links are supplemental; the patch series is intended to be readable
without them.

Known limitations:

  * the UI/debug output is not final;
  * generated or boilerplate-heavy hunks, especially Rust generated test
    updates, can still produce loose matches;
  * one-line long-distance pairs are often less useful than block-level pairs;
  * the prototype intentionally gives local and remote pairs similar treatment
    for now, to make the recovered pairings visible for discussion;
  * thresholds and tie-breaking are still experimental.

The question for this RFC is whether this kind of language-agnostic line-pair
annotation is worth pursuing in core, and if so whether it should be shaped as
word-diff plumbing, a color-moved extension, or a separate opt-in mode.

Keita ODA (3):
  diff: add word-diff-align line pairing
  diff: render word-diff-align pairs for RFC review
  t4034: cover moved-and-edited word diff alignment

 diff.c                | 996 +++++++++++++++++++++++++++++++++++++++++-
 diff.h                |   1 +
 t/t4034-diff-words.sh |  46 ++
 3 files changed, 1035 insertions(+), 8 deletions(-)

-- 
2.39.3 (Apple Git-146)

^ permalink raw reply

* [RFC PATCH 1/3] diff: add word-diff-align line pairing
From: Keita Oda @ 2026-05-27  4:24 UTC (permalink / raw)
  To: git; +Cc: Keita ODA
In-Reply-To: <20260527042402.13607-1-ainsophyao@gmail.com>

From: Keita ODA <ainsophyao@gmail.com>

Add an opt-in --word-diff-align mode that post-processes emitted diff
symbols and tries to pair similar deleted and inserted lines inside each hunk.

This is not a replacement for the diff algorithm.  The normal diff is produced
first; the new code only annotates the already-emitted symbols for review.

The candidate search is intentionally approximate.  Each line is tokenized into
word-ish runs and single non-space symbols.  Each token gets a hash, each line
gets a compact 64-bit fingerprint, and each changed line also gets a small
surrounding window fingerprint.  Inserted-side windows are placed into bit
buckets, and deleted-side windows query those buckets to avoid a full all-pairs
scan in typical cases.

The fingerprint is only a retrieval index.  Candidate pairs are scored again
using token hashes:

  W = unique token overlap in the small surrounding windows
  L = center-line token LCS score
  S = W + 4L

For the center-line score, tokens repeated in the surrounding window carry less
weight.  This is a local-IDF-like approximation: in a run of "import" or
"#define" lines, the repeated keyword is weak evidence, while the identifier
specific to the center line remains strong evidence.

For this RFC, selected pairs are exposed with a debug-style "# aligned ..."
comment.  The next patch adds a small renderer to make the selected pairs more
readable.

---
 diff.c | 809 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 diff.h |   1 +
 2 files changed, 802 insertions(+), 8 deletions(-)

diff --git a/diff.c b/diff.c
index 397e38b41..6b8744920 100644
--- a/diff.c
+++ b/diff.c
@@ -46,6 +46,7 @@
 #include "setup.h"
 #include "strmap.h"
 #include "ws.h"
+#include "ewah/ewok.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
@@ -1358,6 +1359,774 @@ static void dim_moved_lines(struct diff_options *o)
 	}
 }
 
+struct word_diff_align_line_info {
+	uint64_t bits;
+	int token_start;
+	int token_nr;
+};
+
+struct word_diff_align_item {
+	int symbol;
+	int change_pos;
+	int line_no;
+};
+
+struct word_diff_align_candidate {
+	int old_pos;
+	int new_pos;
+	int score;
+	int window_score;
+	int window_shared;
+	int line_score;
+	int line_shared;
+};
+
+#define WORD_DIFF_ALIGN_NUMBER_HASH 0x9e3779b9U
+
+#define WORD_DIFF_ALIGN_WINDOW_SIZE 2
+#define WORD_DIFF_ALIGN_FINGERPRINT_BITS 64
+#define WORD_DIFF_ALIGN_BIT_BUCKET_MAX 256
+#define WORD_DIFF_ALIGN_WINDOW_MIN_SCORE 35
+#define WORD_DIFF_ALIGN_WINDOW_MIN_SHARED 4
+#define WORD_DIFF_ALIGN_LINE_WEIGHT 4
+#define WORD_DIFF_ALIGN_SCORE_MIN 175
+
+static void word_diff_align_window_bounds(int nr, int pos,
+					  int *start_p, int *end_p)
+{
+	int radius = WORD_DIFF_ALIGN_WINDOW_SIZE;
+
+	*start_p = pos > radius ? pos - radius : 0;
+	*end_p = pos + radius + 1 < nr ? pos + radius + 1 : nr;
+}
+
+static int word_diff_align_payload_len(const struct emitted_diff_symbol *e)
+{
+	int len = e->len;
+
+	if (len && e->line[len - 1] == '\n')
+		len--;
+	if (len && e->line[len - 1] == '\r')
+		len--;
+	return len;
+}
+
+static int word_diff_align_word_byte(char ch)
+{
+	return isalnum((unsigned char)ch) || ch == '_';
+}
+
+static int word_diff_align_numeric_token(const char *line, int start, int end)
+{
+	int i;
+
+	for (i = start; i < end; i++)
+		if (!isdigit((unsigned char)line[i]))
+			return 0;
+	return start < end;
+}
+
+static uint64_t word_diff_align_token_bits(unsigned int hash)
+{
+	return (1ULL << (hash & 63)) | (1ULL << ((hash >> 6) & 63));
+}
+
+static int word_diff_align_next_token(const char *line, int len, int *pos,
+				      int *start_p, int *end_p)
+{
+	while (*pos < len && isspace((unsigned char)line[*pos]))
+		(*pos)++;
+	if (*pos == len)
+		return 0;
+
+	*start_p = *pos;
+	if (word_diff_align_word_byte(line[*pos]))
+		while (*pos < len && word_diff_align_word_byte(line[*pos]))
+			(*pos)++;
+	else
+		(*pos)++;
+	*end_p = *pos;
+	return 1;
+}
+
+static void word_diff_align_prepare_line(const struct emitted_diff_symbol *e,
+					 struct word_diff_align_line_info *line,
+					 unsigned int **tokens,
+					 int *tokens_nr,
+					 int *tokens_alloc)
+{
+	int len = word_diff_align_payload_len(e);
+	int pos = 0;
+	int start, end;
+
+	line->bits = 0;
+	line->token_start = *tokens_nr;
+	while (word_diff_align_next_token(e->line, len, &pos, &start, &end)) {
+		int numeric = word_diff_align_numeric_token(e->line, start, end);
+		unsigned int hash = numeric ? WORD_DIFF_ALIGN_NUMBER_HASH :
+			memhash(e->line + start, end - start);
+
+		ALLOC_GROW(*tokens, *tokens_nr + 1, *tokens_alloc);
+		(*tokens)[*tokens_nr] = hash;
+		line->bits |= word_diff_align_token_bits(hash);
+		(*tokens_nr)++;
+	}
+	line->token_nr = *tokens_nr - line->token_start;
+}
+
+static uint64_t word_diff_align_window_bits(struct word_diff_align_line_info *lines,
+					    int nr, int pos)
+{
+	uint64_t bits = 0;
+	int start, end;
+	int i;
+
+	word_diff_align_window_bounds(nr, pos, &start, &end);
+	for (i = start; i < end; i++)
+		bits |= lines[i].bits;
+	return bits;
+}
+
+static int word_diff_align_filter_score(uint64_t a, uint64_t b, int *shared_p)
+{
+	int shared = ewah_bit_popcount64(a & b);
+	int total = ewah_bit_popcount64(a | b);
+
+	*shared_p = shared;
+	return total ? shared * 100 / total : 0;
+}
+
+static int word_diff_align_line_has_token(const struct word_diff_align_line_info *line,
+					  const unsigned int *tokens,
+					  unsigned int hash);
+
+static int word_diff_align_window_has_token(const struct word_diff_align_line_info *lines,
+					    int start, int end,
+					    const unsigned int *tokens,
+					    unsigned int hash)
+{
+	int i;
+
+	for (i = start; i < end; i++)
+		if (word_diff_align_line_has_token(&lines[i], tokens, hash))
+			return 1;
+	return 0;
+}
+
+static int word_diff_align_window_seen_token(const struct word_diff_align_line_info *lines,
+					     int start, int line_pos,
+					     int token_pos,
+					     const unsigned int *tokens,
+					     unsigned int hash)
+{
+	int i;
+
+	for (i = start; i <= line_pos; i++) {
+		const struct word_diff_align_line_info *line = &lines[i];
+		int end = i == line_pos ? token_pos : line->token_nr;
+		int j;
+
+		for (j = 0; j < end; j++)
+			if (tokens[line->token_start + j] == hash)
+				return 1;
+	}
+	return 0;
+}
+
+static int word_diff_align_window_token_score(const struct word_diff_align_line_info *minus_lines,
+					      int minus_nr, int minus_pos,
+					      const unsigned int *minus_tokens,
+					      const struct word_diff_align_line_info *plus_lines,
+					      int plus_nr, int plus_pos,
+					      const unsigned int *plus_tokens,
+					      int *shared_p)
+{
+	int minus_start, minus_end, plus_start, plus_end;
+	int minus_unique = 0, plus_unique = 0, shared = 0;
+	int i;
+
+	word_diff_align_window_bounds(minus_nr, minus_pos, &minus_start,
+				      &minus_end);
+	word_diff_align_window_bounds(plus_nr, plus_pos, &plus_start,
+				      &plus_end);
+
+	for (i = minus_start; i < minus_end; i++) {
+		const struct word_diff_align_line_info *line = &minus_lines[i];
+		int j;
+
+		for (j = 0; j < line->token_nr; j++) {
+			unsigned int hash = minus_tokens[line->token_start + j];
+
+			if (word_diff_align_window_seen_token(minus_lines,
+							      minus_start, i, j,
+							      minus_tokens,
+							      hash))
+				continue;
+			minus_unique++;
+			if (word_diff_align_window_has_token(plus_lines,
+							     plus_start, plus_end,
+							     plus_tokens, hash))
+				shared++;
+		}
+	}
+
+	for (i = plus_start; i < plus_end; i++) {
+		const struct word_diff_align_line_info *line = &plus_lines[i];
+		int j;
+
+		for (j = 0; j < line->token_nr; j++) {
+			unsigned int hash = plus_tokens[line->token_start + j];
+
+			if (word_diff_align_window_seen_token(plus_lines,
+							      plus_start, i, j,
+							      plus_tokens,
+							      hash))
+				continue;
+			plus_unique++;
+		}
+	}
+
+	*shared_p = shared;
+	return minus_unique + plus_unique - shared ?
+		shared * 100 / (minus_unique + plus_unique - shared) : 0;
+}
+
+static int word_diff_align_line_has_token(const struct word_diff_align_line_info *line,
+					  const unsigned int *tokens,
+					  unsigned int hash)
+{
+	int i;
+
+	for (i = 0; i < line->token_nr; i++)
+		if (tokens[line->token_start + i] == hash)
+			return 1;
+	return 0;
+}
+
+static int word_diff_align_surrounding_line_freq(const struct word_diff_align_line_info *lines,
+						 int nr, int pos,
+						 const unsigned int *tokens,
+						 unsigned int hash)
+{
+	int radius = WORD_DIFF_ALIGN_WINDOW_SIZE;
+	int start = pos > radius ? pos - radius : 0;
+	int end = pos + radius + 1 < nr ? pos + radius + 1 : nr;
+	int i, freq = 0;
+
+	for (i = start; i < end; i++) {
+		if (i == pos)
+			continue;
+		if (word_diff_align_line_has_token(&lines[i], tokens, hash))
+			freq++;
+	}
+	return freq;
+}
+
+static int word_diff_align_shared_token_value(const struct word_diff_align_line_info *minus_lines,
+					      int minus_nr, int minus_pos,
+					      const unsigned int *minus_tokens,
+					      const struct word_diff_align_line_info *plus_lines,
+					      int plus_nr, int plus_pos,
+					      const unsigned int *plus_tokens,
+					      unsigned int hash)
+{
+	int minus_freq = word_diff_align_surrounding_line_freq(minus_lines,
+							       minus_nr,
+							       minus_pos,
+							       minus_tokens,
+							       hash);
+	int plus_freq = word_diff_align_surrounding_line_freq(plus_lines,
+							      plus_nr,
+							      plus_pos,
+							      plus_tokens,
+							      hash);
+	int freq = minus_freq > plus_freq ? minus_freq : plus_freq;
+	int value = 2 * WORD_DIFF_ALIGN_WINDOW_SIZE - freq;
+
+	return value > 0 ? value : 0;
+}
+
+static int word_diff_align_unmatched_weight(const struct word_diff_align_line_info *line,
+					    const unsigned int *tokens,
+					    const unsigned int *other_tokens,
+					    int other_start, int other_nr)
+{
+	int i, weight = 0;
+
+	for (i = 0; i < line->token_nr; i++) {
+		unsigned int hash = tokens[line->token_start + i];
+		int j, matched = 0;
+
+		for (j = 0; j < other_nr; j++) {
+			if (hash == other_tokens[other_start + j]) {
+				matched = 1;
+				break;
+			}
+		}
+		if (!matched)
+			weight++;
+	}
+	return weight;
+}
+
+static int word_diff_align_line_token_score(const struct word_diff_align_line_info *minus,
+					    const struct word_diff_align_line_info *minus_lines,
+					    int minus_nr, int minus_pos,
+					    const unsigned int *minus_tokens,
+					    const struct word_diff_align_line_info *plus,
+					    const struct word_diff_align_line_info *plus_lines,
+					    int plus_nr, int plus_pos,
+					    const unsigned int *plus_tokens,
+					    int *shared_p)
+{
+	int i, j, shared;
+	int *lcs;
+	int unmatched, total;
+
+	if (!minus->token_nr || !plus->token_nr) {
+		*shared_p = 0;
+		return 0;
+	}
+
+	CALLOC_ARRAY(lcs, plus->token_nr + 1);
+	for (i = 0; i < minus->token_nr; i++) {
+		unsigned int minus_hash = minus_tokens[minus->token_start + i];
+		int prev = 0;
+
+		for (j = 0; j < plus->token_nr; j++) {
+			unsigned int plus_hash = plus_tokens[plus->token_start + j];
+			int column = j + 1;
+			int saved;
+
+			saved = lcs[column];
+			if (minus_hash == plus_hash) {
+				int value = word_diff_align_shared_token_value(minus_lines,
+									       minus_nr,
+									       minus_pos,
+									       minus_tokens,
+									       plus_lines,
+									       plus_nr,
+									       plus_pos,
+									       plus_tokens,
+									       minus_hash);
+				int best = prev + value;
+
+				if (lcs[column] > best)
+					best = lcs[column];
+				if (lcs[column - 1] > best)
+					best = lcs[column - 1];
+				lcs[column] = best;
+			}
+			else if (lcs[column - 1] > lcs[column])
+				lcs[column] = lcs[column - 1];
+			prev = saved;
+		}
+	}
+
+	shared = lcs[plus->token_nr];
+	free(lcs);
+	unmatched = word_diff_align_unmatched_weight(minus, minus_tokens,
+						     plus_tokens,
+						     plus->token_start,
+						     plus->token_nr) +
+		word_diff_align_unmatched_weight(plus, plus_tokens,
+						 minus_tokens,
+						 minus->token_start,
+						 minus->token_nr);
+	total = 2 * shared + unmatched;
+	*shared_p = shared;
+	return total ? shared * 200 / total : 0;
+}
+
+static int word_diff_align_pair_score(int window_score,
+				      const struct word_diff_align_line_info *minus,
+				      const struct word_diff_align_line_info *minus_lines,
+				      int minus_nr, int minus_pos,
+				      const unsigned int *minus_tokens,
+				      const struct word_diff_align_line_info *plus,
+				      const struct word_diff_align_line_info *plus_lines,
+				      int plus_nr, int plus_pos,
+				      const unsigned int *plus_tokens,
+				      int *line_score_p,
+				      int *line_shared_p)
+{
+	int line_score, line_shared;
+
+	line_score = word_diff_align_line_token_score(minus, minus_lines,
+						      minus_nr, minus_pos,
+						      minus_tokens,
+						      plus, plus_lines,
+						      plus_nr, plus_pos,
+						      plus_tokens, &line_shared);
+	*line_score_p = line_score;
+	*line_shared_p = line_shared;
+
+	/*
+	 * The bit fingerprint is only a retrieval approximation.  The window
+	 * score passed here is computed from exact token hashes in the 5-line
+	 * windows, while the line score gives the center line an additional
+	 * identity bonus.
+	 */
+	return window_score + WORD_DIFF_ALIGN_LINE_WEIGHT * line_score;
+}
+
+static int word_diff_align_candidate_cmp(const void *va, const void *vb)
+{
+	const struct word_diff_align_candidate *a = va;
+	const struct word_diff_align_candidate *b = vb;
+
+	if (a->score != b->score)
+		return b->score - a->score;
+	if (a->line_score != b->line_score)
+		return b->line_score - a->line_score;
+	if (a->window_score != b->window_score)
+		return b->window_score - a->window_score;
+	if (a->line_shared != b->line_shared)
+		return b->line_shared - a->line_shared;
+	if (a->window_shared != b->window_shared)
+		return b->window_shared - a->window_shared;
+	if (a->old_pos != b->old_pos)
+		return a->old_pos - b->old_pos;
+	return a->new_pos - b->new_pos;
+}
+
+static void word_diff_align_add_candidate(struct word_diff_align_candidate **candidates,
+					  int *candidates_nr,
+					  int *candidates_alloc,
+					  int old_pos, int new_pos,
+					  int score, int window_score,
+					  int window_shared, int line_score,
+					  int line_shared)
+{
+	struct word_diff_align_candidate *candidate;
+
+	ALLOC_GROW(*candidates, *candidates_nr + 1, *candidates_alloc);
+	candidate = &(*candidates)[(*candidates_nr)++];
+	candidate->old_pos = old_pos;
+	candidate->new_pos = new_pos;
+	candidate->score = score;
+	candidate->window_score = window_score;
+	candidate->window_shared = window_shared;
+	candidate->line_score = line_score;
+	candidate->line_shared = line_shared;
+}
+
+static void word_diff_align_debug_append_comment(struct emitted_diff_symbol *line,
+						 const struct strbuf *suffix)
+{
+	struct strbuf out = STRBUF_INIT;
+	char *old_line = (char *)line->line;
+	size_t new_len;
+
+	if (line->len && line->line[line->len - 1] == '\n') {
+		strbuf_add(&out, line->line, line->len - 1);
+		strbuf_addbuf(&out, suffix);
+		strbuf_addch(&out, '\n');
+	} else {
+		strbuf_add(&out, line->line, line->len);
+		strbuf_addbuf(&out, suffix);
+	}
+	line->line = strbuf_detach(&out, &new_len);
+	line->len = (int)new_len;
+	free(old_line);
+}
+
+static void word_diff_align_debug_mark_pair(struct emitted_diff_symbol *minus_line,
+					    struct emitted_diff_symbol *plus_line,
+					    int minus_lineno, int plus_lineno,
+					    int changed, int moved,
+						    int window_score,
+						    int line_score,
+						    int pair_score)
+{
+	struct strbuf suffix = STRBUF_INIT;
+
+	if (moved) {
+		minus_line->flags |= DIFF_SYMBOL_MOVED_LINE;
+		plus_line->flags |= DIFF_SYMBOL_MOVED_LINE;
+	}
+
+	strbuf_addf(&suffix,
+		    "        # aligned from %d to %d, %s, W=%d L=%d S=%d",
+		    minus_lineno, plus_lineno,
+		    changed ? "edited" : "unchanged",
+		    window_score, line_score, pair_score);
+	word_diff_align_debug_append_comment(minus_line, &suffix);
+	word_diff_align_debug_append_comment(plus_line, &suffix);
+	strbuf_release(&suffix);
+}
+
+static void word_diff_align_add_item(struct word_diff_align_item **items,
+				     int *items_nr, int *items_alloc,
+				     int symbol, int change_pos, int line_no)
+{
+	ALLOC_GROW(*items, *items_nr + 1, *items_alloc);
+	(*items)[*items_nr].symbol = symbol;
+	(*items)[*items_nr].change_pos = change_pos;
+	(*items)[*items_nr].line_no = line_no;
+	(*items_nr)++;
+}
+
+static void word_diff_align_hunk_start(const struct emitted_diff_symbols *e,
+				       int start, int *old_lineno,
+				       int *new_lineno)
+{
+	const char *line;
+	char *endp;
+
+	*old_lineno = 0;
+	*new_lineno = 0;
+	if (start <= 0 || e->buf[start - 1].s != DIFF_SYMBOL_CONTEXT_FRAGINFO)
+		return;
+	line = e->buf[start - 1].line;
+	if (!starts_with(line, "@@ -"))
+		return;
+	line += 4;
+	if (!isdigit((unsigned char)*line))
+		return;
+	*old_lineno = strtol(line, &endp, 10);
+	line = endp;
+	if (*line == ',') {
+		line++;
+		strtol(line, &endp, 10);
+		line = endp;
+	}
+	if (!starts_with(line, " +"))
+		return;
+	line += 2;
+	if (!isdigit((unsigned char)*line))
+		return;
+	*new_lineno = strtol(line, &endp, 10);
+}
+
+static int word_diff_align_local_pair(int old_pos, int new_pos)
+{
+	return old_pos == new_pos;
+}
+
+static void word_diff_align_hunk(struct emitted_diff_symbols *e, int start, int end)
+{
+	struct word_diff_align_item *old_items = NULL, *new_items = NULL;
+	int old_items_nr = 0, old_items_alloc = 0;
+	int new_items_nr = 0, new_items_alloc = 0;
+	int minus_nr = 0, plus_nr = 0;
+	struct word_diff_align_line_info *old_lines = NULL, *new_lines = NULL;
+	unsigned int *old_tokens = NULL, *new_tokens = NULL;
+	int old_tokens_nr = 0, old_tokens_alloc = 0;
+	int new_tokens_nr = 0, new_tokens_alloc = 0;
+	uint64_t *new_contexts = NULL;
+	int *bucket_heads = NULL, *bucket_next = NULL, *bucket_counts = NULL;
+	int *candidate_counts = NULL, *candidate_touched = NULL;
+	char *used_old = NULL, *used_plus = NULL;
+	struct word_diff_align_candidate *candidates = NULL;
+	int candidates_nr = 0, candidates_alloc = 0;
+	int old_lineno, new_lineno;
+	int i, j;
+
+	word_diff_align_hunk_start(e, start, &old_lineno, &new_lineno);
+	for (i = start; i < end; i++) {
+		if (e->buf[i].s == DIFF_SYMBOL_MINUS) {
+			word_diff_align_add_item(&old_items, &old_items_nr,
+						 &old_items_alloc, i, minus_nr++,
+						 old_lineno++);
+		} else if (e->buf[i].s == DIFF_SYMBOL_PLUS) {
+			word_diff_align_add_item(&new_items, &new_items_nr,
+						 &new_items_alloc, i, plus_nr++,
+						 new_lineno++);
+		} else if (e->buf[i].s == DIFF_SYMBOL_CONTEXT) {
+			word_diff_align_add_item(&old_items, &old_items_nr,
+						 &old_items_alloc, i, -1,
+						 old_lineno++);
+			word_diff_align_add_item(&new_items, &new_items_nr,
+						 &new_items_alloc, i, -1,
+						 new_lineno++);
+		}
+	}
+
+	if (!minus_nr || !plus_nr)
+		goto cleanup;
+
+	CALLOC_ARRAY(old_lines, old_items_nr);
+	CALLOC_ARRAY(new_lines, new_items_nr);
+	CALLOC_ARRAY(new_contexts, new_items_nr);
+	CALLOC_ARRAY(bucket_heads, WORD_DIFF_ALIGN_FINGERPRINT_BITS);
+	CALLOC_ARRAY(bucket_counts, WORD_DIFF_ALIGN_FINGERPRINT_BITS);
+	CALLOC_ARRAY(bucket_next, new_items_nr * WORD_DIFF_ALIGN_FINGERPRINT_BITS);
+	CALLOC_ARRAY(candidate_counts, new_items_nr);
+	CALLOC_ARRAY(candidate_touched, new_items_nr);
+	CALLOC_ARRAY(used_old, minus_nr);
+	CALLOC_ARRAY(used_plus, plus_nr);
+
+	for (i = 0; i < old_items_nr; i++)
+		word_diff_align_prepare_line(&e->buf[old_items[i].symbol], &old_lines[i],
+					     &old_tokens, &old_tokens_nr,
+					     &old_tokens_alloc);
+	for (i = 0; i < new_items_nr; i++)
+		word_diff_align_prepare_line(&e->buf[new_items[i].symbol], &new_lines[i],
+					     &new_tokens, &new_tokens_nr,
+					     &new_tokens_alloc);
+	for (i = 0; i < new_items_nr; i++)
+		new_contexts[i] = word_diff_align_window_bits(new_lines, new_items_nr, i);
+
+	for (i = 0; i < new_items_nr; i++) {
+		if (new_items[i].change_pos < 0)
+			continue;
+		for (j = 0; j < WORD_DIFF_ALIGN_FINGERPRINT_BITS; j++) {
+			uint64_t bit = 1ULL << j;
+
+			if (!(new_contexts[i] & bit))
+				continue;
+			bucket_counts[j]++;
+			bucket_next[i * WORD_DIFF_ALIGN_FINGERPRINT_BITS + j] = bucket_heads[j];
+			bucket_heads[j] = i + 1;
+		}
+	}
+
+	for (i = 0; i < old_items_nr; i++) {
+		int touched_nr = 0;
+		uint64_t old_context;
+
+		if (old_items[i].change_pos < 0)
+			continue;
+		old_context = word_diff_align_window_bits(old_lines, old_items_nr, i);
+		for (j = 0; j < WORD_DIFF_ALIGN_FINGERPRINT_BITS; j++) {
+			uint64_t bit = 1ULL << j;
+			int item;
+
+			if (!(old_context & bit))
+				continue;
+			if (bucket_counts[j] > WORD_DIFF_ALIGN_BIT_BUCKET_MAX)
+				continue;
+			for (item = bucket_heads[j]; item; item = bucket_next[(item - 1) * WORD_DIFF_ALIGN_FINGERPRINT_BITS + j]) {
+				int new_pos = item - 1;
+
+				if (!candidate_counts[new_pos])
+					candidate_touched[touched_nr++] = new_pos;
+				candidate_counts[new_pos]++;
+			}
+		}
+
+		for (j = 0; j < touched_nr; j++) {
+			int new_pos = candidate_touched[j];
+			int filter_score, filter_shared;
+			int window_score, window_shared;
+			int line_score, line_shared, pair_score;
+
+			candidate_counts[new_pos] = 0;
+			if (new_items[new_pos].change_pos < 0)
+				continue;
+			filter_score = word_diff_align_filter_score(old_context,
+								    new_contexts[new_pos],
+								    &filter_shared);
+			if (filter_score < WORD_DIFF_ALIGN_WINDOW_MIN_SCORE ||
+			    filter_shared < WORD_DIFF_ALIGN_WINDOW_MIN_SHARED)
+				continue;
+			window_score = word_diff_align_window_token_score(old_lines,
+									 old_items_nr,
+									 i,
+									 old_tokens,
+									 new_lines,
+									 new_items_nr,
+									 new_pos,
+									 new_tokens,
+									 &window_shared);
+			if (window_score < WORD_DIFF_ALIGN_WINDOW_MIN_SCORE)
+				continue;
+			pair_score = word_diff_align_pair_score(window_score,
+								&old_lines[i],
+								old_lines,
+								old_items_nr,
+								i,
+								old_tokens,
+								&new_lines[new_pos],
+								new_lines,
+								new_items_nr,
+								new_pos,
+								new_tokens,
+								&line_score,
+								&line_shared);
+			if (pair_score < WORD_DIFF_ALIGN_SCORE_MIN)
+				continue;
+			word_diff_align_add_candidate(&candidates, &candidates_nr,
+						      &candidates_alloc, i, new_pos,
+						      pair_score, window_score,
+						      window_shared, line_score,
+						      line_shared);
+		}
+	}
+
+	QSORT(candidates, candidates_nr, word_diff_align_candidate_cmp);
+	for (i = 0; i < candidates_nr; i++) {
+		int old_pos = candidates[i].old_pos;
+		int new_pos = candidates[i].new_pos;
+		int minus_pos = old_items[old_pos].change_pos;
+		int plus_pos = new_items[new_pos].change_pos;
+		int old_len, new_len, edited, moved;
+
+		if (used_old[minus_pos] || used_plus[plus_pos])
+			continue;
+		used_old[minus_pos] = 1;
+		used_plus[plus_pos] = 1;
+		old_len = word_diff_align_payload_len(&e->buf[old_items[old_pos].symbol]);
+		new_len = word_diff_align_payload_len(&e->buf[new_items[new_pos].symbol]);
+		edited = old_len != new_len ||
+			memcmp(e->buf[old_items[old_pos].symbol].line,
+			       e->buf[new_items[new_pos].symbol].line,
+			       old_len);
+		moved = !word_diff_align_local_pair(old_pos, new_pos);
+		word_diff_align_debug_mark_pair(&e->buf[old_items[old_pos].symbol],
+						&e->buf[new_items[new_pos].symbol],
+						old_items[old_pos].line_no,
+						new_items[new_pos].line_no,
+						edited, moved,
+						candidates[i].window_score,
+						candidates[i].line_score,
+						candidates[i].score);
+	}
+
+cleanup:
+	free(old_items);
+	free(new_items);
+	free(old_lines);
+	free(new_lines);
+	free(old_tokens);
+	free(new_tokens);
+	free(new_contexts);
+	free(bucket_heads);
+	free(bucket_next);
+	free(bucket_counts);
+	free(candidate_counts);
+	free(candidate_touched);
+	free(used_old);
+	free(used_plus);
+	free(candidates);
+}
+
+static void mark_word_diff_align(struct diff_options *o)
+{
+	int i, hunk_start = -1;
+	struct emitted_diff_symbols *e = o->emitted_symbols;
+
+	for (i = 0; i < e->nr; i++) {
+		if (e->buf[i].s == DIFF_SYMBOL_CONTEXT_FRAGINFO) {
+			if (hunk_start >= 0)
+				word_diff_align_hunk(e, hunk_start, i);
+			hunk_start = i + 1;
+		} else if (hunk_start >= 0 &&
+			   e->buf[i].s != DIFF_SYMBOL_CONTEXT &&
+			   e->buf[i].s != DIFF_SYMBOL_CONTEXT_INCOMPLETE &&
+			   e->buf[i].s != DIFF_SYMBOL_CONTEXT_MARKER &&
+			   e->buf[i].s != DIFF_SYMBOL_PLUS &&
+			   e->buf[i].s != DIFF_SYMBOL_MINUS) {
+			word_diff_align_hunk(e, hunk_start, i);
+			hunk_start = -1;
+		}
+	}
+
+	if (hunk_start >= 0)
+		word_diff_align_hunk(e, hunk_start, e->nr);
+}
+
 static void emit_line_ws_markup(struct diff_options *o,
 				const char *set_sign, const char *set,
 				const char *reset,
@@ -5245,6 +6014,10 @@ void diff_setup_done(struct diff_options *options)
 		die(_("options '%s' and '%s' cannot be used together, use '%s' with '%s' and '%s'"),
 			"--pickaxe-all", "--find-object", "--pickaxe-all", "-G", "-S");
 
+	if (options->word_diff_align && options->word_diff)
+		die(_("options '%s' and '%s' cannot be used together"),
+		    "--word-diff-align", "--word-diff");
+
 	/*
 	 * Most of the time we can say "there are changes"
 	 * only by checking if there are changed paths, but
@@ -5971,6 +6744,18 @@ static int diff_opt_word_diff_regex(const struct option *opt,
 	return 0;
 }
 
+static int diff_opt_word_diff_align(const struct option *opt,
+				    const char *arg, int unset)
+{
+	struct diff_options *options = opt->value;
+
+	BUG_ON_OPT_ARG(arg);
+	options->word_diff_align = !unset;
+	if (!unset)
+		enable_patch_output(&options->output_format);
+	return 0;
+}
+
 static int diff_opt_rotate_to(const struct option *opt, const char *arg, int unset)
 {
 	struct diff_options *options = opt->value;
@@ -6205,6 +6990,9 @@ struct option *add_diff_options(const struct option *opts,
 		OPT_CALLBACK_F(0, "word-diff-regex", options, N_("<regex>"),
 			       N_("use <regex> to decide what a word is"),
 			       PARSE_OPT_NONEG, diff_opt_word_diff_regex),
+		OPT_CALLBACK_F(0, "word-diff-align", options, NULL,
+			       N_("annotate similar moved lines"),
+			       PARSE_OPT_NOARG, diff_opt_word_diff_align),
 		OPT_CALLBACK_F(0, "color-words", options, N_("<regex>"),
 			       N_("equivalent to --word-diff=color --word-diff-regex=<regex>"),
 			       PARSE_OPT_NONEG | PARSE_OPT_OPTARG, diff_opt_color_words),
@@ -7076,7 +7864,7 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	if (WSEH_NEW & WS_RULE_MASK)
 		BUG("WS rules bit mask overlaps with diff symbol flags");
 
-	if (o->color_moved && want_color(o->use_color))
+	if ((o->color_moved && want_color(o->use_color)) || o->word_diff_align)
 		o->emitted_symbols = &esm;
 
 	if (o->additional_path_headers)
@@ -7092,14 +7880,19 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 		struct mem_pool entry_pool;
 		struct moved_entry_list *entry_list;
 
-		mem_pool_init(&entry_pool, 1024 * 1024);
-		entry_list = add_lines_to_move_detection(o, &entry_pool);
-		mark_color_as_moved(o, entry_list);
-		if (o->color_moved == COLOR_MOVED_ZEBRA_DIM)
-			dim_moved_lines(o);
+		if (o->word_diff_align)
+			mark_word_diff_align(o);
+
+		if (o->color_moved && want_color(o->use_color)) {
+			mem_pool_init(&entry_pool, 1024 * 1024);
+			entry_list = add_lines_to_move_detection(o, &entry_pool);
+			mark_color_as_moved(o, entry_list);
+			if (o->color_moved == COLOR_MOVED_ZEBRA_DIM)
+				dim_moved_lines(o);
 
-		mem_pool_discard(&entry_pool, 0);
-		free(entry_list);
+			mem_pool_discard(&entry_pool, 0);
+			free(entry_list);
+		}
 
 		for (i = 0; i < esm.nr; i++)
 			emit_diff_symbol_from_struct(o, &esm.buf[i]);
diff --git a/diff.h b/diff.h
index 7eb84aadf..b9a20b7f2 100644
--- a/diff.h
+++ b/diff.h
@@ -351,6 +351,7 @@ struct diff_options {
 	int stat_count;
 	const char *word_regex;
 	enum diff_words_type word_diff;
+	int word_diff_align;
 	enum diff_submodule_format submodule_format;
 
 	struct oidset *objfind;
-- 
2.39.3 (Apple Git-146)

^ permalink raw reply related

* [RFC PATCH 2/3] diff: render word-diff-align pairs for RFC review
From: Keita Oda @ 2026-05-27  4:24 UTC (permalink / raw)
  To: git; +Cc: Keita ODA
In-Reply-To: <20260527042402.13607-1-ainsophyao@gmail.com>

From: Keita ODA <ainsophyao@gmail.com>

Teach the RFC prototype to render selected --word-diff-align pairs with
word-diff-like markers.

This renderer is deliberately small and local to the RFC.  It exists to make
the recovered line pairs inspectable in review output.  It is not meant to be
the final UI.  A production version should likely reuse the existing word-diff
machinery once the line-pairing question is settled.

The renderer computes a token LCS for the selected pair and marks the unmatched
spans with the familiar plain word-diff delimiters:

  [-old-]
  {+new+}

Moved selected pairs are also marked with DIFF_SYMBOL_MOVED_LINE so that the
current moved-line coloring can show that the pair came from a moved region.

---
 diff.c | 213 +++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 200 insertions(+), 13 deletions(-)

diff --git a/diff.c b/diff.c
index 6b8744920..8629d4670 100644
--- a/diff.c
+++ b/diff.c
@@ -1811,6 +1811,104 @@ static void word_diff_align_add_candidate(struct word_diff_align_candidate **can
 	candidate->line_shared = line_shared;
 }
 
+/*
+ * RFC-only formatter for exposing the selected line pairs.  The final
+ * presentation should reuse the normal word-diff machinery instead of this
+ * small debug renderer.
+ */
+struct word_diff_align_debug_token {
+	int start;
+	int end;
+};
+
+static void word_diff_align_debug_collect_tokens(const struct emitted_diff_symbol *line,
+						 struct word_diff_align_debug_token **tokens,
+						 int *tokens_nr, int *tokens_alloc)
+{
+	int len = word_diff_align_payload_len(line);
+	int pos = 0, start, end;
+
+	while (word_diff_align_next_token(line->line, len, &pos, &start, &end)) {
+		ALLOC_GROW(*tokens, *tokens_nr + 1, *tokens_alloc);
+		(*tokens)[*tokens_nr].start = start;
+		(*tokens)[*tokens_nr].end = end;
+		(*tokens_nr)++;
+	}
+}
+
+static void word_diff_align_debug_add_span(struct strbuf *out,
+					   const char *open,
+					   const char *line, int len,
+					   const char *close)
+{
+	if (!len)
+		return;
+	strbuf_addstr(out, open);
+	strbuf_add(out, line, len);
+	strbuf_addstr(out, close);
+}
+
+static int word_diff_align_debug_token_eq(const struct emitted_diff_symbol *a,
+					  const struct word_diff_align_debug_token *a_tok,
+					  const struct emitted_diff_symbol *b,
+					  const struct word_diff_align_debug_token *b_tok)
+{
+	int a_len = a_tok->end - a_tok->start;
+	int b_len = b_tok->end - b_tok->start;
+
+	return a_len == b_len &&
+		!memcmp(a->line + a_tok->start, b->line + b_tok->start, a_len);
+}
+
+static void word_diff_align_debug_rewrite_line(struct emitted_diff_symbol *line,
+					       struct word_diff_align_debug_token *tokens,
+					       int tokens_nr, int *match_to,
+					       const struct emitted_diff_symbol *other,
+					       struct word_diff_align_debug_token *other_tokens,
+					       const char *open, const char *close)
+{
+	struct strbuf out = STRBUF_INIT;
+	char *old_line = (char *)line->line;
+	int payload_len = word_diff_align_payload_len(line);
+	int other_pos = 0;
+	int other_payload_len = word_diff_align_payload_len(other);
+	int pos = 0, i;
+	size_t new_len;
+
+	for (i = 0; i < tokens_nr; i++) {
+		int other_i = match_to[i];
+		int gap_len, other_gap_len;
+
+		if (other_i < 0)
+			continue;
+		gap_len = tokens[i].start - pos;
+		other_gap_len = other_tokens[other_i].start - other_pos;
+		if (gap_len == other_gap_len &&
+		    !memcmp(line->line + pos, other->line + other_pos, gap_len))
+			strbuf_add(&out, line->line + pos, gap_len);
+		else
+			word_diff_align_debug_add_span(&out, open,
+						       line->line + pos,
+						       gap_len, close);
+		strbuf_add(&out, line->line + tokens[i].start,
+			   tokens[i].end - tokens[i].start);
+		pos = tokens[i].end;
+		other_pos = other_tokens[other_i].end;
+	}
+	if (payload_len - pos == other_payload_len - other_pos &&
+	    !memcmp(line->line + pos, other->line + other_pos,
+		    payload_len - pos))
+		strbuf_add(&out, line->line + pos, payload_len - pos);
+	else
+		word_diff_align_debug_add_span(&out, open, line->line + pos,
+					       payload_len - pos, close);
+	strbuf_add(&out, line->line + payload_len, line->len - payload_len);
+
+	line->line = strbuf_detach(&out, &new_len);
+	line->len = (int)new_len;
+	free(old_line);
+}
+
 static void word_diff_align_debug_append_comment(struct emitted_diff_symbol *line,
 						 const struct strbuf *suffix)
 {
@@ -1835,25 +1933,114 @@ static void word_diff_align_debug_mark_pair(struct emitted_diff_symbol *minus_li
 					    struct emitted_diff_symbol *plus_line,
 					    int minus_lineno, int plus_lineno,
 					    int changed, int moved,
-						    int window_score,
-						    int line_score,
-						    int pair_score)
-{
-	struct strbuf suffix = STRBUF_INIT;
+					    int window_score,
+					    int line_score,
+					    int pair_score)
+{
+	struct word_diff_align_debug_token *minus_tokens = NULL, *plus_tokens = NULL;
+	int minus_tokens_nr = 0, minus_tokens_alloc = 0;
+	int plus_tokens_nr = 0, plus_tokens_alloc = 0;
+	int *minus_match_to = NULL, *plus_match_to = NULL;
+	int *lcs = NULL;
+	struct emitted_diff_symbol minus_original = *minus_line;
+	struct emitted_diff_symbol plus_original = *plus_line;
+	int i, j, columns;
 
 	if (moved) {
 		minus_line->flags |= DIFF_SYMBOL_MOVED_LINE;
 		plus_line->flags |= DIFF_SYMBOL_MOVED_LINE;
 	}
 
-	strbuf_addf(&suffix,
-		    "        # aligned from %d to %d, %s, W=%d L=%d S=%d",
-		    minus_lineno, plus_lineno,
-		    changed ? "edited" : "unchanged",
-		    window_score, line_score, pair_score);
-	word_diff_align_debug_append_comment(minus_line, &suffix);
-	word_diff_align_debug_append_comment(plus_line, &suffix);
-	strbuf_release(&suffix);
+	minus_original.line = xmemdupz(minus_line->line, minus_line->len);
+	plus_original.line = xmemdupz(plus_line->line, plus_line->len);
+	if (!changed)
+		goto comment;
+
+	word_diff_align_debug_collect_tokens(minus_line, &minus_tokens,
+					     &minus_tokens_nr,
+					     &minus_tokens_alloc);
+	word_diff_align_debug_collect_tokens(plus_line, &plus_tokens,
+					     &plus_tokens_nr,
+					     &plus_tokens_alloc);
+	if (!minus_tokens_nr || !plus_tokens_nr)
+		goto comment;
+
+	columns = plus_tokens_nr + 1;
+	CALLOC_ARRAY(lcs, (minus_tokens_nr + 1) * columns);
+	ALLOC_ARRAY(minus_match_to, minus_tokens_nr);
+	ALLOC_ARRAY(plus_match_to, plus_tokens_nr);
+	for (i = 0; i < minus_tokens_nr; i++)
+		minus_match_to[i] = -1;
+	for (j = 0; j < plus_tokens_nr; j++)
+		plus_match_to[j] = -1;
+
+	for (i = 1; i <= minus_tokens_nr; i++) {
+		for (j = 1; j <= plus_tokens_nr; j++) {
+			if (word_diff_align_debug_token_eq(minus_line,
+							   &minus_tokens[i - 1],
+							   plus_line,
+							   &plus_tokens[j - 1]))
+				lcs[i * columns + j] =
+					lcs[(i - 1) * columns + j - 1] + 1;
+			else if (lcs[(i - 1) * columns + j] >
+				 lcs[i * columns + j - 1])
+				lcs[i * columns + j] = lcs[(i - 1) * columns + j];
+			else
+				lcs[i * columns + j] = lcs[i * columns + j - 1];
+		}
+	}
+
+	i = minus_tokens_nr;
+	j = plus_tokens_nr;
+	while (i > 0 && j > 0) {
+		if (lcs[i * columns + j] == lcs[i * columns + j - 1]) {
+			j--;
+		} else if (lcs[i * columns + j] ==
+			   lcs[(i - 1) * columns + j]) {
+			i--;
+		} else if (word_diff_align_debug_token_eq(minus_line,
+							  &minus_tokens[i - 1],
+							  plus_line,
+							  &plus_tokens[j - 1])) {
+			minus_match_to[i - 1] = j - 1;
+			plus_match_to[j - 1] = i - 1;
+			i--;
+			j--;
+		} else {
+			BUG("word-diff-align display LCS backtrack failed");
+		}
+	}
+
+	word_diff_align_debug_rewrite_line(minus_line, minus_tokens,
+					   minus_tokens_nr, minus_match_to,
+					   &plus_original, plus_tokens,
+					   "[-", "-]");
+	word_diff_align_debug_rewrite_line(plus_line, plus_tokens,
+					   plus_tokens_nr, plus_match_to,
+					   &minus_original, minus_tokens,
+					   "{+", "+}");
+
+comment:
+	{
+		struct strbuf suffix = STRBUF_INIT;
+
+		strbuf_addf(&suffix,
+			    "        # aligned from %d to %d, %s, W=%d L=%d S=%d",
+			    minus_lineno, plus_lineno,
+			    changed ? "edited" : "unchanged",
+			    window_score, line_score, pair_score);
+		word_diff_align_debug_append_comment(minus_line, &suffix);
+		word_diff_align_debug_append_comment(plus_line, &suffix);
+		strbuf_release(&suffix);
+	}
+
+	free((char *)minus_original.line);
+	free((char *)plus_original.line);
+	free(minus_tokens);
+	free(plus_tokens);
+	free(minus_match_to);
+	free(plus_match_to);
+	free(lcs);
 }
 
 static void word_diff_align_add_item(struct word_diff_align_item **items,
-- 
2.39.3 (Apple Git-146)

^ permalink raw reply related

* [RFC PATCH 3/3] t4034: cover moved-and-edited word diff alignment
From: Keita Oda @ 2026-05-27  4:24 UTC (permalink / raw)
  To: git; +Cc: Keita ODA
In-Reply-To: <20260527042402.13607-1-ainsophyao@gmail.com>

From: Keita ODA <ainsophyao@gmail.com>

Add a focused --word-diff-align test with a moved permission-style block where
one value changes inside the moved block.

The test is intentionally small.  It checks that the changed line is paired and
that the RFC renderer exposes the old and new value with word-diff-like
markers.

---
 t/t4034-diff-words.sh | 46 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/t/t4034-diff-words.sh b/t/t4034-diff-words.sh
index 0be647c2f..7bf696b17 100755
--- a/t/t4034-diff-words.sh
+++ b/t/t4034-diff-words.sh
@@ -148,6 +148,52 @@ test_expect_success '--word-diff=plain' '
 	word_diff --word-diff=plain --no-color
 '
 
+test_expect_success '--word-diff-align marks moved-and-edited block lines' '
+	cat >old <<-\EOF &&
+	#define LIMIT_PUBLIC_UPLOADS       25
+	#define LIMIT_PUBLIC_INVITES       8
+
+	/* Public resource permissions. */
+	#define PERM_RESOURCE_READ         0x0001
+	#define PERM_RESOURCE_LIST         0x0002
+	#define PERM_RESOURCE_COMMENT      0x0004
+	#define PERM_RESOURCE_EXPORT       0x0008
+	#define PERM_RESOURCE_SHARE        0x0010
+	#define PERM_RESOURCE_ARCHIVE      0x0020
+	#define PERM_RESOURCE_DELETE       0x0040
+	#define PERM_RESOURCE_ADMIN        0x0080
+
+	#define PERM_INTERNAL_DEBUG        0x0100
+	#define PERM_INTERNAL_IMPERSONATE  0x0200
+
+	#define AUDIT_POLICY_STRICT        1
+	#define AUDIT_POLICY_VERBOSE       2
+	EOF
+	cat >new <<-\EOF &&
+	#define LIMIT_PUBLIC_UPLOADS       25
+	#define LIMIT_PUBLIC_INVITES       8
+
+	#define PERM_INTERNAL_DEBUG        0x0100
+	#define PERM_INTERNAL_IMPERSONATE  0x0200
+
+	#define AUDIT_POLICY_STRICT        1
+	#define AUDIT_POLICY_VERBOSE       2
+
+	/* Public resource permissions. */
+	#define PERM_RESOURCE_READ         0x0001
+	#define PERM_RESOURCE_LIST         0x0002
+	#define PERM_RESOURCE_COMMENT      0x0004
+	#define PERM_RESOURCE_EXPORT       0x0001
+	#define PERM_RESOURCE_SHARE        0x0010
+	#define PERM_RESOURCE_ARCHIVE      0x0020
+	#define PERM_RESOURCE_DELETE       0x0040
+	#define PERM_RESOURCE_ADMIN        0x0080
+	EOF
+	test_must_fail git diff --no-index --histogram --word-diff-align old new >actual &&
+	test_grep "^-#define PERM_RESOURCE_EXPORT\\[-.*0x0008-\\]" actual &&
+	test_grep "^+#define PERM_RESOURCE_EXPORT{+.*0x0001+}" actual
+'
+
 test_expect_success '--word-diff=plain --color' '
 	cat >expect <<-EOF &&
 		<BOLD>diff --git a/pre b/post<RESET>
-- 
2.39.3 (Apple Git-146)

^ permalink raw reply related

* Re: limiting git branch --contains
From: Kristofer Karlsson @ 2026-05-27  7:05 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee
In-Reply-To: <20230324191009.GA536967@coredump.intra.peff.net>

Hi!

I'm reviving this old thread because I believe the discussion here
is still relevant and the patch Peff included is good. I think I
accidentally created the exact same patch before I found this thread.

I work on a large repo (millions of commits, ~8000 remote tracking
branches) where "git for-each-ref --contains <commit> refs/remotes/"
was taking 4+ seconds even with a commit-graph, and much worse for
deeper commits. I started investigating and building a more complex
batched solution before I realized that the cached DFS algorithm
(contains_tag_algo) already exists and does exactly what's needed --
it just wasn't enabled for branches.

With generation numbers (commit-graph), the cached algorithm is
strictly at least as good as the uncached one. Both have the same
walk floor -- the generation cutoff in contains_tag_algo is equivalent
to the STALE boundary in paint_down_to_common. The cache then provides
a pure win: O(total unique commits) instead of O(N * commits per ref).

In my benchmarks, the improvement is significant and scales with the
number of refs and the depth of the target commit:

git.git (69K commits, 5826 tags) with commit-graph:

  Scenario              Master    Cached    Speedup
  Deep (v2.30.0)        1.08s     292ms     3.7x
  Recent (HEAD~100)     338ms     240ms     1.4x
  Orphan (unreachable)  271ms     241ms     1.1x

Large monorepo with commit-graph:

  Scenario              Master    Cached    Speedup
  HEAD~10000            511ms     239ms     2.1x
  HEAD~50000            4.14s     272ms     15x
  Orphan (unreachable)  4.13s     252ms     16x

I also tested with varying ref counts on git.git. The cached
algorithm is faster or tied in every scenario I tested -- the
crossover point is around 3-6 refs, below which both complete in
single-digit milliseconds.

Without commit-graph, the difference is even more dramatic (41-54x
on git.git with 5826 tags), though the theoretical argument is less
clean: without generation numbers, contains_tag_algo has no walk
floor and may traverse to root commits for unreachable targets.
In practice the cache still wins for N > 1, but I don't have a
proof that covers all possible DAG shapes. I think we should start
by optimizing it for the case where we have generation numbers,
and maybe keep exploring if there are any meaningful scenarios
without generation numbers where it would actually be a regression.

To reproduce the benchmarks:

  # Setup
  ORPHAN=$(git commit-tree HEAD^{tree} -m "unreachable")
  git commit-graph write --reachable

  # git.git with 5826 tags
  git for-each-ref --contains v2.30.0 refs/tags/

  # Large repo with remote refs
  git for-each-ref --contains HEAD~50000 refs/remotes/

-- Kristofer

^ permalink raw reply

* Re: [PATCH] fetch: pass transport to post-fetch connectivity check
From: Jeff King @ 2026-05-27  8:32 UTC (permalink / raw)
  To: Kristofer Karlsson via GitGitGadget; +Cc: git, Kristofer Karlsson
In-Reply-To: <pull.2123.git.1779625693328.gitgitgadget@gmail.com>

On Sun, May 24, 2026 at 12:28:12PM +0000, Kristofer Karlsson via GitGitGadget wrote:

> From: Kristofer Karlsson <krka@spotify.com>
> 
> When fetching with a transport that sets `self_contained_and_connected`
> (as index-pack does for self-contained packs), check_connected() can
> use find_pack_entry_one() to skip connectivity verification for refs
> whose objects exist in the new pack. This avoids sending those OIDs to
> the rev-list child process.
> 
> However, store_updated_refs() never passed the transport to
> check_connected(), so opt.transport was always NULL and this
> optimization was dead code for post-fetch connectivity checks.
> 
> Thread the transport parameter through store_updated_refs() and set
> opt.transport so that check_connected() can take advantage of
> self-contained packs.

That makes sense in principle, but one thing puzzles me. We only turn on
the optimization in check_connected() if the transport's smart_options
has the self_contained_and_connected bit set. And we set that only when
we were told via check_self_contained_and_connected to do so (and we
pass the appropriate option to index-pack, which tells us the result is
OK).

And the only place that turns on check_self_contained_and_connected is
in builtin/clone.c. So how does this optimization work for a non-clone
fetch? Am I missing some code path?

-Peff

^ permalink raw reply

* Re: [PATCH 1/8] pack-bitmap: pass object position to `fill_bitmap_tree()`
From: Jeff King @ 2026-05-27  8:57 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Elijah Newren, Derrick Stolee
In-Reply-To: <13191c19b91bc3f5d671b7016b97f2309f12737d.1779207127.git.me@ttaylorr.com>

On Tue, May 19, 2026 at 12:12:36PM -0400, Taylor Blau wrote:

> In the following commit, callers of `fill_bitmap_tree()` will be
> required to check the bit corresponding to their tree before calling
> that function. That change will reduce the overhead of setting up and
> tearing down stack frames for trees whose bits are already set.
> 
> To prepare for that change, have callers pass in the tree's bit position
> in `fill_bitmap_tree()`, which will make the next commit easier to read.
> 
> In the meantime, this change has a surprising and measurable benefit
> during bitmap generation, particularly on very large repositories.

It is indeed surprising. There's a possible candidate for the speedup
here:

> @@ -482,8 +479,12 @@ static int fill_bitmap_tree(struct bitmap_writer *writer,
>  	while (tree_entry(&desc, &entry)) {
>  		switch (object_type(entry.mode)) {
>  		case OBJ_TREE:
> +			pos = find_object_pos(writer, &entry.oid, &found);
> +			if (!found)
> +				return -1;
>  			if (fill_bitmap_tree(writer, bitmap,
> -					     lookup_tree(writer->repo, &entry.oid)) < 0)
> +					     lookup_tree(writer->repo,
> +							 &entry.oid), pos) < 0)
>  				return -1;
>  			break;

Whenever "found" is false, we cut out early and skip the hash lookup in
lookup_tree() entirely. But that should almost never happen! It implies
that a reachable object is not in the pack/midx, and thus the bitmaps is
not closed (and we'll refuse to generate it).

So it really is the case that we do the same operations in a different
order. Weird.

But the patch itself looks correct to me, and I get ~6% speedup on a
from-scratch bitmap generation of linux.git. I guess it could vary
between architectures and compilers (I'm using gcc on x86), but since
the reorg is setting us up for further optimizations in the next patch,
I suppose there's no need to look a gift horse in the mouth.

-Peff

^ permalink raw reply

* Re: [PATCH 2/8] pack-bitmap: check subtree bits before recursing
From: Jeff King @ 2026-05-27  9:03 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Elijah Newren, Derrick Stolee
In-Reply-To: <7d6d1cec0dd2706ba176c7fa070da46c98155018.1779207127.git.me@ttaylorr.com>

On Tue, May 19, 2026 at 12:12:39PM -0400, Taylor Blau wrote:

> In the previous commit, we adjusted the callers of `fill_bitmap_tree()`
> to pass in the bit position of the tree they wish to fill.
> 
> This commit makes use of that information at the call site to avoid
> setting up a stack frame for fill_bitmap_tree() entirely whenever a
> tree's bit position is already set.

OK, this one at least has a plausible explanation. ;)

I can reproduce your speedup on linux.git (~5% again). I don't love that
we have to duplicate the logic in each of the callers, but there are
only two sites (and unlikely to ever be more). And it is only one line,
the comment notwithstanding. That seems like a good tradeoff for a
multiple-second speedup.

The patch itself looks obviously correct.

-Peff

^ permalink raw reply

* Re: [PATCH 3/8] pack-bitmap: reuse stored selected bitmaps
From: Jeff King @ 2026-05-27  9:24 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Elijah Newren, Derrick Stolee
In-Reply-To: <6e1f6bef5f641481a6a875bc215b35fc56cef80c.1779207127.git.me@ttaylorr.com>

On Tue, May 19, 2026 at 12:12:41PM -0400, Taylor Blau wrote:

> Building bitmaps from scratch on the same test repository from the
> previous commits yields a significant speed-up:
> 
>     +------------------+-------------+-------------+---------------------+
>     |                  | HEAD^       | HEAD        | Delta               |
>     +------------------+-------------+-------------+---------------------+
>     | elapsed          |   562.8 s   |   324.8 s   |   -237.9 s (-42.3%) |
>     | cycles           | 2,621.3 B   | 1,508.6 B   | -1,112.7 B (-42.4%) |
>     | instructions     | 2,348.9 B   | 1,436.6 B   |   -912.3 B (-38.8%) |
>     | CPI              |     1.116   |     1.050   |   -0.066    (-5.9%) |
>     +------------------+-------------+-------------+---------------------+

Oh my, that's a rather nice speedup. I can reproduce here on linux.git
(~47% improvement).

> When `fill_bitmap_commit()` reaches an ancestor that was selected for
> its own bitmap and processed earlier, its object closure is already
> stored in `writer->bitmaps` as an EWAH bitmap. As a result, walking
> through that commit's tree and parents again is redundant.
> 
> Teach `fill_bitmap_commit()` to notice that case. For non-root commits in
> the walk, look for a stored selected bitmap and OR it into the bitmap
> being built. If one exists, skip the commit, its tree, and its parents.

I feel like this _shouldn't_ be necessary, because the idea of the
current writing code is to go from the roots up, following inverted
parent pointers, and passing the bitmap up as we go. So whenever we
visit a commit we should in theory have all of the ancestor's bits set
in that bitmap. But I remember that the simple-and-stupid approach ended
up being too memory hungry, so we pick some focal points in the graph
and then fill them independently.

And I guess that's what you're showing here:

> In our testing repository, there are 1,261 commits selected for bitmap
> coverage, and 1,382 maximal commits induced as a result of that. Of the
> 1,382 calls made to `fill_bitmap_commit()` (one per maximal commit), 131
> of them can be short-circuited at some point during their traversal as a
> consequence of this change.

We'll end up seeing some of the same parts of history for various
maximal commits, and this lets us sometimes reuse the earlier efforts.

Anyway, it is hard to argue with the numbers.

> @@ -553,6 +559,28 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
>  			bitmap_free(remapped);
>  		}
>  
> +		/*
> +		 * If we encounter an ancestor for which we have already
> +		 * computed a bitmap during this build (i.e. a regular
> +		 * selected commit processed earlier in topo order), we can
> +		 * short-circuit the walk: its stored bitmap already covers
> +		 * the commit itself, its tree, and all of its ancestors.
> +		 */
> +		if (c != commit) {
> +			khiter_t hash_pos = kh_get_oid_map(writer->bitmaps,
> +							   c->object.oid);
> +			if (hash_pos != kh_end(writer->bitmaps)) {
> +				struct bitmapped_commit *stored =
> +					kh_value(writer->bitmaps, hash_pos);
> +				if (stored && stored->bitmap) {
> +					fill_bitmap_commit_found_ancestor_nr++;
> +					bitmap_or_ewah(ent->bitmap,
> +						       stored->bitmap);
> +					continue;
> +				}
> +			}
> +		}

OK, so we incur one hash lookup per commit as we walk, which seems like
a good tradeoff.

I wondered about "c != commit" here. "c" is the commit we're traversing,
and "commit" is the one for which we're trying to build the bitmap. So
we would not expect to ever have an entry in writer->bitmaps for "c"
yet, but the conditional is just short-circuiting the hash lookup.

The rest of the patch looks obviously correct. The trace2 bits aren't
strictly necessary, of course, but some metrics might help with further
tuning.

-Peff

^ permalink raw reply

* Re: [PATCH 4/8] pack-bitmap: consolidate `find_object_pos()` success path
From: Jeff King @ 2026-05-27  9:27 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Elijah Newren, Derrick Stolee
In-Reply-To: <c9a560660949c53575a9b1e81160d25212a1f484.1779207127.git.me@ttaylorr.com>

On Tue, May 19, 2026 at 12:12:44PM -0400, Taylor Blau wrote:

> Both sides of `find_object_pos()` report success in the same way by
> setting the optional `found` out-parameter and return the resolved
> bitmap position.
> 
> Prepare for adding more bookkeeping around object-position lookups by
> storing the result in a local `pos` variable and sharing the success
> return path between the packlist and MIDX cases.

OK. Modulo the missing "pos" fixup, this seems like an obviously correct
refactor.  On its own its hard to judge if it makes things better, so
let's read on.

-Peff

^ permalink raw reply

* Re: [PATCH 5/8] pack-bitmap: cache object positions during fill
From: Jeff King @ 2026-05-27  9:45 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Elijah Newren, Derrick Stolee
In-Reply-To: <e43ef6a42d13578a6b7a4a346f491e51a6edfd14.1779207127.git.me@ttaylorr.com>

On Tue, May 19, 2026 at 12:12:47PM -0400, Taylor Blau wrote:

> The previous commits removed some redundant work from bitmap generation
> by avoiding unnecessary tree recursion and by reusing selected bitmaps
> that have already been computed.
> 
> Even with those changes in place, there is still an extremely hot path
> from `fill_bitmap_commit()` and `fill_bitmap_tree()` to translate object
> IDs into their corresponding bit positions in order to generate their
> bitmaps.
> 
> In a small repository, this overhead is not significant. However, in a
> very large repository (e.g., the one that we have been using as a
> benchmark over the past several commits with ~57M total objects), the
> overhead of locating object bit positions (often repeatedly) adds up
> significantly.
> 
> Combat this by adding a small, direct-mapped cache to the bitmap writer
> which maps object IDs to their corresponding bit positions. Size the
> cache according to the number of objects being written, with fixed lower
> and upper bounds so small repositories do not pay for a large table and
> large repositories can avoid most repeated packlist and MIDX lookups.

Introducing another layer of data structure feels so dirty, but it's
hard to argue with the numbers. We are looking up oids in the packlist,
so it's already O(lg n). Your cache here is essentially a hash lookup,
which is O(1)-ish (with collisions causing eviction rather than growth).
And it presumably works because there's a lot of locality in lookups
(between commits X and X^1, their top-level trees will be almost
identical but we have to resolve the bits to find out which entries are
new).

It does make me wonder if we'd see similar improvements if we just
turned the packlist into a regular hash table. Or maybe not, because
then we'd have to do actual probing.

It also makes me wonder if we could use this trick elsewhere, but I
guess we usually are using "struct object" itself to find repeats in
most graph traversals. And there we're using a hash table already. So
this might save us a tiny bit of probing, but not much else.

Likewise when comparing two trees directly, we can just walk them in
parallel to find the changed parts (which doesn't work here, because
we're comparing one tree to the bitmap of all ancestors, not just X^1).

So this really is a somewhat unique situation. It _might_ be applicable
for the reading side of bitmaps, though. When we do fill-in traversal we
end up with this same "read a tree, find the bit for each entry, and 99%
of the time find that it is already in the bitmap".

> On my machine with (a somewhat outdated) GCC 15.2.0, each entry in the
> cache is 40 bytes wide:
> 
>     $ pahole -C bitmap_pos_cache_entry pack-bitmap-write.o
>     struct bitmap_pos_cache_entry {
>             struct object_id           oid;                  /*     0    36 */
>             uint32_t                   pos;                  /*    36     4 */
> 
>             /* size: 40, cachelines: 1, members: 2 */
>             /* last cacheline: 40 bytes */
>     };

I wondered about storing a pointer to an oid here, which would be
smaller but require an extra level of pointer chasing. The ones from
object structs are stable, but I guess the ones from trees are not (they
point to an entry field which will be reused). So we have to store the
oid whole.

> In our example repository from above and in earlier commits, this
> results in a ~9.4% reduction in runtime relative to the previous commit:
> 
>     +------------------+-------------+-------------+---------------------+
>     |                  | HEAD^       | HEAD        | Delta               |
>     +------------------+-------------+-------------+---------------------+
>     | elapsed          |   324.8 s   |   294.1 s   |    -30.7 s  (-9.4%) |
>     | cycles           | 1,508.6 B   | 1,365.5 B   |   -143.0 B  (-9.5%) |
>     | instructions     | 1,436.6 B   | 1,389.8 B   |    -46.9 B  (-3.3%) |
>     | CPI              |     1.050   |     0.983   |   -0.068    (-6.4%) |
>     +------------------+-------------+-------------+---------------------+

I show a 26% speed up on linux.git (1m37 down to 1m12). Very cool.

> +static uint32_t store_cached_object_pos(struct bitmap_writer *writer,
> +					const struct object_id *oid,
> +					uint32_t pos)
> +{
> +	size_t slot;
> +
> +	if (pos & BITMAP_POS_CACHE_VALID)
> +		return pos; /* too large to cache */

Cute, I wondered what would happen if we went past 2^31. I suspect there
are other parts of the code that do not behave that well around that
size, but it is good that we are not introducing any new surprises.

The whole patch looked pretty cleanly done.

-Peff

^ permalink raw reply

* Re: [PATCH 6/8] pack-bitmap: sort bitmaps before XORing
From: Jeff King @ 2026-05-27 10:04 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Elijah Newren, Derrick Stolee
In-Reply-To: <b0a4f31353a7053ab37b6d8c8f22c69bcfadfe50.1779207127.git.me@ttaylorr.com>

On Tue, May 19, 2026 at 12:12:50PM -0400, Taylor Blau wrote:

> Reachability bitmaps may be stored as XORs against nearby bitmaps, up to
> 10 away. However, when callers provide selected commits in an arbitrary
> order, the writer may miss good ancestor/descendant pairs and produce
> much larger bitmap files without changing query coverage.
> 
> Sort the selected bitmaps in date order (from oldest to newest) before
> computing XOR offsets, leaving pseudo-merge bitmaps alone (which we will
> deal with separately in following commits).

That order certainly makes the most sense. I'd have thought we ended up
there incidentally because of the order in which we consider the
commits, but perhaps not. I wonder if this got much worse when we
re-wrote the bitmap generation code a few years ago.

That was in v2.31.0, I think. Repacking linux.git with bitmaps, though,
I couldn't find any difference in size between v2.30 and v2.31. They're
both ~67M. But that also didn't shrink with this patch, either.

If you have some spare CPU cycles to burn, I would be interested in a
comparison of the bitmap size of your test repo using v2.30.0, v2.31.1,
and this patch.

> On our same testing repository from previous commits, this change shrunk
> our selection of 1,261 bitmaps from ~635.46 MiB to 176.4 MiB for a
> ~72.24% reduction in the on-disk size of our *.bitmap file. The time to
> generate the smaller bitmap file decreased by ~3.69 seconds, though this
> is likely mostly noise.

Certainly good numbers. The obvious follow-up question is: how does the
reading side fare? I'd expect it to be a little better, if only because
there are fewer bytes to consider when XOR-ing. But if there's some
hidden assumption we're missing, then it could get wildly worse. It
would be good to confirm that that didn't happen. ;)

>  static void compute_xor_offsets(struct bitmap_writer *writer)
>  {
>  	static const int MAX_XOR_OFFSET_SEARCH = 10;
>  
>  	int i, next = 0;
> +	int nr = bitmap_writer_nr_selected_commits(writer);
> +
> +	if (nr > 1) {
> +		QSORT(writer->selected, nr, bitmapped_commit_date_cmp);
> +
> +		for (i = 0; i < nr; i++) {
> +			struct bitmapped_commit *stored = &writer->selected[i];
> +			khiter_t hash_pos = kh_get_oid_map(writer->bitmaps,
> +							   stored->commit->object.oid);
> +
> +			if (hash_pos == kh_end(writer->bitmaps))
> +				BUG("selected commit missing from bitmap map: %s",
> +				    oid_to_hex(&stored->commit->object.oid));
> +
> +			kh_value(writer->bitmaps, hash_pos) = stored;
> +		}
> +	}

OK. It took me a minute to wrap my head around this. The real work is
done by QSORT(). But because we maintain a hash pointing into that
array, we have to go through each hash entry and fix up its pointer.

Looks correct.

-Peff

^ permalink raw reply

* Re: [PATCH] fetch: pass transport to post-fetch connectivity check
From: Kristofer Karlsson @ 2026-05-27 10:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Kristofer Karlsson via GitGitGadget, git
In-Reply-To: <20260527083216.GA981444@coredump.intra.peff.net>

You're right. I dug into this further and realized the problem is deeper
than just the flag not being set in builtin/fetch.c.

Even if we add:
transport->smart_options->check_self_contained_and_connected = 1;
to prepare_transport(), the optimization still won't work for fetches.

The optimization is fundamentally clone-only.

I was unable to reproduce the benchmark numbers from my original commit
message. The patch as submitted is indeed inert for non-clone fetches.
It looked like a simple improvement, but it's clear that it was incorrect.
I'll drop it, and I apologize for the noise here.

-- Kristofer

On Wed, 27 May 2026 at 10:32, Jeff King <peff@peff.net> wrote:
>
> On Sun, May 24, 2026 at 12:28:12PM +0000, Kristofer Karlsson via GitGitGadget wrote:
>
> > From: Kristofer Karlsson <krka@spotify.com>
> >
> > When fetching with a transport that sets `self_contained_and_connected`
> > (as index-pack does for self-contained packs), check_connected() can
> > use find_pack_entry_one() to skip connectivity verification for refs
> > whose objects exist in the new pack. This avoids sending those OIDs to
> > the rev-list child process.
> >
> > However, store_updated_refs() never passed the transport to
> > check_connected(), so opt.transport was always NULL and this
> > optimization was dead code for post-fetch connectivity checks.
> >
> > Thread the transport parameter through store_updated_refs() and set
> > opt.transport so that check_connected() can take advantage of
> > self-contained packs.
>
> That makes sense in principle, but one thing puzzles me. We only turn on
> the optimization in check_connected() if the transport's smart_options
> has the self_contained_and_connected bit set. And we set that only when
> we were told via check_self_contained_and_connected to do so (and we
> pass the appropriate option to index-pack, which tells us the result is
> OK).
>
> And the only place that turns on check_self_contained_and_connected is
> in builtin/clone.c. So how does this optimization work for a non-clone
> fetch? Am I missing some code path?
>
> -Peff

^ permalink raw reply

* Re: [PATCH 8/8] pack-bitmap: build pseudo-merge bitmaps after regular bitmaps
From: Jeff King @ 2026-05-27 10:25 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Elijah Newren, Derrick Stolee
In-Reply-To: <30ce254312cfee2a2a82f08246c3a2546ae32578.1779207127.git.me@ttaylorr.com>

On Tue, May 19, 2026 at 12:12:55PM -0400, Taylor Blau wrote:

> Each selected commit starts with one commit_mask bit in its "commit
> mask" bitmap. Then, we walk the first-parent history in topological
> order and OR each commit's mask into its (first) parent. Whenever that
> OR results in the parent having more bits set, the child is deemed to be
> non-maximal, and the frontier is pushed further back along the first
> parent history.
> 
> That approach works extremely well for ordinary selected commits, whose
> first-parent histories often describe real sharing between the bitmaps
> we are going to write.
> 
> It struggles, however, to efficiently generate pseudo-merge bitmaps.
> Unlike ordinary commits for which the above algorithm is designed,
> pseudo-merges don't represent any "real" commit in history, just a
> grouping of non-bitmapped reference tips. In that sense, their first
> parent is just a part of a larger set, and treating them like ordinary
> selected commits imposes a significant slow-down when generating bitmaps
> with pseudo-merges enabled.

This is a great explanation of the problem, and especially this:

> In other words, we pay a nearly ~5 minute penalty to generate
> pseudo-merge bitmaps, but only save ~50 seconds during traversal.

makes it clear that we're doing something sub-optimal. And it points us
in the right direction, since that traversal should be able to generate
the pseudo-merge bitmap we need in the first place! So that should be
our goal to work towards.

> Instead, build the regular selected commit bitmaps first, considering
> only non-pseudo-merge commits in `bitmap_builder_init()`. Once those
> bitmaps have been stored, build each pseudo-merge bitmap separately and
> attach its parent and object bitmaps to the corresponding pseudo-merge
> entry before writing the extension.

And then this solution follows naturally from the earlier explanations.
Good.

In some ways this goes back to the pre-v2.31 way of generating bitmaps,
which is to just traverse for each bitmap independently. But as you
note, the whole idea of pseudo-merge bitmaps is that they aren't
overlapping in any meaningful way. So doing one fill-in traversal per
pseudo-merge makes sense, and hopefully we hit enough real bitmaps that
it's not too costly.

> As a result, the overhead cost for generating pseudo-merges in the above
> configuration is much smaller:
> 
>     +------------------+-----------------+---------------+-------------------+
>     |                  | no pseudo-merge | pseudo-merges | Delta             |
>     |                  |                 | (HEAD)        |                   |
>     +------------------+-----------------+---------------+-------------------+
>     | elapsed          |   294.1 s       |   328.4 s     |  +34.3 s (+11.7%) |
>     | cycles           | 1,365.5 B       | 1,529.3 B     | +163.7 B (+12.0%) |
>     | instructions     | 1,389.8 B       | 1,552.8 B     | +163.0 B (+11.7%) |
>     | CPI              |     0.983       |     0.985     |  +0.002   (+0.2%) |
>     +------------------+-----------------+---------------+-------------------+

Nice. The time savings are going to depend on how many pseudo-merges we
generate, I think. And I'd guess that the numbers above come from making
one big pseudo-merge bitmap, per the config you showed earlier. But you
probably only want a handful of them in any repo, so hopefully it
doesn't scale _too_ badly.

> Recall that at the start of this series, generating reachability bitmaps
> took 612.5 seconds *without* pseudo-merges. With this commit, it is
> still ~46.38% *faster* to generate reachability bitmaps *with*
> pseudo-merges than it was to generate bitmaps wihtout them at the
> beginning of this series.

Sure, though 612.5 seconds is all in the distant past. We only care
about 294.1 seconds now. ;)

More seriously, I do think the interesting question here is how the time
scales for various pseudo-merge configurations. I don't know if we have
any real operational experience with them yet. The original idea is that
you might slice up the ref space into a few chunks. I'd guess that the
old code performed badly-ish overall, but the time did not grow all that
much as you increased the number of chunks. But with the new code, I
suspect that the cost grows more linearly with number of chunks. That's
just a guess, though.

The other thing we hope for with pseudo-merges is that the chunks are
selected such that most of the chunks don't change (because they are
composed of old, stable refs). So in subsequent bitmap generations, we
can either reuse them either verbatim or as a starting point (if there
were only additions). But all of that is going to be heuristic and
depend on your config, the changes the repo sees over time, and so on.

So I don't know if we'd really have good numbers on that.

> Now that we have decoupled how we generate pseudo-merges from their
> representation, the following commits will improve the API around
> specifying pseudo-merge groupings during bitmap generation.

I think we're at patch 8/8 here. I guess you have more to come
eventually, but for now this part is just misleading. ;)

>  pack-bitmap-write.c | 210 ++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 174 insertions(+), 36 deletions(-)

The patch looks reasonable, though I'm not all that familiar with the
ins and outs of the pseudo-merge code. I'd trust the tests here more
than my review.

> @@ -696,12 +700,32 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
>  		 * walk ensures we cover all parents.
>  		 */
>  		if (!(c->object.flags & BITMAP_PSEUDO_MERGE)) {
> +			struct tree *tree;
> +
> +			if (from_pseudo_merge && !c->object.parsed) {
> +				/*
> +				 * Commits reachable from selected
> +				 * non-pseudo-merges are already parsed
> +				 * by the regular bitmap build.
> +				 *
> +				 * However, pseudo-merge fills can also
> +				 * reach commits that were not covered
> +				 * there, so parse any such leftovers
> +				 * before reading their tree or parents.
> +				 */
> +				if (repo_parse_commit(writer->repo, c))
> +					return -1;
> +			}

Makes sense. This should be a quick noop for the regular bitmap build,
since we'll have the parsed flag set. And it should even allow use of
the commit-graph if it's available.

-Peff

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox