Git development
 help / color / mirror / Atom feed
* Re: [PATCH v4] ref-filter: restore prefix-scoped iteration
From: Junio C Hamano @ 2026-06-18 15:54 UTC (permalink / raw)
  To: Tamir Duberstein
  Cc: git, Karthik Nayak, Patrick Steinhardt, Victoria Dye, ZheNing Hu
In-Reply-To: <20260612-fix-git-branch-regression-v4-1-f150038c02f4@gmail.com>

Tamir Duberstein <tamird@gmail.com> writes:

> Changes in v4:
> - Explain the historical references in the commit message.
> - Run the new performance cases with both ref backends.
> - Drop the Assisted-by trailer.
> - Link to v3: https://patch.msgid.link/20260610-fix-git-branch-regression-v3-1-6fd48fad7a53@gmail.com

This seems to fully address comments by Patrick in
https://lore.kernel.org/git/aivx-7VOKE_TC50R@pks.im/

Let me mark the topic for 'next'.  Thanks all who discussed this patch.

^ permalink raw reply

* Re: [PATCH v5 2/2] graph: indent visual root in graph
From: Jeff King @ 2026-06-18 16:05 UTC (permalink / raw)
  To: Pablo Sabater
  Cc: Kristofer Karlsson, git, ayu.chandekar, chandrapratap3519,
	christian.couder, gitster, jltobler, karthik.188, phillip.wood,
	siddharthasthana31
In-Reply-To: <CAN5EUNSQY2oK7BE4J9Y8APfkP6eJxta050OUu=RoJYhXOjX_OA@mail.gmail.com>

On Thu, Jun 18, 2026 at 02:42:16PM +0200, Pablo Sabater wrote:

> > > +     for (cl = graph->revs->commits; cl; cl = cl->next) {
> > > +             if (get_commit_action(graph->revs, cl->item) != commit_show)
> > > +                     continue;
> [...]
> > I'm not sure what the solution is. This function wants to peek ahead in
> > queue order, possibly through multiple entries. But a heap-based queue
> > inherently only supports peeking at the first entry.
> 
> Yeah, I haven't read dd4bc01c0a yet but from what you say it prob
> won't work anymore, I didn't know about that series, about the
> lookahead I think it could still work with some tweaks, the important
> part is to set the three lookahead flags.

Thanks for looking into it. I meant to also cc the Kristofer, the author
of dd4bc01c0a, for any thoughts (adding him now).

> From what I understood, we can only get the direct next commit, but no
> more reliably ordered.

Right. There are other queue implementations that could allow full
in-order traversal (e.g., a binary tree), but our prio_queue does not. I
suspect performance for other cases would suffer if we switched the
underlying data structure.

> The flags should be fine:
> 
> - 'is_next_visible' could need to traverse multiple entries, but it
> doesn't need them to be in order. We just need to know if something
> will be rendered after.

Yeah, this one seems easy. We are just setting a bit based on whether
we'd find any commit to show. So order doesn't matter.

> - 'next_has_column' only needs the first entry.

But this was the one I was worried about. Walking the linked list in
order will find us the next commit we're going to show, and the result
of the flag depends on graph_find_new_column_by_commit(). Is it OK to
find _any_ such commit?

(I'm looking at this purely based on reading the existing code, and
haven't really thought hard about the problem space).

> - 'is_next_visual_root' only needs the first entry to know if it could
> be a visual root, and also if it is not the last one (but we don't
> need them to be ordered for this last part).

This one just iterates looking for any other commit we'll show after the
next one. So finding any two entries would be equivalent to the current
code (though we only get to this loop if the first one passes the test
for graph_is_visual_root_candidate).

So if you say order doesn't matter for checking the column and the
visual-root-candidate function, I'm happy to believe you. It makes life
much easier. :)

> Should I work with 'next' as a base to have dd4bc01c0a? (Sorry I've
> just worked with master).

As Junio noted, that's already in master, so I think you're OK to just
base there.

But for future reference, no, you probably don't want to build off of
'next'. If your commit has a dependency on another topic it is best to
build directly off of that topic (and note it in the cover letter of the
series). That way you do not accidentally depend on other things in
'next' which might not ever make it to 'master' (and would thus hold
your topic hostage).

-Peff

^ permalink raw reply

* Re: [PATCH v5 2/2] graph: indent visual root in graph
From: Jeff King @ 2026-06-18 16:07 UTC (permalink / raw)
  To: Pablo Sabater
  Cc: Kristofer Karlsson, git, ayu.chandekar, chandrapratap3519,
	christian.couder, gitster, jltobler, karthik.188, phillip.wood,
	siddharthasthana31
In-Reply-To: <20260618160504.GA818042@coredump.intra.peff.net>

On Thu, Jun 18, 2026 at 12:05:05PM -0400, Jeff King wrote:

> > From what I understood, we can only get the direct next commit, but no
> > more reliably ordered.
> 
> Right. There are other queue implementations that could allow full
> in-order traversal (e.g., a binary tree), but our prio_queue does not. I
> suspect performance for other cases would suffer if we switched the
> underlying data structure.

BTW, there's one extra trick here in the iteration: you might see
commits in _both_ revs->commits and revs->commit_queue. So you'll have
to iterate over both of them (and I guess push the loop body into a
function to avoid duplication).

We may eventually settle on having just one queue sturcture, but I think
dd4bc01c0a used that to avoid disrupting existing callers.

-Peff

^ permalink raw reply

* Re: [PATCH v14 4/6] branch: add --prune-merged <branch>
From: Junio C Hamano @ 2026-06-18 16:08 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Harald Nordgren, Harald Nordgren via GitGitGadget, git,
	Kristoffer Haugsbakk, Johannes Sixt
In-Reply-To: <37f2a483-c8bf-4c24-84de-c6233cc20b25@gmail.com>

Phillip Wood <phillip.wood123@gmail.com> writes:

> One thing I've just thought of related to this patch is whether we want 
> to protect branches that are the upstreams of branches that are not 
> slated for deletion. With stacked branches it is possible that a branch 
> has been merged but has other branches stacked on top of it that have 
> not been merged.

An interesting point.  We do have "this topic is built on the result
of merging these other topics into main" and I expect the practice
is wide spread.  These base topics may graduate first, but other
topics may still be updated.

But when you rewrite these other topics, wouldn't you leave their
bases untouched?  IOW, a new iteration (i.e. "rebase -i") would
reuse the base that was used in an earlier iteration, i.e. the
result of an earlier merge of the other topics, some of which might
have been pruned since then, into an older 'main', so it is OK to
lose these other topics once they have graduated, simply because you
wouldn't be recreating the merge that you used as the base of this
remaining topic, no?

Or am I missing something?

Thanks.

^ permalink raw reply

* Re: t5563-simple-http-auth failures with v2.55.0-rc0
From: Junio C Hamano @ 2026-06-18 16:18 UTC (permalink / raw)
  To: Todd Zullinger; +Cc: Matthew John Cheetham, git@vger.kernel.org
In-Reply-To: <20260618144953.l6Ng-dvv@teonanacatl.net>

Todd Zullinger <tmz@pobox.com> writes:

> I saw Fedora picked up curl-8.21.0-rc3 this morning and
> confirmed it resolves the git test failures.  Someone else
> has already commented on the upstream curl issue to note
> that.

Thanks.

^ permalink raw reply

* Re: [PATCH v2] SubmittingPatches: address design critiques
From: Junio C Hamano @ 2026-06-18 16:26 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git
In-Reply-To: <95cd81dc-baea-4318-9f01-6a795f8eb5bb@app.fastmail.com>

"Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com> writes:

> You can imagine someone from group number 1 who is *not* in group number
> 3 use a weekend to implement something. But then when it is submitted it
> turns out that is a very “centralized CVS” idea which doesn’t fit into
> git(1) at all. That’s easily spotted by group number 3 by just looking
> at the proposed docs or design. Now that group number 1 individual might
> just have a bunch of code that is dead weight for any proper Git
> workflow.

That depends on how obviously wrong the idea is.  If your proposal
is to write another CVS into Git, that may be too obvious it may not
fly, but the thing is, "proposals" that get the canned response you
quoted are often vague enough that crucial details that divide
"iffy" and "obviously wrong" are missing.

One way to make these proposals sufficiently clear to allow
reviewers to tell the difference is with a code that builds.  There
may be other ways, but that is one obvious way to start a meaningful
discussion.

^ permalink raw reply

* Re: [PATCH v2 7/8] refs: fix recursing `get_main_ref_store()` with "onbranch" config
From: Jeff King @ 2026-06-18 16:40 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak
In-Reply-To: <20260615-b4-pks-refs-avoid-chdir-notify-reparent-v2-7-f4854aa99859@pks.im>

On Mon, Jun 15, 2026 at 03:56:53PM +0200, Patrick Steinhardt wrote:

> When we have an "onbranch" condition we need to ask the reference
> database whether HEAD currently points at the configured branch. This
> unfortunately creates a chicken-and-egg problem:
> 
>   - The reference database needs to read the configuration so that it
>     can configure itself.
> 
>   - The configuration needs to construct a reference database to fully
>     parse all of its conditionals.
> 
> The way we handle this is by simply excluding "onbranch" conditionals
> when we haven't yet configured the reference database.

My gut feeling upon reading this is that some part of the config reading
is being done wrong to create this chicken-and-egg situation.

I'd expect the ref database config (like the ref format) to be read not
through the regular config subsystem, but via read_repository_format()
and friends. And while that does build on the regular config code, it
should never enable includes at all. So includeIf.onbranch:foo.path is
just another uninteresting config key to it.

In other words, there should be two passes over the config file: one to
load basic repository information (and not respect includes), and one to
actually load what we think of as user-visible config[1].

And it seems to work. If I do this:

diff --git a/config.c b/config.c
index 45144f73c5..343af2cf9a 100644
--- a/config.c
+++ b/config.c
@@ -303,7 +303,7 @@ static int include_by_branch(struct config_include_data *data,
 	const char *refname, *shortname;
 
 	if (!data->repo || data->repo->ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN)
-		return 0;
+		BUG("chicken and egg");
 
 	refname = refs_resolve_ref_unsafe(get_main_ref_store(data->repo),
 					  "HEAD", 0, NULL, &flags);

and then:

  git config includeIf.onbranch:main.path alt-config
  git config -f .git/alt-config foo.bar baz
  git config foo.bar

then we correctly read the value without triggering this code path.

Looking back at the last commit that touched include_by_branch(), the
problem does not appear to be about a chicken-and-egg at all, though. It
is about reading config with includes when there is _no_ repository at
all. I.e., this:

  git config -f main-config includeIf.onbranch:main.path alt-config
  git config -f  alt-config foo.bar baz
  GIT_DIR=/does/not/exist git.compile config --include -f main-config foo.bar

will trigger that BUG() marker, and quietly returning "no match" (like
the current code does) is the right thing.

Looking below...

> The consequence is that we have recursion:
> 
>   1. We call `get_main_ref_store()`.
> 
>   2. We don't yet have a reference store, so we call `ref_store_init()`.
> 
>   3. We parse the configuration required for the reference store.
> 
>   4. We eventually end up in `include_by_branch()`.
> 
>   5. We have already configured the reference storage format, so we end
>      up calling `get_main_ref_store()` again.

Ah, the culprit seems to be ref_store_init() calling into the regular
config parser via repo_settings_get_log_all_ref_updates(). But that
feels weird to me. Either:

  1. It is application config that should not be something we need to
     load in order to initialize the backend. We could lazy-load it
     later, or rely on higher level code to set the option.

  2. It is crucial to the ref backend functioning, in which case we
     ought to be reading it alongside core.repositoryFormatVersion, etc.

-Peff

^ permalink raw reply related

* Re: [PATCH] zlib: properly clamp to uLong
From: Junio C Hamano @ 2026-06-18 17:03 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin
In-Reply-To: <pull.2153.git.1781790619424.gitgitgadget@gmail.com>

The original in js/objects-larger-than-4gb-on-windows says things
like:

+	s->z.total_in = (uLong)(s->total_in & ULONG_MAX_VALUE);
+	s->z.total_out = (uLong)(s->total_out & ULONG_MAX_VALUE);

Your patch ...

> +static inline uLong zlib_uLong_cap(size_t s)
> +{
> +	return s < ULONG_MAX_VALUE ? (uLong)s : ULONG_MAX_VALUE;
> +}
> +
>  static void zlib_pre_call(git_zstream *s)
>  {
>  	s->z.next_in = s->next_in;
>  	s->z.next_out = s->next_out;
> -	s->z.total_in = (uLong)(s->total_in & ULONG_MAX_VALUE);
> -	s->z.total_out = (uLong)(s->total_out & ULONG_MAX_VALUE);
> +	s->z.total_in = zlib_uLong_cap(s->total_in);
> +	s->z.total_out = zlib_uLong_cap(s->total_out);

... is an obvious fix for that.

> @@ -60,7 +65,7 @@ static void zlib_post_call(git_zstream *s, int status)
>  	 * We track our own totals and verify only the low bits match.
>  	 */
>  	if ((s->z.total_out & ULONG_MAX_VALUE) !=
> -	    ((s->total_out + bytes_produced) & ULONG_MAX_VALUE))
> +	    ((zlib_uLong_cap(s->total_out) + bytes_produced) & ULONG_MAX_VALUE))
>  		BUG("total_out mismatch");

Because we now clamp (not "taking lower bits of") s->total_out to a
value between 0..4GB and store it in s->z.total_out in pre-call, let
zlib do its thing that increments s->z.total_out modulo 4GB, and we
clamp the s->total_out (before incrementing) the same way in
post_call here, both sides of "!=" above even out.  But the comment
before this comparison that claims that "we ... verify only the low
bits match" is a bit off the reality, I suspect.

> @@ -68,7 +73,7 @@ static void zlib_post_call(git_zstream *s, int status)
>  	 */
>  	if (status != Z_NEED_DICT &&
>  	    (s->z.total_in & ULONG_MAX_VALUE) !=
> -	    ((s->total_in + bytes_consumed) & ULONG_MAX_VALUE))
> +	    ((zlib_uLong_cap(s->total_in) + bytes_consumed) & ULONG_MAX_VALUE))
>  		BUG("total_in mismatch");
>  
>  	s->total_out += bytes_produced;
>
> base-commit: 7a094d68a27e321a99c8ab6b700909e503904bd9

^ permalink raw reply

* [PATCH] doc: detail LMAP binary format specification
From: irsalshydiq @ 2026-04-16  5:55 UTC (permalink / raw)
  To: git; +Cc: irsalshydiq

The experimental Rust implementation in 'loose.rs' introduces a new
binary format called 'LMAP' for mapping between different object ID
formats (e.g., SHA-1 and SHA-256).

However, the technical documentation for this format was missing from
the 'Documentation/technical/' directory, making it difficult for
C developers to understand the format's layout and logic.

Add 'Documentation/technical/loose-object-map.adoc' to provide a bit-by-bit
specification of the 'LMAP' format and register it in the build systems.

Signed-off-by: irsalshydiq <ichalprov@gmail.com>
---
 Documentation/Makefile                        |  1 +
 Documentation/technical/loose-object-map.adoc | 72 +++++++++++++++++++
 Documentation/technical/meson.build           |  1 +
 3 files changed, 74 insertions(+)
 create mode 100644 Documentation/technical/loose-object-map.adoc

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 2699f0b24a..8ad908d62c 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -126,6 +126,7 @@ TECH_DOCS += technical/directory-rename-detection
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/large-object-promisors
 TECH_DOCS += technical/long-running-process-protocol
+TECH_DOCS += technical/loose-object-map
 TECH_DOCS += technical/multi-pack-index
 TECH_DOCS += technical/packfile-uri
 TECH_DOCS += technical/pack-heuristics
diff --git a/Documentation/technical/loose-object-map.adoc b/Documentation/technical/loose-object-map.adoc
new file mode 100644
index 0000000000..6e8dbd6c6f
--- /dev/null
+++ b/Documentation/technical/loose-object-map.adoc
@@ -0,0 +1,72 @@
+Loose Object Map (LMAP) format
+============================
+
+The loose object map file (LMAP) provides a way to map between different object
+ID formats (e.g., SHA-1 and SHA-256) for loose objects. It is designed for
+efficient lookup and storage of these mappings.
+
+All multi-byte integers are in network byte order (big-endian).
+
+== File Layout
+
+- A header (20 bytes)
+- An Object Format Table (16 bytes per format)
+- A Trailer Offset (8 bytes)
+- Data Sections (variable length, 4-byte aligned)
+- A Trailer (variable length, defined by hash algorithm)
+
+=== Header
+
+- 4-byte signature: `LMAP`
+- 4-byte version number: The current version is 1.
+- 4-byte header size: The total size of the header, including the Object Format Table and Trailer Offset.
+- 4-byte number of items: The number of object IDs mapped in this file.
+- 4-byte number of object formats: The number of different hash algorithms supported in this file (minimum 2).
+
+=== Object Format Table
+
+For each object format (as specified in the header), there is a 16-byte entry:
+
+- 4-byte Format ID: The identifier for the hash algorithm (e.g., `0x73686131` for SHA-1, `0x73323536` for SHA-256).
+- 4-byte Shortened Length: The minimum number of bytes needed to unambiguously identify an object ID in this format within this file.
+- 8-byte Data Offset: The absolute offset from the beginning of the file to the start of the data section for this format.
+
+=== Trailer Offset
+
+- 8-byte Trailer Offset: The absolute offset from the beginning of the file to the start of the Trailer.
+
+=== Data Sections
+
+Each object format has a corresponding data section starting at the offset provided in the Object Format Table. Each data section is aligned to a 4-byte boundary.
+
+==== Format 1 (Storage Format) Data Section
+
+The first format listed is considered the "storage" or "main" format. Its data section contains:
+
+1. **Shortened Index**: `(number of items) * (shortened length)` bytes.
+   This table contains the first `shortened length` bytes of each object ID, sorted lexicographically. This allows for binary search lookup.
+
+2. **Full OID Table**: `(number of items) * (hash length)` bytes.
+   The full object IDs for the storage format, in the same order as the Shortened Index.
+
+3. **Metadata Table**: `(number of items) * 4` bytes.
+   A table of 32-bit integers representing the type of each object:
+   - 0: Reserved (e.g., null OID, empty tree/blob)
+   - 1: Loose Object
+   - 2: Shallow Commit
+   - 3: Submodule Commit
+
+==== Subsequent Format (Compatibility) Data Section
+
+For each subsequent format, the data section contains:
+
+1. **Shortened Index**: Similar to the storage format, but for the compatibility algorithm's OIDs.
+
+2. **Full OID Table**: The full object IDs in the compatibility algorithm.
+
+3. **Mapping Table**: `(number of items) * 4` bytes.
+   A table of 32-bit integers. Each entry at index `i` provides the index in the **storage format's** tables that corresponds to this compatibility object ID.
+
+=== Trailer
+
+- Variable length: The hash of all preceding bytes in the file, calculated using the main hash algorithm.
diff --git a/Documentation/technical/meson.build b/Documentation/technical/meson.build
index ec07088c57..dc1249f9aa 100644
--- a/Documentation/technical/meson.build
+++ b/Documentation/technical/meson.build
@@ -16,6 +16,7 @@ articles = [
   'large-object-promisors.adoc',
   'long-running-process-protocol.adoc',
   'multi-pack-index.adoc',
+  'loose-object-map.adoc',
   'packfile-uri.adoc',
   'pack-heuristics.adoc',
   'parallel-checkout.adoc',
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related

* Re: [PATCH] completion: zsh: support completion after "git -C <path>"
From: D. Ben Knoble @ 2026-06-18 17:43 UTC (permalink / raw)
  To: Lutz Lengemann via GitGitGadget; +Cc: git, Lutz Lengemann, Junio C Hamano
In-Reply-To: <pull.2155.git.1781710256081.gitgitgadget@gmail.com>

[apologies in advance for the strange format below]

On Wed, Jun 17, 2026 at 11:37 AM Lutz Lengemann via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Lutz Lengemann <lutz@lengemann.net>
>
> The zsh completion wrapper (__git_zsh_main) did not handle the global -C
> option, so "git -C <path> <command> <TAB>" offered nothing and could not
> complete a command's arguments.
>
> Three things are needed to make it work, all scoped to -C:
>
>   - Add -C to the _arguments specification, so completion no longer stops
>     at it.
>
>   - Advance __git_cmd_idx past any leading "-C <path>" options. The index
>     is hard-coded to 1, i.e. the command is assumed to be the first
>     argument; with -C present the command sits two words later for each
>     -C, so the bash helpers otherwise look at the wrong word and produce
>     nothing.
>
>   - Collect the -C paths into __git_C_args, as __git_main does. The bash
>     helpers run git to resolve aliases and list refs; without the -C
>     paths they run in the current directory, so completion fails whenever
>     the cwd is not the target repository or the command is an alias.
>
> With these, "git -C <path> <command> <TAB>" completes the command, its
> options and its arguments, including outside the repository, through
> aliases, and with repeated -C options.
>
> Signed-off-by: Lutz Lengemann <lutz@lengemann.net>
> ---
>     completion: zsh: support completion after "git -C "
>
>     This patch is intentionally scoped to -C, but the underlying problem is
>     more general. The zsh wrapper hard-codes __git_cmd_idx=1, i.e. it
>     assumes the command is always the first argument. That assumption breaks
>     argument completion after any global option that precedes the command,
>     not just -C — e.g. --git-dir, --work-tree, --namespace, -c, and
>     -p/--paginate. After those, git <opt> <command> <TAB> currently
>     completes the command name but not its arguments.
>
>     The same approach generalizes cleanly: instead of skipping only leading
>     -C options, walk all leading global options and their arguments to
>     locate the command and its true index (mirroring the option scan in
>     __git_main in git-completion.bash), while collecting -C into
>     __git_C_args and --git-dir into __git_dir as today.
>
>     I kept this revision narrow for reviewability and because git -C is the
>     case where I miss the completion, but I'm happy to extend it to cover
>     the other global options in a follow-up (or fold it into this patch) if
>     that's preferred.

See Junio's review for whether we should expand in this patch or a follow-up.

In reply to Junio:

> [the new handling only knows about -C]
> Doesn't it want to do something similar to what __git_main in
> git-completion.bash does at the beginning, namely, this part?

Yeah, we probably do want to skip over -c, etc. (I see some support for
--bare and --git-dir, but not skipping over it.) Still, this patch makes
things no worse in that regard, and improves the situation for -C
AFAICT.

In reply to Lutz:

> +        local -a __git_C_args
> +        local -i i=2
> +
> +        while [[ ${orig_words[i]} == -C ]]; do
> +            __git_C_args+=(-C ${orig_words[i+1]})
> +            (( __git_cmd_idx += 2 ))
> +            (( i += 2 ))
> +        done

I don't see either of these 2 local variables used anywhere else…

…well, except the Bash completion helpers, I suppose. But we mark these
local, so how do they propagate to the other functions?

Still, I was able to try this out with the somewhat hacky

    zsh # new shell :)
    # absolute path important
    autoload -Uz $PWD/contrib/completion/git-completion.zsh
    compdef git-completion.zsh git

    git -C <tab>

and it does prioritize directories there (though I still get a listing
of files afterwards, so the screen is taken up by that gigantic listing
in git.git, for example).

By the way, I've realized that "git -<tab>" has the same problem (a
giant list of files after the other option completions), and worse has
some _funky_ output!

    git -<tab> # without patch
    (option)
    --bare
    --exec-path
    --git-dir
    --help
    --html-path
    --info-path
    --man-path
    --namespace
    --no-pager
    --no-replace-objects
    --paginate
    --version
    --work-tree

    -p

    # treat the repository as a bare repository
    # path to where your core git programs are installed
    # set the path to the repository
    # prints the synopsis and a list of the most commonly used commands
    # print the path where gits HTML documentation is installed
    # print the path where the Info files are installed
    # print the manpath (see `man(1)`) for the man pages
    # set the git namespace
    # do not pipe git output into a pager
    # do not use replacement refs to replace git objects
    # pipe all output into less
    # prints the git suite version
    # set the path to the working tree
    [ed: the above block repeats twice more before the (file) listing below]
    (file)
    […]

Here's the output of _complete_help (^Xh by default) in both situations,
in case that helps to understand either the extra files listing (1) in
the example further back or the issue with single letter options (2)
just mentioned:

1: tags in context :completion::complete:git::
    option-C-1     (_arguments __git_zsh_main _git git-completion.zsh)
    use-compctl    (_default _git git-completion.zsh)
    globbed-files  (_files _default _git git-completion.zsh)
tags in context :completion::complete:git:option-C-1:
    directories    (_directories _arguments __git_zsh_main _git
git-completion.zsh)
    globbed-files  (_files _directories _arguments __git_zsh_main _git
git-completion.zsh)
    all-files      (_files _directories _arguments __git_zsh_main _git
git-completion.zsh)

2: tags in context :completion::complete:git::
    argument-1 options  (_arguments __git_zsh_main _git)
    use-compctl         (_default _git)
    globbed-files       (_files _default _git)
tags in context :completion::complete:git:argument-1:
    common-commands alias-commands all-commands  (__git_zsh_main _git)
    common-commands                              (__git_zsh_cmd_common
__git_zsh_main _git)
    alias-commands                               (__git_zsh_cmd_alias
__git_zsh_main _git)
    all-commands                                 (__git_zsh_cmd_all
__git_zsh_main _git)
tags in context :completion::complete:git:options:
    options  (_arguments __git_zsh_main _git)

> +        '*-C[run as if git was started in <path>]: :_directories' \

We should probably note in the log message that the _directories
completion will not account for previous -C; that is, after typing

    git -C dir -C <tab>

we will complete directories in ".", not "dir". That's probably a
reasonable limitation for now, but I think we could do _slightly_ better
by using a state "->dir" or something, accumulating the current prefix,
and passing that to _directories as a prefix with -W (see _path_files in
zshcompsys, which _directories delegates to via _files, IIUC).

-- 
D. Ben Knoble

^ permalink raw reply

* Re: [PATCH] checkout: add --fetch to fetch remote before resolving start-point
From: D. Ben Knoble @ 2026-06-18 17:47 UTC (permalink / raw)
  To: Harald Nordgren; +Cc: Harald Nordgren via GitGitGadget, git
In-Reply-To: <CAHwyqnU4xuw9ZDjarWKKua_s1Qywt07GyP1kJO2HM+XQTcE8hA@mail.gmail.com>

Hi Harald,

On Thu, Jun 18, 2026 at 8:36 AM Harald Nordgren
<haraldnordgren@gmail.com> wrote:
>
> Hi Ben!
>
> Trying to shore up some support for this topic. How do you feel about this now?
>
>
> Harald

I haven't followed the series too closely; it sounds like the code is
in decent shape.

I stand by my thought that this could be convenient, but that's not
much one way or another in terms of how it fits in overall with Git's
vision.

There are some conveniences I recommend universally. This one requires
caveats, as pointed out by others, depending on the project, so… from
that point of view I wonder if keeping "fetch + switch" as 2 distinct
steps is better? I truly don't know.

-- 
D. Ben Knoble

^ permalink raw reply

* Re: [PATCH] help: prompt user to run corrected command on typo
From: Justin Tobler @ 2026-06-18 17:48 UTC (permalink / raw)
  To: calicomills; +Cc: git, gitster
In-Reply-To: <6a340006.60da1a74.20db39.8f57@mx.google.com>

On 26/06/18 07:26AM, calicomills wrote:
> From 0dc9e5c4593611b75e7003e8fdbea9370524c05b Mon Sep 17 00:00:00 2001
> From: calicomills <jishnuck26@gmail.com>
> Date: Thu, 18 Jun 2026 19:47:12 +0530
> Subject: [PATCH] help: prompt user to run corrected command on typo
> 
> When a user mistypes a git command and there is exactly one similar
> command, git currently prints a suggestion but exits, requiring the
> user to retype the corrected command manually.
> 
> Instead, when stdin and stderr are both connected to a terminal and
> there is a single best match, prompt the user with:
> 
>   Did you mean 'git checkout neo'? [y/N]
> 
> The full corrected invocation (command + original arguments) is shown
> in the prompt so the user knows exactly what will run. Answering 'y'
> re-executes git with the corrected command and all original arguments.
> Answering anything else exits as before.

Isn't this already possible via setting `help.autoCorrect=prompt` in the
config? For example:

  git -c help.autoCorrect=prompt comit --allow-empty -m init

seems to already do exactly what is proposed here.

-Justin

^ permalink raw reply

* Re: [PATCH v15 0/7] branch: delete-merged
From: Harald Nordgren @ 2026-06-18 17:53 UTC (permalink / raw)
  To: phillip.wood
  Cc: Harald Nordgren via GitGitGadget, git, Kristoffer Haugsbakk,
	Johannes Sixt
In-Reply-To: <feac3d8b-e291-48e8-ac73-3b1f5321799b@gmail.com>

> > I received the same feedback from Junio before, so I'm not unaware of
> > this problem. I am trying to slow down. I often prepare the work as
> > soon as I get some comments -- I'm on paternity leave so I have a lot
> > of time when the baby is sleeping --
>
> Congratulations - I hope the baby is sleeping at night as well in the day!

Thanks! It's our third, so hopefully we got the hang of it now.

He sleeps -- some of the time.

> > then I actively hold off on
> > sending to not overload the rest of you. But at the same time I think
> > it's valuable to keep up a certain pace. It's a balancing act.
> It is worth waiting for the discussion to settle on each round, I'll try
> and be clear when I've finished looking at each revision. I'm sure other
> folks would appreciate you looking at their patches and commenting on
> them while you're waiting for feedback on yours, especially the GSoC
> project students.

That's a good point!


Harald

^ permalink raw reply

* Re: [PATCH v3 15/17] odb/source-packed: stub out remaining functions
From: Junio C Hamano @ 2026-06-18 17:59 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Justin Tobler
In-Reply-To: <20260617-pks-odb-source-packed-v3-15-b5c7583cd795@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

Just FYI (i.e. nothing wrong in this patch)

> +static int odb_source_packed_write_object(struct odb_source *source UNUSED,
> +					  const void *buf UNUSED,
> +					  unsigned long len UNUSED,

The type of this parameter will become size_t via another topic in
flight; I prepared an evil merge to address it (otherwise winbuild
would barf, as expected).

-- >8 --
Author: Junio C Hamano <gitster@pobox.com>
Date:   Thu Jun 18 10:49:10 2026 -0700

    merge-fix po/hash-object-size-t vs ps/odb-source-packed

diff --git a/odb/source-packed.c b/odb/source-packed.c
index 42c28fba0e..decc81aa52 100644
--- a/odb/source-packed.c
+++ b/odb/source-packed.c
@@ -503,7 +503,7 @@ static int odb_source_packed_freshen_object(struct odb_source *source,
 
 static int odb_source_packed_write_object(struct odb_source *source UNUSED,
 					  const void *buf UNUSED,
-					  unsigned long len UNUSED,
+					  size_t len UNUSED,
 					  enum object_type type UNUSED,
 					  struct object_id *oid UNUSED,
 					  struct object_id *compat_oid UNUSED,

^ permalink raw reply related

* [PATCH 0/7] line-log: range-scope stat, check, and -G under -L
From: Michael Montalbo via GitGitGadget @ 2026-06-18 18:16 UTC (permalink / raw)
  To: git; +Cc: D. Ben Knoble, Michael Montalbo

This series extends git log -L so that more of its diff output and commit
selection honor the tracked line ranges: the diff stat formats and --check
are supported with range-scoped results, and the -G pickaxe is scoped to the
tracked range.

It builds on top of the mm/line-log-cleanup topic [1], which integrated -L
with the standard log output pipeline and taught it the non-patch formats
--raw, --name-only, --name-status, and --summary.

With these patches the following also honor the tracked range:

 * --stat, --numstat, --shortstat: counts cover only the lines inside the
   tracked range, not the whole file.

 * --check: whitespace errors are reported only for added lines inside the
   tracked range, with the correct file line numbers.

 * -G: a commit is selected only when the pattern appears on an
   added/removed line inside the tracked range, rather than anywhere in the
   file.

The --dirstat format is deliberately rejected. Its default mode reports each
directory's share of the total churn as a percentage, computed from
whole-file byte damage (via diffcore_count_changes(), outside the line-based
pipeline that -L scopes), so bare --dirstat cannot honor the tracked range.
The --dirstat=lines mode could: it aggregates the same per-file line counts
as --numstat, which -L already scopes. But supporting only that sub-mode
while bare --dirstat still errors is a confusing split, so the whole format
is left to a follow-up; --numstat already gives the exact range-scoped
per-file counts.

-S is left matching the whole file. Unlike -G, it counts needle occurrences
per blob rather than grepping the diff, so scoping it to a range needs a
different approach; that is left to a follow-up. Patch 7, which scopes -G,
also updates the -L documentation to note the -S/-G distinction, so the
whole-file behavior of -S is not mistaken for the range-scoped behavior
around it.

Patches 1-3 are independent of the new formats: they fix two bugs in the
existing -L patch output (a leaked deletion and an off-by-one hunk header),
bring its hunk headers in line with git diff's format, and clarify the
line-range filter mm/line-log-cleanup added, whose names obscured its model
(cryptic lno_ cursors conflating the pre/post-image and 0/1-based axes, a
flat hunk-state struct, and a one-letter state pointer (s)). The two bugs
may be a hint that the model could use clarification, so patch 1 renames and
groups the filter state and patch 2 documents the model, before the fixes
that read against it. Patches 4-7 then build the new formats on top:

 * Patch 1: rename and group the filter for clarity. Spell the cryptic names
   out to the file's own forms: the line-number cursors to
   lno_in_preimage/lno_in_postimage (as in struct emit_callback) and the
   range index to idx_in_postimage, while the hunk geometry stays old/new
   (the xdiff_emit_hunk_fn convention) and moves into a sub-struct. Name the
   filter pointer (filter) and rename the struct to line_range_filter and
   the flush helper to flush_range_hunk. No behavior change.

 * Patch 2: simplify the filter by classifying removals as they arrive,
   dropping the pending_rm buffer and a latent flush_range_hunk() bug that
   leaked deletions just past the range. Make the buffered lines the hunk's
   single source of truth: flush_range_hunk() derives the counts from them
   rather than tracking them per line, dropping three more fields. Document
   the model with a block comment and worked example, and add
   begin_range_hunk() as the counterpart to flush_range_hunk(). (This
   simplification was submitted by itself previously [2] but did not
   advance, so it is re-included here.)

 * Patch 3: stop hand-rolling the synthetic hunk header and emit it through
   xdiff's own formatter via a new xdiff_emit_hunk_header() helper. The
   hand-rolled code put a count-0 side's begin one too high (the convention
   is the line before the change); routing through xdl_emit_hunk_hdr() fixes
   that by construction and, as a side effect, makes -L headers match git
   diff exactly, including its omission of a count of 1. Regenerate the two
   affected fixtures.

 * Patch 4: extract a line_range_filter_diff() helper that folds the
   filter's two preconditions into one place: inflate ctxlen to the largest
   range span so every change within a range lands in a single xdiff hunk,
   and clear XDL_EMIT_NO_HUNK_HDR so the hunk headers the filter reads are
   always emitted (its position tracking relies on both). It then runs an
   initialized filter through xdiff, flushes the final range hunk, and
   releases it; use it in builtin_diff(). The stat, check, and -G patches
   that reuse it inherit both.

 * Patch 5: reuse the filter in builtin_diffstat() for the stat formats,
   extend the -L output-format allowlist, and reject --dirstat.

 * Patch 6: reuse the filter in builtin_checkdiff() and extend the allowlist
   for --check. The separate blank-at-eof pass scans the whole file, so
   scope its report to the tracked ranges too.

 * Patch 7: scope -G to the tracked range. Expose the filter as
   diff_emit_line_ranges() and grep only the tracked range's lines,
   threading the filepair's line_ranges through the pickaxe callback. -S is
   left whole-file, and the -L documentation is updated to note that -G is
   range-scoped while -S still matches the whole file.

[1]
https://lore.kernel.org/git/pull.2094.v3.git.1780001267.gitgitgadget@gmail.com/
[2]
https://lore.kernel.org/git/pull.2099.git.1777230630020.gitgitgadget@gmail.com/

Michael Montalbo (7):
  diff: rename and group the line-range filter for clarity
  diff: simplify the line-range filter by classifying removals
    immediately
  diff: emit -L hunk headers via xdiff's formatter
  diff: extract a line-range diff helper for reuse
  line-log: support diff stat formats with -L
  diff: support --check with -L line ranges
  diffcore-pickaxe: scope -G to the -L tracked range

 Documentation/line-range-options.adoc    |  17 +-
 diff.c                                   | 491 ++++++++++++++---------
 diffcore-pickaxe.c                       |  37 +-
 revision.c                               |   6 +-
 t/t4211-line-log.sh                      | 439 +++++++++++++++++++-
 t/t4211/sha1/expect.no-assertion-error   |   2 +-
 t/t4211/sha1/expect.vanishes-early       |   6 +-
 t/t4211/sha256/expect.no-assertion-error |   2 +-
 t/t4211/sha256/expect.vanishes-early     |   6 +-
 xdiff-interface.c                        |  19 +
 xdiff-interface.h                        |  28 ++
 11 files changed, 826 insertions(+), 227 deletions(-)


base-commit: ea97ad8d017de0c9037451a78008a0fd60abea0c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2152%2Fmmontalbo%2Fmm%2Fline-log-stat-formats-followup-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2152/mmontalbo/mm/line-log-stat-formats-followup-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2152
-- 
gitgitgadget

^ permalink raw reply

* [PATCH 1/7] diff: rename and group the line-range filter for clarity
From: Michael Montalbo via GitGitGadget @ 2026-06-18 18:16 UTC (permalink / raw)
  To: git; +Cc: D. Ben Knoble, Michael Montalbo, Michael Montalbo
In-Reply-To: <pull.2152.git.1781806593.gitgitgadget@gmail.com>

From: Michael Montalbo <mmontalbo@gmail.com>

The line-range filter that mm/line-log-cleanup added uses names that
obscure its model.  The cursors lno_post/lno_pre and the index lno_0
share an lno_ prefix but conflate the pre/post-image axis with the
0-based/1-based axis, the hunk state is a flat set of rhunk_* fields,
and the filter-state pointer is just s.

The filter bridges two layers of diff.c, and its fields already used
each layer's vocabulary, but in cryptic abbreviations.  Spell them out
to the form the rest of the file uses, so that the patches that follow
can simplify and fix it with those clearer names in place:

  - lno_post/lno_pre -> lno_in_postimage/lno_in_preimage, the
    line-number cursors, matching the counters in struct emit_callback
  - lno_0 -> idx_in_postimage, the 0-based range index
  - the hunk-header geometry stays old/new (old_begin, new_begin, and
    counts) to match the xdiff_emit_hunk_fn callback and the
    "@@ -<old> +<new> @@" header it feeds, but moves from flat rhunk_*
    fields into a "hunk" sub-struct, so accesses read
    filter->hunk.old_begin
  - flush_rhunk -> flush_range_hunk
  - the filter-state pointer in each callback: s -> filter

Also rename the struct line_range_callback to line_range_filter: it is
a filter over xdiff output, not merely a callback.

No behavior change.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
---
 diff.c | 192 +++++++++++++++++++++++++++++----------------------------
 1 file changed, 97 insertions(+), 95 deletions(-)

diff --git a/diff.c b/diff.c
index 5a584fa1d5..1e043c959f 100644
--- a/diff.c
+++ b/diff.c
@@ -623,15 +623,15 @@ struct emit_callback {
  * reveals whether they precede an in-range line (flush into range hunk) or
  * an out-of-range line (discard).
  */
-struct line_range_callback {
+struct line_range_filter {
 	xdiff_emit_line_fn orig_line_fn;
 	void *orig_cb_data;
 	const struct range_set *ranges;	/* 0-based [start, end) */
 	unsigned int cur_range;		/* index into the range_set */
 
 	/* Post/pre-image line counters (1-based, set from hunk headers) */
-	long lno_post;
-	long lno_pre;
+	long lno_in_postimage;
+	long lno_in_preimage;
 
 	/*
 	 * Function name from most recent xdiff hunk header;
@@ -640,12 +640,14 @@ struct line_range_callback {
 	char func[80];
 	long funclen;
 
-	/* Range hunk being accumulated for the current range */
-	struct strbuf rhunk;
-	long rhunk_old_begin, rhunk_old_count;
-	long rhunk_new_begin, rhunk_new_count;
-	int rhunk_active;
-	int rhunk_has_changes;		/* any '+' or '-' lines? */
+	/* The range hunk being accumulated for the current range. */
+	struct {
+		struct strbuf lines;	/* buffered in-range diff lines */
+		long old_begin, old_count;
+		long new_begin, new_count;
+		int active;
+		int has_changes;	/* any '+' or '-' line? */
+	} hunk;
 
 	/* Removal lines not yet known to be in-range */
 	struct strbuf pending_rm;
@@ -2540,26 +2542,26 @@ static int quick_consume(void *priv, char *line UNUSED, unsigned long len UNUSED
 	return 1;
 }
 
-static void discard_pending_rm(struct line_range_callback *s)
+static void discard_pending_rm(struct line_range_filter *filter)
 {
-	strbuf_reset(&s->pending_rm);
-	s->pending_rm_count = 0;
+	strbuf_reset(&filter->pending_rm);
+	filter->pending_rm_count = 0;
 }
 
-static void flush_rhunk(struct line_range_callback *s)
+static void flush_range_hunk(struct line_range_filter *filter)
 {
 	struct strbuf hdr = STRBUF_INIT;
 	const char *p, *end;
 
-	if (!s->rhunk_active || s->ret)
+	if (!filter->hunk.active || filter->ret)
 		return;
 
 	/* Drain any pending removal lines into the range hunk */
-	if (s->pending_rm_count) {
-		strbuf_addbuf(&s->rhunk, &s->pending_rm);
-		s->rhunk_old_count += s->pending_rm_count;
-		s->rhunk_has_changes = 1;
-		discard_pending_rm(s);
+	if (filter->pending_rm_count) {
+		strbuf_addbuf(&filter->hunk.lines, &filter->pending_rm);
+		filter->hunk.old_count += filter->pending_rm_count;
+		filter->hunk.has_changes = 1;
+		discard_pending_rm(filter);
 	}
 
 	/*
@@ -2568,22 +2570,22 @@ static void flush_rhunk(struct line_range_callback *s)
 	 * ctxlen causes xdiff to emit context covering a range that
 	 * has no changes in this commit.
 	 */
-	if (!s->rhunk_has_changes) {
-		s->rhunk_active = 0;
-		strbuf_reset(&s->rhunk);
+	if (!filter->hunk.has_changes) {
+		filter->hunk.active = 0;
+		strbuf_reset(&filter->hunk.lines);
 		return;
 	}
 
 	strbuf_addf(&hdr, "@@ -%ld,%ld +%ld,%ld @@",
-		    s->rhunk_old_begin, s->rhunk_old_count,
-		    s->rhunk_new_begin, s->rhunk_new_count);
-	if (s->funclen > 0) {
+		    filter->hunk.old_begin, filter->hunk.old_count,
+		    filter->hunk.new_begin, filter->hunk.new_count);
+	if (filter->funclen > 0) {
 		strbuf_addch(&hdr, ' ');
-		strbuf_add(&hdr, s->func, s->funclen);
+		strbuf_add(&hdr, filter->func, filter->funclen);
 	}
 	strbuf_addch(&hdr, '\n');
 
-	s->ret = s->orig_line_fn(s->orig_cb_data, hdr.buf, hdr.len);
+	filter->ret = filter->orig_line_fn(filter->orig_cb_data, hdr.buf, hdr.len);
 	strbuf_release(&hdr);
 
 	/*
@@ -2591,18 +2593,18 @@ static void flush_rhunk(struct line_range_callback *s)
 	 * The cast discards const because xdiff_emit_line_fn takes
 	 * char *, though fn_out_consume does not modify the buffer.
 	 */
-	p = s->rhunk.buf;
-	end = p + s->rhunk.len;
-	while (!s->ret && p < end) {
+	p = filter->hunk.lines.buf;
+	end = p + filter->hunk.lines.len;
+	while (!filter->ret && p < end) {
 		const char *eol = memchr(p, '\n', end - p);
 		unsigned long line_len = eol ? (unsigned long)(eol - p + 1)
 					     : (unsigned long)(end - p);
-		s->ret = s->orig_line_fn(s->orig_cb_data, (char *)p, line_len);
+		filter->ret = filter->orig_line_fn(filter->orig_cb_data, (char *)p, line_len);
 		p += line_len;
 	}
 
-	s->rhunk_active = 0;
-	strbuf_reset(&s->rhunk);
+	filter->hunk.active = 0;
+	strbuf_reset(&filter->hunk.lines);
 }
 
 static void line_range_hunk_fn(void *data,
@@ -2610,7 +2612,7 @@ static void line_range_hunk_fn(void *data,
 			       long new_begin, long new_nr UNUSED,
 			       const char *func, long funclen)
 {
-	struct line_range_callback *s = data;
+	struct line_range_filter *filter = data;
 
 	/*
 	 * When count > 0, begin is 1-based.  When count == 0, begin is
@@ -2622,104 +2624,104 @@ static void line_range_hunk_fn(void *data,
 	 * flush or discard them when the next content line reveals
 	 * whether the removals precede in-range content.
 	 */
-	s->lno_post = new_begin;
-	s->lno_pre = old_begin;
+	filter->lno_in_postimage = new_begin;
+	filter->lno_in_preimage = old_begin;
 
 	if (funclen > 0) {
-		if (funclen > (long)sizeof(s->func))
-			funclen = sizeof(s->func);
-		memcpy(s->func, func, funclen);
+		if (funclen > (long)sizeof(filter->func))
+			funclen = sizeof(filter->func);
+		memcpy(filter->func, func, funclen);
 	}
-	s->funclen = funclen;
+	filter->funclen = funclen;
 }
 
 static int line_range_line_fn(void *priv, char *line, unsigned long len)
 {
-	struct line_range_callback *s = priv;
+	struct line_range_filter *filter = priv;
 	const struct range *cur;
-	long lno_0, cur_pre;
+	long idx_in_postimage, cur_pre;
 
-	if (s->ret)
-		return s->ret;
+	if (filter->ret)
+		return filter->ret;
 
 	if (line[0] == '-') {
-		if (!s->pending_rm_count)
-			s->pending_rm_pre_begin = s->lno_pre;
-		s->lno_pre++;
-		strbuf_add(&s->pending_rm, line, len);
-		s->pending_rm_count++;
-		return s->ret;
+		if (!filter->pending_rm_count)
+			filter->pending_rm_pre_begin = filter->lno_in_preimage;
+		filter->lno_in_preimage++;
+		strbuf_add(&filter->pending_rm, line, len);
+		filter->pending_rm_count++;
+		return filter->ret;
 	}
 
 	if (line[0] == '\\') {
-		if (s->pending_rm_count)
-			strbuf_add(&s->pending_rm, line, len);
-		else if (s->rhunk_active)
-			strbuf_add(&s->rhunk, line, len);
+		if (filter->pending_rm_count)
+			strbuf_add(&filter->pending_rm, line, len);
+		else if (filter->hunk.active)
+			strbuf_add(&filter->hunk.lines, line, len);
 		/* otherwise outside tracked range; drop silently */
-		return s->ret;
+		return filter->ret;
 	}
 
 	if (line[0] != '+' && line[0] != ' ')
 		BUG("unexpected diff line type '%c'", line[0]);
 
-	lno_0 = s->lno_post - 1;
-	cur_pre = s->lno_pre;	/* save before advancing for context lines */
-	s->lno_post++;
+	idx_in_postimage = filter->lno_in_postimage - 1;
+	cur_pre = filter->lno_in_preimage;	/* save before advancing for context lines */
+	filter->lno_in_postimage++;
 	if (line[0] == ' ')
-		s->lno_pre++;
+		filter->lno_in_preimage++;
 
 	/* Advance past ranges we've passed */
-	while (s->cur_range < s->ranges->nr &&
-	       lno_0 >= s->ranges->ranges[s->cur_range].end) {
-		if (s->rhunk_active)
-			flush_rhunk(s);
-		discard_pending_rm(s);
-		s->cur_range++;
+	while (filter->cur_range < filter->ranges->nr &&
+	       idx_in_postimage >= filter->ranges->ranges[filter->cur_range].end) {
+		if (filter->hunk.active)
+			flush_range_hunk(filter);
+		discard_pending_rm(filter);
+		filter->cur_range++;
 	}
 
 	/* Past all ranges */
-	if (s->cur_range >= s->ranges->nr) {
-		discard_pending_rm(s);
-		return s->ret;
+	if (filter->cur_range >= filter->ranges->nr) {
+		discard_pending_rm(filter);
+		return filter->ret;
 	}
 
-	cur = &s->ranges->ranges[s->cur_range];
+	cur = &filter->ranges->ranges[filter->cur_range];
 
 	/* Before current range */
-	if (lno_0 < cur->start) {
-		discard_pending_rm(s);
-		return s->ret;
+	if (idx_in_postimage < cur->start) {
+		discard_pending_rm(filter);
+		return filter->ret;
 	}
 
 	/* In range so start a new range hunk if needed */
-	if (!s->rhunk_active) {
-		s->rhunk_active = 1;
-		s->rhunk_has_changes = 0;
-		s->rhunk_new_begin = lno_0 + 1;
-		s->rhunk_old_begin = s->pending_rm_count
-			? s->pending_rm_pre_begin : cur_pre;
-		s->rhunk_old_count = 0;
-		s->rhunk_new_count = 0;
-		strbuf_reset(&s->rhunk);
+	if (!filter->hunk.active) {
+		filter->hunk.active = 1;
+		filter->hunk.has_changes = 0;
+		filter->hunk.new_begin = idx_in_postimage + 1;
+		filter->hunk.old_begin = filter->pending_rm_count
+			? filter->pending_rm_pre_begin : cur_pre;
+		filter->hunk.old_count = 0;
+		filter->hunk.new_count = 0;
+		strbuf_reset(&filter->hunk.lines);
 	}
 
 	/* Flush pending removals into range hunk */
-	if (s->pending_rm_count) {
-		strbuf_addbuf(&s->rhunk, &s->pending_rm);
-		s->rhunk_old_count += s->pending_rm_count;
-		s->rhunk_has_changes = 1;
-		discard_pending_rm(s);
+	if (filter->pending_rm_count) {
+		strbuf_addbuf(&filter->hunk.lines, &filter->pending_rm);
+		filter->hunk.old_count += filter->pending_rm_count;
+		filter->hunk.has_changes = 1;
+		discard_pending_rm(filter);
 	}
 
-	strbuf_add(&s->rhunk, line, len);
-	s->rhunk_new_count++;
+	strbuf_add(&filter->hunk.lines, line, len);
+	filter->hunk.new_count++;
 	if (line[0] == '+')
-		s->rhunk_has_changes = 1;
+		filter->hunk.has_changes = 1;
 	else
-		s->rhunk_old_count++;
+		filter->hunk.old_count++;
 
-	return s->ret;
+	return filter->ret;
 }
 
 static void pprint_rename(struct strbuf *name, const char *a, const char *b)
@@ -4086,7 +4088,7 @@ static void builtin_diff(const char *name_a,
 			xdi_diff_outf(&mf1, &mf2, NULL, quick_consume,
 				      &ecbdata, &xpp, &xecfg);
 		} else if (line_ranges) {
-			struct line_range_callback lr_state;
+			struct line_range_filter lr_state;
 			unsigned int i;
 			long max_span = 0;
 
@@ -4094,7 +4096,7 @@ static void builtin_diff(const char *name_a,
 			lr_state.orig_line_fn = fn_out_consume;
 			lr_state.orig_cb_data = &ecbdata;
 			lr_state.ranges = line_ranges;
-			strbuf_init(&lr_state.rhunk, 0);
+			strbuf_init(&lr_state.hunk.lines, 0);
 			strbuf_init(&lr_state.pending_rm, 0);
 
 			/*
@@ -4125,11 +4127,11 @@ static void builtin_diff(const char *name_a,
 				die("unable to generate diff for %s",
 				    one->path);
 
-			flush_rhunk(&lr_state);
+			flush_range_hunk(&lr_state);
 			if (lr_state.ret)
 				die("unable to generate diff for %s",
 				    one->path);
-			strbuf_release(&lr_state.rhunk);
+			strbuf_release(&lr_state.hunk.lines);
 			strbuf_release(&lr_state.pending_rm);
 		} else if (xdi_diff_outf(&mf1, &mf2, NULL, fn_out_consume,
 					 &ecbdata, &xpp, &xecfg))
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 2/7] diff: simplify the line-range filter by classifying removals immediately
From: Michael Montalbo via GitGitGadget @ 2026-06-18 18:16 UTC (permalink / raw)
  To: git; +Cc: D. Ben Knoble, Michael Montalbo, Michael Montalbo
In-Reply-To: <pull.2152.git.1781806593.gitgitgadget@gmail.com>

From: Michael Montalbo <mmontalbo@gmail.com>

The filter buffered '-' lines in a pending_rm strbuf, deferring their
classification until a '+' or ' ' line revealed the post-image
position.  That buffering is unnecessary: a removal occupies no
post-image line, so it does not advance lno_in_postimage, and xdiff
emits removals before additions within a change.  A '-' therefore
arrives while lno_in_postimage already holds the index the following
'+'/' ' will occupy, and can be classified against the ranges as it
arrives.

The buffering also hid a bug: flush_range_hunk() drained pending_rm into
the range hunk whenever the hunk was active, even after lno_in_postimage
had advanced past the tracked range, so a deletion just after the
tracked function leaked into the patch.  Classifying each line as it
arrives removes the pending_rm buffer, the discard_pending_rm() helper,
three struct fields, and makes that bug impossible by construction.

With every line classified on arrival, the buffered lines are the
hunk's single source of truth, so the old/new counts need not be kept
alongside them: flush_range_hunk() derives the counts (and whether the
hunk holds any change) from the buffer when it builds the header.  Drop
the per-line counting and the old_count, new_count, and has_changes
fields; there is no longer a second tally that could fall out of sync
with the buffer.

Add begin_range_hunk() to open the accumulator at the first in-range
line, seeding both begins from the live image cursors, as the
counterpart to flush_range_hunk().  With the counting gone too,
line_range_line_fn() now only appends an in-range line.

Document the coordinate model: a block comment on struct
line_range_filter states it (the pre/post-image cursors, the 0-based
idx_in_postimage, removals classified by the following line) with a
worked example.

Add tests for the leaked trailing deletion this fixes, the symmetric
leading-deletion case, and the filter's range boundaries (a change at
the first and last line of a range, and a pure in-range deletion).

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
---
 diff.c              | 215 ++++++++++++++++++++++++--------------------
 t/t4211-line-log.sh | 125 ++++++++++++++++++++++++++
 2 files changed, 243 insertions(+), 97 deletions(-)

diff --git a/diff.c b/diff.c
index 1e043c959f..ee765d7ac2 100644
--- a/diff.c
+++ b/diff.c
@@ -610,18 +610,58 @@ struct emit_callback {
 };
 
 /*
- * State for the line-range callback wrappers that sit between
- * xdi_diff_outf() and fn_out_consume().  xdiff produces a normal,
- * unfiltered diff; the wrappers intercept each hunk header and line,
- * track post-image position, and forward only lines that fall within
- * the requested ranges.  Contiguous in-range lines are collected into
- * range hunks and flushed with a synthetic @@ header so that
- * fn_out_consume() sees well-formed unified-diff fragments.
+ * Line-range filter: scopes "git log -L" output to the tracked ranges.
  *
- * Removal lines ('-') cannot be classified by post-image position, so
- * they are buffered in pending_rm until the next '+' or ' ' line
- * reveals whether they precede an in-range line (flush into range hunk) or
- * an out-of-range line (discard).
+ * It sits between xdi_diff_outf() and an output callback (fn_out_consume,
+ * diffstat_consume, checkdiff_consume).  xdiff produces a normal diff; the
+ * filter forwards only the lines inside the requested ranges, collecting
+ * contiguous in-range lines into a "range hunk" emitted with a synthetic
+ * @@ header so the callback sees well-formed unified-diff fragments.
+ *
+ * A diff describes the change from a pre-image to a post-image.  Each
+ * line is context (' ', in both), a removal ('-', pre-image only), or
+ * an addition ('+', post-image only).  -L tracks ranges in the
+ * post-image, so a line is in range by its post-image position.
+ *
+ * Two 1-based cursors track the next line in each image, named as in
+ * struct emit_callback and seeded from the xdiff hunk header:
+ *
+ *	lno_in_postimage  advances on '+' and ' '   (lines in the post-image)
+ *	lno_in_preimage   advances on '-' and ' '   (lines in the pre-image)
+ *
+ * Ranges are 0-based half-open [start, end), so a line is tested at the
+ * 0-based index idx_in_postimage = lno_in_postimage - 1.
+ *
+ * A '-' is not present in the post-image, so it has no post-image line
+ * number of its own.  Since it does not advance lno_in_postimage, it is
+ * classified at the idx_in_postimage that the following '+'/' ' will
+ * occupy.  xdiff emits a change's removals before its additions, so that
+ * index is already known when the '-' arrives.
+ *
+ * The synthetic "@@ -<old> +<new> @@" header has two sides, old (the
+ * pre-image) and new (the post-image), matching the xdiff_emit_hunk_fn
+ * callback; the hunk.old_begin / hunk.new_begin fields below hold those
+ * begins, and flush_range_hunk() derives the counts from the buffered
+ * lines.
+ *
+ * Example, tracking post-image line 2 (range [1, 2)) of:
+ *
+ *	pre-image  post-image
+ *	1 a        1 a
+ *	2 b        2 X      (b -> X)
+ *	3 c        3 c
+ *
+ *	classify each line by idx_in_postimage.  The pre and post columns
+ *	are each cursor's value while that line is classified, i.e. before
+ *	the line advances them (pre = lno_in_preimage,
+ *	post = lno_in_postimage, idx = idx_in_postimage):
+ *	' a'  pre 1  post 1  idx 0  ->  before start, skip
+ *	'-b'  pre 2  post 2  idx 1  ->  keep (removal)
+ *	'+X'  pre 3  post 2  idx 1  ->  keep (addition)
+ *	' c'  pre 3  post 3  idx 2  ->  past end, flush
+ *
+ * -b and +X share idx = 1 because -b did not advance lno_in_postimage;
+ * both land in the range hunk, flushed when ' c' crosses the range end.
  */
 struct line_range_filter {
 	xdiff_emit_line_fn orig_line_fn;
@@ -640,20 +680,18 @@ struct line_range_filter {
 	char func[80];
 	long funclen;
 
-	/* The range hunk being accumulated for the current range. */
+	/*
+	 * The range hunk being accumulated.  At most one is live at a time:
+	 * it is flushed and reset as the cursor leaves each range (and once
+	 * more at end of diff), then reused for the next range.
+	 */
 	struct {
 		struct strbuf lines;	/* buffered in-range diff lines */
-		long old_begin, old_count;
-		long new_begin, new_count;
+		long old_begin;
+		long new_begin;
 		int active;
-		int has_changes;	/* any '+' or '-' line? */
 	} hunk;
 
-	/* Removal lines not yet known to be in-range */
-	struct strbuf pending_rm;
-	int pending_rm_count;
-	long pending_rm_pre_begin;	/* pre-image line of first pending */
-
 	int ret;			/* latched error from orig_line_fn */
 };
 
@@ -2542,26 +2580,48 @@ static int quick_consume(void *priv, char *line UNUSED, unsigned long len UNUSED
 	return 1;
 }
 
-static void discard_pending_rm(struct line_range_filter *filter)
+/*
+ * Begin a range hunk at the first in-range line.  Its position fixes the
+ * hunk's begins, taken from the two image cursors before they advance:
+ * new_begin from the post-image, old_begin from the pre-image.  The line
+ * counts are not tracked here; flush_range_hunk() derives them from the
+ * buffered lines.
+ */
+static void begin_range_hunk(struct line_range_filter *filter)
 {
-	strbuf_reset(&filter->pending_rm);
-	filter->pending_rm_count = 0;
+	filter->hunk.active = 1;
+	filter->hunk.new_begin = filter->lno_in_postimage;
+	filter->hunk.old_begin = filter->lno_in_preimage;
+	strbuf_reset(&filter->hunk.lines);
 }
 
 static void flush_range_hunk(struct line_range_filter *filter)
 {
 	struct strbuf hdr = STRBUF_INIT;
 	const char *p, *end;
+	long old_count = 0, new_count = 0;
+	int has_changes = 0;
 
 	if (!filter->hunk.active || filter->ret)
 		return;
 
-	/* Drain any pending removal lines into the range hunk */
-	if (filter->pending_rm_count) {
-		strbuf_addbuf(&filter->hunk.lines, &filter->pending_rm);
-		filter->hunk.old_count += filter->pending_rm_count;
-		filter->hunk.has_changes = 1;
-		discard_pending_rm(filter);
+	/*
+	 * Derive the hunk's geometry from the buffered lines: a ' '
+	 * counts on both sides, a '-' on the old side, a '+' on the new.
+	 * A '-' or '+' marks a real change; the "\ No newline at end of
+	 * file" marker (line[0] == '\\') counts on neither side.
+	 */
+	p = filter->hunk.lines.buf;
+	end = p + filter->hunk.lines.len;
+	while (p < end) {
+		const char *eol = memchr(p, '\n', end - p);
+		if (*p == ' ' || *p == '-')
+			old_count++;
+		if (*p == ' ' || *p == '+')
+			new_count++;
+		if (*p == '-' || *p == '+')
+			has_changes = 1;
+		p = eol ? eol + 1 : end;
 	}
 
 	/*
@@ -2570,15 +2630,15 @@ static void flush_range_hunk(struct line_range_filter *filter)
 	 * ctxlen causes xdiff to emit context covering a range that
 	 * has no changes in this commit.
 	 */
-	if (!filter->hunk.has_changes) {
+	if (!has_changes) {
 		filter->hunk.active = 0;
 		strbuf_reset(&filter->hunk.lines);
 		return;
 	}
 
 	strbuf_addf(&hdr, "@@ -%ld,%ld +%ld,%ld @@",
-		    filter->hunk.old_begin, filter->hunk.old_count,
-		    filter->hunk.new_begin, filter->hunk.new_count);
+		    filter->hunk.old_begin, old_count,
+		    filter->hunk.new_begin, new_count);
 	if (filter->funclen > 0) {
 		strbuf_addch(&hdr, ' ');
 		strbuf_add(&hdr, filter->func, filter->funclen);
@@ -2618,11 +2678,6 @@ static void line_range_hunk_fn(void *data,
 	 * When count > 0, begin is 1-based.  When count == 0, begin is
 	 * adjusted down by 1 by xdl_emit_hunk_hdr(), but no lines of
 	 * that type will arrive, so the value is unused.
-	 *
-	 * Any pending removal lines from the previous xdiff hunk are
-	 * intentionally left in pending_rm: the line callback will
-	 * flush or discard them when the next content line reveals
-	 * whether the removals precede in-range content.
 	 */
 	filter->lno_in_postimage = new_begin;
 	filter->lno_in_preimage = old_begin;
@@ -2638,88 +2693,56 @@ static void line_range_hunk_fn(void *data,
 static int line_range_line_fn(void *priv, char *line, unsigned long len)
 {
 	struct line_range_filter *filter = priv;
-	const struct range *cur;
-	long idx_in_postimage, cur_pre;
+	long idx_in_postimage;
+	int in_range;
 
 	if (filter->ret)
 		return filter->ret;
 
-	if (line[0] == '-') {
-		if (!filter->pending_rm_count)
-			filter->pending_rm_pre_begin = filter->lno_in_preimage;
-		filter->lno_in_preimage++;
-		strbuf_add(&filter->pending_rm, line, len);
-		filter->pending_rm_count++;
-		return filter->ret;
-	}
-
 	if (line[0] == '\\') {
-		if (filter->pending_rm_count)
-			strbuf_add(&filter->pending_rm, line, len);
-		else if (filter->hunk.active)
+		if (filter->hunk.active)
 			strbuf_add(&filter->hunk.lines, line, len);
-		/* otherwise outside tracked range; drop silently */
 		return filter->ret;
 	}
 
-	if (line[0] != '+' && line[0] != ' ')
+	if (line[0] != '+' && line[0] != ' ' && line[0] != '-')
 		BUG("unexpected diff line type '%c'", line[0]);
 
+	/*
+	 * idx_in_postimage is this line's 0-based post-image index (see the model on
+	 * struct line_range_filter).  The cursors are advanced only after
+	 * the line is classified, so a '-' is tested at the same idx_in_postimage as
+	 * the '+'/' ' that follows it.
+	 */
 	idx_in_postimage = filter->lno_in_postimage - 1;
-	cur_pre = filter->lno_in_preimage;	/* save before advancing for context lines */
-	filter->lno_in_postimage++;
-	if (line[0] == ' ')
-		filter->lno_in_preimage++;
 
-	/* Advance past ranges we've passed */
+	/* Retire ranges we have passed, flushing the one we leave. */
 	while (filter->cur_range < filter->ranges->nr &&
 	       idx_in_postimage >= filter->ranges->ranges[filter->cur_range].end) {
 		if (filter->hunk.active)
 			flush_range_hunk(filter);
-		discard_pending_rm(filter);
 		filter->cur_range++;
 	}
 
-	/* Past all ranges */
-	if (filter->cur_range >= filter->ranges->nr) {
-		discard_pending_rm(filter);
-		return filter->ret;
-	}
+	in_range = filter->cur_range < filter->ranges->nr &&
+		   idx_in_postimage >= filter->ranges->ranges[filter->cur_range].start &&
+		   idx_in_postimage < filter->ranges->ranges[filter->cur_range].end;
 
-	cur = &filter->ranges->ranges[filter->cur_range];
+	if (in_range) {
+		if (!filter->hunk.active)
+			begin_range_hunk(filter);
 
-	/* Before current range */
-	if (idx_in_postimage < cur->start) {
-		discard_pending_rm(filter);
-		return filter->ret;
+		strbuf_add(&filter->hunk.lines, line, len);
 	}
 
-	/* In range so start a new range hunk if needed */
-	if (!filter->hunk.active) {
-		filter->hunk.active = 1;
-		filter->hunk.has_changes = 0;
-		filter->hunk.new_begin = idx_in_postimage + 1;
-		filter->hunk.old_begin = filter->pending_rm_count
-			? filter->pending_rm_pre_begin : cur_pre;
-		filter->hunk.old_count = 0;
-		filter->hunk.new_count = 0;
-		strbuf_reset(&filter->hunk.lines);
-	}
-
-	/* Flush pending removals into range hunk */
-	if (filter->pending_rm_count) {
-		strbuf_addbuf(&filter->hunk.lines, &filter->pending_rm);
-		filter->hunk.old_count += filter->pending_rm_count;
-		filter->hunk.has_changes = 1;
-		discard_pending_rm(filter);
-	}
-
-	strbuf_add(&filter->hunk.lines, line, len);
-	filter->hunk.new_count++;
-	if (line[0] == '+')
-		filter->hunk.has_changes = 1;
-	else
-		filter->hunk.old_count++;
+	/*
+	 * Advance each image's cursor: a line present in that image (see
+	 * the model) consumes one of its line numbers.
+	 */
+	if (line[0] != '-')
+		filter->lno_in_postimage++;
+	if (line[0] != '+')
+		filter->lno_in_preimage++;
 
 	return filter->ret;
 }
@@ -4097,7 +4120,6 @@ static void builtin_diff(const char *name_a,
 			lr_state.orig_cb_data = &ecbdata;
 			lr_state.ranges = line_ranges;
 			strbuf_init(&lr_state.hunk.lines, 0);
-			strbuf_init(&lr_state.pending_rm, 0);
 
 			/*
 			 * Inflate ctxlen so that all changes within
@@ -4132,7 +4154,6 @@ static void builtin_diff(const char *name_a,
 				die("unable to generate diff for %s",
 				    one->path);
 			strbuf_release(&lr_state.hunk.lines);
-			strbuf_release(&lr_state.pending_rm);
 		} else if (xdi_diff_outf(&mf1, &mf2, NULL, fn_out_consume,
 					 &ecbdata, &xpp, &xecfg))
 			die("unable to generate diff for %s", one->path);
diff --git a/t/t4211-line-log.sh b/t/t4211-line-log.sh
index ca4eb7bbc7..e9691066de 100755
--- a/t/t4211-line-log.sh
+++ b/t/t4211-line-log.sh
@@ -738,6 +738,131 @@ test_expect_success '-L with -G filters to diff-text matches' '
 	grep "F2 + 2" actual
 '
 
+test_expect_success 'setup for trailing deletion test' '
+	git checkout --orphan trailing-del &&
+	git reset --hard &&
+	cat >file.c <<-\EOF &&
+	void tracked()
+	{
+	    return 1;
+	}
+	// trailing comment
+	EOF
+	git add file.c &&
+	test_tick &&
+	git commit -m "add file with trailing comment" &&
+	# Modify tracked() AND delete the trailing comment in
+	# one commit, so the commit touches the tracked range
+	# and is not filtered out by the revision walker.
+	cat >file.c <<-\EOF &&
+	void tracked()
+	{
+	    return 2;
+	}
+	EOF
+	git commit -a -m "modify tracked and delete trailing comment"
+'
+
+test_expect_success '-L does not include deletions past end of tracked range' '
+	git log -L:tracked:file.c --format= -1 -p >actual &&
+	# The trailing comment deletion is outside the tracked
+	# range and should not appear in the patch output.
+	test_grep "return 2" actual &&
+	test_grep ! "trailing comment" actual
+'
+
+test_expect_success '-L includes leading deletions resolved by in-range line' '
+	git checkout --orphan leading-del &&
+	git reset --hard &&
+	cat >file.c <<-\EOF &&
+	// leading comment
+	void tracked()
+	{
+	    return 1;
+	}
+	EOF
+	git add file.c &&
+	test_tick &&
+	git commit -m "add file with leading comment" &&
+	cat >file.c <<-\EOF &&
+	void tracked()
+	{
+	    return 2;
+	}
+	EOF
+	git commit -a -m "modify tracked and delete leading comment" &&
+	git log -L:tracked:file.c --format= -1 -p >actual &&
+	# The leading comment deletion is resolved by the next
+	# non-removal line (void tracked), which is in range: a
+	# removal is classified by the position of the following
+	# line, so it joins the range that line falls in.
+	test_grep "return 2" actual &&
+	test_grep "leading comment" actual
+'
+
+test_expect_success 'setup for line-range filter edge cases' '
+	git checkout --orphan filter-edge &&
+	git reset --hard &&
+	cat >file.c <<-\EOF &&
+	void before()
+	{
+	    return 0;
+	}
+
+	void tracked()
+	{
+	    int a = 1;
+	    int b = 2;
+	    int c = 3;
+	    return a + b + c;
+	}
+
+	void after()
+	{
+	    return 9;
+	}
+	EOF
+	git add file.c &&
+	test_tick &&
+	git commit -m "initial"
+'
+
+test_expect_success '-L change at exact first line of range' '
+	git checkout filter-edge &&
+	# Change the function signature (first line of range)
+	sed "s/void tracked/int tracked/" file.c >tmp &&
+	mv tmp file.c &&
+	git commit -a -m "change first line" &&
+	git log -L:tracked:file.c -p --format=%s -1 >actual &&
+	test_grep "change first line" actual &&
+	test_grep "+int tracked" actual &&
+	test_grep "\\-void tracked" actual
+'
+
+test_expect_success '-L change at exact last line of range' '
+	git checkout filter-edge &&
+	git reset --hard HEAD~1 &&
+	# Change the closing brace line (last line of range)
+	sed "s/^}$/} \/\/ end tracked/" file.c >tmp &&
+	mv tmp file.c &&
+	git commit -a -m "change last line" &&
+	git log -L:tracked:file.c -p --format=%s -1 >actual &&
+	test_grep "change last line" actual &&
+	test_grep "end tracked" actual
+'
+
+test_expect_success '-L pure deletion in range (no additions)' '
+	git checkout filter-edge &&
+	git reset --hard HEAD~1 &&
+	# Delete a line inside tracked() without adding anything
+	sed "/int c/d" file.c >tmp &&
+	mv tmp file.c &&
+	git commit -a -m "pure deletion" &&
+	git log -L:tracked:file.c -p --format=%s -1 >actual &&
+	test_grep "pure deletion" actual &&
+	test_grep "\\-.*int c" actual
+'
+
 test_expect_success '-L with --diff-filter=M excludes root commit' '
 	git checkout parent-oids &&
 	git log -L:func2:file.c --diff-filter=M --format=%s --no-patch >actual &&
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 3/7] diff: emit -L hunk headers via xdiff's formatter
From: Michael Montalbo via GitGitGadget @ 2026-06-18 18:16 UTC (permalink / raw)
  To: git; +Cc: D. Ben Knoble, Michael Montalbo, Michael Montalbo
In-Reply-To: <pull.2152.git.1781806593.gitgitgadget@gmail.com>

From: Michael Montalbo <mmontalbo@gmail.com>

The line-range filter builds its own "@@ -<old> +<new> @@" header for
each range hunk.  For a side with no lines (count 0, such as the old
side of a pure insertion), the begin should be the number of the line
before the change, per the convention git diff and xdl_emit_hunk_hdr()
follow.  The hand-rolled code's begin was one too high; in t4211 this
produced

	@@ -25,0 +18,9 @@

an old begin of 25 in a 24-line file, where git diff would give 24.

Stop hand-rolling the header.  flush_range_hunk() now formats it through
xdiff's own emitter: a new xdiff_emit_hunk_header() helper wraps
xdl_emit_hunk_hdr(), the function that produces every other diff's hunk
headers.  The count-0 begin is then correct by construction, and as a
side effect -L headers match git diff exactly, including its omission of
a count of 1 ("@@ -22 +22 @@" rather than "@@ -22,1 +22,1 @@").

xdiff's hunk callback already hands line_range_hunk_fn() a count-0 begin
decremented, so undo that when seeding the cursors and let the formatter
re-apply the convention once, at emit time.

The off-by-one predates this series, and the two regenerated fixtures
reach it from different origins: no-assertion-error has carried it since
its test was added in ab60c693a2 (line-log: fix assertion error,
2025-08-18), while vanishes-early acquired it when 86e986f166 (line-log:
route -L output through the standard diff pipeline) reshaped its tracked
line into a pure insertion.  vanishes-early also drops its count-1
counts.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
---
 diff.c                                   | 27 +++++++++++-------------
 t/t4211/sha1/expect.no-assertion-error   |  2 +-
 t/t4211/sha1/expect.vanishes-early       |  6 +++---
 t/t4211/sha256/expect.no-assertion-error |  2 +-
 t/t4211/sha256/expect.vanishes-early     |  6 +++---
 xdiff-interface.c                        | 19 +++++++++++++++++
 xdiff-interface.h                        | 15 +++++++++++++
 7 files changed, 54 insertions(+), 23 deletions(-)

diff --git a/diff.c b/diff.c
index ee765d7ac2..9751bb6798 100644
--- a/diff.c
+++ b/diff.c
@@ -2636,14 +2636,9 @@ static void flush_range_hunk(struct line_range_filter *filter)
 		return;
 	}
 
-	strbuf_addf(&hdr, "@@ -%ld,%ld +%ld,%ld @@",
-		    filter->hunk.old_begin, old_count,
-		    filter->hunk.new_begin, new_count);
-	if (filter->funclen > 0) {
-		strbuf_addch(&hdr, ' ');
-		strbuf_add(&hdr, filter->func, filter->funclen);
-	}
-	strbuf_addch(&hdr, '\n');
+	xdiff_emit_hunk_header(&hdr, filter->hunk.old_begin, old_count,
+			       filter->hunk.new_begin, new_count,
+			       filter->func, filter->funclen);
 
 	filter->ret = filter->orig_line_fn(filter->orig_cb_data, hdr.buf, hdr.len);
 	strbuf_release(&hdr);
@@ -2668,19 +2663,21 @@ static void flush_range_hunk(struct line_range_filter *filter)
 }
 
 static void line_range_hunk_fn(void *data,
-			       long old_begin, long old_nr UNUSED,
-			       long new_begin, long new_nr UNUSED,
+			       long old_begin, long old_nr,
+			       long new_begin, long new_nr,
 			       const char *func, long funclen)
 {
 	struct line_range_filter *filter = data;
 
 	/*
-	 * When count > 0, begin is 1-based.  When count == 0, begin is
-	 * adjusted down by 1 by xdl_emit_hunk_hdr(), but no lines of
-	 * that type will arrive, so the value is unused.
+	 * Seed the per-image line cursors from the hunk header's begins.  For
+	 * a side with no lines (count 0), xdiff's callback has already moved
+	 * its begin to the line before the change, so add one back to recover
+	 * the true 1-based start.  xdiff_emit_hunk_header() reapplies that -1
+	 * when the clipped hunk is emitted.
 	 */
-	filter->lno_in_postimage = new_begin;
-	filter->lno_in_preimage = old_begin;
+	filter->lno_in_postimage = new_nr ? new_begin : new_begin + 1;
+	filter->lno_in_preimage = old_nr ? old_begin : old_begin + 1;
 
 	if (funclen > 0) {
 		if (funclen > (long)sizeof(filter->func))
diff --git a/t/t4211/sha1/expect.no-assertion-error b/t/t4211/sha1/expect.no-assertion-error
index 54c568f273..95faf51a7b 100644
--- a/t/t4211/sha1/expect.no-assertion-error
+++ b/t/t4211/sha1/expect.no-assertion-error
@@ -8,7 +8,7 @@ diff --git a/b.c b/b.c
 index bf79c2f..27c829c 100644
 --- a/b.c
 +++ b/b.c
-@@ -25,0 +18,9 @@
+@@ -24,0 +18,9 @@
 +long f(long x)
 +{
 +	int s = 0;
diff --git a/t/t4211/sha1/expect.vanishes-early b/t/t4211/sha1/expect.vanishes-early
index a413ad3659..e4b1a201d5 100644
--- a/t/t4211/sha1/expect.vanishes-early
+++ b/t/t4211/sha1/expect.vanishes-early
@@ -8,7 +8,7 @@ diff --git a/a.c b/a.c
 index 0b9cae5..5de3ea4 100644
 --- a/a.c
 +++ b/a.c
-@@ -23,0 +24,1 @@ int main ()
+@@ -22,0 +24 @@ int main ()
 +/* incomplete lines are bad! */
 
 commit 100b61a6f2f720f812620a9d10afb3a960ccb73c
@@ -21,7 +21,7 @@ diff --git a/a.c b/a.c
 index 5e709a1..0b9cae5 100644
 --- a/a.c
 +++ b/a.c
-@@ -22,1 +22,1 @@ int main ()
+@@ -22 +22 @@ int main ()
 -}
 +}
 \ No newline at end of file
@@ -37,5 +37,5 @@ new file mode 100644
 index 0000000..444e415
 --- /dev/null
 +++ b/a.c
-@@ -0,0 +20,1 @@
+@@ -0,0 +20 @@
 +}
diff --git a/t/t4211/sha256/expect.no-assertion-error b/t/t4211/sha256/expect.no-assertion-error
index c25f2ce19c..815d27f7f1 100644
--- a/t/t4211/sha256/expect.no-assertion-error
+++ b/t/t4211/sha256/expect.no-assertion-error
@@ -8,7 +8,7 @@ diff --git a/b.c b/b.c
 index 69cb69c..a0d566e 100644
 --- a/b.c
 +++ b/b.c
-@@ -25,0 +18,9 @@
+@@ -24,0 +18,9 @@
 +long f(long x)
 +{
 +	int s = 0;
diff --git a/t/t4211/sha256/expect.vanishes-early b/t/t4211/sha256/expect.vanishes-early
index bc33b963dc..263fc9eaac 100644
--- a/t/t4211/sha256/expect.vanishes-early
+++ b/t/t4211/sha256/expect.vanishes-early
@@ -8,7 +8,7 @@ diff --git a/a.c b/a.c
 index e4fa1d8..62c1fc2 100644
 --- a/a.c
 +++ b/a.c
-@@ -23,0 +24,1 @@ int main ()
+@@ -22,0 +24 @@ int main ()
 +/* incomplete lines are bad! */
 
 commit 29f32ac3141c48b22803e5c4127b719917b67d0f8ca8c5248bebfa2a19f7da10
@@ -21,7 +21,7 @@ diff --git a/a.c b/a.c
 index d325124..e4fa1d8 100644
 --- a/a.c
 +++ b/a.c
-@@ -22,1 +22,1 @@ int main ()
+@@ -22 +22 @@ int main ()
 -}
 +}
 \ No newline at end of file
@@ -37,5 +37,5 @@ new file mode 100644
 index 0000000..9f550c3
 --- /dev/null
 +++ b/a.c
-@@ -0,0 +20,1 @@
+@@ -0,0 +20 @@
 +}
diff --git a/xdiff-interface.c b/xdiff-interface.c
index 5ee2b96d0a..32e04630ee 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -91,6 +91,25 @@ static int xdiff_outf(void *priv_, mmbuffer_t *mb, int nbuf)
 	return 0;
 }
 
+static int strbuf_out_line(void *priv, mmbuffer_t *mb, int nbuf)
+{
+	struct strbuf *out = priv;
+	int i;
+	for (i = 0; i < nbuf; i++)
+		strbuf_add(out, mb[i].ptr, mb[i].size);
+	return 0;
+}
+
+void xdiff_emit_hunk_header(struct strbuf *out,
+			    long old_begin, long old_count,
+			    long new_begin, long new_count,
+			    const char *func, long funclen)
+{
+	xdemitcb_t ecb = { .priv = out, .out_line = strbuf_out_line };
+	xdl_emit_hunk_hdr(old_begin, old_count, new_begin, new_count,
+			  func, funclen, &ecb);
+}
+
 /*
  * Trim down common substring at the end of the buffers,
  * but end on a complete line.
diff --git a/xdiff-interface.h b/xdiff-interface.h
index ce54e1c0e0..51c88296ed 100644
--- a/xdiff-interface.h
+++ b/xdiff-interface.h
@@ -76,4 +76,19 @@ int xdiff_compare_lines(const char *l1, long s1,
  */
 unsigned long xdiff_hash_string(const char *s, size_t len, long flags);
 
+struct strbuf;
+
+/*
+ * Append a unified-diff hunk header to `out`, e.g.
+ * "@@ -<old> +<new> @@ func\n".  The header comes from wrapping xdiff's
+ * own hunk-header emitter, so it matches what a normal diff would
+ * produce for these begins and counts.  For a side with no lines
+ * (count 0) the begin is the line before the change, and a count of 1
+ * is omitted.
+ */
+void xdiff_emit_hunk_header(struct strbuf *out,
+			    long old_begin, long old_count,
+			    long new_begin, long new_count,
+			    const char *func, long funclen);
+
 #endif
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 4/7] diff: extract a line-range diff helper for reuse
From: Michael Montalbo via GitGitGadget @ 2026-06-18 18:16 UTC (permalink / raw)
  To: git; +Cc: D. Ben Knoble, Michael Montalbo, Michael Montalbo
In-Reply-To: <pull.2152.git.1781806593.gitgitgadget@gmail.com>

From: Michael Montalbo <mmontalbo@gmail.com>

builtin_diff() open-codes the line-range filter setup and teardown
around its xdi_diff_outf() call: zero the struct, point it at the
output callback, inflate ctxlen to the largest range span so each range
yields a single xdiff hunk, run the diff, flush the trailing range
hunk, and release the buffer.  The upcoming -L stat and check formats
need the same sequence.

Extract line_range_filter_init() for the setup and a
line_range_filter_diff() helper that prepares the xdiff config the
filter needs, runs an initialized filter through xdi_diff_outf(),
flushes the final range hunk, and releases it, returning the latched
error.  The helper inflates ctxlen to the largest range span so each
range yields a single xdiff hunk, and clears XDL_EMIT_NO_HUNK_HDR so
the hunk headers the filter seeds its position from are always emitted.
Folding both into the helper keeps these invariants, which the filter's
position tracking relies on, in a single place for every consumer.
builtin_diff() now does init + line_range_filter_diff(); the next two
patches reuse them in builtin_diffstat() and builtin_checkdiff()
instead of repeating the boilerplate.

No behavior change: builtin_diff() leaves XDL_EMIT_NO_HUNK_HDR unset,
so clearing it is a no-op until the suppressing consumers arrive.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
---
 diff.c | 100 +++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 61 insertions(+), 39 deletions(-)

diff --git a/diff.c b/diff.c
index 9751bb6798..6233a96bf0 100644
--- a/diff.c
+++ b/diff.c
@@ -2580,6 +2580,18 @@ static int quick_consume(void *priv, char *line UNUSED, unsigned long len UNUSED
 	return 1;
 }
 
+static void line_range_filter_init(struct line_range_filter *filter,
+				   const struct range_set *ranges,
+				   xdiff_emit_line_fn line_fn,
+				   void *cb_data)
+{
+	memset(filter, 0, sizeof(*filter));
+	filter->orig_line_fn = line_fn;
+	filter->orig_cb_data = cb_data;
+	filter->ranges = ranges;
+	strbuf_init(&filter->hunk.lines, 0);
+}
+
 /*
  * Begin a range hunk at the first in-range line.  Its position fixes the
  * hunk's begins, taken from the two image cursors before they advance:
@@ -2744,6 +2756,50 @@ static int line_range_line_fn(void *priv, char *line, unsigned long len)
 	return filter->ret;
 }
 
+/*
+ * Run an xdiff pass through an initialized line-range filter, flush the
+ * final range hunk, and release the filter.  Inflates ctxlen to the largest
+ * range span first, so that every change within a single range lands in one
+ * xdiff hunk and the inter-change context is emitted; the filter then clips
+ * back to range boundaries.  The optimal ctxlen depends on where changes fall
+ * within the range, which is only known after xdiff runs, so the max span is
+ * the upper bound that guarantees correctness in a single pass.  Every
+ * consumer (patch, diffstat, check) relies on one xdiff hunk per range, so
+ * this lives here rather than at each call site.  Also clears
+ * XDL_EMIT_NO_HUNK_HDR: the filter seeds its per-image position from the hunk
+ * headers, so a consumer that otherwise suppresses them (diffstat) still gets
+ * them here.  Returns non-zero if xdiff or any forwarded callback failed.
+ */
+static int line_range_filter_diff(struct line_range_filter *filter,
+				  mmfile_t *mf1, mmfile_t *mf2,
+				  xpparam_t *xpp, xdemitconf_t *xecfg)
+{
+	const struct range_set *ranges = filter->ranges;
+	long max_span = 0;
+	unsigned int i;
+	int ret;
+
+	for (i = 0; i < ranges->nr; i++) {
+		long span = ranges->ranges[i].end - ranges->ranges[i].start;
+		if (span > max_span)
+			max_span = span;
+	}
+	if (max_span > xecfg->ctxlen)
+		xecfg->ctxlen = max_span;
+
+	/* the filter seeds its per-image position from hunk headers */
+	xecfg->flags &= ~XDL_EMIT_NO_HUNK_HDR;
+
+	ret = xdi_diff_outf(mf1, mf2, line_range_hunk_fn,
+			    line_range_line_fn, filter, xpp, xecfg);
+	if (!ret) {
+		flush_range_hunk(filter);
+		ret = filter->ret;
+	}
+	strbuf_release(&filter->hunk.lines);
+	return ret;
+}
+
 static void pprint_rename(struct strbuf *name, const char *a, const char *b)
 {
 	const char *old_name = a;
@@ -4108,49 +4164,15 @@ static void builtin_diff(const char *name_a,
 			xdi_diff_outf(&mf1, &mf2, NULL, quick_consume,
 				      &ecbdata, &xpp, &xecfg);
 		} else if (line_ranges) {
-			struct line_range_filter lr_state;
-			unsigned int i;
-			long max_span = 0;
+			struct line_range_filter lr_filter;
 
-			memset(&lr_state, 0, sizeof(lr_state));
-			lr_state.orig_line_fn = fn_out_consume;
-			lr_state.orig_cb_data = &ecbdata;
-			lr_state.ranges = line_ranges;
-			strbuf_init(&lr_state.hunk.lines, 0);
-
-			/*
-			 * Inflate ctxlen so that all changes within
-			 * any single range are merged into one xdiff
-			 * hunk and the inter-change context is emitted.
-			 * The callback clips back to range boundaries.
-			 *
-			 * The optimal ctxlen depends on where changes
-			 * fall within the range, which is only known
-			 * after xdiff runs; the max range span is the
-			 * upper bound that guarantees correctness in a
-			 * single pass.
-			 */
-			for (i = 0; i < line_ranges->nr; i++) {
-				long span = line_ranges->ranges[i].end -
-					    line_ranges->ranges[i].start;
-				if (span > max_span)
-					max_span = span;
-			}
-			if (max_span > xecfg.ctxlen)
-				xecfg.ctxlen = max_span;
-
-			if (xdi_diff_outf(&mf1, &mf2,
-					  line_range_hunk_fn,
-					  line_range_line_fn,
-					  &lr_state, &xpp, &xecfg))
-				die("unable to generate diff for %s",
-				    one->path);
+			line_range_filter_init(&lr_filter, line_ranges,
+					       fn_out_consume, &ecbdata);
 
-			flush_range_hunk(&lr_state);
-			if (lr_state.ret)
+			if (line_range_filter_diff(&lr_filter, &mf1, &mf2,
+						   &xpp, &xecfg))
 				die("unable to generate diff for %s",
 				    one->path);
-			strbuf_release(&lr_state.hunk.lines);
 		} else if (xdi_diff_outf(&mf1, &mf2, NULL, fn_out_consume,
 					 &ecbdata, &xpp, &xecfg))
 			die("unable to generate diff for %s", one->path);
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 5/7] line-log: support diff stat formats with -L
From: Michael Montalbo via GitGitGadget @ 2026-06-18 18:16 UTC (permalink / raw)
  To: git; +Cc: D. Ben Knoble, Michael Montalbo, Michael Montalbo
In-Reply-To: <pull.2152.git.1781806593.gitgitgadget@gmail.com>

From: Michael Montalbo <mmontalbo@gmail.com>

Reuse the line_range_filter in builtin_diffstat() to produce
range-scoped statistics.  When a filepair carries line_ranges, the
filter wraps diffstat_consume() as its output callback, forwarding only
in-range lines for counting.  flush_range_hunk() replays buffered
content through diffstat_consume(), which ignores synthetic @@ headers
since it only counts '+' and '-' lines.

Expand the output format allowlist in setup_revisions() to accept
--stat, --numstat, and --shortstat with -L.

Leave --dirstat out of the allowlist so it is rejected like any other
unsupported format.  Its default mode counts each file's whole-file
byte damage via diffcore_count_changes(), outside the line-based
pipeline that the -L filter scopes, so bare --dirstat cannot honor the
tracked range.  The --dirstat=lines mode could: it aggregates the same
per-file line counts as --numstat, which -L already scopes.  But
accepting only that sub-mode while bare --dirstat keeps erroring is a
confusing split, so the whole format is deferred to a follow-up;
--numstat already reports the exact range-scoped per-file counts.

Also drop "yet" from the generic -L rejection message ("does not
yet support the requested diff format").  Some rejected formats do
not fit a line range at all, so "yet" wrongly implied they are all
just awaiting support.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
---
 Documentation/line-range-options.adoc |  12 ++-
 diff.c                                |  13 ++-
 revision.c                            |   6 +-
 t/t4211-line-log.sh                   | 150 ++++++++++++++++++++++----
 4 files changed, 155 insertions(+), 26 deletions(-)

diff --git a/Documentation/line-range-options.adoc b/Documentation/line-range-options.adoc
index 72f639b5e7..1a25f55bb1 100644
--- a/Documentation/line-range-options.adoc
+++ b/Documentation/line-range-options.adoc
@@ -9,10 +9,14 @@
 	_<start>_ and _<end>_ (or _<funcname>_) must exist in the starting revision.
 	You can specify this option more than once. Implies `--patch`.
 	Patch output can be suppressed using `--no-patch`.
-	Non-patch diff formats `--raw`, `--name-only`, `--name-status`,
-	and `--summary` are supported.  Diff stat formats
-	(`--stat`, `--numstat`, `--shortstat`, `--dirstat`) are not
-	currently implemented.
+	The following non-patch diff formats are supported: `--raw`,
+	`--name-only`, `--name-status`, `--summary`,
+	`--stat`, `--numstat`, and `--shortstat`.
+	The stat formats show range-scoped counts: only lines within
+	the tracked range are counted.  `--dirstat` is not supported
+	with `-L`: it summarizes change as each directory's share of
+	the total churn, not as counts for the tracked lines.  Use
+	`--numstat` for exact per-file counts within the range.
 +
 Patch formatting options such as `--word-diff`, `--color-moved`,
 `--no-prefix`, and whitespace options (`-w`, `-b`) are supported,
diff --git a/diff.c b/diff.c
index 6233a96bf0..026fafeb90 100644
--- a/diff.c
+++ b/diff.c
@@ -4289,7 +4289,18 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
 		xecfg.ctxlen = o->context;
 		xecfg.interhunkctxlen = o->interhunkcontext;
 		xecfg.flags = XDL_EMIT_NO_HUNK_HDR;
-		if (xdi_diff_outf(&mf1, &mf2, NULL,
+
+		if (p->line_ranges) {
+			struct line_range_filter lr_filter;
+
+			line_range_filter_init(&lr_filter, p->line_ranges,
+					       diffstat_consume, diffstat);
+
+			if (line_range_filter_diff(&lr_filter, &mf1, &mf2,
+						   &xpp, &xecfg))
+				die("unable to generate diffstat for %s",
+				    one->path);
+		} else if (xdi_diff_outf(&mf1, &mf2, NULL,
 				  diffstat_consume, diffstat, &xpp, &xecfg))
 			die("unable to generate diffstat for %s", one->path);
 
diff --git a/revision.c b/revision.c
index 6a8101e8b7..2c76e15778 100644
--- a/revision.c
+++ b/revision.c
@@ -3193,8 +3193,10 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 	     (revs->diffopt.output_format &
 	      ~(DIFF_FORMAT_PATCH | DIFF_FORMAT_NO_OUTPUT |
 		DIFF_FORMAT_RAW | DIFF_FORMAT_NAME |
-		DIFF_FORMAT_NAME_STATUS | DIFF_FORMAT_SUMMARY))))
-		die(_("-L does not yet support the requested diff format"));
+		DIFF_FORMAT_NAME_STATUS | DIFF_FORMAT_SUMMARY |
+		DIFF_FORMAT_NUMSTAT | DIFF_FORMAT_DIFFSTAT |
+		DIFF_FORMAT_SHORTSTAT))))
+		die(_("-L does not support the requested diff format"));
 
 	if (revs->expand_tabs_in_log < 0)
 		revs->expand_tabs_in_log = revs->expand_tabs_in_log_default;
diff --git a/t/t4211-line-log.sh b/t/t4211-line-log.sh
index e9691066de..af37bd532f 100755
--- a/t/t4211-line-log.sh
+++ b/t/t4211-line-log.sh
@@ -176,24 +176,15 @@ test_expect_success '--name-status shows status and path' '
 	test_grep ! "^@@" actual
 '
 
-test_expect_success '--stat is not yet supported with -L' '
-	test_must_fail git log -L1,24:b.c --stat 2>err &&
-	test_grep "does not yet support" err
-'
-
-test_expect_success '--numstat is not yet supported with -L' '
-	test_must_fail git log -L1,24:b.c --numstat 2>err &&
-	test_grep "does not yet support" err
-'
-
-test_expect_success '--shortstat is not yet supported with -L' '
-	test_must_fail git log -L1,24:b.c --shortstat 2>err &&
-	test_grep "does not yet support" err
-'
-
-test_expect_success '--dirstat is not yet supported with -L' '
+test_expect_success '--dirstat is not supported with -L' '
+	# --dirstat is not supported with -L: its default mode measures
+	# whole-file change, not the tracked lines, and the
+	# --dirstat=lines variant is deferred too, so both forms are
+	# rejected like any other unsupported format.
 	test_must_fail git log -L1,24:b.c --dirstat 2>err &&
-	test_grep "does not yet support" err
+	test_grep "does not support" err &&
+	test_must_fail git log -L1,24:b.c --dirstat=lines 2>err &&
+	test_grep "does not support" err
 '
 
 test_expect_success 'setup for checking fancy rename following' '
@@ -887,9 +878,9 @@ test_expect_success '-L with -S suppresses non-matching commits' '
 	test_cmp expect actual
 '
 
-test_expect_success '--full-diff is not yet supported with -L' '
+test_expect_success '--full-diff is not supported with -L' '
 	test_must_fail git log -L1,24:b.c --full-diff 2>err &&
-	test_grep "does not yet support" err
+	test_grep "does not support" err
 '
 
 test_expect_success '-L --oneline has no extra blank line before diff' '
@@ -900,6 +891,127 @@ test_expect_success '-L --oneline has no extra blank line before diff' '
 	test_grep "^diff --git" line2
 '
 
+test_expect_success 'setup for stat range-scoping tests' '
+	git checkout --orphan stat-scoping &&
+	git reset --hard &&
+	cat >file.c <<-\EOF &&
+	int func1()
+	{
+	    return F1;
+	}
+
+	int func2()
+	{
+	    return F2;
+	}
+	EOF
+	git add file.c &&
+	test_tick &&
+	git commit -m "Add func1() and func2()" &&
+
+	# Modify both functions in a single commit so that
+	# whole-file stats differ from range-scoped stats.
+	sed -e "s/F1/F1 + 1/" -e "s/F2/F2 + 2/" file.c >tmp &&
+	mv tmp file.c &&
+	git commit -a -m "Modify both functions"
+'
+
+test_expect_success '--numstat counts only lines in tracked range' '
+	# "Modify both functions" changes one line in func1 and one in
+	# func2.  Whole-file numstat would show 2 added, 2 deleted.
+	# Range-scoped numstat for func2 should show only 1 and 1.
+	git log -L:func2:file.c --numstat --format=%s -1 >actual &&
+	test_grep "Modify both functions" actual &&
+	test_grep "^1	1	file.c$" actual &&
+	test_grep ! "^diff --git" actual
+'
+
+test_expect_success '--numstat counts only additions for root commit' '
+	# Root commit creates both func1 (4 lines) and func2 (4 lines).
+	# Whole-file numstat would show 9 lines added.  Range-scoped
+	# numstat for func2 should show only 4.
+	git log -L:func2:file.c --numstat --format=%s >actual &&
+	test_grep "Add func1() and func2()" actual &&
+	test_grep "^4	0	file.c$" actual &&
+	test_grep ! "^diff --git" actual
+'
+
+test_expect_success '--stat counts only lines in tracked range' '
+	git log -L:func2:file.c --stat --format=%s -1 >actual &&
+	test_grep "Modify both functions" actual &&
+	test_grep "file.c |" actual &&
+	test_grep "1 insertion" actual &&
+	test_grep "1 deletion" actual &&
+	test_grep ! "^diff --git" actual
+'
+
+test_expect_success '--shortstat counts only lines in tracked range' '
+	# --shortstat prints only the summary line: no per-file "file.c |"
+	# line.  Counts are range-scoped as for --numstat above.
+	git log -L:func2:file.c --shortstat --format=%s -1 >actual &&
+	test_grep "Modify both functions" actual &&
+	test_grep "1 insertion" actual &&
+	test_grep "1 deletion" actual &&
+	test_grep ! "file.c |" actual &&
+	test_grep ! "^diff --git" actual
+'
+
+test_expect_success '--numstat across renames and multiple commits' '
+	# parallel-change carries the tracked function f across an a.c -> b.c
+	# rename and a merge of two parallel histories.  With -M, --numstat
+	# follows the rename and reports range-scoped (not whole-file)
+	# added/removed counts for f per commit; the file column flips from
+	# b.c to a.c at the rename as the walk goes back in time.  Commits
+	# that do not change the range of f emit no row (the merge and the
+	# pure file-move produce nothing), so there are fewer rows than
+	# commits.
+	git checkout parallel-change &&
+	git log -M -L ":f:b.c" --format= --numstat >actual &&
+	cat >expect <<-\EOF &&
+	1	1	b.c
+	1	1	a.c
+	1	1	a.c
+	1	1	a.c
+	1	0	a.c
+	13	0	a.c
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success '-L multiple ranges with --numstat excludes untracked change' '
+	git checkout --orphan multi-range &&
+	git reset --hard &&
+	cat >m.c <<-\EOF &&
+	int func1()
+	{
+	    return F1;
+	}
+
+	int func2()
+	{
+	    return F2;
+	}
+
+	int func3()
+	{
+	    return F3;
+	}
+	EOF
+	git add m.c &&
+	test_tick &&
+	git commit -m "add m.c" &&
+	# Change all three functions but track only func1 and func2.
+	# Whole-file numstat would be 3 3; a 2 2 result proves the
+	# untracked func3 change is excluded and the two ranges just sum.
+	sed -e "s/F1/F1 + 1/" -e "s/F2/F2 + 2/" -e "s/F3/F3 + 3/" m.c >tmp &&
+	mv tmp m.c &&
+	git commit -a -m "Modify all three functions" &&
+	git log -L:func1:m.c -L:func2:m.c --numstat --format=%s -1 >actual &&
+	test_grep "Modify all three functions" actual &&
+	test_grep "^2	2	m.c$" actual &&
+	test_grep ! "^3	3	m.c$" actual
+'
+
 test_expect_success '--summary shows new file on root commit' '
 	git checkout parent-oids &&
 	git log -L:func2:file.c --summary --format= >actual &&
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 6/7] diff: support --check with -L line ranges
From: Michael Montalbo via GitGitGadget @ 2026-06-18 18:16 UTC (permalink / raw)
  To: git; +Cc: D. Ben Knoble, Michael Montalbo, Michael Montalbo
In-Reply-To: <pull.2152.git.1781806593.gitgitgadget@gmail.com>

From: Michael Montalbo <mmontalbo@gmail.com>

builtin_checkdiff() runs its own xdiff pass to detect whitespace
errors in newly added lines.  When -L is active, the check should
be scoped to the tracked line ranges rather than the whole file.

Reuse the line_range_filter to wrap checkdiff_consume(), the same
pattern already used for patch output and diffstat.  The filter
forwards only in-range lines for whitespace checking.

checkdiff reports the file line number of each error, which it
normally learns from the hunk header via checkdiff_consume_hunk().
The filter synthesizes its own hunk headers, so give it an optional
hunk callback and route checkdiff_consume_hunk() through it; this
sets the post-image position before the in-range lines are replayed.
Without it the reported line numbers would count from the start of
the range hunk rather than the start of the file.

The trailing blank-at-eof check is a second pass that scans the whole
file via check_blank_at_eof(), so gate its report on the tracked
ranges as well; otherwise a blank line added at end of file is
reported even when it lies outside the range.

Add DIFF_FORMAT_CHECKDIFF to the -L output format allowlist in
setup_revisions() so that -L --check is accepted, and list --check
among the supported formats in the documentation.  Add tests covering
that whitespace errors are reported, scoped to the tracked range, and
labeled with the correct file line number, including when two errors
in one range are separated by a gap that would otherwise split into
multiple xdiff hunks.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
---
 Documentation/line-range-options.adoc |  2 +-
 diff.c                                | 65 ++++++++++++++++++-
 revision.c                            |  2 +-
 t/t4211-line-log.sh                   | 92 +++++++++++++++++++++++++++
 4 files changed, 156 insertions(+), 5 deletions(-)

diff --git a/Documentation/line-range-options.adoc b/Documentation/line-range-options.adoc
index 1a25f55bb1..ec92f43ae4 100644
--- a/Documentation/line-range-options.adoc
+++ b/Documentation/line-range-options.adoc
@@ -10,7 +10,7 @@
 	You can specify this option more than once. Implies `--patch`.
 	Patch output can be suppressed using `--no-patch`.
 	The following non-patch diff formats are supported: `--raw`,
-	`--name-only`, `--name-status`, `--summary`,
+	`--name-only`, `--name-status`, `--summary`, `--check`,
 	`--stat`, `--numstat`, and `--shortstat`.
 	The stat formats show range-scoped counts: only lines within
 	the tracked range are counted.  `--dirstat` is not supported
diff --git a/diff.c b/diff.c
index 026fafeb90..519c513356 100644
--- a/diff.c
+++ b/diff.c
@@ -665,6 +665,12 @@ struct emit_callback {
  */
 struct line_range_filter {
 	xdiff_emit_line_fn orig_line_fn;
+	/*
+	 * Optional; consumers that report file line numbers (e.g.
+	 * checkdiff) need the synthetic hunk header to set their
+	 * post-image position before in-range lines are replayed.
+	 */
+	xdiff_emit_hunk_fn orig_hunk_fn;
 	void *orig_cb_data;
 	const struct range_set *ranges;	/* 0-based [start, end) */
 	unsigned int cur_range;		/* index into the range_set */
@@ -2652,6 +2658,17 @@ static void flush_range_hunk(struct line_range_filter *filter)
 			       filter->hunk.new_begin, new_count,
 			       filter->func, filter->funclen);
 
+	/*
+	 * Inform a line-numbering consumer of the post-image position
+	 * before replaying lines, mirroring the hunk callback xdiff
+	 * would have issued for a non-scoped diff.
+	 */
+	if (filter->orig_hunk_fn)
+		filter->orig_hunk_fn(filter->orig_cb_data,
+				filter->hunk.old_begin, old_count,
+				filter->hunk.new_begin, new_count,
+				filter->func, filter->funclen);
+
 	filter->ret = filter->orig_line_fn(filter->orig_cb_data, hdr.buf, hdr.len);
 	strbuf_release(&hdr);
 
@@ -4330,11 +4347,29 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
 	diff_free_filespec_data(two);
 }
 
+/*
+ * Is the 0-based line index within any of the tracked ranges?
+ * (range_set ranges are 0-based, half-open [start, end).)  This is a
+ * one-shot query for a single line and scans; the streaming filter
+ * (line_range_line_fn) uses a forward cursor instead.
+ */
+static int idx_in_ranges(const struct range_set *ranges, long idx)
+{
+	unsigned int i;
+
+	for (i = 0; i < ranges->nr; i++)
+		if (idx >= ranges->ranges[i].start &&
+		    idx < ranges->ranges[i].end)
+			return 1;
+	return 0;
+}
+
 static void builtin_checkdiff(const char *name_a, const char *name_b,
 			      const char *attr_path,
 			      struct diff_filespec *one,
 			      struct diff_filespec *two,
-			      struct diff_options *o)
+			      struct diff_options *o,
+			      const struct range_set *line_ranges)
 {
 	mmfile_t mf1, mf2;
 	struct checkdiff_t data;
@@ -4374,7 +4409,19 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 		memset(&xecfg, 0, sizeof(xecfg));
 		xecfg.ctxlen = 1; /* at least one context line */
 		xpp.flags = 0;
-		if (xdi_diff_outf(&mf1, &mf2, checkdiff_consume_hunk,
+
+		if (line_ranges) {
+			struct line_range_filter lr_filter;
+
+			line_range_filter_init(&lr_filter, line_ranges,
+					       checkdiff_consume, &data);
+			lr_filter.orig_hunk_fn = checkdiff_consume_hunk;
+
+			if (line_range_filter_diff(&lr_filter, &mf1, &mf2,
+						   &xpp, &xecfg))
+				die("unable to generate checkdiff for %s",
+				    one->path);
+		} else if (xdi_diff_outf(&mf1, &mf2, checkdiff_consume_hunk,
 				  checkdiff_consume, &data,
 				  &xpp, &xecfg))
 			die("unable to generate checkdiff for %s", one->path);
@@ -4387,6 +4434,17 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 			check_blank_at_eof(&mf1, &mf2, &ecbdata);
 			blank_at_eof = ecbdata.blank_at_eof_in_postimage;
 
+			/*
+			 * check_blank_at_eof() scans the whole file; with -L,
+			 * keep the report only when its line is in a tracked
+			 * range.  The error's location is the first trailing
+			 * blank line (blank_at_eof, 1-based; ranges 0-based), so
+			 * we scope by that line.
+			 */
+			if (blank_at_eof && line_ranges &&
+			    !idx_in_ranges(line_ranges, blank_at_eof - 1))
+				blank_at_eof = 0;
+
 			if (blank_at_eof) {
 				static char *err;
 				if (!err)
@@ -5179,7 +5237,8 @@ static void run_checkdiff(struct diff_filepair *p, struct diff_options *o)
 	diff_fill_oid_info(p->one, o->repo->index);
 	diff_fill_oid_info(p->two, o->repo->index);
 
-	builtin_checkdiff(name, other, attr_path, p->one, p->two, o);
+	builtin_checkdiff(name, other, attr_path, p->one, p->two, o,
+			  p->line_ranges);
 }
 
 void repo_diff_setup(struct repository *r, struct diff_options *options)
diff --git a/revision.c b/revision.c
index 2c76e15778..7abb287451 100644
--- a/revision.c
+++ b/revision.c
@@ -3195,7 +3195,7 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 		DIFF_FORMAT_RAW | DIFF_FORMAT_NAME |
 		DIFF_FORMAT_NAME_STATUS | DIFF_FORMAT_SUMMARY |
 		DIFF_FORMAT_NUMSTAT | DIFF_FORMAT_DIFFSTAT |
-		DIFF_FORMAT_SHORTSTAT))))
+		DIFF_FORMAT_SHORTSTAT | DIFF_FORMAT_CHECKDIFF))))
 		die(_("-L does not support the requested diff format"));
 
 	if (revs->expand_tabs_in_log < 0)
diff --git a/t/t4211-line-log.sh b/t/t4211-line-log.sh
index af37bd532f..9d351aa05f 100755
--- a/t/t4211-line-log.sh
+++ b/t/t4211-line-log.sh
@@ -1018,4 +1018,96 @@ test_expect_success '--summary shows new file on root commit' '
 	test_grep "create mode 100644 file.c" actual
 '
 
+test_expect_success 'setup for --check test' '
+	git checkout --orphan check-test &&
+	git reset --hard &&
+	cat >check.c <<-\EOF &&
+	void tracked()
+	{
+	    return;
+	}
+
+	void other()
+	{
+	    return;
+	}
+	EOF
+	git add check.c &&
+	test_tick &&
+	git commit -m "add check.c" &&
+	# Introduce trailing whitespace errors in both functions
+	sed "s/return;/return; /" check.c >check.c.tmp &&
+	mv check.c.tmp check.c &&
+	git commit -a -m "introduce trailing whitespace"
+'
+
+test_expect_success '--check scoped to tracked range with correct file line' '
+	# tracked() trailing whitespace is at check.c:3; report it with the
+	# real file line number, not a count from the start of the range
+	# hunk.  other() at check.c:8 is outside the range and is excluded.
+	test_must_fail git log -L:tracked:check.c --check --format= >actual &&
+	test_grep "check.c:3: trailing whitespace" actual &&
+	test_grep ! "check.c:8:" actual
+'
+
+test_expect_success '--check reports each of several tracked ranges' '
+	# Track both functions as separate ranges.  Each range is flushed
+	# as its own hunk, so the second error must report its real file
+	# line (check.c:8), not continue the numbering from the first
+	# range (check.c:3).
+	test_must_fail git log -L:tracked:check.c -L:other:check.c \
+		--check --format= >actual &&
+	test_grep "check.c:3: trailing whitespace" actual &&
+	test_grep "check.c:8: trailing whitespace" actual
+'
+
+test_expect_success '--check line numbers stay correct across a gap in one range' '
+	git checkout --orphan check-gap &&
+	git reset --hard &&
+	cat >gap.c <<-\EOF &&
+	void tracked()
+	{
+	    int a = 1;
+	    int b = 2;
+	    int c = 3;
+	    int d = 4;
+	    int e = 5;
+	    int g = 7;
+	    return;
+	}
+	EOF
+	git add gap.c &&
+	test_tick &&
+	git commit -m "add gap.c" &&
+	# Two trailing-whitespace errors within one tracked range,
+	# separated by clean lines.  ctxlen is inflated to the range span,
+	# so they land in a single xdiff hunk with the gap as context;
+	# both must report their real file line number, with the context
+	# lines between them counted.
+	sed -e "s/int a = 1;/int a = 1; /" -e "s/int g = 7;/int g = 7; /" gap.c >tmp &&
+	mv tmp gap.c &&
+	git commit -a -m "ws errors with a gap" &&
+	test_must_fail git log -L:tracked:gap.c --check --format= >actual &&
+	test_grep "gap.c:3: trailing whitespace" actual &&
+	test_grep "gap.c:8: trailing whitespace" actual
+'
+
+test_expect_success '--check does not report blank-at-eof outside the range' '
+	git checkout --orphan check-eof &&
+	git reset --hard &&
+	printf "void tracked()\n{\n    return;\n}\n\nint tail = 1;\n" >eof.c &&
+	git add eof.c &&
+	test_tick &&
+	git commit -m "add eof.c" &&
+	# One commit introduces a trailing-whitespace error inside tracked()
+	# (line 3) and a blank line at end of file (line 7, outside the
+	# range).  The blank-at-eof check scans the whole file, so it must be
+	# scoped: report the in-range error, not the out-of-range EOF blank.
+	printf "void tracked()\n{\n    return; \n}\n\nint tail = 1;\n\n" >eof.c &&
+	git commit -a -m "ws in range, blank at eof out of range" &&
+	test_must_fail git log -L:tracked:eof.c --check --format= >actual &&
+	test_grep "eof.c:3: trailing whitespace" actual &&
+	test_grep ! "blank line at EOF" actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 7/7] diffcore-pickaxe: scope -G to the -L tracked range
From: Michael Montalbo via GitGitGadget @ 2026-06-18 18:16 UTC (permalink / raw)
  To: git; +Cc: D. Ben Knoble, Michael Montalbo, Michael Montalbo
In-Reply-To: <pull.2152.git.1781806593.gitgitgadget@gmail.com>

From: Michael Montalbo <mmontalbo@gmail.com>

git log -L scopes its diff output to the tracked range, but pickaxe
(-S, -G) still runs in diffcore over the whole-file change, so -L -G
selects a commit whenever the pattern appears in any added or removed
line of the file, even outside the tracked range.

Teach -G to honor the range.  diff_grep() already runs an xdiff pass
and greps the +/- lines; route that pass through the line-range filter
so only the tracked range's lines are grepped.  Expose the filter as
diff_emit_line_ranges(), a line-range-scoped xdi_diff_outf(), thread
the filepair's line_ranges through the pickaxe callback, and pass it
from pickaxe_match().  Skip scoping under textconv, whose output is not
in the original file's line coordinates.

-G needs only a hit/no-hit answer, so the line-number concerns the
filter handles for patch and check output do not apply here.

-S is left matching the whole file: it counts needle occurrences per
blob rather than grepping the diff, so scoping it needs a different
approach, left to a follow-up.  has_changes() takes the range parameter
but ignores it for now.

Document the resulting -L pickaxe scoping: -G is range-scoped, while -S
still matches the whole file.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
---
 Documentation/line-range-options.adoc |  5 +-
 diff.c                                | 15 ++++++
 diffcore-pickaxe.c                    | 37 +++++++++++---
 t/t4211-line-log.sh                   | 72 +++++++++++++++++++++++++--
 xdiff-interface.h                     | 13 +++++
 5 files changed, 130 insertions(+), 12 deletions(-)

diff --git a/Documentation/line-range-options.adoc b/Documentation/line-range-options.adoc
index ec92f43ae4..7cfae1499c 100644
--- a/Documentation/line-range-options.adoc
+++ b/Documentation/line-range-options.adoc
@@ -20,6 +20,9 @@
 +
 Patch formatting options such as `--word-diff`, `--color-moved`,
 `--no-prefix`, and whitespace options (`-w`, `-b`) are supported,
-as are pickaxe options (`-S`, `-G`) and `--diff-filter`.
+as are pickaxe options (`-S`, `-G`) and `--diff-filter`.  `-G` is
+scoped to the tracked range; `-S` is still evaluated over the whole
+file, so an `-S` query may select a commit for a change outside the
+range.
 +
 include::line-range-format.adoc[]
diff --git a/diff.c b/diff.c
index 519c513356..a8f346621b 100644
--- a/diff.c
+++ b/diff.c
@@ -2817,6 +2817,21 @@ static int line_range_filter_diff(struct line_range_filter *filter,
 	return ret;
 }
 
+/*
+ * Expose the in-file line-range filter to callers outside diff.c (e.g.
+ * pickaxe -G); see xdiff-interface.h for the contract.
+ */
+int diff_emit_line_ranges(mmfile_t *one, mmfile_t *two,
+			  const struct range_set *ranges,
+			  xdiff_emit_line_fn line_fn, void *cb_data,
+			  xpparam_t *xpp, xdemitconf_t *xecfg)
+{
+	struct line_range_filter filter;
+
+	line_range_filter_init(&filter, ranges, line_fn, cb_data);
+	return line_range_filter_diff(&filter, one, two, xpp, xecfg);
+}
+
 static void pprint_rename(struct strbuf *name, const char *a, const char *b)
 {
 	const char *old_name = a;
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index a52d569911..047b2bf7ac 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -16,7 +16,8 @@
 
 typedef int (*pickaxe_fn)(mmfile_t *one, mmfile_t *two,
 			  struct diff_options *o,
-			  regex_t *regexp, kwset_t kws);
+			  regex_t *regexp, kwset_t kws,
+			  const struct range_set *ranges);
 
 struct diffgrep_cb {
 	regex_t *regexp;
@@ -42,7 +43,8 @@ static int diffgrep_consume(void *priv, char *line, unsigned long len)
 
 static int diff_grep(mmfile_t *one, mmfile_t *two,
 		     struct diff_options *o,
-		     regex_t *regexp, kwset_t kws UNUSED)
+		     regex_t *regexp, kwset_t kws UNUSED,
+		     const struct range_set *ranges)
 {
 	struct diffgrep_cb ecbdata;
 	xpparam_t xpp;
@@ -50,8 +52,11 @@ static int diff_grep(mmfile_t *one, mmfile_t *two,
 	int ret;
 
 	/*
-	 * We have both sides; need to run textual diff and see if
-	 * the pattern appears on added/deleted lines.
+	 * We have both sides; need to run textual diff and see if the
+	 * pattern appears on added/deleted lines.  Under -L (ranges set),
+	 * forward only the tracked range's lines so the match is scoped.
+	 * -G needs only a hit/no-hit answer, so the line-number bookkeeping
+	 * the filter does for -L patch and check output is irrelevant here.
 	 */
 	memset(&xpp, 0, sizeof(xpp));
 	memset(&xecfg, 0, sizeof(xecfg));
@@ -65,8 +70,12 @@ static int diff_grep(mmfile_t *one, mmfile_t *two,
 	 * An xdiff error might be our "data->hit" from above. See the
 	 * comment for xdiff_emit_line_fn in xdiff-interface.h
 	 */
-	ret = xdi_diff_outf(one, two, NULL, diffgrep_consume,
-			    &ecbdata, &xpp, &xecfg);
+	if (ranges)
+		ret = diff_emit_line_ranges(one, two, ranges, diffgrep_consume,
+					    &ecbdata, &xpp, &xecfg);
+	else
+		ret = xdi_diff_outf(one, two, NULL, diffgrep_consume,
+				    &ecbdata, &xpp, &xecfg);
 	if (ecbdata.hit)
 		return 1;
 	if (ret)
@@ -119,8 +128,13 @@ static unsigned int contains(mmfile_t *mf, regex_t *regexp, kwset_t kws,
 
 static int has_changes(mmfile_t *one, mmfile_t *two,
 		       struct diff_options *o UNUSED,
-		       regex_t *regexp, kwset_t kws)
+		       regex_t *regexp, kwset_t kws,
+		       const struct range_set *ranges UNUSED)
 {
+	/*
+	 * -S counts needle occurrences in each whole blob.  Scoping this to
+	 * a -L range is left to a follow-up; for now -S ignores the range.
+	 */
 	unsigned int c1 = one ? contains(one, regexp, kws, 0) : 0;
 	unsigned int c2 = two ? contains(two, regexp, kws, c1 + 1) : 0;
 	return c1 != c2;
@@ -132,6 +146,7 @@ static int pickaxe_match(struct diff_filepair *p, struct diff_options *o,
 	struct userdiff_driver *textconv_one = NULL;
 	struct userdiff_driver *textconv_two = NULL;
 	mmfile_t mf1, mf2;
+	const struct range_set *ranges;
 	int ret;
 
 	/* ignore unmerged */
@@ -169,7 +184,13 @@ static int pickaxe_match(struct diff_filepair *p, struct diff_options *o,
 	mf1.size = fill_textconv(o->repo, textconv_one, p->one, &mf1.ptr);
 	mf2.size = fill_textconv(o->repo, textconv_two, p->two, &mf2.ptr);
 
-	ret = fn(&mf1, &mf2, o, regexp, kws);
+	/*
+	 * -L scopes the search to the tracked range, but the range is in
+	 * original-file line coordinates that do not map onto textconv
+	 * output, so search the whole file when textconv is in play.
+	 */
+	ranges = (textconv_one || textconv_two) ? NULL : p->line_ranges;
+	ret = fn(&mf1, &mf2, o, regexp, kws, ranges);
 
 	if (textconv_one)
 		free(mf1.ptr);
diff --git a/t/t4211-line-log.sh b/t/t4211-line-log.sh
index 9d351aa05f..3d35b20178 100755
--- a/t/t4211-line-log.sh
+++ b/t/t4211-line-log.sh
@@ -722,9 +722,9 @@ test_expect_success '-L with -S filters to string-count changes' '
 test_expect_success '-L with -G filters to diff-text matches' '
 	git checkout parent-oids &&
 	git log -L:func2:file.c -G "F2 [+] 2" --format= >actual &&
-	# -G greps the whole-file diff text, not just the tracked range;
-	# combined with -L, this selects commits that both touch func2
-	# and have "F2 + 2" in their diff.
+	# -G greps the diff text, and under -L only the lines in the
+	# tracked range (unlike -S above, which searches the whole file);
+	# this selects commits whose change to func2 contains "F2 + 2".
 	test $(grep -c "^diff --git" actual) = 1 &&
 	grep "F2 + 2" actual
 '
@@ -1110,4 +1110,70 @@ test_expect_success '--check does not report blank-at-eof outside the range' '
 	test_grep ! "blank line at EOF" actual
 '
 
+test_expect_success '-L -G is scoped to the tracked range' '
+	git checkout --orphan grep-scope &&
+	git reset --hard &&
+	cat >gp.c <<-\EOF &&
+	int func1()
+	{
+	    return ALPHA;
+	}
+
+	int func2()
+	{
+	    return BETA;
+	}
+	EOF
+	git add gp.c &&
+	test_tick &&
+	git commit -m "add gp.c" &&
+	sed -e "s/ALPHA/ALPHA2/" -e "s/BETA/BETA2/" gp.c >tmp &&
+	mv tmp gp.c &&
+	git commit -a -m "touch both functions" &&
+	# The commit changes ALPHA (func1) and BETA (func2).  Tracking func2,
+	# -G BETA matches its in-range change; -G ALPHA must not, since ALPHA
+	# changes only outside the tracked range.
+	git log -L:func2:gp.c -G BETA --format=%s >actual &&
+	test_grep "touch both functions" actual &&
+	git log -L:func2:gp.c -G ALPHA --format=%s >actual &&
+	test_grep ! "touch both functions" actual
+'
+
+test_expect_success '-L -G searches the whole file under textconv' '
+	git checkout --orphan grep-textconv &&
+	git reset --hard &&
+	cat >tc.c <<-\EOF &&
+	int func1()
+	{
+	    return F1;
+	}
+
+	int func2()
+	{
+	    return F2;
+	}
+	EOF
+	git add tc.c &&
+	test_tick &&
+	git commit -m "add tc.c" &&
+	# One commit changes func1 and func2; MAGIC lands only in the
+	# func2 change, outside func1.
+	sed -e "s/F1/F1 + 1/" -e "s/return F2/return MAGIC/" tc.c >tmp &&
+	mv tmp tc.c &&
+	git commit -a -m "change both funcs" &&
+	echo "tc.c diff=tc" >.gitattributes &&
+
+	# Without a textconv driver, -G is scoped to func1, so MAGIC (only
+	# in the func2 change) does not select the commit.
+	git log -L:func1:tc.c -G MAGIC --format=%s --no-patch >actual &&
+	test_must_be_empty actual &&
+
+	# A textconv driver makes the range (original-file line numbers)
+	# meaningless against the driver output, so -G falls back to the
+	# whole file and MAGIC now selects the commit.
+	git config diff.tc.textconv cat &&
+	git log -L:func1:tc.c -G MAGIC --format=%s --no-patch >actual &&
+	test_grep "change both funcs" actual
+'
+
 test_done
diff --git a/xdiff-interface.h b/xdiff-interface.h
index 51c88296ed..71e5dffefb 100644
--- a/xdiff-interface.h
+++ b/xdiff-interface.h
@@ -46,6 +46,19 @@ int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2,
 		  xdiff_emit_line_fn line_fn,
 		  void *consume_callback_data,
 		  xpparam_t const *xpp, xdemitconf_t const *xecfg);
+
+struct range_set;
+/*
+ * Like xdi_diff_outf(), but forwards only the lines within the given
+ * (post-image) line ranges to line_fn, as "git log -L" scopes its output.
+ * Returns line_fn's latched return value (so a consumer can signal a hit
+ * with a non-zero return), or non-zero on xdiff failure.  Defined in
+ * diff.c (it reuses the line-range filter there).
+ */
+int diff_emit_line_ranges(mmfile_t *mf1, mmfile_t *mf2,
+			  const struct range_set *ranges,
+			  xdiff_emit_line_fn line_fn, void *cb_data,
+			  xpparam_t *xpp, xdemitconf_t *xecfg);
 int read_mmfile(mmfile_t *ptr, const char *filename);
 void read_mmblob(mmfile_t *ptr, struct object_database *odb,
 		 const struct object_id *oid);
-- 
gitgitgadget

^ permalink raw reply related

* Pinned references?
From: Erik Östlund @ 2026-06-18 18:37 UTC (permalink / raw)
  To: git

I'd like to be able to express a reference together with an expected
object ID, for example with strawman syntax like:

refs/tags/v1.2.3?oid=a1b2c3d4

The intended semantics would be that both the reference and object ID
must exist, and Git should fail if the reference does not resolve to the
specified object ID.

Tags are nice because they convey human meaning. Object IDs are nice
because they are immutable. As it is, I often have to choose between the
two, or represent them separately in external tooling.

Is there existing terminology, prior discussion, or an accepted Git-native
approach for this kind of "ref plus expected OID" invariant? I
searched both the Git reference documentation and the mailing list
archives, but couldn't find what I was looking for.

Thanks,
Erik

^ permalink raw reply

* Re: [PATCH] rebase: mention --abort alongside --continue
From: Harald Nordgren @ 2026-06-18 18:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Phillip Wood, Harald Nordgren via GitGitGadget, git
In-Reply-To: <xmqqo6h9z7e6.fsf@gitster.g>

Just an example when working on a different topic:

I rebased with -x to run all the tests, but ran a test that didn't
exist yet on the first commit and ended up in a bad state. Here it
should clearly show the 'git rebase --abort', so I can start over,
it's not something to fix:

```
$ git rebase --keep-base -x 'make -s' -x 'cd t && prove -j8
t3454-history-squash.sh t3453-history-fixup.sh t3452-history-split.sh
t3451-history-reword.sh t3450-history.sh'
Executing: make -s
GIT_VERSION=2.55.0.rc1.20.g1e31474ef6
Executing: cd t && prove -j8 t3454-history-squash.sh
t3453-history-fixup.sh t3452-history-split.sh t3451-history-reword.sh
t3450-history.sh
Cannot detect source of 't3454-history-squash.sh'! at
/System/Library/Perl/5.34/TAP/Parser/IteratorFactory.pm line 256.
...
warning: execution failed: cd t && prove -j8 t3454-history-squash.sh
t3453-history-fixup.sh t3452-history-split.sh t3451-history-reword.sh
t3450-history.sh
You can fix the problem, and then run

  git rebase --continue

$ git status
interactive rebase in progress; onto 95e20213fa
Last commands done (3 commands done):
   exec make -s
   exec cd t && prove -j8 t3454-history-squash.sh
t3453-history-fixup.sh t3452-history-split.sh t3451-history-reword.sh
t3450-history.sh
  (see more in file .git/rebase-merge/done)
Next commands to do (9 remaining commands):
   pick 498da64046 # history: give commit_tree_ext a message template
   exec make -s
  (use "git rebase --edit-todo" to view and edit)
You are currently editing a commit while rebasing branch
'rebase-fixup-fold' on '95e20213fa'.
  (use "git commit --amend" to amend the current commit)
  (use "git rebase --continue" once you are satisfied with your changes)

nothing to commit, working tree clean
```


Harald

On Wed, Jun 17, 2026 at 2:19 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Phillip Wood <phillip.wood123@gmail.com> writes:
>
> >> It is very true that users who know what they are doing and got into
> >> such conflicts are opted to go into such a situation tnat it is
> >> unlikely that they would appreciate a choice to abort.
> >
> > That's not quite what I was trying to say which was that aborting in the
> > case of conflicts is more likely than in the case of a failed exec.
>
> Ah, I misread the intention.  And I agree with you that "failed
> test" case is very likely to lead to "further changes/amends" and
> not "aborted rebase".
>
> > So if I've understood we'd print a message explaining what's happened
> > and how to continue followed by a hint about aborting. The message would
> > depend on what problem caused the rebase to stop, but the hint would be
> > the same in each case. That sounds fine to me.
>
> Yeah, and "failed test" would not be one of the problem that would
> invite the hint to "abort".  I am OK with that, too.  FWIW, I am OK
> if the "you can abort" hint cannot be configured away, either ;-)
>

^ permalink raw reply

* [PATCH v3 0/4] history: add squash subcommand to fold a range
From: Harald Nordgren via GitGitGadget @ 2026-06-18 19:17 UTC (permalink / raw)
  To: git; +Cc: Harald Nordgren
In-Reply-To: <pull.2337.v2.git.git.1781512625.gitgitgadget@gmail.com>

Adds git history squash <revision-range> to fold a range of commits into its
oldest one, reusing that commit's message and replaying any descendants on
top.

Changes in v3:

 * Moved the feature out of git rebase and into a new git history squash
   <revision-range> subcommand, per the list discussion. git rebase --squash
   is dropped.
 * Takes an arbitrary range (git history squash @~3.., git history squash
   @~5..@~2), folding it into the oldest commit and replaying any
   descendants on top.
 * Implemented as a single tree operation rather than picking each commit,
   so there are no repeated conflict stops (addresses Phillip's efficiency
   point).
 * A merge inside the range is folded fine, only a range with more than one
   base is rejected.
 * --reedit-message seeds the editor with every folded-in message, not just
   the oldest.

Harald Nordgren (4):
  history: extract helper for a commit's parent tree
  history: give commit_tree_ext a message template
  history: add squash subcommand to fold a range
  history: re-edit a squash with every message

 Documentation/git-history.adoc |  21 +++
 builtin/history.c              | 287 ++++++++++++++++++++++++++++-----
 t/meson.build                  |   1 +
 t/t3454-history-squash.sh      | 250 ++++++++++++++++++++++++++++
 4 files changed, 521 insertions(+), 38 deletions(-)
 create mode 100755 t/t3454-history-squash.sh


base-commit: 95e20213faefeb95df29277c58ac1980ab68f701
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2337%2FHaraldNordgren%2Frebase-fixup-fold-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2337/HaraldNordgren/rebase-fixup-fold-v3
Pull-Request: https://github.com/git/git/pull/2337

Range-diff vs v2:

 1:  c55b9cd6f7 < -:  ---------- t3415: remove prepare-commit-msg hook after use
 2:  22d4276ff5 < -:  ---------- rebase: add --squash to fold a range
 -:  ---------- > 1:  1e31474ef6 history: extract helper for a commit's parent tree
 -:  ---------- > 2:  498da64046 history: give commit_tree_ext a message template
 -:  ---------- > 3:  66b2f49fb4 history: add squash subcommand to fold a range
 -:  ---------- > 4:  43e4270614 history: re-edit a squash with every message

-- 
gitgitgadget

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox