From: "Michael Montalbo via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Michael Montalbo <mmontalbo@gmail.com>,
Michael Montalbo <mmontalbo@gmail.com>
Subject: [PATCH v4 6/6] blame: consult diff process for no-hunk detection
Date: Sun, 14 Jun 2026 18:59:23 +0000 [thread overview]
Message-ID: <3dadafa1bc237f8003fb96f69cf44350e72cc46e.1781463564.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2120.v4.git.1781463564.gitgitgadget@gmail.com>
From: Michael Montalbo <mmontalbo@gmail.com>
When a diff process is configured via diff.<driver>.process,
consult it during blame's per-commit diffing. If the process
returns no hunks for a commit's changes to a file, treat the
commit as having no changes, causing blame to attribute lines
to earlier commits.
The consultation happens at the pass_blame_to_parent() callsite
using diff_process_fill_hunks(), matching how builtin_diff() in
diff.c uses the same function. A new diff_hunks_xpp() variant
accepts a pre-populated xpparam_t so callers can pass external
hunks, while the existing diff_hunks() retains its original
signature and behavior. The copy-detection callsite is
unaffected since it does not use the diff process.
The subprocess is long-running (one startup cost amortized
across the blame traversal), but each commit in the file's
history incurs a round-trip to the tool.
Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
---
blame.c | 40 +++++++++++----
t/t4080-diff-process.sh | 105 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 136 insertions(+), 9 deletions(-)
diff --git a/blame.c b/blame.c
index 977cbb7097..354e6c15f4 100644
--- a/blame.c
+++ b/blame.c
@@ -19,6 +19,8 @@
#include "tag.h"
#include "trace2.h"
#include "blame.h"
+#include "diff-process.h"
+#include "xdiff-interface.h"
#include "alloc.h"
#include "commit-slab.h"
#include "bloom.h"
@@ -314,17 +316,25 @@ static struct commit *fake_working_tree_commit(struct repository *r,
-static int diff_hunks(mmfile_t *file_a, mmfile_t *file_b,
- xdl_emit_hunk_consume_func_t hunk_func, void *cb_data, int xdl_opts)
+static int diff_hunks_xpp(mmfile_t *file_a, mmfile_t *file_b,
+ xdl_emit_hunk_consume_func_t hunk_func,
+ void *cb_data, xpparam_t *xpp)
{
- xpparam_t xpp = {0};
xdemitconf_t xecfg = {0};
xdemitcb_t ecb = {NULL};
- xpp.flags = xdl_opts;
xecfg.hunk_func = hunk_func;
ecb.priv = cb_data;
- return xdi_diff(file_a, file_b, &xpp, &xecfg, &ecb);
+ return xdi_diff(file_a, file_b, xpp, &xecfg, &ecb);
+}
+
+static int diff_hunks(mmfile_t *file_a, mmfile_t *file_b,
+ xdl_emit_hunk_consume_func_t hunk_func, void *cb_data, int xdl_opts)
+{
+ xpparam_t xpp = {0};
+
+ xpp.flags = xdl_opts;
+ return diff_hunks_xpp(file_a, file_b, hunk_func, cb_data, &xpp);
}
static const char *get_next_line(const char *start, const char *end)
@@ -1943,6 +1953,7 @@ static void pass_blame_to_parent(struct blame_scoreboard *sb,
struct blame_origin *parent, int ignore_diffs)
{
mmfile_t file_p, file_o;
+ xpparam_t xpp = {0};
struct blame_chunk_cb_data d;
struct blame_entry *newdest = NULL;
@@ -1961,10 +1972,21 @@ static void pass_blame_to_parent(struct blame_scoreboard *sb,
&sb->num_read_blob, ignore_diffs);
sb->num_get_patch++;
- if (diff_hunks(&file_p, &file_o, blame_chunk_cb, &d, sb->xdl_opts))
- die("unable to generate diff (%s -> %s)",
- oid_to_hex(&parent->commit->object.oid),
- oid_to_hex(&target->commit->object.oid));
+ xpp.flags = sb->xdl_opts;
+ /*
+ * If the diff process considers the files equivalent,
+ * skip the diff so blame looks past this commit.
+ */
+ if (diff_process_fill_hunks(&sb->revs->diffopt, target->path,
+ &file_p, &file_o, &xpp)
+ != DIFF_PROCESS_EQUIVALENT) {
+ if (diff_hunks_xpp(&file_p, &file_o, blame_chunk_cb,
+ &d, &xpp))
+ die("unable to generate diff (%s -> %s)",
+ oid_to_hex(&parent->commit->object.oid),
+ oid_to_hex(&target->commit->object.oid));
+ }
+ free(xpp.external_hunks);
/* The rest are the same as the parent */
blame_chunk(&d.dstq, &d.srcq, INT_MAX, d.offset, INT_MAX, 0,
parent, target, 0);
diff --git a/t/t4080-diff-process.sh b/t/t4080-diff-process.sh
index df4d08e31f..9fc3c01eec 100755
--- a/t/t4080-diff-process.sh
+++ b/t/t4080-diff-process.sh
@@ -445,4 +445,109 @@ test_expect_success 'diff process skipped when tool omits capability' '
test_must_be_empty stderr
'
+#
+# Blame integration.
+#
+
+test_expect_success 'blame uses tool-provided hunks' '
+ cat >blame-hunk.c <<-\EOF &&
+ line1
+ line2
+ line3
+ line4
+ original5
+ original6
+ line7
+ line8
+ line9
+ line10
+ EOF
+ git add blame-hunk.c &&
+ git commit -m "add blame-hunk.c" &&
+ ORIG=$(git rev-parse --short HEAD) &&
+
+ cat >blame-hunk.c <<-\EOF &&
+ line1
+ line2
+ line3
+ line4
+ changed5
+ changed6
+ line7
+ line8
+ changed9
+ changed10
+ EOF
+ git add blame-hunk.c &&
+ git commit -m "change blame-hunk.c" &&
+ CHANGE=$(git rev-parse --short HEAD) &&
+
+ # With fixed-hunk mode the tool reports only lines 5-6 as changed,
+ # so blame should attribute lines 9-10 to the original commit
+ # even though the builtin diff would show them as changed.
+ git -c diff.cdiff.process="$BACKEND --mode=fixed-hunk" \
+ blame blame-hunk.c >actual &&
+ sed -n "9p" actual >line9 &&
+ sed -n "10p" actual >line10 &&
+ test_grep "$ORIG" line9 &&
+ test_grep "$ORIG" line10 &&
+ sed -n "5p" actual >line5 &&
+ sed -n "6p" actual >line6 &&
+ test_grep "$CHANGE" line5 &&
+ test_grep "$CHANGE" line6
+'
+
+test_expect_success 'blame skips commits with no hunks from diff process' '
+ cat >blame.c <<-\EOF &&
+ int main(void) {
+ return 0;
+ }
+ EOF
+ git add blame.c &&
+ git commit -m "add blame.c" &&
+ ORIG_COMMIT=$(git rev-parse --short HEAD) &&
+
+ cat >blame.c <<-\EOF &&
+ int main(void)
+ {
+ return 0;
+ }
+ EOF
+ git add blame.c &&
+ git commit -m "reformat blame.c" &&
+ BLAME_COMMIT=$(git rev-parse --short HEAD) &&
+
+ # Without no-hunks mode, blame attributes the change.
+ git blame blame.c >without &&
+ test_grep "$BLAME_COMMIT" without &&
+
+ # With no-hunks mode, the process considers the files equivalent
+ # and blame skips the reformat commit, attributing to the original.
+ git -c diff.cdiff.process="$BACKEND --mode=no-hunks" \
+ blame blame.c >with &&
+ test_grep ! "$BLAME_COMMIT" with &&
+ test_grep "$ORIG_COMMIT" with
+'
+
+test_expect_success 'blame --no-ext-diff bypasses diff process' '
+ test_when_finished "rm -f backend.log" &&
+ git -c diff.cdiff.process="$BACKEND --mode=no-hunks --log=backend.log" \
+ blame --no-ext-diff blame.c >actual &&
+ # Without the process, blame attributes the reformat commit normally.
+ test_grep "$BLAME_COMMIT" actual &&
+ test_path_is_missing backend.log
+'
+
+test_expect_success 'blame --no-ext-diff uses builtin hunks' '
+ # fixed-hunk mode would narrow blame to lines 5-6, but
+ # --no-ext-diff should bypass it and use the builtin diff.
+ test_when_finished "rm -f backend.log" &&
+ git -c diff.cdiff.process="$BACKEND --mode=fixed-hunk --log=backend.log" \
+ blame --no-ext-diff blame-hunk.c >actual &&
+ # Builtin diff attributes lines 9-10 to the change commit.
+ sed -n "9p" actual >line9 &&
+ test_grep "$CHANGE" line9 &&
+ test_path_is_missing backend.log
+'
+
test_done
--
gitgitgadget
next prev parent reply other threads:[~2026-06-14 18:59 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-22 2:11 [PATCH 0/5] [RFC] diff: add diff.<driver>.process for external hunk providers Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 1/5] xdiff: support external hunks via xpparam_t Michael Montalbo via GitGitGadget
2026-05-22 5:29 ` Junio C Hamano
2026-05-22 19:06 ` Michael Montalbo
2026-05-24 8:50 ` Junio C Hamano
2026-05-24 18:01 ` Michael Montalbo
2026-05-22 2:11 ` [PATCH 2/5] userdiff: add diff.<driver>.process config Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 3/5] diff: add long-running diff process via diff.<driver>.process Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 4/5] blame: consult diff process for zero-hunk detection Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 5/5] diff-process-normalize: add built-in whitespace normalizer Michael Montalbo via GitGitGadget
2026-05-22 5:29 ` [PATCH 0/5] [RFC] diff: add diff.<driver>.process for external hunk providers Junio C Hamano
2026-05-22 17:19 ` Michael Montalbo
2026-05-25 18:29 ` [PATCH v2 0/4] " Michael Montalbo via GitGitGadget
2026-05-25 18:29 ` [PATCH v2 1/4] xdiff: support external hunks via xpparam_t Michael Montalbo via GitGitGadget
2026-05-25 18:29 ` [PATCH v2 2/4] userdiff: add diff.<driver>.process config Michael Montalbo via GitGitGadget
2026-05-25 18:29 ` [PATCH v2 3/4] diff: add long-running diff process via diff.<driver>.process Michael Montalbo via GitGitGadget
2026-05-26 1:56 ` Junio C Hamano
2026-05-29 0:51 ` Michael Montalbo
2026-05-26 2:26 ` Junio C Hamano
2026-05-29 0:55 ` Michael Montalbo
2026-05-25 18:29 ` [PATCH v2 4/4] blame: consult diff process for zero-hunk detection Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 0/6] [RFC] diff: add diff.<driver>.process for external hunk providers Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 1/6] xdiff: support external hunks via xpparam_t Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 2/6] userdiff: add diff.<driver>.process config Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 3/6] sub-process: separate process lifecycle from hashmap management Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 4/6] diff: add long-running diff process via diff.<driver>.process Michael Montalbo via GitGitGadget
2026-06-07 14:36 ` Johannes Schindelin
2026-06-07 17:04 ` Michael Montalbo
2026-06-08 12:26 ` Junio C Hamano
2026-06-07 20:36 ` Michael Montalbo
2026-06-08 17:19 ` Junio C Hamano
2026-06-08 12:06 ` Junio C Hamano
2026-05-29 20:48 ` [PATCH v3 5/6] diff: bypass diff process with --no-ext-diff and in format-patch Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 6/6] blame: consult diff process for no-hunk detection Michael Montalbo via GitGitGadget
2026-05-31 10:44 ` [PATCH v3 0/6] [RFC] diff: add diff.<driver>.process for external hunk providers Junio C Hamano
2026-06-01 4:28 ` Michael Montalbo
2026-06-14 18:59 ` [PATCH v4 " Michael Montalbo via GitGitGadget
2026-06-14 18:59 ` [PATCH v4 1/6] xdiff: support external hunks via xpparam_t Michael Montalbo via GitGitGadget
2026-06-14 18:59 ` [PATCH v4 2/6] userdiff: add diff.<driver>.process config Michael Montalbo via GitGitGadget
2026-06-14 18:59 ` [PATCH v4 3/6] sub-process: separate process lifecycle from hashmap management Michael Montalbo via GitGitGadget
2026-06-14 18:59 ` [PATCH v4 4/6] diff: add long-running diff process via diff.<driver>.process Michael Montalbo via GitGitGadget
2026-06-14 18:59 ` [PATCH v4 5/6] diff: bypass diff process with --no-ext-diff and in format-patch Michael Montalbo via GitGitGadget
2026-06-14 18:59 ` Michael Montalbo via GitGitGadget [this message]
[not found] ` <pull.2120.v4.git.1781463332.gitgitgadget@gmail.com>
2026-06-15 21:14 ` [PREVIEW v4 0/6] [RFC] diff: add diff.<driver>.process for external hunk providers Michael Montalbo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3dadafa1bc237f8003fb96f69cf44350e72cc46e.1781463564.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=mmontalbo@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.