From: "Michael Montalbo via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Michael Montalbo <mmontalbo@gmail.com>,
Michael Montalbo <mmontalbo@gmail.com>
Subject: [PATCH v2 4/4] blame: consult diff process for zero-hunk detection
Date: Mon, 25 May 2026 18:29:58 +0000 [thread overview]
Message-ID: <39ff53acefb2844b0f4e88825d29b9d57bed66f2.1779733799.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2120.v2.git.1779733799.gitgitgadget@gmail.com>
From: Michael Montalbo <mmontalbo@gmail.com>
When a diff process is configured via diff.<driver>.process,
consult it during blame's per-commit diffing. If the process
returns zero hunks for a commit's changes to a file, treat the
commit as having no changes, causing blame to attribute lines
to earlier commits.
The subprocess is long-running (one startup cost amortized
across the blame traversal), but each commit in the file's
history incurs a round-trip to the tool.
Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
---
Documentation/gitattributes.adoc | 3 +++
blame.c | 43 +++++++++++++++++++++++++++++---
t/t4080-diff-process.sh | 32 ++++++++++++++++++++++++
3 files changed, 74 insertions(+), 4 deletions(-)
diff --git a/Documentation/gitattributes.adoc b/Documentation/gitattributes.adoc
index 962896a0b4..c087b4b265 100644
--- a/Documentation/gitattributes.adoc
+++ b/Documentation/gitattributes.adoc
@@ -857,6 +857,9 @@ The tool responds with lines of the form
If the tool returns zero hunks with `status=success`, Git treats
the file as having no changes and produces no diff output.
+`git blame` also consults the diff process and skips commits
+where it reports zero hunks, attributing lines to earlier commits
+instead.
Tools should ignore unknown keys in the per-file request to
remain forward-compatible.
diff --git a/blame.c b/blame.c
index a3c49d132e..8a5f14db7a 100644
--- a/blame.c
+++ b/blame.c
@@ -19,6 +19,8 @@
#include "tag.h"
#include "trace2.h"
#include "blame.h"
+#include "diff-process.h"
+#include "userdiff.h"
#include "alloc.h"
#include "commit-slab.h"
#include "bloom.h"
@@ -315,16 +317,47 @@ static struct commit *fake_working_tree_commit(struct repository *r,
static int diff_hunks(mmfile_t *file_a, mmfile_t *file_b,
- xdl_emit_hunk_consume_func_t hunk_func, void *cb_data, int xdl_opts)
+ xdl_emit_hunk_consume_func_t hunk_func, void *cb_data,
+ int xdl_opts, struct index_state *istate,
+ const char *path)
{
xpparam_t xpp = {0};
xdemitconf_t xecfg = {0};
xdemitcb_t ecb = {NULL};
+ struct xdl_hunk *ext_hunks = NULL;
+ int ret;
xpp.flags = xdl_opts;
xecfg.hunk_func = hunk_func;
ecb.priv = cb_data;
- return xdi_diff(file_a, file_b, &xpp, &xecfg, &ecb);
+
+ if (path && istate) {
+ struct userdiff_driver *drv;
+ drv = userdiff_find_by_path(istate, path);
+ if (drv && drv->process) {
+ size_t nr = 0;
+ if (!diff_process_get_hunks(drv, path,
+ file_a->ptr, file_a->size,
+ file_b->ptr, file_b->size,
+ &ext_hunks, &nr)) {
+ if (!nr) {
+ /*
+ * Zero hunks: the diff process
+ * considers these files equivalent.
+ * Skip so blame looks past this
+ * commit.
+ */
+ return 0;
+ }
+ xpp.external_hunks = ext_hunks;
+ xpp.external_hunks_nr = nr;
+ }
+ }
+ }
+
+ ret = xdi_diff(file_a, file_b, &xpp, &xecfg, &ecb);
+ free(ext_hunks);
+ return ret;
}
static const char *get_next_line(const char *start, const char *end)
@@ -1961,7 +1994,8 @@ static void pass_blame_to_parent(struct blame_scoreboard *sb,
&sb->num_read_blob, ignore_diffs);
sb->num_get_patch++;
- if (diff_hunks(&file_p, &file_o, blame_chunk_cb, &d, sb->xdl_opts))
+ if (diff_hunks(&file_p, &file_o, blame_chunk_cb, &d, sb->xdl_opts,
+ sb->revs->diffopt.repo->index, target->path))
die("unable to generate diff (%s -> %s)",
oid_to_hex(&parent->commit->object.oid),
oid_to_hex(&target->commit->object.oid));
@@ -2114,7 +2148,8 @@ static void find_copy_in_blob(struct blame_scoreboard *sb,
* file_p partially may match that image.
*/
memset(split, 0, sizeof(struct blame_entry [3]));
- if (diff_hunks(file_p, &file_o, handle_split_cb, &d, sb->xdl_opts))
+ if (diff_hunks(file_p, &file_o, handle_split_cb, &d, sb->xdl_opts,
+ NULL, NULL))
die("unable to generate diff (%s)",
oid_to_hex(&parent->commit->object.oid));
/* remainder, if any, all match the preimage */
diff --git a/t/t4080-diff-process.sh b/t/t4080-diff-process.sh
index 083e48e872..50f49a9b02 100755
--- a/t/t4080-diff-process.sh
+++ b/t/t4080-diff-process.sh
@@ -335,4 +335,36 @@ test_expect_success PYTHON 'diff process zero hunks suppresses diff output' '
test_must_be_empty actual
'
+test_expect_success PYTHON 'blame skips commits with zero hunks from diff process' '
+ cat >blame.c <<-\EOF &&
+ int main(void)
+ {
+ return 0;
+ }
+ EOF
+ git add blame.c &&
+ git commit -m "add blame.c" &&
+
+ cat >blame.c <<-\EOF &&
+ int main(void)
+ {
+ return 0;
+ }
+ EOF
+ git add blame.c &&
+ git commit -m "reformat blame.c" &&
+ BLAME_COMMIT=$(git rev-parse --short HEAD) &&
+
+ # Without zero-hunk mode, blame attributes the change.
+ git blame blame.c >without &&
+ test_grep "$BLAME_COMMIT" without &&
+
+ # With zero-hunk mode, the process considers the files equivalent
+ # and blame skips the reformat commit.
+ git -c diff.cdiff.process="$BACKEND --mode=zero-hunk" \
+ blame blame.c >with &&
+ ! test_grep "$BLAME_COMMIT" with
+'
+
+
test_done
--
gitgitgadget
next prev parent reply other threads:[~2026-05-25 18:30 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-22 2:11 [PATCH 0/5] [RFC] diff: add diff.<driver>.process for external hunk providers Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 1/5] xdiff: support external hunks via xpparam_t Michael Montalbo via GitGitGadget
2026-05-22 5:29 ` Junio C Hamano
2026-05-22 19:06 ` Michael Montalbo
2026-05-24 8:50 ` Junio C Hamano
2026-05-24 18:01 ` Michael Montalbo
2026-05-22 2:11 ` [PATCH 2/5] userdiff: add diff.<driver>.process config Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 3/5] diff: add long-running diff process via diff.<driver>.process Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 4/5] blame: consult diff process for zero-hunk detection Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 5/5] diff-process-normalize: add built-in whitespace normalizer Michael Montalbo via GitGitGadget
2026-05-22 5:29 ` [PATCH 0/5] [RFC] diff: add diff.<driver>.process for external hunk providers Junio C Hamano
2026-05-22 17:19 ` Michael Montalbo
2026-05-25 18:29 ` [PATCH v2 0/4] " Michael Montalbo via GitGitGadget
2026-05-25 18:29 ` [PATCH v2 1/4] xdiff: support external hunks via xpparam_t Michael Montalbo via GitGitGadget
2026-05-25 18:29 ` [PATCH v2 2/4] userdiff: add diff.<driver>.process config Michael Montalbo via GitGitGadget
2026-05-25 18:29 ` [PATCH v2 3/4] diff: add long-running diff process via diff.<driver>.process Michael Montalbo via GitGitGadget
2026-05-26 1:56 ` Junio C Hamano
2026-05-29 0:51 ` Michael Montalbo
2026-05-26 2:26 ` Junio C Hamano
2026-05-29 0:55 ` Michael Montalbo
2026-05-25 18:29 ` Michael Montalbo via GitGitGadget [this message]
2026-05-29 20:48 ` [PATCH v3 0/6] [RFC] diff: add diff.<driver>.process for external hunk providers Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 1/6] xdiff: support external hunks via xpparam_t Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 2/6] userdiff: add diff.<driver>.process config Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 3/6] sub-process: separate process lifecycle from hashmap management Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 4/6] diff: add long-running diff process via diff.<driver>.process Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 5/6] diff: bypass diff process with --no-ext-diff and in format-patch Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 6/6] blame: consult diff process for no-hunk detection Michael Montalbo via GitGitGadget
2026-05-31 10:44 ` [PATCH v3 0/6] [RFC] diff: add diff.<driver>.process for external hunk providers Junio C Hamano
2026-06-01 4:28 ` Michael Montalbo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=39ff53acefb2844b0f4e88825d29b9d57bed66f2.1779733799.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=mmontalbo@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.