From: Junio C Hamano <gitster@pobox.com>
To: "Michael Montalbo via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Michael Montalbo <mmontalbo@gmail.com>
Subject: Re: [PATCH 1/5] xdiff: support external hunks via xpparam_t
Date: Fri, 22 May 2026 14:29:25 +0900 [thread overview]
Message-ID: <xmqq33zkui4q.fsf@gitster.g> (raw)
In-Reply-To: <8c0ea0bc0742651e634db7a3002e8cbe1240acf9.1779415884.git.gitgitgadget@gmail.com> (Michael Montalbo via GitGitGadget's message of "Fri, 22 May 2026 02:11:20 +0000")
"Michael Montalbo via GitGitGadget" <gitgitgadget@gmail.com> writes:
> +/*
> + * Populate the changed[] arrays from externally supplied hunks,
> + * bypassing the diff algorithm. Validates that hunks are in order,
> + * non-overlapping, and within bounds.
> + *
> + * Returns 0 on success, -1 on validation failure.
> + */
> +static int xdl_populate_hunks_from_external(xdfenv_t *xe,
> + const struct xdl_hunk *hunks,
> + size_t nr_hunks)
> +{
> + size_t i;
> + long j, prev_old_end = 0, prev_new_end = 0;
> + long total_old = 0, total_new = 0;
> +
> + /*
> + * Clear changed[] arrays. xdl_prepare_env() may have dirtied
> + * them via xdl_cleanup_records(). The allocation is nrec + 2
> + * elements; changed points one past the start (see xprepare.c).
> + */
> + memset(xe->xdf1.changed - 1, 0,
> + (xe->xdf1.nrec + 2) * sizeof(bool));
> + memset(xe->xdf2.changed - 1, 0,
> + (xe->xdf2.nrec + 2) * sizeof(bool));
This, especially the starting offset of -1, looks horrible. The
internal layout of xdfenv_t might happen to match the way the above
code expects, which is how xdl_prepare_ctx() may have give you, but
it somehow feels brittle. I guess the assumption that changed[]
does not point at the beginning of the allocated area (e.g., it is a
no-no to free(xe->xdf1.changed) or realloc() it) is so pervasive that
it cannot be helped. Sigh.
> int xdl_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
> xdemitconf_t const *xecfg, xdemitcb_t *ecb) {
> xdchange_t *xscr;
> xdfenv_t xe;
> emit_func_t ef = xecfg->hunk_func ? xdl_call_hunk_func : xdl_emit_diff;
>
> - if (xdl_do_diff(mf1, mf2, xpp, &xe) < 0) {
> -
> - return -1;
> + if (xpp->external_hunks) {
> + if (xdl_prepare_env(mf1, mf2, xpp, &xe) < 0)
> + return -1;
> + if (xdl_populate_hunks_from_external(&xe,
> + xpp->external_hunks,
> + xpp->external_hunks_nr) < 0) {
> + /*
> + * Invalid external hunks; fall back to the
> + * builtin diff algorithm. Re-runs
> + * xdl_prepare_env() via xdl_do_diff().
> + */
> + xdl_free_env(&xe);
> + if (xdl_do_diff(mf1, mf2, xpp, &xe) < 0)
> + return -1;
If the external tool keeps sending bogus hunks, silently falling
back to what we would have done if there weren't any external stuff
may be necessary to pleasantly keep using Git, but two and a half
short comments here.
(1) "What we would have done" is exactly the same as what appears
in the corresponding "else" block. Can we make sure that we do
not have to keep updating both copies in the future with some
code rearrangement?
(2) The writer of the external tool may want to see some trace of
warning under certain flags when a failure of the tool forces
the receiving end to fallback.
(3) If the tool throws too many broken replies, perhaps we want to
disable it automatically?
> + }
> + } else {
> + if (xdl_do_diff(mf1, mf2, xpp, &xe) < 0)
> + return -1;
> }
> +
> if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
> xdl_change_compact(&xe.xdf2, &xe.xdf1, xpp->flags) < 0 ||
> xdl_build_script(&xe, &xscr) < 0) {
next prev parent reply other threads:[~2026-05-22 5:29 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-22 2:11 [PATCH 0/5] [RFC] diff: add diff.<driver>.process for external hunk providers Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 1/5] xdiff: support external hunks via xpparam_t Michael Montalbo via GitGitGadget
2026-05-22 5:29 ` Junio C Hamano [this message]
2026-05-22 2:11 ` [PATCH 2/5] userdiff: add diff.<driver>.process config Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 3/5] diff: add long-running diff process via diff.<driver>.process Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 4/5] blame: consult diff process for zero-hunk detection Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 5/5] diff-process-normalize: add built-in whitespace normalizer Michael Montalbo via GitGitGadget
2026-05-22 5:29 ` [PATCH 0/5] [RFC] diff: add diff.<driver>.process for external hunk providers Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq33zkui4q.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=mmontalbo@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox