public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Phillip Wood <phillip.wood123@gmail.com>
To: Ezekiel Newren via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: Ezekiel Newren <ezekielnewren@gmail.com>
Subject: Re: [PATCH 00/10] Xdiff cleanup part 3
Date: Wed, 28 Jan 2026 14:40:17 +0000	[thread overview]
Message-ID: <93942e80-207d-4fbd-9965-7c072693b6e1@gmail.com> (raw)
In-Reply-To: <pull.2156.git.git.1767379944.gitgitgadget@gmail.com>

The discussion of this series has got rather spread out so I thought it 
might be helpful to write a summary of my thoughts here.

On 02/01/2026 18:52, Ezekiel Newren via GitGitGadget wrote:
> Patch series summary:
> 
>   * patch 1: Introduce the ivec type

I agree this is a good idea to allow rust and C code to operate on the 
some data structure. The implementation needs a bit of work to avoid 
undefined behavior.

>   * patch 2: Create the function xdl_do_classic_diff()

This is sensible

>   * patches 3-4: generic cleanup

Patch 3 claims to "stop wasting time" but it introduces an extra pass 
over the input records without any explanation of why that is more 
efficient.

Patch 4 removes the common lines from the beginning and end of the input 
files before passing them on to the patience or histogram algorithms. 
That should speed things up (though we should measure by how much). It 
changes the output because excluding the common lines at beginning and 
end of the file changes the longest sequence of unique context lines in 
the lines that remain. If the different output is easier to read then 
that's clearly a good thing but you would need to do some analysis to 
show that.

>   * patches 5-8: convert from dstart/dend (in xdfile_t) to
>     delta_start/delta_end (in xdfenv_t)

dstart a dend in xdfile_t are the index of the first and last line after 
removing any common lines from the beginning and end. The proposal is to 
store the offset from the beginning and end instead in xdfenv_t. Looking 
at where dstart and dend are used I think storing the indices is more 
convenient - if we store offsets we end up calculating the indices from 
them which is a pain and introduces an opportunity to make an error.

>   * patches 9-10: move xdl_cleanup_records(), and related, from xprepare.c to
>     xdiffi.c

Here we finally get to use the ivec data structre introduced in patch 1. 
However it is just replacing a fixed size array and so does not 
demonstrate the more interesting parts of the API which concern growing 
the array as we push more elements to it. I'm also not convinced by the 
claim that this change saves time as it introduces an extra pass over 
the input records.

Overall I struggled to see how the cleanups proposed here linked to the 
introduction of the ivec data structure.

Thanks

Phillip

> Things that will be addressed in future patch series:
> 
>   * Make xdl_cleanup_records() easier to read
>   * convert recs/nrec into an ivec
>   * convert changed to an ivec
>   * remove reference_index/nreff from xdfile_t and turn it into an ivec
>   * splitting minimal_perfect_hash out as its own ivec
>   * improve the performance of the classifier and parsing/hashing lines
> 
> === before this patch series typedef struct s_xdfile { xrecord_t *recs;
> size_t nrec; ptrdiff_t dstart, dend; bool *changed; size_t *reference_index;
> size_t nreff; } xdfile_t;
> 
> typedef struct s_xdfenv { xdfile_t xdf1, xdf2; } xdfenv_t;
> 
> === after this patch series typedef struct s_xdfile { xrecord_t *recs;
> size_t nrec; bool *changed; size_t *reference_index; size_t nreff; }
> xdfile_t;
> 
> typedef struct s_xdfenv { xdfile_t xdf1, xdf2; size_t delta_start,
> delta_end; size_t mph_size; } xdfenv_t;
> 
> Ezekiel Newren (10):
>    ivec: introduce the C side of ivec
>    xdiff: make classic diff explicit by creating xdl_do_classic_diff()
>    xdiff: don't waste time guessing the number of lines
>    xdiff: let patience and histogram benefit from xdl_trim_ends()
>    xdiff: use xdfenv_t in xdl_trim_ends() and xdl_cleanup_records()
>    xdiff: cleanup xdl_trim_ends()
>    xdiff: replace xdfile_t.dstart with xdfenv_t.delta_start
>    xdiff: replace xdfile_t.dend with xdfenv_t.delta_end
>    xdiff: remove dependence on xdlclassifier from xdl_cleanup_records()
>    xdiff: move xdl_cleanup_records() from xprepare.c to xdiffi.c
> 
>   Makefile           |   1 +
>   compat/ivec.c      | 113 ++++++++++++++++++
>   compat/ivec.h      |  52 +++++++++
>   meson.build        |   1 +
>   xdiff/xdiffi.c     | 221 +++++++++++++++++++++++++++++++++---
>   xdiff/xdiffi.h     |   1 +
>   xdiff/xhistogram.c |   7 +-
>   xdiff/xpatience.c  |   7 +-
>   xdiff/xprepare.c   | 277 ++++++++-------------------------------------
>   xdiff/xtypes.h     |   3 +-
>   xdiff/xutils.c     |  20 ----
>   xdiff/xutils.h     |   1 -
>   12 files changed, 432 insertions(+), 272 deletions(-)
>   create mode 100644 compat/ivec.c
>   create mode 100644 compat/ivec.h
> 
> 
> base-commit: 66ce5f8e8872f0183bb137911c52b07f1f242d13
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2156%2Fezekielnewren%2Fxdiff-cleanup-3-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2156/ezekielnewren/xdiff-cleanup-3-v1
> Pull-Request: https://github.com/git/git/pull/2156


  parent reply	other threads:[~2026-01-28 14:40 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-02 18:52 [PATCH 00/10] Xdiff cleanup part 3 Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 01/10] ivec: introduce the C side of ivec Ezekiel Newren via GitGitGadget
2026-01-04  5:32   ` Junio C Hamano
2026-01-17 16:06     ` Ezekiel Newren
2026-01-08 14:34   ` Phillip Wood
2026-01-15 15:55     ` Ezekiel Newren
2026-01-16 10:39       ` Phillip Wood
2026-01-16 20:19         ` René Scharfe
2026-01-17 13:55           ` Phillip Wood
2026-01-17 16:04             ` Ezekiel Newren
2026-01-18 14:58               ` René Scharfe
2026-01-17 16:14         ` Ezekiel Newren
2026-01-17 16:16           ` Ezekiel Newren
2026-01-17 17:40           ` Phillip Wood
2026-01-19  5:59             ` Jeff King
2026-01-19 20:21               ` Ezekiel Newren
2026-01-19 20:40                 ` Jeff King
2026-01-20  2:36                   ` D. Ben Knoble
2026-01-21 21:00                   ` Ezekiel Newren
2026-01-21 21:20                     ` Jeff King
2026-01-21 21:31                       ` Junio C Hamano
2026-01-21 21:45                         ` Ezekiel Newren
2026-01-20 13:46               ` Phillip Wood
2026-01-20 14:06       ` Phillip Wood
2026-01-21 21:39         ` Ezekiel Newren
2026-01-28 11:15           ` Phillip Wood
2026-01-16 20:19   ` René Scharfe
2026-01-17 15:58     ` Ezekiel Newren
2026-01-18 14:55       ` René Scharfe
2026-01-02 18:52 ` [PATCH 02/10] xdiff: make classic diff explicit by creating xdl_do_classic_diff() Ezekiel Newren via GitGitGadget
2026-01-20 15:01   ` Phillip Wood
2026-01-21 21:05     ` Ezekiel Newren
2026-01-02 18:52 ` [PATCH 03/10] xdiff: don't waste time guessing the number of lines Ezekiel Newren via GitGitGadget
2026-01-20 15:02   ` Phillip Wood
2026-01-21 21:12     ` Ezekiel Newren
2026-01-22 10:16       ` Phillip Wood
2026-01-02 18:52 ` [PATCH 04/10] xdiff: let patience and histogram benefit from xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 15:02   ` Phillip Wood
2026-01-21 14:49     ` Phillip Wood
2026-01-02 18:52 ` [PATCH 05/10] xdiff: use xdfenv_t in xdl_trim_ends() and xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 06/10] xdiff: cleanup xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 07/10] xdiff: replace xdfile_t.dstart with xdfenv_t.delta_start Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-28 10:51     ` Phillip Wood
2026-01-02 18:52 ` [PATCH 08/10] xdiff: replace xdfile_t.dend with xdfenv_t.delta_end Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 09/10] xdiff: remove dependence on xdlclassifier from xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-16 20:19   ` René Scharfe
2026-01-17 16:34     ` Ezekiel Newren
2026-01-18 18:23       ` René Scharfe
2026-01-21 15:01   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 10/10] xdiff: move xdl_cleanup_records() from xprepare.c to xdiffi.c Ezekiel Newren via GitGitGadget
2026-01-21 15:01   ` Phillip Wood
2026-01-28 10:56     ` Phillip Wood
2026-01-04  2:44 ` [PATCH 00/10] Xdiff cleanup part 3 Junio C Hamano
2026-01-04  6:01 ` Yee Cheng Chin
2026-01-28 14:40 ` Phillip Wood [this message]
2026-03-06 23:03 ` Junio C Hamano
2026-03-09 19:06   ` Ezekiel Newren
2026-03-09 23:31     ` Junio C Hamano
2026-03-25 21:11 ` [PATCH v2 0/5] " Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 1/5] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 2/5] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 3/5] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 4/5] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 5/5] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-25 21:58     ` Junio C Hamano
2026-03-26  6:26   ` [PATCH v2 0/5] Xdiff cleanup part 3 SZEDER Gábor
2026-03-27 19:23   ` [PATCH v3 0/6] " Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-27 21:09       ` Junio C Hamano
2026-03-27 23:01         ` Junio C Hamano
2026-03-27 19:23     ` [PATCH v3 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 6/6] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=93942e80-207d-4fbd-9965-7c072693b6e1@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=ezekielnewren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox