* [PATCH 00/17] Use rust types in xdiff.
@ 2025-09-07 19:45 Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 01/17] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
` (18 more replies)
0 siblings, 19 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren
This patch series involves ZERO Rust code and toolchains, which avoids the
debate about Rust's portability and timeline. Instead, it shows how Git can
immediately benefit from Rust's design choices without using it at all. The
rationale for using Rust types on the C and Rust side is addressed in the
commit that creates compat/rust_types.h.
This patch series has 2 parts:
* Patches 1-9: Clean up xdiff, this can be merged without part 2.
* Patches 10-17: Define Rust types in compat/rust_types.h and then start
refactoring xdiff with Rust types. This depends on part 1.
The cleanup in this patch series makes the structs xrecord_t and xdfile_t
Rust FFI friendly. My opinion is that part 1 should be merged soon, while
part 2 can be discussed further.
Before:
typedef struct s_xrecord {
struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
unsigned int hbits;
xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
unsigned long *ha;
} xdfile_t;
After cleanup:
typedef struct s_xrecord {
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
xrecord_t *recs;
long nrec;
long dstart, dend;
char *rchg;
long *rindex;
long nreff;
} xdfile_t;
After using Rust types:
typedef struct s_xrecord {
u8 const *ptr;
usize size;
u64 line_hash;
usize minimal_perfect_hash;
} xrecord_t;
typedef struct s_xdfile {
xrecord_t *recs;
usize nrec;
i32 dstart, dend;
u8 *rchg;
usize *rindex;
usize nreff;
} xdfile_t;
Ezekiel Newren (17):
xdiff: delete static forward declarations in xprepare
xdiff: delete local variables and initialize/free xdfile_t directly
xdiff: delete unnecessary fields from xrecord_t and xdfile_t
xdiff: delete xdl_get_rec() in xemit
xdiff: delete struct diffdata_t
xdiff: delete redundant array xdfile_t.ha
xdiff: delete fields ha, line, size in xdlclass_t in favor of an
xrecord_t
xdiff: delete chastore from xdfile_t, view with --color-words
xdiff: treat xdfile_t.rchg like an enum
compat/rust_types.h: define rust primitive types
xdiff: include compat/rust_types.h
xdiff: make xrecord_t.ptr a u8 instead of char
xdiff: make xrecord_t.size a usize instead of long
xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash
xdiff: make xdfile_t.nrec a usize instead of long
xdiff: make xdfile_t.nreff a usize instead of long
xdiff: change the types of dstart, dend, rchg, and rindex in xdfile_t
compat/rust_types.h | 28 +++++
xdiff/xdiff.h | 4 +
xdiff/xdiffi.c | 118 ++++++++----------
xdiff/xdiffi.h | 11 +-
xdiff/xemit.c | 52 +++-----
xdiff/xhistogram.c | 14 +--
xdiff/xinclude.h | 1 +
xdiff/xmacros.h | 2 +-
xdiff/xmerge.c | 66 +++++-----
xdiff/xpatience.c | 28 ++---
xdiff/xprepare.c | 289 +++++++++++++++++---------------------------
xdiff/xtypes.h | 26 ++--
xdiff/xutils.c | 12 +-
13 files changed, 293 insertions(+), 358 deletions(-)
create mode 100644 compat/rust_types.h
base-commit: 16bd9f20a403117f2e0d9bcda6c6e621d3763e77
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2048%2Fezekielnewren%2Fuse_rust_types_in_xdiff-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2048/ezekielnewren/use_rust_types_in_xdiff-v1
Pull-Request: https://github.com/git/git/pull/2048
--
gitgitgadget
^ permalink raw reply [flat|nested] 158+ messages in thread
* [PATCH 01/17] xdiff: delete static forward declarations in xprepare
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-09 8:55 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 02/17] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
` (17 subsequent siblings)
18 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Move xdl_prepare_env() later in the file to avoid the need
for static forward declarations.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 116 ++++++++++++++++++++---------------------------
1 file changed, 50 insertions(+), 66 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e1d4017b2d..a45c5ee208 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -53,21 +53,6 @@ typedef struct s_xdlclassifier {
-static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags);
-static void xdl_free_classifier(xdlclassifier_t *cf);
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec);
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
- xdlclassifier_t *cf, xdfile_t *xdf);
-static void xdl_free_ctx(xdfile_t *xdf);
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e);
-static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-
-
-
-
static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags) {
cf->flags = flags;
@@ -242,57 +227,6 @@ static void xdl_free_ctx(xdfile_t *xdf) {
}
-int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
- xdfenv_t *xe) {
- long enl1, enl2, sample;
- xdlclassifier_t cf;
-
- memset(&cf, 0, sizeof(cf));
-
- /*
- * For histogram diff, we can afford a smaller sample size and
- * thus a poorer estimate of the number of lines, as the hash
- * table (rhash) won't be filled up/grown. The number of lines
- * (nrecs) will be updated correctly anyway by
- * xdl_prepare_ctx().
- */
- sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
- ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
-
- enl1 = xdl_guess_lines(mf1, sample) + 1;
- enl2 = xdl_guess_lines(mf2, sample) + 1;
-
- if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
- return -1;
-
- if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
-
- xdl_free_classifier(&cf);
- return -1;
- }
- if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
- (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
- xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf2);
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- xdl_free_classifier(&cf);
-
- return 0;
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
@@ -460,3 +394,53 @@ static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2
return 0;
}
+
+int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+ xdfenv_t *xe) {
+ long enl1, enl2, sample;
+ xdlclassifier_t cf;
+
+ memset(&cf, 0, sizeof(cf));
+
+ /*
+ * For histogram diff, we can afford a smaller sample size and
+ * thus a poorer estimate of the number of lines, as the hash
+ * table (rhash) won't be filled up/grown. The number of lines
+ * (nrecs) will be updated correctly anyway by
+ * xdl_prepare_ctx().
+ */
+ sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
+ ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
+
+ enl1 = xdl_guess_lines(mf1, sample) + 1;
+ enl2 = xdl_guess_lines(mf2, sample) + 1;
+
+ if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
+ return -1;
+
+ if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
+
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+ if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
+ (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
+ xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf2);
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ xdl_free_classifier(&cf);
+
+ return 0;
+}
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 02/17] xdiff: delete local variables and initialize/free xdfile_t directly
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 01/17] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-09 8:56 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 03/17] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
` (16 subsequent siblings)
18 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xdl_prepare_ctx() uses local variables and assigns them to the
corresponding xdfile_t fields if there are no errors. Delete them and
use the fields of xdfile_t directly.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 79 +++++++++++++++++++-----------------------------
1 file changed, 31 insertions(+), 48 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index a45c5ee208..2ed1785b09 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -134,99 +134,82 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
}
+static void xdl_free_ctx(xdfile_t *xdf)
+{
+
+ xdl_free(xdf->rhash);
+ xdl_free(xdf->rindex);
+ xdl_free(xdf->rchg - 1);
+ xdl_free(xdf->ha);
+ xdl_free(xdf->recs);
+ xdl_cha_free(&xdf->rcha);
+}
+
+
static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
xdlclassifier_t *cf, xdfile_t *xdf) {
- unsigned int hbits;
- long nrec, hsize, bsize;
+ long bsize;
unsigned long hav;
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xrecord_t **recs;
- xrecord_t **rhash;
- unsigned long *ha;
- char *rchg;
- long *rindex;
- ha = NULL;
- rindex = NULL;
- rchg = NULL;
- rhash = NULL;
- recs = NULL;
+ xdf->ha = NULL;
+ xdf->rindex = NULL;
+ xdf->rchg = NULL;
+ xdf->rhash = NULL;
+ xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
goto abort;
- if (!XDL_ALLOC_ARRAY(recs, narec))
+ if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- hbits = xdl_hashbits((unsigned int) narec);
- hsize = 1 << hbits;
- if (!XDL_CALLOC_ARRAY(rhash, hsize))
+ xdf->hbits = xdl_hashbits((unsigned int) narec);
+ if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
goto abort;
- nrec = 0;
+ xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
prev = cur;
hav = xdl_hash_record(&cur, top, xpp->flags);
- if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
+ if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
if (!(crec = xdl_cha_alloc(&xdf->rcha)))
goto abort;
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- recs[nrec++] = crec;
- if (xdl_classify_record(pass, cf, rhash, hbits, crec) < 0)
+ xdf->recs[xdf->nrec++] = crec;
+ if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
goto abort;
}
}
- if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
+ if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
goto abort;
if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
- if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
goto abort;
}
- xdf->nrec = nrec;
- xdf->recs = recs;
- xdf->hbits = hbits;
- xdf->rhash = rhash;
- xdf->rchg = rchg + 1;
- xdf->rindex = rindex;
+ xdf->rchg += 1;
xdf->nreff = 0;
- xdf->ha = ha;
xdf->dstart = 0;
- xdf->dend = nrec - 1;
+ xdf->dend = xdf->nrec - 1;
return 0;
abort:
- xdl_free(ha);
- xdl_free(rindex);
- xdl_free(rchg);
- xdl_free(rhash);
- xdl_free(recs);
- xdl_cha_free(&xdf->rcha);
+ xdl_free_ctx(xdf);
return -1;
}
-static void xdl_free_ctx(xdfile_t *xdf) {
-
- xdl_free(xdf->rhash);
- xdl_free(xdf->rindex);
- xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
- xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 03/17] xdiff: delete unnecessary fields from xrecord_t and xdfile_t
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 01/17] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 02/17] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-09 8:56 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 04/17] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
` (15 subsequent siblings)
18 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 15 ++-------------
xdiff/xtypes.h | 3 ---
2 files changed, 2 insertions(+), 16 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 2ed1785b09..91b0ed54e0 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -91,8 +91,7 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
}
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec) {
+static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
char const *line;
xdlclass_t *rcrec;
@@ -126,10 +125,6 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
rec->ha = (unsigned long) rcrec->idx;
- hi = (long) XDL_HASHLONG(rec->ha, hbits);
- rec->next = rhash[hi];
- rhash[hi] = rec;
-
return 0;
}
@@ -137,7 +132,6 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
static void xdl_free_ctx(xdfile_t *xdf)
{
- xdl_free(xdf->rhash);
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->ha);
@@ -156,7 +150,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
- xdf->rhash = NULL;
xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
@@ -164,10 +157,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- xdf->hbits = xdl_hashbits((unsigned int) narec);
- if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
- goto abort;
-
xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
@@ -181,7 +170,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
crec->size = (long) (cur - prev);
crec->ha = hav;
xdf->recs[xdf->nrec++] = crec;
- if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
+ if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8442bd436e..8b8467360e 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,7 +39,6 @@ typedef struct s_chastore {
} chastore_t;
typedef struct s_xrecord {
- struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
@@ -48,8 +47,6 @@ typedef struct s_xrecord {
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
- unsigned int hbits;
- xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 04/17] xdiff: delete xdl_get_rec() in xemit
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (2 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 03/17] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-09 8:56 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 05/17] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
` (14 subsequent siblings)
18 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
This function aliases the fields of xrecord_t, which makes it harder
to track the usages of those fields. Delete it.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xemit.c | 38 +++++++++++++-------------------------
1 file changed, 13 insertions(+), 25 deletions(-)
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 1d40c9cb40..2161ac3cd0 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -22,23 +22,13 @@
#include "xinclude.h"
-static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
- *rec = xdf->recs[ri]->ptr;
-
- return xdf->recs[ri]->size;
-}
-
-
-static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
- long size, psize = strlen(pre);
- char const *rec;
-
- size = xdl_get_rec(xdf, ri, &rec);
- if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0) {
+static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
+{
+ xrecord_t *rec = xdf->recs[ri];
+ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
return -1;
- }
return 0;
}
@@ -120,11 +110,11 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ xrecord_t *rec = xdf->recs[ri];
+
if (!xecfg->find_func)
- return def_ff(rec, len, buf, sz);
- return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
+ return def_ff(rec->ptr, rec->size, buf, sz);
+ return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
}
static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
@@ -160,14 +150,12 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ xrecord_t *rec = xdf->recs[ri];
+ long i = 0;
- while (len > 0 && XDL_ISSPACE(*rec)) {
- rec++;
- len--;
- }
- return !len;
+ for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
+
+ return i == rec->size;
}
int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 05/17] xdiff: delete struct diffdata_t
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (3 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 04/17] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-09 8:56 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 06/17] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
` (13 subsequent siblings)
18 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Every field in this struct is an alias for a certain field in xdfile_t.
diffdata_t.nrec -> xdfile_t.nreff
diffdata_t.ha -> xdfile_t.ha
diffdata_t.rindex -> xdfile_t.rindex
diffdata_t.rchg -> xdfile_t.rchg
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 32 ++++++++------------------------
xdiff/xdiffi.h | 11 ++---------
2 files changed, 10 insertions(+), 33 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 5a96e36dfb..bbf0161f84 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -257,10 +257,10 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
* sub-boxes by calling the box splitting function. Note that the real job
* (marking changed lines) is done in the two boundary reaching checks.
*/
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = dd1->ha, *ha2 = dd2->ha;
+ unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
@@ -273,17 +273,11 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
* be obviously changed.
*/
if (off1 == lim1) {
- char *rchg2 = dd2->rchg;
- long *rindex2 = dd2->rindex;
-
for (; off2 < lim2; off2++)
- rchg2[rindex2[off2]] = 1;
+ xdf2->rchg[xdf2->rindex[off2]] = 1;
} else if (off2 == lim2) {
- char *rchg1 = dd1->rchg;
- long *rindex1 = dd1->rindex;
-
for (; off1 < lim1; off1++)
- rchg1[rindex1[off1]] = 1;
+ xdf1->rchg[xdf1->rindex[off1]] = 1;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -300,9 +294,9 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
/*
* ... et Impera.
*/
- if (xdl_recs_cmp(dd1, off1, spl.i1, dd2, off2, spl.i2,
+ if (xdl_recs_cmp(xdf1, off1, spl.i1, xdf2, off2, spl.i2,
kvdf, kvdb, spl.min_lo, xenv) < 0 ||
- xdl_recs_cmp(dd1, spl.i1, lim1, dd2, spl.i2, lim2,
+ xdl_recs_cmp(xdf1, spl.i1, lim1, xdf2, spl.i2, lim2,
kvdf, kvdb, spl.min_hi, xenv) < 0) {
return -1;
@@ -318,7 +312,6 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
long ndiags;
long *kvd, *kvdf, *kvdb;
xdalgoenv_t xenv;
- diffdata_t dd1, dd2;
int res;
if (xdl_prepare_env(mf1, mf2, xpp, xe) < 0)
@@ -357,16 +350,7 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xenv.snake_cnt = XDL_SNAKE_CNT;
xenv.heur_min = XDL_HEUR_MIN_COST;
- dd1.nrec = xe->xdf1.nreff;
- dd1.ha = xe->xdf1.ha;
- dd1.rchg = xe->xdf1.rchg;
- dd1.rindex = xe->xdf1.rindex;
- dd2.nrec = xe->xdf2.nreff;
- dd2.ha = xe->xdf2.ha;
- dd2.rchg = xe->xdf2.rchg;
- dd2.rindex = xe->xdf2.rindex;
-
- res = xdl_recs_cmp(&dd1, 0, dd1.nrec, &dd2, 0, dd2.nrec,
+ res = xdl_recs_cmp(&xe->xdf1, 0, xe->xdf1.nreff, &xe->xdf2, 0, xe->xdf2.nreff,
kvdf, kvdb, (xpp->flags & XDF_NEED_MINIMAL) != 0,
&xenv);
xdl_free(kvd);
diff --git a/xdiff/xdiffi.h b/xdiff/xdiffi.h
index 126c9d8ff4..49e52c67f9 100644
--- a/xdiff/xdiffi.h
+++ b/xdiff/xdiffi.h
@@ -24,13 +24,6 @@
#define XDIFFI_H
-typedef struct s_diffdata {
- long nrec;
- unsigned long const *ha;
- long *rindex;
- char *rchg;
-} diffdata_t;
-
typedef struct s_xdalgoenv {
long mxcost;
long snake_cnt;
@@ -46,8 +39,8 @@ typedef struct s_xdchange {
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv);
int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xdfenv_t *xe);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 06/17] xdiff: delete redundant array xdfile_t.ha
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (4 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 05/17] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-09 8:57 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 07/17] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
` (12 subsequent siblings)
18 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
When 0 <= i < xdfile_t.nreff the following is true:
xdfile_t.ha[i] == xdfile_t.recs[xdfile_t.rindex[i]]
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++++++----------
xdiff/xprepare.c | 12 ++----------
xdiff/xtypes.h | 1 -
3 files changed, 16 insertions(+), 21 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index bbf0161f84..11cd090b53 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -22,6 +22,11 @@
#include "xinclude.h"
+static unsigned long get_hash(xdfile_t *xdf, long index)
+{
+ return xdf->recs[xdf->rindex[index]]->ha;
+}
+
#define XDL_MAX_COST_MIN 256
#define XDL_HEUR_MIN_COST 256
#define XDL_LINE_MAX (long)((1UL << (CHAR_BIT * sizeof(long) - 1)) - 1)
@@ -42,8 +47,8 @@ typedef struct s_xdpsplit {
* using this algorithm, so a little bit of heuristic is needed to cut the
* search and to return a suboptimal point.
*/
-static long xdl_split(unsigned long const *ha1, long off1, long lim1,
- unsigned long const *ha2, long off2, long lim2,
+static long xdl_split(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl,
xdalgoenv_t *xenv) {
long dmin = off1 - lim2, dmax = lim1 - off2;
@@ -87,7 +92,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdf[d + 1];
prev1 = i1;
i2 = i1 - d;
- for (; i1 < lim1 && i2 < lim2 && ha1[i1] == ha2[i2]; i1++, i2++);
+ for (; i1 < lim1 && i2 < lim2 && get_hash(xdf1, i1) == get_hash(xdf2, i2); i1++, i2++);
if (i1 - prev1 > xenv->snake_cnt)
got_snake = 1;
kvdf[d] = i1;
@@ -124,7 +129,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdb[d + 1] - 1;
prev1 = i1;
i2 = i1 - d;
- for (; i1 > off1 && i2 > off2 && ha1[i1 - 1] == ha2[i2 - 1]; i1--, i2--);
+ for (; i1 > off1 && i2 > off2 && get_hash(xdf1, i1 - 1) == get_hash(xdf2, i2 - 1); i1--, i2--);
if (prev1 - i1 > xenv->snake_cnt)
got_snake = 1;
kvdb[d] = i1;
@@ -159,7 +164,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 + xenv->snake_cnt <= i1 && i1 < lim1 &&
off2 + xenv->snake_cnt <= i2 && i2 < lim2) {
- for (k = 1; ha1[i1 - k] == ha2[i2 - k]; k++)
+ for (k = 1; get_hash(xdf1, i1 - k) == get_hash(xdf2, i2 - k); k++)
if (k == xenv->snake_cnt) {
best = v;
spl->i1 = i1;
@@ -183,7 +188,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 < i1 && i1 <= lim1 - xenv->snake_cnt &&
off2 < i2 && i2 <= lim2 - xenv->snake_cnt) {
- for (k = 0; ha1[i1 + k] == ha2[i2 + k]; k++)
+ for (k = 0; get_hash(xdf1, i1 + k) == get_hash(xdf2, i2 + k); k++)
if (k == xenv->snake_cnt - 1) {
best = v;
spl->i1 = i1;
@@ -260,13 +265,12 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
*/
- for (; off1 < lim1 && off2 < lim2 && ha1[off1] == ha2[off2]; off1++, off2++);
- for (; off1 < lim1 && off2 < lim2 && ha1[lim1 - 1] == ha2[lim2 - 1]; lim1--, lim2--);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, off1) == get_hash(xdf2, off2); off1++, off2++);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, lim1 - 1) == get_hash(xdf2, lim2 - 1); lim1--, lim2--);
/*
* If one dimension is empty, then all records on the other one must
@@ -285,7 +289,7 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
/*
* Divide ...
*/
- if (xdl_split(ha1, off1, lim1, ha2, off2, lim2, kvdf, kvdb,
+ if (xdl_split(xdf1, off1, lim1, xdf2, off2, lim2, kvdf, kvdb,
need_min, &spl, xenv) < 0) {
return -1;
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 91b0ed54e0..59730989a3 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -134,7 +134,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
xdl_free(xdf->recs);
xdl_cha_free(&xdf->rcha);
}
@@ -147,7 +146,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
xdf->recs = NULL;
@@ -182,8 +180,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
- goto abort;
}
xdf->rchg += 1;
@@ -301,9 +297,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf1->dend; i++, recs++) {
if (dis1[i] == 1 ||
(dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
- xdf1->rindex[nreff] = i;
- xdf1->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf1->rindex[nreff++] = i;
} else
xdf1->rchg[i] = 1;
}
@@ -313,9 +307,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf2->dend; i++, recs++) {
if (dis2[i] == 1 ||
(dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
- xdf2->rindex[nreff] = i;
- xdf2->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf2->rindex[nreff++] = i;
} else
xdf2->rchg[i] = 1;
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8b8467360e..85848f1685 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -52,7 +52,6 @@ typedef struct s_xdfile {
char *rchg;
long *rindex;
long nreff;
- unsigned long *ha;
} xdfile_t;
typedef struct s_xdfenv {
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 07/17] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (5 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 06/17] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-09 8:57 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 08/17] xdiff: delete chastore from xdfile_t, view with --color-words Ezekiel Newren via GitGitGadget
` (11 subsequent siblings)
18 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 59730989a3..6f1d4b4725 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -32,9 +32,7 @@
typedef struct s_xdlclass {
struct s_xdlclass *next;
- unsigned long ha;
- char const *line;
- long size;
+ xrecord_t rec;
long idx;
long len1, len2;
} xdlclass_t;
@@ -93,14 +91,12 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
- char const *line;
xdlclass_t *rcrec;
- line = rec->ptr;
hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
- if (rcrec->ha == rec->ha &&
- xdl_recmatch(rcrec->line, rcrec->size,
+ if (rcrec->rec.ha == rec->ha &&
+ xdl_recmatch(rcrec->rec.ptr, rcrec->rec.size,
rec->ptr, rec->size, cf->flags))
break;
@@ -113,9 +109,9 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
if (XDL_ALLOC_GROW(cf->rcrecs, cf->count, cf->alloc))
return -1;
cf->rcrecs[rcrec->idx] = rcrec;
- rcrec->line = line;
- rcrec->size = rec->size;
- rcrec->ha = rec->ha;
+ rcrec->rec.ptr = rec->ptr;
+ rcrec->rec.size = rec->size;
+ rcrec->rec.ha = rec->ha;
rcrec->len1 = rcrec->len2 = 0;
rcrec->next = cf->rchash[hi];
cf->rchash[hi] = rcrec;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 08/17] xdiff: delete chastore from xdfile_t, view with --color-words
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (6 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 07/17] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-09 8:58 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 09/17] xdiff: treat xdfile_t.rchg like an enum Ezekiel Newren via GitGitGadget
` (10 subsequent siblings)
18 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
The chastore_t type is very unfriendly to Rust FFI. It's also redundant
since 'recs' is a vector type that grows every time an xrecord_t is
added.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++----------
xdiff/xemit.c | 6 ++---
xdiff/xhistogram.c | 2 +-
xdiff/xmerge.c | 56 +++++++++++++++++++++++-----------------------
xdiff/xpatience.c | 10 ++++-----
xdiff/xprepare.c | 19 ++++++----------
xdiff/xtypes.h | 3 +--
xdiff/xutils.c | 12 +++++-----
8 files changed, 63 insertions(+), 69 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 11cd090b53..a66125d44a 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -24,7 +24,7 @@
static unsigned long get_hash(xdfile_t *xdf, long index)
{
- return xdf->recs[xdf->rindex[index]]->ha;
+ return xdf->recs[xdf->rindex[index]].ha;
}
#define XDL_MAX_COST_MIN 256
@@ -489,13 +489,13 @@ static void measure_split(const xdfile_t *xdf, long split,
m->indent = -1;
} else {
m->end_of_file = 0;
- m->indent = get_indent(xdf->recs[split]);
+ m->indent = get_indent(&xdf->recs[split]);
}
m->pre_blank = 0;
m->pre_indent = -1;
for (i = split - 1; i >= 0; i--) {
- m->pre_indent = get_indent(xdf->recs[i]);
+ m->pre_indent = get_indent(&xdf->recs[i]);
if (m->pre_indent != -1)
break;
m->pre_blank += 1;
@@ -508,7 +508,7 @@ static void measure_split(const xdfile_t *xdf, long split,
m->post_blank = 0;
m->post_indent = -1;
for (i = split + 1; i < xdf->nrec; i++) {
- m->post_indent = get_indent(xdf->recs[i]);
+ m->post_indent = get_indent(&xdf->recs[i]);
if (m->post_indent != -1)
break;
m->post_blank += 1;
@@ -752,7 +752,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
- recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
+ recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
xdf->rchg[g->start++] = 0;
xdf->rchg[g->end++] = 1;
@@ -773,7 +773,7 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
- recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1])) {
+ recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
xdf->rchg[--g->start] = 1;
xdf->rchg[--g->end] = 0;
@@ -988,16 +988,16 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
for (xch = xscr; xch; xch = xch->next) {
int ignore = 1;
- xrecord_t **rec;
+ xrecord_t *rec;
long i;
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
xch->ignore = ignore;
}
@@ -1021,7 +1021,7 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
xdchange_t *xch;
for (xch = xscr; xch; xch = xch->next) {
- xrecord_t **rec;
+ xrecord_t *rec;
int ignore = 1;
long i;
@@ -1033,11 +1033,11 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
xch->ignore = ignore;
}
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 2161ac3cd0..b2f1f30cd3 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -25,7 +25,7 @@
static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
return -1;
@@ -110,7 +110,7 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (!xecfg->find_func)
return def_ff(rec->ptr, rec->size, buf, sz);
@@ -150,7 +150,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
long i = 0;
for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 040d81e0bc..4d857e8ae2 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -86,7 +86,7 @@ struct region {
((LINE_MAP(index, ptr))->cnt)
#define REC(env, s, l) \
- (env->xdf##s.recs[l - 1])
+ (&env->xdf##s.recs[l - 1])
static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
{
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index af40c88a5b..fd600cbb5d 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -97,12 +97,12 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
int line_count, long flags)
{
int i;
- xrecord_t **rec1 = xe1->xdf2.recs + i1;
- xrecord_t **rec2 = xe2->xdf2.recs + i2;
+ xrecord_t *rec1 = xe1->xdf2.recs + i1;
+ xrecord_t *rec2 = xe2->xdf2.recs + i2;
for (i = 0; i < line_count; i++) {
- int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
- rec2[i]->ptr, rec2[i]->size, flags);
+ int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
+ rec2[i].ptr, rec2[i].size, flags);
if (!result)
return -1;
}
@@ -111,7 +111,7 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int needs_cr, int add_nl, char *dest)
{
- xrecord_t **recs;
+ xrecord_t *recs;
int size = 0;
recs = (use_orig ? xe->xdf1.recs : xe->xdf2.recs) + i;
@@ -119,12 +119,12 @@ static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int nee
if (count < 1)
return 0;
- for (i = 0; i < count; size += recs[i++]->size)
+ for (i = 0; i < count; size += recs[i++].size)
if (dest)
- memcpy(dest + size, recs[i]->ptr, recs[i]->size);
+ memcpy(dest + size, recs[i].ptr, recs[i].size);
if (add_nl) {
- i = recs[count - 1]->size;
- if (i == 0 || recs[count - 1]->ptr[i - 1] != '\n') {
+ i = recs[count - 1].size;
+ if (i == 0 || recs[count - 1].ptr[i - 1] != '\n') {
if (needs_cr) {
if (dest)
dest[size] = '\r';
@@ -160,22 +160,22 @@ static int is_eol_crlf(xdfile_t *file, int i)
if (i < file->nrec - 1)
/* All lines before the last *must* end in LF */
- return (size = file->recs[i]->size) > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ return (size = file->recs[i].size) > 1 &&
+ file->recs[i].ptr[size - 2] == '\r';
if (!file->nrec)
/* Cannot determine eol style from empty file */
return -1;
- if ((size = file->recs[i]->size) &&
- file->recs[i]->ptr[size - 1] == '\n')
+ if ((size = file->recs[i].size) &&
+ file->recs[i].ptr[size - 1] == '\n')
/* Last line; ends in LF; Is it CR/LF? */
return size > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ file->recs[i].ptr[size - 2] == '\r';
if (!i)
/* The only line has no eol */
return -1;
/* Determine eol from second-to-last line */
- return (size = file->recs[i - 1]->size) > 1 &&
- file->recs[i - 1]->ptr[size - 2] == '\r';
+ return (size = file->recs[i - 1].size) > 1 &&
+ file->recs[i - 1].ptr[size - 2] == '\r';
}
static int is_cr_needed(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m)
@@ -334,22 +334,22 @@ static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
static void xdl_refine_zdiff3_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
xpparam_t const *xpp)
{
- xrecord_t **rec1 = xe1->xdf2.recs, **rec2 = xe2->xdf2.recs;
+ xrecord_t *rec1 = xe1->xdf2.recs, *rec2 = xe2->xdf2.recs;
for (; m; m = m->next) {
/* let's handle just the conflicts */
if (m->mode)
continue;
while(m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1], rec2[m->i2], xpp->flags)) {
+ recmatch(&rec1[m->i1], &rec2[m->i2], xpp->flags)) {
m->chg1--;
m->chg2--;
m->i1++;
m->i2++;
}
while (m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1 + m->chg1 - 1],
- rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
+ recmatch(&rec1[m->i1 + m->chg1 - 1],
+ &rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
m->chg1--;
m->chg2--;
}
@@ -381,12 +381,12 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
* This probably does not work outside git, since
* we have a very simple mmfile structure.
*/
- t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
- t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
- + xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
- t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
- t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
- + xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
+ t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
+ t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
+ + xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
+ t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
+ t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
+ + xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
return -1;
if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
@@ -440,8 +440,8 @@ static int line_contains_alnum(const char *ptr, long size)
static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
{
for (; chg; chg--, i++)
- if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
- xe->xdf2.recs[i]->size))
+ if (line_contains_alnum(xe->xdf2.recs[i].ptr,
+ xe->xdf2.recs[i].size))
return 1;
return 0;
}
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 77dc411d19..bf69a58527 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -88,9 +88,9 @@ static int is_anchor(xpparam_t const *xpp, const char *line)
static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
int pass)
{
- xrecord_t **records = pass == 1 ?
+ xrecord_t *records = pass == 1 ?
map->env->xdf1.recs : map->env->xdf2.recs;
- xrecord_t *record = records[line - 1];
+ xrecord_t *record = &records[line - 1];
/*
* After xdl_prepare_env() (or more precisely, due to
* xdl_classify_record()), the "ha" member of the records (AKA lines)
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
return;
map->entries[index].line1 = line;
map->entries[index].hash = record->ha;
- map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
+ map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
if (!map->first)
map->first = map->entries + index;
if (map->last) {
@@ -246,8 +246,8 @@ static int find_longest_common_sequence(struct hashmap *map, struct entry **res)
static int match(struct hashmap *map, int line1, int line2)
{
- xrecord_t *record1 = map->env->xdf1.recs[line1 - 1];
- xrecord_t *record2 = map->env->xdf2.recs[line2 - 1];
+ xrecord_t *record1 = &map->env->xdf1.recs[line1 - 1];
+ xrecord_t *record2 = &map->env->xdf2.recs[line2 - 1];
return record1->ha == record2->ha;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 6f1d4b4725..92f9845003 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -131,7 +131,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
}
@@ -146,8 +145,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->rchg = NULL;
xdf->recs = NULL;
- if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
- goto abort;
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
@@ -158,12 +155,10 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
hav = xdl_hash_record(&cur, top, xpp->flags);
if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
- if (!(crec = xdl_cha_alloc(&xdf->rcha)))
- goto abort;
+ crec = &xdf->recs[xdf->nrec++];
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- xdf->recs[xdf->nrec++] = crec;
if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
@@ -263,7 +258,7 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
*/
static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
long i, nm, nreff, mlim;
- xrecord_t **recs;
+ xrecord_t *recs;
xdlclass_t *rcrec;
char *dis, *dis1, *dis2;
int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
@@ -276,7 +271,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -284,7 +279,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -320,13 +315,13 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
*/
static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
long i, lim;
- xrecord_t **recs1, **recs2;
+ xrecord_t *recs1, *recs2;
recs1 = xdf1->recs;
recs2 = xdf2->recs;
for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
i++, recs1++, recs2++)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dstart = xdf2->dstart = i;
@@ -334,7 +329,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
recs1 = xdf1->recs + xdf1->nrec - 1;
recs2 = xdf2->recs + xdf2->nrec - 1;
for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dend = xdf1->nrec - i - 1;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 85848f1685..3d26cbf1ec 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -45,10 +45,9 @@ typedef struct s_xrecord {
} xrecord_t;
typedef struct s_xdfile {
- chastore_t rcha;
+ xrecord_t *recs;
long nrec;
long dstart, dend;
- xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 444a108f87..332982b509 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -416,12 +416,12 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
mmfile_t subfile1, subfile2;
xdfenv_t env;
- subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
- subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
- diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
- subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
- subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
- diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
+ subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
+ subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
+ diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
+ subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
+ subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
+ diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
return -1;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 09/17] xdiff: treat xdfile_t.rchg like an enum
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (7 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 08/17] xdiff: delete chastore from xdfile_t, view with --color-words Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-09 8:58 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 10/17] compat/rust_types.h: define rust primitive types Ezekiel Newren via GitGitGadget
` (9 subsequent siblings)
18 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Define macros NO(0), YES(1), MAYBE(2) as the enum values for rchg to
make the code easier to follow. Perhaps 'rchg' should be renamed to
'changed'?
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiff.h | 4 ++++
xdiff/xdiffi.c | 29 ++++++++++++++---------------
xdiff/xhistogram.c | 8 ++++----
xdiff/xpatience.c | 8 ++++----
xdiff/xprepare.c | 24 ++++++++++++------------
5 files changed, 38 insertions(+), 35 deletions(-)
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 2cecde5afe..7092879829 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -27,6 +27,10 @@
extern "C" {
#endif /* #ifdef __cplusplus */
+#define NO 0
+#define YES 1
+#define MAYBE 2
+
/* xpparm_t.flags */
#define XDF_NEED_MINIMAL (1 << 0)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index a66125d44a..44fd27823a 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
- xdf2->rchg[xdf2->rindex[off2]] = 1;
+ xdf2->rchg[xdf2->rindex[off2]] = YES;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
- xdf1->rchg[xdf1->rindex[off1]] = 1;
+ xdf1->rchg[xdf1->rindex[off1]] = YES;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -708,7 +708,7 @@ struct xdlgroup {
static void group_init(xdfile_t *xdf, struct xdlgroup *g)
{
g->start = g->end = 0;
- while (xdf->rchg[g->end])
+ while (xdf->rchg[g->end] == YES)
g->end++;
}
@@ -722,7 +722,7 @@ static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->start = g->end + 1;
- for (g->end = g->start; xdf->rchg[g->end]; g->end++)
+ for (g->end = g->start; xdf->rchg[g->end] == YES; g->end++)
;
return 0;
@@ -738,7 +738,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->end = g->start - 1;
- for (g->start = g->end; xdf->rchg[g->start - 1]; g->start--)
+ for (g->start = g->end; xdf->rchg[g->start - 1] == YES; g->start--)
;
return 0;
@@ -753,10 +753,10 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
- xdf->rchg[g->start++] = 0;
- xdf->rchg[g->end++] = 1;
+ xdf->rchg[g->start++] = NO;
+ xdf->rchg[g->end++] = YES;
- while (xdf->rchg[g->end])
+ while (xdf->rchg[g->end] == YES)
g->end++;
return 0;
@@ -774,10 +774,10 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
- xdf->rchg[--g->start] = 1;
- xdf->rchg[--g->end] = 0;
+ xdf->rchg[--g->start] = YES;
+ xdf->rchg[--g->end] = NO;
- while (xdf->rchg[g->start - 1])
+ while (xdf->rchg[g->start - 1] == YES)
g->start--;
return 0;
@@ -932,16 +932,15 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
xdchange_t *cscr = NULL, *xch;
- char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
long i1, i2, l1, l2;
/*
* Trivial. Collects "groups" of changes and creates an edit script.
*/
for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
- if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
- for (l1 = i1; rchg1[i1 - 1]; i1--);
- for (l2 = i2; rchg2[i2 - 1]; i2--);
+ if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
+ for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
+ for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
xdl_free_script(cscr);
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 4d857e8ae2..c2e85b8ab9 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -318,11 +318,11 @@ redo:
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
return 0;
}
@@ -335,9 +335,9 @@ redo:
else {
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
while (count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
result = 0;
} else {
result = histogram_diff(xpp, env,
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index bf69a58527..20cda5e258 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* trivial case: one side is empty */
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
return 0;
}
@@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* are there any matching lines at all? */
if (!map.has_matches) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
xdl_free(map.entries);
return 0;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 92f9845003..36437f91bb 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -215,9 +215,9 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
* current line (i) is already a multimatch line.
*/
for (r = 1, rdis0 = 0, rpdis0 = 1; (i - r) >= s; r++) {
- if (!dis[i - r])
+ if (dis[i - r] == NO)
rdis0++;
- else if (dis[i - r] == 2)
+ else if (dis[i - r] == MAYBE)
rpdis0++;
else
break;
@@ -231,9 +231,9 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
if (rdis0 == 0)
return 0;
for (r = 1, rdis1 = 0, rpdis1 = 1; (i + r) <= e; r++) {
- if (!dis[i + r])
+ if (dis[i + r] == NO)
rdis1++;
- else if (dis[i + r] == 2)
+ else if (dis[i + r] == MAYBE)
rpdis1++;
else
break;
@@ -273,7 +273,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
- dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ dis1[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
}
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
@@ -281,26 +281,26 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
- dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ dis2[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
}
for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
i <= xdf1->dend; i++, recs++) {
- if (dis1[i] == 1 ||
- (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
+ if (dis1[i] == YES ||
+ (dis1[i] == MAYBE && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
} else
- xdf1->rchg[i] = 1;
+ xdf1->rchg[i] = YES;
}
xdf1->nreff = nreff;
for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
i <= xdf2->dend; i++, recs++) {
- if (dis2[i] == 1 ||
- (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
+ if (dis2[i] == YES ||
+ (dis2[i] == MAYBE && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
} else
- xdf2->rchg[i] = 1;
+ xdf2->rchg[i] = YES;
}
xdf2->nreff = nreff;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 10/17] compat/rust_types.h: define rust primitive types
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (8 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 09/17] xdiff: treat xdfile_t.rchg like an enum Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-08 15:08 ` Junio C Hamano
2025-09-07 19:45 ` [PATCH 11/17] xdiff: include compat/rust_types.h Ezekiel Newren via GitGitGadget
` (8 subsequent siblings)
18 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Why Rust primitive types should be used in C:
* Consistency across languages: Sharing the same type names makes it
easier to translate and refactor code across boundaries, and search
history.
* Clarity and ergonomics: The types f32 and f64 are clearer than
float and double. The types u64 or isize are easier to write than
uint64_t, or ptrdiff_t.
* Explicit intent: Inclusion of compat/rust_types.h signals other
readers that the code is designed, or being cleaned up, for Rust
interop.
* Character types: Rust's char is defined as an unsigned 32-bit type.
In contrast, C's char is an 8-bit type that is neither signed nor
unsigned. The u8 type should be used instead of C's char when
referring to bytes in memory.
* Keep the FFI boundary precise: When Rust calls into C, the C
interface should use Rust types exclusively in both functions and
structs. If a broad refactor would cause too much churn, C stub
functions may be used as an interim step.
Reasons to avoid c_* types (e.g. c_char, c_long) in Rust:
* Rust remains precise: Bringing c_* into Rust reintroduces the very
ambiguity Rust was designed to eliminate. Using only Rust
primitives keeps our code portable and predictable.
* One clear contract: Rust should define the interface with precise
types. C adapts through compat/rust_types.h, ensuring the boundary
is consistent and easy to audit.
* Future-proof interop: Other runtimes (Python, Go, Java, Wasm, etc.)
map cleanly onto Rust's primitives, but not onto c_*. Sticking with
Rust types makes bindings straightforward and avoids locking Git's
ABI to C's historical quirks.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
compat/rust_types.h | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
create mode 100644 compat/rust_types.h
diff --git a/compat/rust_types.h b/compat/rust_types.h
new file mode 100644
index 0000000000..af93d0a116
--- /dev/null
+++ b/compat/rust_types.h
@@ -0,0 +1,28 @@
+#ifndef COMPAT_RUST_TYPES_H
+#define COMPAT_RUST_TYPES_H
+
+#include <compat/posix.h>
+
+/*
+ * A typedef for bool is not needed because C bool and Rust bool are
+ * the same if #include <stdbool.h> is used.
+ */
+
+typedef uint8_t u8;
+typedef uint16_t u16;
+typedef uint32_t u32;
+typedef uint64_t u64;
+
+typedef int8_t i8;
+typedef int16_t i16;
+typedef int32_t i32;
+typedef int64_t i64;
+
+typedef float f32;
+typedef double f64;
+
+typedef size_t usize;
+typedef ptrdiff_t isize;
+typedef uint32_t rust_char;
+
+#endif /* COMPAT_RUST_TYPES_H */
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 11/17] xdiff: include compat/rust_types.h
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (9 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 10/17] compat/rust_types.h: define rust primitive types Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 12/17] xdiff: make xrecord_t.ptr a u8 instead of char Ezekiel Newren via GitGitGadget
` (7 subsequent siblings)
18 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xinclude.h | 1 +
xdiff/xmacros.h | 2 +-
xdiff/xtypes.h | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/xdiff/xinclude.h b/xdiff/xinclude.h
index a4285ac0eb..6733d752a4 100644
--- a/xdiff/xinclude.h
+++ b/xdiff/xinclude.h
@@ -24,6 +24,7 @@
#define XINCLUDE_H
#include "git-compat-util.h"
+#include <compat/rust_types.h>
#include "xmacros.h"
#include "xdiff.h"
#include "xtypes.h"
diff --git a/xdiff/xmacros.h b/xdiff/xmacros.h
index 8487bb396f..ef663af3b8 100644
--- a/xdiff/xmacros.h
+++ b/xdiff/xmacros.h
@@ -23,7 +23,7 @@
#if !defined(XMACROS_H)
#define XMACROS_H
-
+#include <compat/rust_types.h>
#define XDL_MIN(a, b) ((a) < (b) ? (a): (b))
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 3d26cbf1ec..80afb98bf4 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -23,7 +23,7 @@
#if !defined(XTYPES_H)
#define XTYPES_H
-
+#include <compat/rust_types.h>
typedef struct s_chanode {
struct s_chanode *next;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 12/17] xdiff: make xrecord_t.ptr a u8 instead of char
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (10 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 11/17] xdiff: include compat/rust_types.h Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 13/17] xdiff: make xrecord_t.size a usize instead of long Ezekiel Newren via GitGitGadget
` (6 subsequent siblings)
18 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 8 ++++----
xdiff/xemit.c | 6 +++---
xdiff/xmerge.c | 14 +++++++-------
xdiff/xpatience.c | 2 +-
xdiff/xprepare.c | 6 +++---
xdiff/xtypes.h | 2 +-
xdiff/xutils.c | 4 ++--
7 files changed, 21 insertions(+), 21 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 44fd27823a..370813d2cf 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -407,7 +407,7 @@ static int get_indent(xrecord_t *rec)
int ret = 0;
for (i = 0; i < rec->size; i++) {
- char c = rec->ptr[i];
+ u8 c = rec->ptr[i];
if (!XDL_ISSPACE(c))
return ret;
@@ -992,11 +992,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
+ ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
+ ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
xch->ignore = ignore;
}
@@ -1007,7 +1007,7 @@ static int record_matches_regex(xrecord_t *rec, xpparam_t const *xpp) {
size_t i;
for (i = 0; i < xpp->ignore_regex_nr; i++)
- if (!regexec_buf(xpp->ignore_regex[i], rec->ptr, rec->size, 1,
+ if (!regexec_buf(xpp->ignore_regex[i], (const char *)rec->ptr, rec->size, 1,
®match, 0))
return 1;
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index b2f1f30cd3..ead930088a 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -27,7 +27,7 @@ static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *
{
xrecord_t *rec = &xdf->recs[ri];
- if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
+ if (xdl_emit_diffrec((char const *)rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
return -1;
return 0;
@@ -113,8 +113,8 @@ static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
xrecord_t *rec = &xdf->recs[ri];
if (!xecfg->find_func)
- return def_ff(rec->ptr, rec->size, buf, sz);
- return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
+ return def_ff((const char *)rec->ptr, rec->size, buf, sz);
+ return xecfg->find_func((const char *)rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
}
static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index fd600cbb5d..75cb3e76a2 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -101,8 +101,8 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
xrecord_t *rec2 = xe2->xdf2.recs + i2;
for (i = 0; i < line_count; i++) {
- int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
- rec2[i].ptr, rec2[i].size, flags);
+ int result = xdl_recmatch((const char *)rec1[i].ptr, rec1[i].size,
+ (const char *)rec2[i].ptr, rec2[i].size, flags);
if (!result)
return -1;
}
@@ -324,8 +324,8 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
{
- return xdl_recmatch(rec1->ptr, rec1->size,
- rec2->ptr, rec2->size, flags);
+ return xdl_recmatch((const char *)rec1->ptr, rec1->size,
+ (const char *)rec2->ptr, rec2->size, flags);
}
/*
@@ -382,10 +382,10 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
* we have a very simple mmfile structure.
*/
t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
- t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
+ t1.size = (char *)xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
+ xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
- t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
+ t2.size = (char *)xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
+ xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
return -1;
@@ -440,7 +440,7 @@ static int line_contains_alnum(const char *ptr, long size)
static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
{
for (; chg; chg--, i++)
- if (line_contains_alnum(xe->xdf2.recs[i].ptr,
+ if (line_contains_alnum((const char *)xe->xdf2.recs[i].ptr,
xe->xdf2.recs[i].size))
return 1;
return 0;
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 20cda5e258..9181815fd4 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
return;
map->entries[index].line1 = line;
map->entries[index].hash = record->ha;
- map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
+ map->entries[index].anchor = is_anchor(xpp, (const char *)map->env->xdf1.recs[line - 1].ptr);
if (!map->first)
map->first = map->entries + index;
if (map->last) {
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 36437f91bb..f5c04afe50 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -96,8 +96,8 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
if (rcrec->rec.ha == rec->ha &&
- xdl_recmatch(rcrec->rec.ptr, rcrec->rec.size,
- rec->ptr, rec->size, cf->flags))
+ xdl_recmatch((const char *)rcrec->rec.ptr, rcrec->rec.size,
+ (const char *)rec->ptr, rec->size, cf->flags))
break;
if (!rcrec) {
@@ -156,7 +156,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
crec = &xdf->recs[xdf->nrec++];
- crec->ptr = prev;
+ crec->ptr = (u8 const *)prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
if (xdl_classify_record(pass, cf, crec) < 0)
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 80afb98bf4..a1a9a61840 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,7 +39,7 @@ typedef struct s_chastore {
} chastore_t;
typedef struct s_xrecord {
- char const *ptr;
+ u8 const *ptr;
long size;
unsigned long ha;
} xrecord_t;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 332982b509..530addf1c6 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -417,10 +417,10 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
xdfenv_t env;
subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
- subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
+ subfile1.size = (char *)diff_env->xdf1.recs[line1 + count1 - 2].ptr +
diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
- subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
+ subfile2.size = (char *)diff_env->xdf2.recs[line2 + count2 - 2].ptr +
diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
return -1;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 13/17] xdiff: make xrecord_t.size a usize instead of long
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (11 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 12/17] xdiff: make xrecord_t.ptr a u8 instead of char Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 14/17] xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash Ezekiel Newren via GitGitGadget
` (5 subsequent siblings)
18 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 7 +++----
xdiff/xemit.c | 8 ++++----
xdiff/xmerge.c | 16 ++++++++--------
xdiff/xprepare.c | 6 +++---
xdiff/xtypes.h | 2 +-
5 files changed, 19 insertions(+), 20 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 370813d2cf..45cc9ce116 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -403,10 +403,9 @@ static int recs_match(xrecord_t *rec1, xrecord_t *rec2)
*/
static int get_indent(xrecord_t *rec)
{
- long i;
int ret = 0;
- for (i = 0; i < rec->size; i++) {
+ for (usize i = 0; i < rec->size; i++) {
u8 c = rec->ptr[i];
if (!XDL_ISSPACE(c))
@@ -992,11 +991,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
+ ignore = xdl_blankline((const char *)rec[i].ptr, (long)rec[i].size, flags);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
+ ignore = xdl_blankline((const char *)rec[i].ptr, (long)rec[i].size, flags);
xch->ignore = ignore;
}
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index ead930088a..ad3e859c57 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -27,7 +27,7 @@ static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *
{
xrecord_t *rec = &xdf->recs[ri];
- if (xdl_emit_diffrec((char const *)rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
+ if (xdl_emit_diffrec((char const *)rec->ptr, (long)rec->size, pre, strlen(pre), ecb) < 0)
return -1;
return 0;
@@ -113,8 +113,8 @@ static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
xrecord_t *rec = &xdf->recs[ri];
if (!xecfg->find_func)
- return def_ff((const char *)rec->ptr, rec->size, buf, sz);
- return xecfg->find_func((const char *)rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
+ return def_ff((const char *)rec->ptr, (long)rec->size, buf, sz);
+ return xecfg->find_func((const char *)rec->ptr, (long)rec->size, buf, sz, xecfg->find_func_priv);
}
static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
@@ -151,7 +151,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
xrecord_t *rec = &xdf->recs[ri];
- long i = 0;
+ usize i = 0;
for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index 75cb3e76a2..c1a003326a 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -101,8 +101,8 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
xrecord_t *rec2 = xe2->xdf2.recs + i2;
for (i = 0; i < line_count; i++) {
- int result = xdl_recmatch((const char *)rec1[i].ptr, rec1[i].size,
- (const char *)rec2[i].ptr, rec2[i].size, flags);
+ int result = xdl_recmatch((const char *)rec1[i].ptr, (long)rec1[i].size,
+ (const char *)rec2[i].ptr, (long)rec2[i].size, flags);
if (!result)
return -1;
}
@@ -119,11 +119,11 @@ static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int nee
if (count < 1)
return 0;
- for (i = 0; i < count; size += recs[i++].size)
+ for (i = 0; i < count; size += (int)recs[i++].size)
if (dest)
memcpy(dest + size, recs[i].ptr, recs[i].size);
if (add_nl) {
- i = recs[count - 1].size;
+ i = (int)recs[count - 1].size;
if (i == 0 || recs[count - 1].ptr[i - 1] != '\n') {
if (needs_cr) {
if (dest)
@@ -156,7 +156,7 @@ static int xdl_orig_copy(xdfenv_t *xe, int i, int count, int needs_cr, int add_n
*/
static int is_eol_crlf(xdfile_t *file, int i)
{
- long size;
+ usize size;
if (i < file->nrec - 1)
/* All lines before the last *must* end in LF */
@@ -324,8 +324,8 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
{
- return xdl_recmatch((const char *)rec1->ptr, rec1->size,
- (const char *)rec2->ptr, rec2->size, flags);
+ return xdl_recmatch((const char *)rec1->ptr, (long)rec1->size,
+ (const char *)rec2->ptr, (long)rec2->size, flags);
}
/*
@@ -441,7 +441,7 @@ static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
{
for (; chg; chg--, i++)
if (line_contains_alnum((const char *)xe->xdf2.recs[i].ptr,
- xe->xdf2.recs[i].size))
+ (long)xe->xdf2.recs[i].size))
return 1;
return 0;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index f5c04afe50..d62a329d0c 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -96,8 +96,8 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
if (rcrec->rec.ha == rec->ha &&
- xdl_recmatch((const char *)rcrec->rec.ptr, rcrec->rec.size,
- (const char *)rec->ptr, rec->size, cf->flags))
+ xdl_recmatch((const char *)rcrec->rec.ptr, (long)rcrec->rec.size,
+ (const char *)rec->ptr, (long)rec->size, cf->flags))
break;
if (!rcrec) {
@@ -157,7 +157,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
goto abort;
crec = &xdf->recs[xdf->nrec++];
crec->ptr = (u8 const *)prev;
- crec->size = (long) (cur - prev);
+ crec->size = cur - prev;
crec->ha = hav;
if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index a1a9a61840..6f83a9f4ff 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -40,7 +40,7 @@ typedef struct s_chastore {
typedef struct s_xrecord {
u8 const *ptr;
- long size;
+ usize size;
unsigned long ha;
} xrecord_t;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 14/17] xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (12 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 13/17] xdiff: make xrecord_t.size a usize instead of long Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 15/17] xdiff: make xdfile_t.nrec a usize instead of long Ezekiel Newren via GitGitGadget
` (4 subsequent siblings)
18 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 4 ++--
xdiff/xhistogram.c | 4 ++--
xdiff/xpatience.c | 10 +++++-----
xdiff/xprepare.c | 18 +++++++++---------
xdiff/xtypes.h | 3 ++-
5 files changed, 20 insertions(+), 19 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 45cc9ce116..c8d351705c 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -24,7 +24,7 @@
static unsigned long get_hash(xdfile_t *xdf, long index)
{
- return xdf->recs[xdf->rindex[index]].ha;
+ return xdf->recs[xdf->rindex[index]].minimal_perfect_hash;
}
#define XDL_MAX_COST_MIN 256
@@ -385,7 +385,7 @@ static xdchange_t *xdl_add_change(xdchange_t *xscr, long i1, long i2, long chg1,
static int recs_match(xrecord_t *rec1, xrecord_t *rec2)
{
- return (rec1->ha == rec2->ha);
+ return (rec1->minimal_perfect_hash == rec2->minimal_perfect_hash);
}
/*
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index c2e85b8ab9..4c827b0cba 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -90,7 +90,7 @@ struct region {
static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
{
- return r1->ha == r2->ha;
+ return r1->minimal_perfect_hash == r2->minimal_perfect_hash;
}
@@ -98,7 +98,7 @@ static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
(cmp_recs(REC(i->env, s1, l1), REC(i->env, s2, l2)))
#define TABLE_HASH(index, side, line) \
- XDL_HASHLONG((REC(index->env, side, line))->ha, index->table_bits)
+ XDL_HASHLONG((REC(index->env, side, line))->minimal_perfect_hash, index->table_bits)
static int scanA(struct histindex *index, int line1, int count1)
{
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 9181815fd4..e400d85072 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -48,7 +48,7 @@
struct hashmap {
int nr, alloc;
struct entry {
- unsigned long hash;
+ usize minimal_perfect_hash;
/*
* 0 = unused entry, 1 = first line, 2 = second, etc.
* line2 is NON_UNIQUE if the line is not unique
@@ -101,10 +101,10 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
* So we multiply ha by 2 in the hope that the hashing was
* "unique enough".
*/
- int index = (int)((record->ha << 1) % map->alloc);
+ int index = (int)((record->minimal_perfect_hash << 1) % map->alloc);
while (map->entries[index].line1) {
- if (map->entries[index].hash != record->ha) {
+ if (map->entries[index].minimal_perfect_hash != record->minimal_perfect_hash) {
if (++index >= map->alloc)
index = 0;
continue;
@@ -120,7 +120,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
if (pass == 2)
return;
map->entries[index].line1 = line;
- map->entries[index].hash = record->ha;
+ map->entries[index].minimal_perfect_hash = record->minimal_perfect_hash;
map->entries[index].anchor = is_anchor(xpp, (const char *)map->env->xdf1.recs[line - 1].ptr);
if (!map->first)
map->first = map->entries + index;
@@ -248,7 +248,7 @@ static int match(struct hashmap *map, int line1, int line2)
{
xrecord_t *record1 = &map->env->xdf1.recs[line1 - 1];
xrecord_t *record2 = &map->env->xdf2.recs[line2 - 1];
- return record1->ha == record2->ha;
+ return record1->minimal_perfect_hash == record2->minimal_perfect_hash;
}
static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index d62a329d0c..9ec2a5d078 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -93,9 +93,9 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
long hi;
xdlclass_t *rcrec;
- hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
+ hi = (long) XDL_HASHLONG(rec->line_hash, cf->hbits);
for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
- if (rcrec->rec.ha == rec->ha &&
+ if (rcrec->rec.line_hash == rec->line_hash &&
xdl_recmatch((const char *)rcrec->rec.ptr, (long)rcrec->rec.size,
(const char *)rec->ptr, (long)rec->size, cf->flags))
break;
@@ -111,7 +111,7 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
cf->rcrecs[rcrec->idx] = rcrec;
rcrec->rec.ptr = rec->ptr;
rcrec->rec.size = rec->size;
- rcrec->rec.ha = rec->ha;
+ rcrec->rec.line_hash = rec->line_hash;
rcrec->len1 = rcrec->len2 = 0;
rcrec->next = cf->rchash[hi];
cf->rchash[hi] = rcrec;
@@ -119,7 +119,7 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
(pass == 1) ? rcrec->len1++ : rcrec->len2++;
- rec->ha = (unsigned long) rcrec->idx;
+ rec->minimal_perfect_hash = (unsigned long) rcrec->idx;
return 0;
}
@@ -158,7 +158,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
crec = &xdf->recs[xdf->nrec++];
crec->ptr = (u8 const *)prev;
crec->size = cur - prev;
- crec->ha = hav;
+ crec->line_hash = hav;
if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
@@ -271,7 +271,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
- rcrec = cf->rcrecs[recs->ha];
+ rcrec = cf->rcrecs[recs->minimal_perfect_hash];
nm = rcrec ? rcrec->len2 : 0;
dis1[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
}
@@ -279,7 +279,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
- rcrec = cf->rcrecs[recs->ha];
+ rcrec = cf->rcrecs[recs->minimal_perfect_hash];
nm = rcrec ? rcrec->len1 : 0;
dis2[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
}
@@ -321,7 +321,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
recs2 = xdf2->recs;
for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
i++, recs1++, recs2++)
- if (recs1->ha != recs2->ha)
+ if (recs1->minimal_perfect_hash != recs2->minimal_perfect_hash)
break;
xdf1->dstart = xdf2->dstart = i;
@@ -329,7 +329,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
recs1 = xdf1->recs + xdf1->nrec - 1;
recs2 = xdf2->recs + xdf2->nrec - 1;
for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
- if (recs1->ha != recs2->ha)
+ if (recs1->minimal_perfect_hash != recs2->minimal_perfect_hash)
break;
xdf1->dend = xdf1->nrec - i - 1;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 6f83a9f4ff..f2b53a6553 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -41,7 +41,8 @@ typedef struct s_chastore {
typedef struct s_xrecord {
u8 const *ptr;
usize size;
- unsigned long ha;
+ u64 line_hash;
+ usize minimal_perfect_hash;
} xrecord_t;
typedef struct s_xdfile {
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 15/17] xdiff: make xdfile_t.nrec a usize instead of long
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (13 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 14/17] xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 16/17] xdiff: make xdfile_t.nreff " Ezekiel Newren via GitGitGadget
` (3 subsequent siblings)
18 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 8 ++++----
xdiff/xemit.c | 14 +++++++-------
xdiff/xmerge.c | 4 ++--
xdiff/xprepare.c | 6 +++---
xdiff/xtypes.h | 2 +-
5 files changed, 17 insertions(+), 17 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index c8d351705c..ee72f5ea3b 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -483,7 +483,7 @@ static void measure_split(const xdfile_t *xdf, long split,
{
long i;
- if (split >= xdf->nrec) {
+ if (split >= (long)xdf->nrec) {
m->end_of_file = 1;
m->indent = -1;
} else {
@@ -506,7 +506,7 @@ static void measure_split(const xdfile_t *xdf, long split,
m->post_blank = 0;
m->post_indent = -1;
- for (i = split + 1; i < xdf->nrec; i++) {
+ for (i = split + 1; i < (long)xdf->nrec; i++) {
m->post_indent = get_indent(&xdf->recs[i]);
if (m->post_indent != -1)
break;
@@ -717,7 +717,7 @@ static void group_init(xdfile_t *xdf, struct xdlgroup *g)
*/
static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
{
- if (g->end == xdf->nrec)
+ if (g->end == (long)xdf->nrec)
return -1;
g->start = g->end + 1;
@@ -750,7 +750,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
*/
static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
- if (g->end < xdf->nrec &&
+ if (g->end < (long)xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
xdf->rchg[g->start++] = NO;
xdf->rchg[g->end++] = YES;
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index ad3e859c57..aa63aab749 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -137,7 +137,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
buf = func_line ? func_line->buf : dummy;
size = func_line ? sizeof(func_line->buf) : sizeof(dummy);
- for (l = start; l != limit && 0 <= l && l < xe->xdf1.nrec; l += step) {
+ for (l = start; l != limit && 0 <= l && l < (long)xe->xdf1.nrec; l += step) {
long len = match_func_rec(&xe->xdf1, xecfg, l, buf, size);
if (len >= 0) {
if (func_line)
@@ -179,14 +179,14 @@ pre_context_calculation:
long fs1, i1 = xch->i1;
/* Appended chunk? */
- if (i1 >= xe->xdf1.nrec) {
+ if (i1 >= (long)xe->xdf1.nrec) {
long i2 = xch->i2;
/*
* We don't need additional context if
* a whole function was added.
*/
- while (i2 < xe->xdf2.nrec) {
+ while (i2 < (long)xe->xdf2.nrec) {
if (is_func_rec(&xe->xdf2, xecfg, i2))
goto post_context_calculation;
i2++;
@@ -228,8 +228,8 @@ pre_context_calculation:
post_context_calculation:
lctx = xecfg->ctxlen;
- lctx = XDL_MIN(lctx, xe->xdf1.nrec - (xche->i1 + xche->chg1));
- lctx = XDL_MIN(lctx, xe->xdf2.nrec - (xche->i2 + xche->chg2));
+ lctx = XDL_MIN(lctx, (long)xe->xdf1.nrec - (xche->i1 + xche->chg1));
+ lctx = XDL_MIN(lctx, (long)xe->xdf2.nrec - (xche->i2 + xche->chg2));
e1 = xche->i1 + xche->chg1 + lctx;
e2 = xche->i2 + xche->chg2 + lctx;
@@ -243,7 +243,7 @@ pre_context_calculation:
if (fe1 < 0)
fe1 = xe->xdf1.nrec;
if (fe1 > e1) {
- e2 = XDL_MIN(e2 + (fe1 - e1), xe->xdf2.nrec);
+ e2 = XDL_MIN(e2 + (fe1 - e1), (long)xe->xdf2.nrec);
e1 = fe1;
}
@@ -254,7 +254,7 @@ pre_context_calculation:
*/
if (xche->next) {
long l = XDL_MIN(xche->next->i1,
- xe->xdf1.nrec - 1);
+ (long)xe->xdf1.nrec - 1);
if (l - xecfg->ctxlen <= e1 ||
get_func_line(xe, xecfg, NULL, l, e1) < 0) {
xche = xche->next;
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index c1a003326a..1ebcbb4e3a 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -158,7 +158,7 @@ static int is_eol_crlf(xdfile_t *file, int i)
{
usize size;
- if (i < file->nrec - 1)
+ if (i < (long)file->nrec - 1)
/* All lines before the last *must* end in LF */
return (size = file->recs[i].size) > 1 &&
file->recs[i].ptr[size - 2] == '\r';
@@ -622,7 +622,7 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1,
changes = c;
i0 = xscr1->i1;
i1 = xscr1->i2;
- i2 = xscr1->i1 + xe2->xdf2.nrec - xe2->xdf1.nrec;
+ i2 = xscr1->i1 + (long)xe2->xdf2.nrec - (long)xe2->xdf1.nrec;
chg0 = xscr1->chg1;
chg1 = xscr1->chg2;
chg2 = xscr1->chg1;
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 9ec2a5d078..d990fe1c9e 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -153,7 +153,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
for (top = blk + bsize; cur < top; ) {
prev = cur;
hav = xdl_hash_record(&cur, top, xpp->flags);
- if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
+ if (XDL_ALLOC_GROW(xdf->recs, (long)xdf->nrec + 1, narec))
goto abort;
crec = &xdf->recs[xdf->nrec++];
crec->ptr = (u8 const *)prev;
@@ -332,8 +332,8 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
if (recs1->minimal_perfect_hash != recs2->minimal_perfect_hash)
break;
- xdf1->dend = xdf1->nrec - i - 1;
- xdf2->dend = xdf2->nrec - i - 1;
+ xdf1->dend = (long)xdf1->nrec - i - 1;
+ xdf2->dend = (long)xdf2->nrec - i - 1;
return 0;
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index f2b53a6553..41986c6603 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -47,7 +47,7 @@ typedef struct s_xrecord {
typedef struct s_xdfile {
xrecord_t *recs;
- long nrec;
+ usize nrec;
long dstart, dend;
char *rchg;
long *rindex;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 16/17] xdiff: make xdfile_t.nreff a usize instead of long
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (14 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 15/17] xdiff: make xdfile_t.nrec a usize instead of long Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 17/17] xdiff: change the types of dstart, dend, rchg, and rindex in xdfile_t Ezekiel Newren via GitGitGadget
` (2 subsequent siblings)
18 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 14 +++++++-------
xdiff/xtypes.h | 2 +-
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index d990fe1c9e..83355f036e 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -257,7 +257,7 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
* might be potentially discarded if they happear in a run of discardable.
*/
static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
- long i, nm, nreff, mlim;
+ long i, nm, mlim;
xrecord_t *recs;
xdlclass_t *rcrec;
char *dis, *dis1, *dis2;
@@ -284,25 +284,25 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
dis2[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
}
- for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
+ xdf1->nreff = 0;
+ for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
i <= xdf1->dend; i++, recs++) {
if (dis1[i] == YES ||
(dis1[i] == MAYBE && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
- xdf1->rindex[nreff++] = i;
+ xdf1->rindex[xdf1->nreff++] = i;
} else
xdf1->rchg[i] = YES;
}
- xdf1->nreff = nreff;
- for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
+ xdf2->nreff = 0;
+ for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
i <= xdf2->dend; i++, recs++) {
if (dis2[i] == YES ||
(dis2[i] == MAYBE && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
- xdf2->rindex[nreff++] = i;
+ xdf2->rindex[xdf2->nreff++] = i;
} else
xdf2->rchg[i] = YES;
}
- xdf2->nreff = nreff;
xdl_free(dis);
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 41986c6603..070674d7c4 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -51,7 +51,7 @@ typedef struct s_xdfile {
long dstart, dend;
char *rchg;
long *rindex;
- long nreff;
+ usize nreff;
} xdfile_t;
typedef struct s_xdfenv {
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH 17/17] xdiff: change the types of dstart, dend, rchg, and rindex in xdfile_t
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (15 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 16/17] xdiff: make xdfile_t.nreff " Ezekiel Newren via GitGitGadget
@ 2025-09-07 19:45 ` Ezekiel Newren via GitGitGadget
2025-09-16 21:56 ` [PATCH 00/17] Use rust types in xdiff Junio C Hamano
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
18 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-07 19:45 UTC (permalink / raw)
To: git; +Cc: Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xtypes.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 070674d7c4..08301bf932 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -48,9 +48,9 @@ typedef struct s_xrecord {
typedef struct s_xdfile {
xrecord_t *recs;
usize nrec;
- long dstart, dend;
- char *rchg;
- long *rindex;
+ i32 dstart, dend;
+ u8 *rchg;
+ usize *rindex;
usize nreff;
} xdfile_t;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* Re: [PATCH 10/17] compat/rust_types.h: define rust primitive types
2025-09-07 19:45 ` [PATCH 10/17] compat/rust_types.h: define rust primitive types Ezekiel Newren via GitGitGadget
@ 2025-09-08 15:08 ` Junio C Hamano
2025-09-08 16:15 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Junio C Hamano @ 2025-09-08 15:08 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> +/*
> + * A typedef for bool is not needed because C bool and Rust bool are
> + * the same if #include <stdbool.h> is used.
> + */
> +
> +typedef uint8_t u8;
> +typedef uint16_t u16;
> +typedef uint32_t u32;
> +typedef uint64_t u64;
> +
> +typedef int8_t i8;
> +typedef int16_t i16;
> +typedef int32_t i32;
> +typedef int64_t i64;
The standard guarantees that these are all of the above are exactly
N-bits wide, so I can buy the above types. But before I can buy the
above typedefs, don't we need to rename existing variables that
squat on these names?
$ git grep -n -E -e '\<[ui](8|16|32|64)\>'
gives some hits, like
reftable/record.c:678: uint8_t i64[8];
t/helper/test-parse-options.c:123: uint16_t u16 = 0;
t/helper/test-parse-options.c:148: OPT_UNSIGNED(0, "u16", &u16, "get a 16 bit unsigned integer"),
to avoid confusion? There are handful other hits.
> +typedef float f32;
> +typedef double f64;
It may be that they can be used interchangeably in practice on
popular platforms, but are these guaranteed to be equivalent by some
standard? C only cares about the minimum required range and
precision, so you may have allocated enough bytes thinking you can
fit a f32 but your float may not fit there.
Or does Rust care only about platforms with IEEE 754 and would
refuse to port to other exotic architectures so the above worries
would not apply?
> +typedef size_t usize;
> +typedef ptrdiff_t isize;
Interesting. I would have expected, "isize" that is a signed
variant of "usize" to be aliased out of ssize_t (simply because it
is declared that "usize" corresponds to "size_t" on the previous
line), not using ptrdiff_t.
> +typedef uint32_t rust_char;
> +
> +#endif /* COMPAT_RUST_TYPES_H */
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 10/17] compat/rust_types.h: define rust primitive types
2025-09-08 15:08 ` Junio C Hamano
@ 2025-09-08 16:15 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-08 16:15 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Ezekiel Newren via GitGitGadget, git
On Mon, Sep 8, 2025 at 9:08 AM Junio C Hamano <gitster@pobox.com> wrote:
> The standard guarantees that these are all of the above are exactly
> N-bits wide, so I can buy the above types. But before I can buy the
> above typedefs, don't we need to rename existing variables that
> squat on these names?
>
> $ git grep -n -E -e '\<[ui](8|16|32|64)\>'
>
> gives some hits, like
>
> reftable/record.c:678: uint8_t i64[8];
> t/helper/test-parse-options.c:123: uint16_t u16 = 0;
> t/helper/test-parse-options.c:148: OPT_UNSIGNED(0, "u16", &u16, "get a 16 bit unsigned integer"),
>
> to avoid confusion? There are handful other hits.
Those places should be cleaned up, but it's not an immediate problem
because compat/rust_types.h is not auto included anywhere. These
typedefs live in git-compat-util.h in my "Introduce Rust" patch series
and I never had a problem with compilation or testing. One reason I
included compat/posix.h instead of git-compat-util.h is because it
includes stdbool.h where git-compat-util.h doesn't. I'll be happy to
do that cleanup.
> > +typedef float f32;
> > +typedef double f64;
>
> It may be that they can be used interchangeably in practice on
> popular platforms, but are these guaranteed to be equivalent by some
> standard? C only cares about the minimum required range and
> precision, so you may have allocated enough bytes thinking you can
> fit a f32 but your float may not fit there.
>
> Or does Rust care only about platforms with IEEE 754 and would
> refuse to port to other exotic architectures so the above worries
> would not apply?
If the typedefs in compat/rust_types.h are incorrect for a
platform/target then compat/posix.h or compat/rust_types.h should be
updated rather than relying on Rust's core::ffi's guess as what it is.
On the Rust side: core::ffi assumes that a C 'float' is the same as
f32 and that a C 'double' is an f64[1,2,3]. So I am making the same
assumption that the Rust maintainers are _and_ I'm keeping ambiguity
on the C side rather than eroding Rust's type precision. If a platform
does not follow these assumptions then the compat/rust_types.h should
warn those using that platform or fail to build entirely.
[1] https://doc.rust-lang.org/1.89.0/core/ffi/type.c_float.html
[2] https://doc.rust-lang.org/1.63.0/src/core/ffi/mod.rs.html#80-81
[3] https://doc.rust-lang.org/1.89.0/src/core/ffi/primitives.rs.html#34-35
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 01/17] xdiff: delete static forward declarations in xprepare
2025-09-07 19:45 ` [PATCH 01/17] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
@ 2025-09-09 8:55 ` Elijah Newren
0 siblings, 0 replies; 158+ messages in thread
From: Elijah Newren @ 2025-09-09 8:55 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
On Sun, Sep 7, 2025 at 12:45 PM Ezekiel Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Move xdl_prepare_env() later in the file to avoid the need
> for static forward declarations.
>
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xprepare.c | 116 ++++++++++++++++++++---------------------------
> 1 file changed, 50 insertions(+), 66 deletions(-)
>
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index e1d4017b2d..a45c5ee208 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -53,21 +53,6 @@ typedef struct s_xdlclassifier {
>
>
>
> -static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags);
> -static void xdl_free_classifier(xdlclassifier_t *cf);
> -static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
> - unsigned int hbits, xrecord_t *rec);
> -static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
> - xdlclassifier_t *cf, xdfile_t *xdf);
> -static void xdl_free_ctx(xdfile_t *xdf);
> -static int xdl_clean_mmatch(char const *dis, long i, long s, long e);
> -static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
> -static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
> -static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
> -
> -
> -
> -
> static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags) {
> cf->flags = flags;
>
> @@ -242,57 +227,6 @@ static void xdl_free_ctx(xdfile_t *xdf) {
> }
>
>
> -int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
> - xdfenv_t *xe) {
> - long enl1, enl2, sample;
> - xdlclassifier_t cf;
> -
> - memset(&cf, 0, sizeof(cf));
> -
> - /*
> - * For histogram diff, we can afford a smaller sample size and
> - * thus a poorer estimate of the number of lines, as the hash
> - * table (rhash) won't be filled up/grown. The number of lines
> - * (nrecs) will be updated correctly anyway by
> - * xdl_prepare_ctx().
> - */
> - sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
> - ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
> -
> - enl1 = xdl_guess_lines(mf1, sample) + 1;
> - enl2 = xdl_guess_lines(mf2, sample) + 1;
> -
> - if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
> - return -1;
> -
> - if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
> -
> - xdl_free_classifier(&cf);
> - return -1;
> - }
> - if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
> -
> - xdl_free_ctx(&xe->xdf1);
> - xdl_free_classifier(&cf);
> - return -1;
> - }
> -
> - if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
> - (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
> - xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
> -
> - xdl_free_ctx(&xe->xdf2);
> - xdl_free_ctx(&xe->xdf1);
> - xdl_free_classifier(&cf);
> - return -1;
> - }
> -
> - xdl_free_classifier(&cf);
> -
> - return 0;
> -}
> -
> -
> void xdl_free_env(xdfenv_t *xe) {
>
> xdl_free_ctx(&xe->xdf2);
> @@ -460,3 +394,53 @@ static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2
>
> return 0;
> }
> +
> +int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
> + xdfenv_t *xe) {
> + long enl1, enl2, sample;
> + xdlclassifier_t cf;
> +
> + memset(&cf, 0, sizeof(cf));
> +
> + /*
> + * For histogram diff, we can afford a smaller sample size and
> + * thus a poorer estimate of the number of lines, as the hash
> + * table (rhash) won't be filled up/grown. The number of lines
> + * (nrecs) will be updated correctly anyway by
> + * xdl_prepare_ctx().
> + */
> + sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
> + ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
> +
> + enl1 = xdl_guess_lines(mf1, sample) + 1;
> + enl2 = xdl_guess_lines(mf2, sample) + 1;
> +
> + if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
> + return -1;
> +
> + if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
> +
> + xdl_free_classifier(&cf);
> + return -1;
> + }
> + if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
> +
> + xdl_free_ctx(&xe->xdf1);
> + xdl_free_classifier(&cf);
> + return -1;
> + }
> +
> + if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
> + (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
> + xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
> +
> + xdl_free_ctx(&xe->xdf2);
> + xdl_free_ctx(&xe->xdf1);
> + xdl_free_classifier(&cf);
> + return -1;
> + }
> +
> + xdl_free_classifier(&cf);
> +
> + return 0;
> +}
> --
> gitgitgadget
Viewing this with --color-moved makes it clear that the changes
exactly match what you summarize in the commit message.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 02/17] xdiff: delete local variables and initialize/free xdfile_t directly
2025-09-07 19:45 ` [PATCH 02/17] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
@ 2025-09-09 8:56 ` Elijah Newren
0 siblings, 0 replies; 158+ messages in thread
From: Elijah Newren @ 2025-09-09 8:56 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
On Sun, Sep 7, 2025 at 12:48 PM Ezekiel Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> xdl_prepare_ctx() uses local variables and assigns them to the
> corresponding xdfile_t fields if there are no errors. Delete them and
> use the fields of xdfile_t directly.
In particular, those local variables are essentially a hand-rolled
additional implementation of xdl_free_ctx() inlined into
xdl_prepare_ctx(). You're just modifying the code to use the existing
xdl_free_ctx() function so we don't have two ways to free such
variables (especially since one of those two was an ugly inlining).
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xprepare.c | 79 +++++++++++++++++++-----------------------------
> 1 file changed, 31 insertions(+), 48 deletions(-)
>
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index a45c5ee208..2ed1785b09 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -134,99 +134,82 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
> }
>
>
> +static void xdl_free_ctx(xdfile_t *xdf)
> +{
> +
unnecessary blank line here
> + xdl_free(xdf->rhash);
> + xdl_free(xdf->rindex);
> + xdl_free(xdf->rchg - 1);
> + xdl_free(xdf->ha);
> + xdl_free(xdf->recs);
> + xdl_cha_free(&xdf->rcha);
> +}
> +
> +
> static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
> xdlclassifier_t *cf, xdfile_t *xdf) {
> - unsigned int hbits;
> - long nrec, hsize, bsize;
> + long bsize;
> unsigned long hav;
> char const *blk, *cur, *top, *prev;
> xrecord_t *crec;
> - xrecord_t **recs;
> - xrecord_t **rhash;
> - unsigned long *ha;
> - char *rchg;
> - long *rindex;
>
> - ha = NULL;
> - rindex = NULL;
> - rchg = NULL;
> - rhash = NULL;
> - recs = NULL;
> + xdf->ha = NULL;
> + xdf->rindex = NULL;
> + xdf->rchg = NULL;
> + xdf->rhash = NULL;
> + xdf->recs = NULL;
>
> if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
> goto abort;
> - if (!XDL_ALLOC_ARRAY(recs, narec))
> + if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
> goto abort;
>
> - hbits = xdl_hashbits((unsigned int) narec);
> - hsize = 1 << hbits;
> - if (!XDL_CALLOC_ARRAY(rhash, hsize))
> + xdf->hbits = xdl_hashbits((unsigned int) narec);
> + if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
> goto abort;
>
> - nrec = 0;
> + xdf->nrec = 0;
> if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
> for (top = blk + bsize; cur < top; ) {
> prev = cur;
> hav = xdl_hash_record(&cur, top, xpp->flags);
> - if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
> + if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
> goto abort;
> if (!(crec = xdl_cha_alloc(&xdf->rcha)))
> goto abort;
> crec->ptr = prev;
> crec->size = (long) (cur - prev);
> crec->ha = hav;
> - recs[nrec++] = crec;
> - if (xdl_classify_record(pass, cf, rhash, hbits, crec) < 0)
> + xdf->recs[xdf->nrec++] = crec;
> + if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
> goto abort;
> }
> }
>
> - if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
> + if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
> goto abort;
>
> if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
> (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
> - if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
> + if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
> goto abort;
> - if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
> + if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
> goto abort;
> }
>
> - xdf->nrec = nrec;
> - xdf->recs = recs;
> - xdf->hbits = hbits;
> - xdf->rhash = rhash;
> - xdf->rchg = rchg + 1;
> - xdf->rindex = rindex;
> + xdf->rchg += 1;
> xdf->nreff = 0;
> - xdf->ha = ha;
> xdf->dstart = 0;
> - xdf->dend = nrec - 1;
> + xdf->dend = xdf->nrec - 1;
>
> return 0;
>
> abort:
> - xdl_free(ha);
> - xdl_free(rindex);
> - xdl_free(rchg);
> - xdl_free(rhash);
> - xdl_free(recs);
> - xdl_cha_free(&xdf->rcha);
> + xdl_free_ctx(xdf);
> return -1;
> }
>
>
> -static void xdl_free_ctx(xdfile_t *xdf) {
> -
> - xdl_free(xdf->rhash);
> - xdl_free(xdf->rindex);
> - xdl_free(xdf->rchg - 1);
> - xdl_free(xdf->ha);
> - xdl_free(xdf->recs);
> - xdl_cha_free(&xdf->rcha);
> -}
> -
> -
> void xdl_free_env(xdfenv_t *xe) {
>
> xdl_free_ctx(&xe->xdf2);
> --
> gitgitgadget
Looks good.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 03/17] xdiff: delete unnecessary fields from xrecord_t and xdfile_t
2025-09-07 19:45 ` [PATCH 03/17] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-09 8:56 ` Elijah Newren
0 siblings, 0 replies; 158+ messages in thread
From: Elijah Newren @ 2025-09-09 8:56 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
On Sun, Sep 7, 2025 at 12:45 PM Ezekiel Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
> but never used for anything by the code. Remove them.
>
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xprepare.c | 15 ++-------------
> xdiff/xtypes.h | 3 ---
> 2 files changed, 2 insertions(+), 16 deletions(-)
>
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index 2ed1785b09..91b0ed54e0 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -91,8 +91,7 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
> }
>
>
> -static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
> - unsigned int hbits, xrecord_t *rec) {
> +static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
> long hi;
> char const *line;
> xdlclass_t *rcrec;
> @@ -126,10 +125,6 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
>
> rec->ha = (unsigned long) rcrec->idx;
>
> - hi = (long) XDL_HASHLONG(rec->ha, hbits);
> - rec->next = rhash[hi];
> - rhash[hi] = rec;
> -
> return 0;
> }
>
> @@ -137,7 +132,6 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
> static void xdl_free_ctx(xdfile_t *xdf)
> {
>
> - xdl_free(xdf->rhash);
> xdl_free(xdf->rindex);
> xdl_free(xdf->rchg - 1);
> xdl_free(xdf->ha);
> @@ -156,7 +150,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> xdf->ha = NULL;
> xdf->rindex = NULL;
> xdf->rchg = NULL;
> - xdf->rhash = NULL;
> xdf->recs = NULL;
>
> if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
> @@ -164,10 +157,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
> goto abort;
>
> - xdf->hbits = xdl_hashbits((unsigned int) narec);
> - if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
> - goto abort;
> -
> xdf->nrec = 0;
> if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
> for (top = blk + bsize; cur < top; ) {
> @@ -181,7 +170,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> crec->size = (long) (cur - prev);
> crec->ha = hav;
> xdf->recs[xdf->nrec++] = crec;
> - if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
> + if (xdl_classify_record(pass, cf, crec) < 0)
> goto abort;
> }
> }
> diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
> index 8442bd436e..8b8467360e 100644
> --- a/xdiff/xtypes.h
> +++ b/xdiff/xtypes.h
> @@ -39,7 +39,6 @@ typedef struct s_chastore {
> } chastore_t;
>
> typedef struct s_xrecord {
> - struct s_xrecord *next;
> char const *ptr;
> long size;
> unsigned long ha;
> @@ -48,8 +47,6 @@ typedef struct s_xrecord {
> typedef struct s_xdfile {
> chastore_t rcha;
> long nrec;
> - unsigned int hbits;
> - xrecord_t **rhash;
> long dstart, dend;
> xrecord_t **recs;
> char *rchg;
> --
> gitgitgadget
Always nice to see unused fields get removed.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 04/17] xdiff: delete xdl_get_rec() in xemit
2025-09-07 19:45 ` [PATCH 04/17] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
@ 2025-09-09 8:56 ` Elijah Newren
0 siblings, 0 replies; 158+ messages in thread
From: Elijah Newren @ 2025-09-09 8:56 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
On Sun, Sep 7, 2025 at 12:45 PM Ezekiel Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> This function aliases the fields of xrecord_t, which makes it harder
> to track the usages of those fields. Delete it.
>
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xemit.c | 38 +++++++++++++-------------------------
> 1 file changed, 13 insertions(+), 25 deletions(-)
>
> diff --git a/xdiff/xemit.c b/xdiff/xemit.c
> index 1d40c9cb40..2161ac3cd0 100644
> --- a/xdiff/xemit.c
> +++ b/xdiff/xemit.c
> @@ -22,23 +22,13 @@
>
> #include "xinclude.h"
>
> -static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
>
Can we remove this line too, to simplify the diff? (i.e. make there
only be one blank line between the include of xinclude.h and
xdl_emit_record?
> - *rec = xdf->recs[ri]->ptr;
> -
> - return xdf->recs[ri]->size;
> -}
> -
> -
> -static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
> - long size, psize = strlen(pre);
> - char const *rec;
> -
> - size = xdl_get_rec(xdf, ri, &rec);
> - if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0) {
> +static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
> +{
The change of the opening curly brace to be on a new line does match
our general coding guidelines, but cleanups like this should be in a
separate patch.
> + xrecord_t *rec = xdf->recs[ri];
>
> + if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
> return -1;
> - }
While in this case you were modifying the line in question and thus
fixing the code to also not use curly braces around a single
statement, which is more justified, it still makes the patch slightly
harder for reviewers to read since you are doing multiple things (what
you said in the commit message, plus cleaning up style "violations").
It'd be better to leave the existing style violations in place, or fix
them in a separate patch.
>
> return 0;
> }
> @@ -120,11 +110,11 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
> static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
> char *buf, long sz)
> {
> - const char *rec;
> - long len = xdl_get_rec(xdf, ri, &rec);
> + xrecord_t *rec = xdf->recs[ri];
> +
> if (!xecfg->find_func)
> - return def_ff(rec, len, buf, sz);
> - return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
> + return def_ff(rec->ptr, rec->size, buf, sz);
> + return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
> }
>
> static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
> @@ -160,14 +150,12 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
>
> static int is_empty_rec(xdfile_t *xdf, long ri)
> {
> - const char *rec;
> - long len = xdl_get_rec(xdf, ri, &rec);
> + xrecord_t *rec = xdf->recs[ri];
> + long i = 0;
>
> - while (len > 0 && XDL_ISSPACE(*rec)) {
> - rec++;
> - len--;
> - }
> - return !len;
> + for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
> +
> + return i == rec->size;
> }
I agree that the code is easier to follow without the aliasing.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 05/17] xdiff: delete struct diffdata_t
2025-09-07 19:45 ` [PATCH 05/17] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
@ 2025-09-09 8:56 ` Elijah Newren
0 siblings, 0 replies; 158+ messages in thread
From: Elijah Newren @ 2025-09-09 8:56 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
On Sun, Sep 7, 2025 at 12:45 PM Ezekiel Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Every field in this struct is an alias for a certain field in xdfile_t.
>
> diffdata_t.nrec -> xdfile_t.nreff
> diffdata_t.ha -> xdfile_t.ha
> diffdata_t.rindex -> xdfile_t.rindex
> diffdata_t.rchg -> xdfile_t.recharge
>
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xdiffi.c | 32 ++++++++------------------------
> xdiff/xdiffi.h | 11 ++---------
> 2 files changed, 10 insertions(+), 33 deletions(-)
>
> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index 5a96e36dfb..bbf0161f84 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -257,10 +257,10 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
> * sub-boxes by calling the box splitting function. Note that the real job
> * (marking changed lines) is done in the two boundary reaching checks.
> */
> -int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
> - diffdata_t *dd2, long off2, long lim2,
> +int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
> + xdfile_t *xdf2, long off2, long lim2,
> long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
> - unsigned long const *ha1 = dd1->ha, *ha2 = dd2->ha;
> + unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
>
> /*
> * Shrink the box by walking through each diagonal snake (SW and NE).
> @@ -273,17 +273,11 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
> * be obviously changed.
> */
> if (off1 == lim1) {
> - char *rchg2 = dd2->rchg;
> - long *rindex2 = dd2->rindex;
> -
> for (; off2 < lim2; off2++)
> - rchg2[rindex2[off2]] = 1;
> + xdf2->rchg[xdf2->rindex[off2]] = 1;
> } else if (off2 == lim2) {
> - char *rchg1 = dd1->rchg;
> - long *rindex1 = dd1->rindex;
> -
> for (; off1 < lim1; off1++)
> - rchg1[rindex1[off1]] = 1;
> + xdf1->rchg[xdf1->rindex[off1]] = 1;
> } else {
> xdpsplit_t spl;
> spl.i1 = spl.i2 = 0;
> @@ -300,9 +294,9 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
> /*
> * ... et Impera.
> */
> - if (xdl_recs_cmp(dd1, off1, spl.i1, dd2, off2, spl.i2,
> + if (xdl_recs_cmp(xdf1, off1, spl.i1, xdf2, off2, spl.i2,
> kvdf, kvdb, spl.min_lo, xenv) < 0 ||
> - xdl_recs_cmp(dd1, spl.i1, lim1, dd2, spl.i2, lim2,
> + xdl_recs_cmp(xdf1, spl.i1, lim1, xdf2, spl.i2, lim2,
> kvdf, kvdb, spl.min_hi, xenv) < 0) {
>
> return -1;
> @@ -318,7 +312,6 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
> long ndiags;
> long *kvd, *kvdf, *kvdb;
> xdalgoenv_t xenv;
> - diffdata_t dd1, dd2;
> int res;
>
> if (xdl_prepare_env(mf1, mf2, xpp, xe) < 0)
> @@ -357,16 +350,7 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
> xenv.snake_cnt = XDL_SNAKE_CNT;
> xenv.heur_min = XDL_HEUR_MIN_COST;
>
> - dd1.nrec = xe->xdf1.nreff;
> - dd1.ha = xe->xdf1.ha;
> - dd1.rchg = xe->xdf1.rchg;
> - dd1.rindex = xe->xdf1.rindex;
> - dd2.nrec = xe->xdf2.nreff;
> - dd2.ha = xe->xdf2.ha;
> - dd2.rchg = xe->xdf2.rchg;
> - dd2.rindex = xe->xdf2.rindex;
> -
> - res = xdl_recs_cmp(&dd1, 0, dd1.nrec, &dd2, 0, dd2.nrec,
> + res = xdl_recs_cmp(&xe->xdf1, 0, xe->xdf1.nreff, &xe->xdf2, 0, xe->xdf2.nreff,
> kvdf, kvdb, (xpp->flags & XDF_NEED_MINIMAL) != 0,
> &xenv);
> xdl_free(kvd);
> diff --git a/xdiff/xdiffi.h b/xdiff/xdiffi.h
> index 126c9d8ff4..49e52c67f9 100644
> --- a/xdiff/xdiffi.h
> +++ b/xdiff/xdiffi.h
> @@ -24,13 +24,6 @@
> #define XDIFFI_H
>
>
> -typedef struct s_diffdata {
> - long nrec;
> - unsigned long const *ha;
> - long *rindex;
> - char *rchg;
> -} diffdata_t;
> -
> typedef struct s_xdalgoenv {
> long mxcost;
> long snake_cnt;
> @@ -46,8 +39,8 @@ typedef struct s_xdchange {
>
>
>
> -int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
> - diffdata_t *dd2, long off2, long lim2,
> +int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
> + xdfile_t *xdf2, long off2, long lim2,
> long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv);
> int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
> xdfenv_t *xe);
> --
> gitgitgadget
Viewing this commit with --color-moved helps highlight in the code
what you say in the commit message. Makes sense.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 06/17] xdiff: delete redundant array xdfile_t.ha
2025-09-07 19:45 ` [PATCH 06/17] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
@ 2025-09-09 8:57 ` Elijah Newren
0 siblings, 0 replies; 158+ messages in thread
From: Elijah Newren @ 2025-09-09 8:57 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
On Sun, Sep 7, 2025 at 12:46 PM Ezekiel Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> When 0 <= i < xdfile_t.nreff the following is true:
> xdfile_t.ha[i] == xdfile_t.recs[xdfile_t.rindex[i]]
I like getting rid of redundant stuff. One thing to note here is that
you're replacing a single indirection with two...
>
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xdiffi.c | 24 ++++++++++++++----------
> xdiff/xprepare.c | 12 ++----------
> xdiff/xtypes.h | 1 -
> 3 files changed, 16 insertions(+), 21 deletions(-)
>
> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index bbf0161f84..11cd090b53 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -22,6 +22,11 @@
>
> #include "xinclude.h"
>
> +static unsigned long get_hash(xdfile_t *xdf, long index)
> +{
> + return xdf->recs[xdf->rindex[index]]->ha;
> +}
> +
> #define XDL_MAX_COST_MIN 256
> #define XDL_HEUR_MIN_COST 256
> #define XDL_LINE_MAX (long)((1UL << (CHAR_BIT * sizeof(long) - 1)) - 1)
> @@ -42,8 +47,8 @@ typedef struct s_xdpsplit {
> * using this algorithm, so a little bit of heuristic is needed to cut the
> * search and to return a suboptimal point.
> */
> -static long xdl_split(unsigned long const *ha1, long off1, long lim1,
> - unsigned long const *ha2, long off2, long lim2,
> +static long xdl_split(xdfile_t *xdf1, long off1, long lim1,
> + xdfile_t *xdf2, long off2, long lim2,
> long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl,
> xdalgoenv_t *xenv) {
> long dmin = off1 - lim2, dmax = lim1 - off2;
> @@ -87,7 +92,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
> i1 = kvdf[d + 1];
> prev1 = i1;
> i2 = i1 - d;
> - for (; i1 < lim1 && i2 < lim2 && ha1[i1] == ha2[i2]; i1++, i2++);
> + for (; i1 < lim1 && i2 < lim2 && get_hash(xdf1, i1) == get_hash(xdf2, i2); i1++, i2++);
You're not going to be happy with me asking, so sorry in advance, but
I'm really curious...we are now replacing a single indirection with a
double-indirection inside a for loop, which is nested within two other
for loops. Three levels of for-loops to me suggests it might be a hot
codepath. Does this double indirection in these codepaths affect
performance?
> if (i1 - prev1 > xenv->snake_cnt)
> got_snake = 1;
> kvdf[d] = i1;
> @@ -124,7 +129,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
> i1 = kvdb[d + 1] - 1;
> prev1 = i1;
> i2 = i1 - d;
> - for (; i1 > off1 && i2 > off2 && ha1[i1 - 1] == ha2[i2 - 1]; i1--, i2--);
> + for (; i1 > off1 && i2 > off2 && get_hash(xdf1, i1 - 1) == get_hash(xdf2, i2 - 1); i1--, i2--);
> if (prev1 - i1 > xenv->snake_cnt)
> got_snake = 1;
> kvdb[d] = i1;
> @@ -159,7 +164,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
> if (v > XDL_K_HEUR * ec && v > best &&
> off1 + xenv->snake_cnt <= i1 && i1 < lim1 &&
> off2 + xenv->snake_cnt <= i2 && i2 < lim2) {
> - for (k = 1; ha1[i1 - k] == ha2[i2 - k]; k++)
> + for (k = 1; get_hash(xdf1, i1 - k) == get_hash(xdf2, i2 - k); k++)
> if (k == xenv->snake_cnt) {
> best = v;
> spl->i1 = i1;
> @@ -183,7 +188,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
> if (v > XDL_K_HEUR * ec && v > best &&
> off1 < i1 && i1 <= lim1 - xenv->snake_cnt &&
> off2 < i2 && i2 <= lim2 - xenv->snake_cnt) {
> - for (k = 0; ha1[i1 + k] == ha2[i2 + k]; k++)
> + for (k = 0; get_hash(xdf1, i1 + k) == get_hash(xdf2, i2 + k); k++)
> if (k == xenv->snake_cnt - 1) {
> best = v;
> spl->i1 = i1;
> @@ -260,13 +265,12 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
> int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
> xdfile_t *xdf2, long off2, long lim2,
> long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
> - unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
>
> /*
> * Shrink the box by walking through each diagonal snake (SW and NE).
> */
> - for (; off1 < lim1 && off2 < lim2 && ha1[off1] == ha2[off2]; off1++, off2++);
> - for (; off1 < lim1 && off2 < lim2 && ha1[lim1 - 1] == ha2[lim2 - 1]; lim1--, lim2--);
> + for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, off1) == get_hash(xdf2, off2); off1++, off2++);
> + for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, lim1 - 1) == get_hash(xdf2, lim2 - 1); lim1--, lim2--);
>
> /*
> * If one dimension is empty, then all records on the other one must
> @@ -285,7 +289,7 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
> /*
> * Divide ...
> */
> - if (xdl_split(ha1, off1, lim1, ha2, off2, lim2, kvdf, kvdb,
> + if (xdl_split(xdf1, off1, lim1, xdf2, off2, lim2, kvdf, kvdb,
> need_min, &spl, xenv) < 0) {
>
> return -1;
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index 91b0ed54e0..59730989a3 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -134,7 +134,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
>
> xdl_free(xdf->rindex);
> xdl_free(xdf->rchg - 1);
> - xdl_free(xdf->ha);
> xdl_free(xdf->recs);
> xdl_cha_free(&xdf->rcha);
> }
> @@ -147,7 +146,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> char const *blk, *cur, *top, *prev;
> xrecord_t *crec;
>
> - xdf->ha = NULL;
> xdf->rindex = NULL;
> xdf->rchg = NULL;
> xdf->recs = NULL;
> @@ -182,8 +180,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
> if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
> goto abort;
> - if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
> - goto abort;
> }
>
> xdf->rchg += 1;
> @@ -301,9 +297,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> i <= xdf1->dend; i++, recs++) {
> if (dis1[i] == 1 ||
> (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
> - xdf1->rindex[nreff] = i;
> - xdf1->ha[nreff] = (*recs)->ha;
> - nreff++;
> + xdf1->rindex[nreff++] = i;
> } else
> xdf1->rchg[i] = 1;
> }
> @@ -313,9 +307,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> i <= xdf2->dend; i++, recs++) {
> if (dis2[i] == 1 ||
> (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
> - xdf2->rindex[nreff] = i;
> - xdf2->ha[nreff] = (*recs)->ha;
> - nreff++;
> + xdf2->rindex[nreff++] = i;
> } else
> xdf2->rchg[i] = 1;
> }
> diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
> index 8b8467360e..85848f1685 100644
> --- a/xdiff/xtypes.h
> +++ b/xdiff/xtypes.h
> @@ -52,7 +52,6 @@ typedef struct s_xdfile {
> char *rchg;
> long *rindex;
> long nreff;
> - unsigned long *ha;
> } xdfile_t;
>
> typedef struct s_xdfenv {
> --
> gitgitgadget
Other than the performance question, it looks like you've made a
straightforward mechanical change as highlighted in the commit
message.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 07/17] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
2025-09-07 19:45 ` [PATCH 07/17] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-09 8:57 ` Elijah Newren
0 siblings, 0 replies; 158+ messages in thread
From: Elijah Newren @ 2025-09-09 8:57 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
On Sun, Sep 7, 2025 at 12:46 PM Ezekiel Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xprepare.c | 16 ++++++----------
> 1 file changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index 59730989a3..6f1d4b4725 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -32,9 +32,7 @@
>
> typedef struct s_xdlclass {
> struct s_xdlclass *next;
> - unsigned long ha;
> - char const *line;
> - long size;
> + xrecord_t rec;
> long idx;
> long len1, len2;
> } xdlclass_t;
> @@ -93,14 +91,12 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
>
> static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
> long hi;
> - char const *line;
> xdlclass_t *rcrec;
>
> - line = rec->ptr;
> hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
> for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
> - if (rcrec->ha == rec->ha &&
> - xdl_recmatch(rcrec->line, rcrec->size,
> + if (rcrec->rec.ha == rec->ha &&
> + xdl_recmatch(rcrec->rec.ptr, rcrec->rec.size,
> rec->ptr, rec->size, cf->flags))
> break;
>
> @@ -113,9 +109,9 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
> if (XDL_ALLOC_GROW(cf->rcrecs, cf->count, cf->alloc))
> return -1;
> cf->rcrecs[rcrec->idx] = rcrec;
> - rcrec->line = line;
> - rcrec->size = rec->size;
> - rcrec->ha = rec->ha;
> + rcrec->rec.ptr = rec->ptr;
> + rcrec->rec.size = rec->size;
> + rcrec->rec.ha = rec->ha;
> rcrec->len1 = rcrec->len2 = 0;
> rcrec->next = cf->rchash[hi];
> cf->rchash[hi] = rcrec;
> --
> gitgitgadget
I can see the changes match the one-line summary. And I think the
point is simplification or reducing redundancy or something...but
could a single sentence motivation (stating which of these purposes is
at play) be added to the commit message?
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 08/17] xdiff: delete chastore from xdfile_t, view with --color-words
2025-09-07 19:45 ` [PATCH 08/17] xdiff: delete chastore from xdfile_t, view with --color-words Ezekiel Newren via GitGitGadget
@ 2025-09-09 8:58 ` Elijah Newren
2025-09-09 13:50 ` Phillip Wood
` (2 more replies)
0 siblings, 3 replies; 158+ messages in thread
From: Elijah Newren @ 2025-09-09 8:58 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
On Sun, Sep 7, 2025 at 12:46 PM Ezekiel Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Ezekiel Newren <ezekielnewren@gmail.com>
My personal bias is that things like "view with --color-words" makes
more sense to include near the end of the commit message, just before
the sign-offs. Not sure if others agree on that.
> The chastore_t type is very unfriendly to Rust FFI. It's also redundant
> since 'recs' is a vector type that grows every time an xrecord_t is
> added.
The second sentence seems to presume the reader knows what chastore_t
type is for, and about the confusing dual layering between it and
recs.its confusing dual layering. I liked your more extended
explanation in https://lore.kernel.org/git/7ea2dccd71fc502f20614ce217fc9885d1b17413.1756496539.git.gitgitgadget@gmail.com/;
could some of that be used here?
>
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xdiffi.c | 24 ++++++++++----------
> xdiff/xemit.c | 6 ++---
> xdiff/xhistogram.c | 2 +-
> xdiff/xmerge.c | 56 +++++++++++++++++++++++-----------------------
> xdiff/xpatience.c | 10 ++++-----
> xdiff/xprepare.c | 19 ++++++----------
> xdiff/xtypes.h | 3 +--
> xdiff/xutils.c | 12 +++++-----
> 8 files changed, 63 insertions(+), 69 deletions(-)
>
> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index 11cd090b53..a66125d44a 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -24,7 +24,7 @@
>
> static unsigned long get_hash(xdfile_t *xdf, long index)
> {
> - return xdf->recs[xdf->rindex[index]]->ha;
> + return xdf->recs[xdf->rindex[index]].ha;
> }
>
> #define XDL_MAX_COST_MIN 256
> @@ -489,13 +489,13 @@ static void measure_split(const xdfile_t *xdf, long split,
> m->indent = -1;
> } else {
> m->end_of_file = 0;
> - m->indent = get_indent(xdf->recs[split]);
> + m->indent = get_indent(&xdf->recs[split]);
> }
>
> m->pre_blank = 0;
> m->pre_indent = -1;
> for (i = split - 1; i >= 0; i--) {
> - m->pre_indent = get_indent(xdf->recs[i]);
> + m->pre_indent = get_indent(&xdf->recs[i]);
> if (m->pre_indent != -1)
> break;
> m->pre_blank += 1;
> @@ -508,7 +508,7 @@ static void measure_split(const xdfile_t *xdf, long split,
> m->post_blank = 0;
> m->post_indent = -1;
> for (i = split + 1; i < xdf->nrec; i++) {
> - m->post_indent = get_indent(xdf->recs[i]);
> + m->post_indent = get_indent(&xdf->recs[i]);
> if (m->post_indent != -1)
> break;
> m->post_blank += 1;
> @@ -752,7 +752,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
> static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
> {
> if (g->end < xdf->nrec &&
> - recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
> + recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
> xdf->rchg[g->start++] = 0;
> xdf->rchg[g->end++] = 1;
>
> @@ -773,7 +773,7 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
> static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
> {
> if (g->start > 0 &&
> - recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1])) {
> + recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
> xdf->rchg[--g->start] = 1;
> xdf->rchg[--g->end] = 0;
>
> @@ -988,16 +988,16 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
>
> for (xch = xscr; xch; xch = xch->next) {
> int ignore = 1;
> - xrecord_t **rec;
> + xrecord_t *rec;
> long i;
>
> rec = &xe->xdf1.recs[xch->i1];
> for (i = 0; i < xch->chg1 && ignore; i++)
> - ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
> + ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
>
> rec = &xe->xdf2.recs[xch->i2];
> for (i = 0; i < xch->chg2 && ignore; i++)
> - ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
> + ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
>
> xch->ignore = ignore;
> }
> @@ -1021,7 +1021,7 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
> xdchange_t *xch;
>
> for (xch = xscr; xch; xch = xch->next) {
> - xrecord_t **rec;
> + xrecord_t *rec;
> int ignore = 1;
> long i;
>
> @@ -1033,11 +1033,11 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
>
> rec = &xe->xdf1.recs[xch->i1];
> for (i = 0; i < xch->chg1 && ignore; i++)
> - ignore = record_matches_regex(rec[i], xpp);
> + ignore = record_matches_regex(&rec[i], xpp);
>
> rec = &xe->xdf2.recs[xch->i2];
> for (i = 0; i < xch->chg2 && ignore; i++)
> - ignore = record_matches_regex(rec[i], xpp);
> + ignore = record_matches_regex(&rec[i], xpp);
>
> xch->ignore = ignore;
> }
> diff --git a/xdiff/xemit.c b/xdiff/xemit.c
> index 2161ac3cd0..b2f1f30cd3 100644
> --- a/xdiff/xemit.c
> +++ b/xdiff/xemit.c
> @@ -25,7 +25,7 @@
>
> static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
> {
> - xrecord_t *rec = xdf->recs[ri];
> + xrecord_t *rec = &xdf->recs[ri];
>
> if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
> return -1;
> @@ -110,7 +110,7 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
> static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
> char *buf, long sz)
> {
> - xrecord_t *rec = xdf->recs[ri];
> + xrecord_t *rec = &xdf->recs[ri];
>
> if (!xecfg->find_func)
> return def_ff(rec->ptr, rec->size, buf, sz);
> @@ -150,7 +150,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
>
> static int is_empty_rec(xdfile_t *xdf, long ri)
> {
> - xrecord_t *rec = xdf->recs[ri];
> + xrecord_t *rec = &xdf->recs[ri];
> long i = 0;
>
> for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
> diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
> index 040d81e0bc..4d857e8ae2 100644
> --- a/xdiff/xhistogram.c
> +++ b/xdiff/xhistogram.c
> @@ -86,7 +86,7 @@ struct region {
> ((LINE_MAP(index, ptr))->cnt)
>
> #define REC(env, s, l) \
> - (env->xdf##s.recs[l - 1])
> + (&env->xdf##s.recs[l - 1])
>
> static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
> {
> diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
> index af40c88a5b..fd600cbb5d 100644
> --- a/xdiff/xmerge.c
> +++ b/xdiff/xmerge.c
> @@ -97,12 +97,12 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
> int line_count, long flags)
> {
> int i;
> - xrecord_t **rec1 = xe1->xdf2.recs + i1;
> - xrecord_t **rec2 = xe2->xdf2.recs + i2;
> + xrecord_t *rec1 = xe1->xdf2.recs + i1;
> + xrecord_t *rec2 = xe2->xdf2.recs + i2;
>
> for (i = 0; i < line_count; i++) {
> - int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
> - rec2[i]->ptr, rec2[i]->size, flags);
> + int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
> + rec2[i].ptr, rec2[i].size, flags);
> if (!result)
> return -1;
> }
> @@ -111,7 +111,7 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
>
> static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int needs_cr, int add_nl, char *dest)
> {
> - xrecord_t **recs;
> + xrecord_t *recs;
> int size = 0;
>
> recs = (use_orig ? xe->xdf1.recs : xe->xdf2.recs) + i;
> @@ -119,12 +119,12 @@ static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int nee
> if (count < 1)
> return 0;
>
> - for (i = 0; i < count; size += recs[i++]->size)
> + for (i = 0; i < count; size += recs[i++].size)
> if (dest)
> - memcpy(dest + size, recs[i]->ptr, recs[i]->size);
> + memcpy(dest + size, recs[i].ptr, recs[i].size);
> if (add_nl) {
> - i = recs[count - 1]->size;
> - if (i == 0 || recs[count - 1]->ptr[i - 1] != '\n') {
> + i = recs[count - 1].size;
> + if (i == 0 || recs[count - 1].ptr[i - 1] != '\n') {
> if (needs_cr) {
> if (dest)
> dest[size] = '\r';
> @@ -160,22 +160,22 @@ static int is_eol_crlf(xdfile_t *file, int i)
>
> if (i < file->nrec - 1)
> /* All lines before the last *must* end in LF */
> - return (size = file->recs[i]->size) > 1 &&
> - file->recs[i]->ptr[size - 2] == '\r';
> + return (size = file->recs[i].size) > 1 &&
> + file->recs[i].ptr[size - 2] == '\r';
> if (!file->nrec)
> /* Cannot determine eol style from empty file */
> return -1;
> - if ((size = file->recs[i]->size) &&
> - file->recs[i]->ptr[size - 1] == '\n')
> + if ((size = file->recs[i].size) &&
> + file->recs[i].ptr[size - 1] == '\n')
> /* Last line; ends in LF; Is it CR/LF? */
> return size > 1 &&
> - file->recs[i]->ptr[size - 2] == '\r';
> + file->recs[i].ptr[size - 2] == '\r';
> if (!i)
> /* The only line has no eol */
> return -1;
> /* Determine eol from second-to-last line */
> - return (size = file->recs[i - 1]->size) > 1 &&
> - file->recs[i - 1]->ptr[size - 2] == '\r';
> + return (size = file->recs[i - 1].size) > 1 &&
> + file->recs[i - 1].ptr[size - 2] == '\r';
> }
>
> static int is_cr_needed(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m)
> @@ -334,22 +334,22 @@ static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
> static void xdl_refine_zdiff3_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
> xpparam_t const *xpp)
> {
> - xrecord_t **rec1 = xe1->xdf2.recs, **rec2 = xe2->xdf2.recs;
> + xrecord_t *rec1 = xe1->xdf2.recs, *rec2 = xe2->xdf2.recs;
> for (; m; m = m->next) {
> /* let's handle just the conflicts */
> if (m->mode)
> continue;
>
> while(m->chg1 && m->chg2 &&
> - recmatch(rec1[m->i1], rec2[m->i2], xpp->flags)) {
> + recmatch(&rec1[m->i1], &rec2[m->i2], xpp->flags)) {
> m->chg1--;
> m->chg2--;
> m->i1++;
> m->i2++;
> }
> while (m->chg1 && m->chg2 &&
> - recmatch(rec1[m->i1 + m->chg1 - 1],
> - rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
> + recmatch(&rec1[m->i1 + m->chg1 - 1],
> + &rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
> m->chg1--;
> m->chg2--;
> }
> @@ -381,12 +381,12 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
> * This probably does not work outside git, since
> * we have a very simple mmfile structure.
> */
> - t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
> - t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
> - + xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
> - t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
> - t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
> - + xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
> + t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
> + t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
> + + xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
> + t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
> + t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
> + + xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
> if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
> return -1;
> if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
> @@ -440,8 +440,8 @@ static int line_contains_alnum(const char *ptr, long size)
> static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
> {
> for (; chg; chg--, i++)
> - if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
> - xe->xdf2.recs[i]->size))
> + if (line_contains_alnum(xe->xdf2.recs[i].ptr,
> + xe->xdf2.recs[i].size))
> return 1;
> return 0;
> }
> diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
> index 77dc411d19..bf69a58527 100644
> --- a/xdiff/xpatience.c
> +++ b/xdiff/xpatience.c
> @@ -88,9 +88,9 @@ static int is_anchor(xpparam_t const *xpp, const char *line)
> static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
> int pass)
> {
> - xrecord_t **records = pass == 1 ?
> + xrecord_t *records = pass == 1 ?
> map->env->xdf1.recs : map->env->xdf2.recs;
> - xrecord_t *record = records[line - 1];
> + xrecord_t *record = &records[line - 1];
> /*
> * After xdl_prepare_env() (or more precisely, due to
> * xdl_classify_record()), the "ha" member of the records (AKA lines)
> @@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
> return;
> map->entries[index].line1 = line;
> map->entries[index].hash = record->ha;
> - map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
> + map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
> if (!map->first)
> map->first = map->entries + index;
> if (map->last) {
> @@ -246,8 +246,8 @@ static int find_longest_common_sequence(struct hashmap *map, struct entry **res)
>
> static int match(struct hashmap *map, int line1, int line2)
> {
> - xrecord_t *record1 = map->env->xdf1.recs[line1 - 1];
> - xrecord_t *record2 = map->env->xdf2.recs[line2 - 1];
> + xrecord_t *record1 = &map->env->xdf1.recs[line1 - 1];
> + xrecord_t *record2 = &map->env->xdf2.recs[line2 - 1];
> return record1->ha == record2->ha;
> }
>
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index 6f1d4b4725..92f9845003 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -131,7 +131,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
> xdl_free(xdf->rindex);
> xdl_free(xdf->rchg - 1);
> xdl_free(xdf->recs);
> - xdl_cha_free(&xdf->rcha);
> }
>
>
> @@ -146,8 +145,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> xdf->rchg = NULL;
> xdf->recs = NULL;
>
> - if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
> - goto abort;
> if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
> goto abort;
>
> @@ -158,12 +155,10 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> hav = xdl_hash_record(&cur, top, xpp->flags);
> if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
> goto abort;
> - if (!(crec = xdl_cha_alloc(&xdf->rcha)))
> - goto abort;
> + crec = &xdf->recs[xdf->nrec++];
> crec->ptr = prev;
> crec->size = (long) (cur - prev);
> crec->ha = hav;
> - xdf->recs[xdf->nrec++] = crec;
> if (xdl_classify_record(pass, cf, crec) < 0)
> goto abort;
> }
> @@ -263,7 +258,7 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
> */
> static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
> long i, nm, nreff, mlim;
> - xrecord_t **recs;
> + xrecord_t *recs;
> xdlclass_t *rcrec;
> char *dis, *dis1, *dis2;
> int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
> @@ -276,7 +271,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
> mlim = XDL_MAX_EQLIMIT;
> for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
> - rcrec = cf->rcrecs[(*recs)->ha];
> + rcrec = cf->rcrecs[recs->ha];
> nm = rcrec ? rcrec->len2 : 0;
> dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
> }
> @@ -284,7 +279,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
> mlim = XDL_MAX_EQLIMIT;
> for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
> - rcrec = cf->rcrecs[(*recs)->ha];
> + rcrec = cf->rcrecs[recs->ha];
> nm = rcrec ? rcrec->len1 : 0;
> dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
> }
> @@ -320,13 +315,13 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> */
> static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
> long i, lim;
> - xrecord_t **recs1, **recs2;
> + xrecord_t *recs1, *recs2;
>
> recs1 = xdf1->recs;
> recs2 = xdf2->recs;
> for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
> i++, recs1++, recs2++)
> - if ((*recs1)->ha != (*recs2)->ha)
> + if (recs1->ha != recs2->ha)
> break;
>
> xdf1->dstart = xdf2->dstart = i;
> @@ -334,7 +329,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
> recs1 = xdf1->recs + xdf1->nrec - 1;
> recs2 = xdf2->recs + xdf2->nrec - 1;
> for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
> - if ((*recs1)->ha != (*recs2)->ha)
> + if (recs1->ha != recs2->ha)
> break;
>
> xdf1->dend = xdf1->nrec - i - 1;
> diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
> index 85848f1685..3d26cbf1ec 100644
> --- a/xdiff/xtypes.h
> +++ b/xdiff/xtypes.h
> @@ -45,10 +45,9 @@ typedef struct s_xrecord {
> } xrecord_t;
>
> typedef struct s_xdfile {
> - chastore_t rcha;
> + xrecord_t *recs;
> long nrec;
> long dstart, dend;
> - xrecord_t **recs;
> char *rchg;
> long *rindex;
> long nreff;
> diff --git a/xdiff/xutils.c b/xdiff/xutils.c
> index 444a108f87..332982b509 100644
> --- a/xdiff/xutils.c
> +++ b/xdiff/xutils.c
> @@ -416,12 +416,12 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
> mmfile_t subfile1, subfile2;
> xdfenv_t env;
>
> - subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
> - subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
> - diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
> - subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
> - subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
> - diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
> + subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
> + subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
> + diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
> + subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
> + subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
> + diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
> if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
> return -1;
>
> --
> gitgitgadget
You weren't kidding with the --color-words callout; there's an awful
lot of places where you only change one or two characters (e.g. '->'
becoming '.'); that's much easier to see when viewing the diff with
that flag.
Anyway, looks good.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 09/17] xdiff: treat xdfile_t.rchg like an enum
2025-09-07 19:45 ` [PATCH 09/17] xdiff: treat xdfile_t.rchg like an enum Ezekiel Newren via GitGitGadget
@ 2025-09-09 8:58 ` Elijah Newren
0 siblings, 0 replies; 158+ messages in thread
From: Elijah Newren @ 2025-09-09 8:58 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
On Sun, Sep 7, 2025 at 12:46 PM Ezekiel Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Define macros NO(0), YES(1), MAYBE(2) as the enum values for rchg to
> make the code easier to follow. Perhaps 'rchg' should be renamed to
> 'changed'?
>
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xdiff.h | 4 ++++
> xdiff/xdiffi.c | 29 ++++++++++++++---------------
> xdiff/xhistogram.c | 8 ++++----
> xdiff/xpatience.c | 8 ++++----
> xdiff/xprepare.c | 24 ++++++++++++------------
> 5 files changed, 38 insertions(+), 35 deletions(-)
>
> diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
> index 2cecde5afe..7092879829 100644
> --- a/xdiff/xdiff.h
> +++ b/xdiff/xdiff.h
> @@ -27,6 +27,10 @@
> extern "C" {
> #endif /* #ifdef __cplusplus */
>
> +#define NO 0
> +#define YES 1
> +#define MAYBE 2
> +
> /* xpparm_t.flags */
> #define XDF_NEED_MINIMAL (1 << 0)
>
> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index a66125d44a..44fd27823a 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
> */
> if (off1 == lim1) {
> for (; off2 < lim2; off2++)
> - xdf2->rchg[xdf2->rindex[off2]] = 1;
> + xdf2->rchg[xdf2->rindex[off2]] = YES;
> } else if (off2 == lim2) {
> for (; off1 < lim1; off1++)
> - xdf1->rchg[xdf1->rindex[off1]] = 1;
> + xdf1->rchg[xdf1->rindex[off1]] = YES;
> } else {
> xdpsplit_t spl;
> spl.i1 = spl.i2 = 0;
> @@ -708,7 +708,7 @@ struct xdlgroup {
> static void group_init(xdfile_t *xdf, struct xdlgroup *g)
> {
> g->start = g->end = 0;
> - while (xdf->rchg[g->end])
> + while (xdf->rchg[g->end] == YES)
You've got a few places like this where the old code would have
behaved differently if there were some MAYBE values. I presume you've
carefully vetted that those can't happen at these points in the code,
but it might be worth calling that out in the commit message for
reviewers who'll otherwise wonder if there's a behavior change that
has occurred.
> g->end++;
> }
>
> @@ -722,7 +722,7 @@ static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
> return -1;
>
> g->start = g->end + 1;
> - for (g->end = g->start; xdf->rchg[g->end]; g->end++)
> + for (g->end = g->start; xdf->rchg[g->end] == YES; g->end++)
Here's another of those where you assume MAYBE isn't possible.
> ;
>
> return 0;
> @@ -738,7 +738,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
> return -1;
>
> g->end = g->start - 1;
> - for (g->start = g->end; xdf->rchg[g->start - 1]; g->start--)
> + for (g->start = g->end; xdf->rchg[g->start - 1] == YES; g->start--)
...and another.
> ;
>
> return 0;
> @@ -753,10 +753,10 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
> {
> if (g->end < xdf->nrec &&
> recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
> - xdf->rchg[g->start++] = 0;
> - xdf->rchg[g->end++] = 1;
> + xdf->rchg[g->start++] = NO;
> + xdf->rchg[g->end++] = YES;
>
> - while (xdf->rchg[g->end])
> + while (xdf->rchg[g->end] == YES)
...and another.
> g->end++;
>
> return 0;
> @@ -774,10 +774,10 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
> {
> if (g->start > 0 &&
> recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
> - xdf->rchg[--g->start] = 1;
> - xdf->rchg[--g->end] = 0;
> + xdf->rchg[--g->start] = YES;
> + xdf->rchg[--g->end] = NO;
>
> - while (xdf->rchg[g->start - 1])
> + while (xdf->rchg[g->start - 1] == YES)
...and another.
> g->start--;
>
> return 0;
> @@ -932,16 +932,15 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
>
> int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
> xdchange_t *cscr = NULL, *xch;
> - char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
> long i1, i2, l1, l2;
>
> /*
> * Trivial. Collects "groups" of changes and creates an edit script.
> */
> for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
> - if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
> - for (l1 = i1; rchg1[i1 - 1]; i1--);
> - for (l2 = i2; rchg2[i2 - 1]; i2--);
> + if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
> + for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
> + for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
The changes in this xdl_build_script() function appear to be
orthogonal to what was described in the commit message. If it's a
separate cleanup, perhaps justify it in a separate patch?
>
> if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
> xdl_free_script(cscr);
> diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
> index 4d857e8ae2..c2e85b8ab9 100644
> --- a/xdiff/xhistogram.c
> +++ b/xdiff/xhistogram.c
> @@ -318,11 +318,11 @@ redo:
>
> if (!count1) {
> while(count2--)
> - env->xdf2.rchg[line2++ - 1] = 1;
> + env->xdf2.rchg[line2++ - 1] = YES;
> return 0;
> } else if (!count2) {
> while(count1--)
> - env->xdf1.rchg[line1++ - 1] = 1;
> + env->xdf1.rchg[line1++ - 1] = YES;
> return 0;
> }
>
> @@ -335,9 +335,9 @@ redo:
> else {
> if (lcs.begin1 == 0 && lcs.begin2 == 0) {
> while (count1--)
> - env->xdf1.rchg[line1++ - 1] = 1;
> + env->xdf1.rchg[line1++ - 1] = YES;
> while (count2--)
> - env->xdf2.rchg[line2++ - 1] = 1;
> + env->xdf2.rchg[line2++ - 1] = YES;
> result = 0;
> } else {
> result = histogram_diff(xpp, env,
> diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
> index bf69a58527..20cda5e258 100644
> --- a/xdiff/xpatience.c
> +++ b/xdiff/xpatience.c
> @@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
> /* trivial case: one side is empty */
> if (!count1) {
> while(count2--)
> - env->xdf2.rchg[line2++ - 1] = 1;
> + env->xdf2.rchg[line2++ - 1] = YES;
> return 0;
> } else if (!count2) {
> while(count1--)
> - env->xdf1.rchg[line1++ - 1] = 1;
> + env->xdf1.rchg[line1++ - 1] = YES;
> return 0;
> }
>
> @@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
> /* are there any matching lines at all? */
> if (!map.has_matches) {
> while(count1--)
> - env->xdf1.rchg[line1++ - 1] = 1;
> + env->xdf1.rchg[line1++ - 1] = YES;
> while(count2--)
> - env->xdf2.rchg[line2++ - 1] = 1;
> + env->xdf2.rchg[line2++ - 1] = YES;
> xdl_free(map.entries);
> return 0;
> }
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index 92f9845003..36437f91bb 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -215,9 +215,9 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
> * current line (i) is already a multimatch line.
> */
> for (r = 1, rdis0 = 0, rpdis0 = 1; (i - r) >= s; r++) {
> - if (!dis[i - r])
> + if (dis[i - r] == NO)
> rdis0++;
> - else if (dis[i - r] == 2)
> + else if (dis[i - r] == MAYBE)
> rpdis0++;
> else
> break;
> @@ -231,9 +231,9 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
> if (rdis0 == 0)
> return 0;
> for (r = 1, rdis1 = 0, rpdis1 = 1; (i + r) <= e; r++) {
> - if (!dis[i + r])
> + if (dis[i + r] == NO)
> rdis1++;
> - else if (dis[i + r] == 2)
> + else if (dis[i + r] == MAYBE)
> rpdis1++;
> else
> break;
> @@ -273,7 +273,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
> rcrec = cf->rcrecs[recs->ha];
> nm = rcrec ? rcrec->len2 : 0;
> - dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
> + dis1[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
> }
>
> if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
> @@ -281,26 +281,26 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
> rcrec = cf->rcrecs[recs->ha];
> nm = rcrec ? rcrec->len1 : 0;
> - dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
> + dis2[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
> }
>
> for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
> i <= xdf1->dend; i++, recs++) {
> - if (dis1[i] == 1 ||
> - (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
> + if (dis1[i] == YES ||
> + (dis1[i] == MAYBE && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
> xdf1->rindex[nreff++] = i;
> } else
> - xdf1->rchg[i] = 1;
> + xdf1->rchg[i] = YES;
> }
> xdf1->nreff = nreff;
>
> for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
> i <= xdf2->dend; i++, recs++) {
> - if (dis2[i] == 1 ||
> - (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
> + if (dis2[i] == YES ||
> + (dis2[i] == MAYBE && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
> xdf2->rindex[nreff++] = i;
> } else
> - xdf2->rchg[i] = 1;
> + xdf2->rchg[i] = YES;
> }
> xdf2->nreff = nreff;
>
> --
> gitgitgadget
Everything else looks like the straightforward translation you called
out in your commit message.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 08/17] xdiff: delete chastore from xdfile_t, view with --color-words
2025-09-09 8:58 ` Elijah Newren
@ 2025-09-09 13:50 ` Phillip Wood
2025-09-09 20:33 ` Junio C Hamano
2025-09-10 22:02 ` Ben Knoble
2 siblings, 0 replies; 158+ messages in thread
From: Phillip Wood @ 2025-09-09 13:50 UTC (permalink / raw)
To: Elijah Newren, Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
On 09/09/2025 09:58, Elijah Newren wrote:
> On Sun, Sep 7, 2025 at 12:46 PM Ezekiel Newren via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Ezekiel Newren <ezekielnewren@gmail.com>
>> The chastore_t type is very unfriendly to Rust FFI. It's also redundant
>> since 'recs' is a vector type that grows every time an xrecord_t is
>> added.
>
> The second sentence seems to presume the reader knows what chastore_t
> type is for, and about the confusing dual layering between it and
> recs.its confusing dual layering. I liked your more extended
> explanation in https://lore.kernel.org/git/7ea2dccd71fc502f20614ce217fc9885d1b17413.1756496539.git.gitgitgadget@gmail.com/;
> could some of that be used here?
I agree that's a better explaination. I also think it would be helpful
to spell out the implications of this change. If I understand the change
correctly we now store all the records in a contiguous array, rather
than having the records in a arena and storing a separate array of
pointers to those records. As sizeof(xrecord_t) is pretty small the
change to contiguous storage hopefully wont cause any allocation issues,
though I guess it does mean we end up copying more data as we grow the
array compared to using an arena.
Overall these first few patches look like a really nice cleanup.
Thanks
Phillip
>>
>> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
>> ---
>> xdiff/xdiffi.c | 24 ++++++++++----------
>> xdiff/xemit.c | 6 ++---
>> xdiff/xhistogram.c | 2 +-
>> xdiff/xmerge.c | 56 +++++++++++++++++++++++-----------------------
>> xdiff/xpatience.c | 10 ++++-----
>> xdiff/xprepare.c | 19 ++++++----------
>> xdiff/xtypes.h | 3 +--
>> xdiff/xutils.c | 12 +++++-----
>> 8 files changed, 63 insertions(+), 69 deletions(-)
>>
>> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
>> index 11cd090b53..a66125d44a 100644
>> --- a/xdiff/xdiffi.c
>> +++ b/xdiff/xdiffi.c
>> @@ -24,7 +24,7 @@
>>
>> static unsigned long get_hash(xdfile_t *xdf, long index)
>> {
>> - return xdf->recs[xdf->rindex[index]]->ha;
>> + return xdf->recs[xdf->rindex[index]].ha;
>> }
>>
>> #define XDL_MAX_COST_MIN 256
>> @@ -489,13 +489,13 @@ static void measure_split(const xdfile_t *xdf, long split,
>> m->indent = -1;
>> } else {
>> m->end_of_file = 0;
>> - m->indent = get_indent(xdf->recs[split]);
>> + m->indent = get_indent(&xdf->recs[split]);
>> }
>>
>> m->pre_blank = 0;
>> m->pre_indent = -1;
>> for (i = split - 1; i >= 0; i--) {
>> - m->pre_indent = get_indent(xdf->recs[i]);
>> + m->pre_indent = get_indent(&xdf->recs[i]);
>> if (m->pre_indent != -1)
>> break;
>> m->pre_blank += 1;
>> @@ -508,7 +508,7 @@ static void measure_split(const xdfile_t *xdf, long split,
>> m->post_blank = 0;
>> m->post_indent = -1;
>> for (i = split + 1; i < xdf->nrec; i++) {
>> - m->post_indent = get_indent(xdf->recs[i]);
>> + m->post_indent = get_indent(&xdf->recs[i]);
>> if (m->post_indent != -1)
>> break;
>> m->post_blank += 1;
>> @@ -752,7 +752,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
>> static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
>> {
>> if (g->end < xdf->nrec &&
>> - recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
>> + recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
>> xdf->rchg[g->start++] = 0;
>> xdf->rchg[g->end++] = 1;
>>
>> @@ -773,7 +773,7 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
>> static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
>> {
>> if (g->start > 0 &&
>> - recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1])) {
>> + recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
>> xdf->rchg[--g->start] = 1;
>> xdf->rchg[--g->end] = 0;
>>
>> @@ -988,16 +988,16 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
>>
>> for (xch = xscr; xch; xch = xch->next) {
>> int ignore = 1;
>> - xrecord_t **rec;
>> + xrecord_t *rec;
>> long i;
>>
>> rec = &xe->xdf1.recs[xch->i1];
>> for (i = 0; i < xch->chg1 && ignore; i++)
>> - ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
>> + ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
>>
>> rec = &xe->xdf2.recs[xch->i2];
>> for (i = 0; i < xch->chg2 && ignore; i++)
>> - ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
>> + ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
>>
>> xch->ignore = ignore;
>> }
>> @@ -1021,7 +1021,7 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
>> xdchange_t *xch;
>>
>> for (xch = xscr; xch; xch = xch->next) {
>> - xrecord_t **rec;
>> + xrecord_t *rec;
>> int ignore = 1;
>> long i;
>>
>> @@ -1033,11 +1033,11 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
>>
>> rec = &xe->xdf1.recs[xch->i1];
>> for (i = 0; i < xch->chg1 && ignore; i++)
>> - ignore = record_matches_regex(rec[i], xpp);
>> + ignore = record_matches_regex(&rec[i], xpp);
>>
>> rec = &xe->xdf2.recs[xch->i2];
>> for (i = 0; i < xch->chg2 && ignore; i++)
>> - ignore = record_matches_regex(rec[i], xpp);
>> + ignore = record_matches_regex(&rec[i], xpp);
>>
>> xch->ignore = ignore;
>> }
>> diff --git a/xdiff/xemit.c b/xdiff/xemit.c
>> index 2161ac3cd0..b2f1f30cd3 100644
>> --- a/xdiff/xemit.c
>> +++ b/xdiff/xemit.c
>> @@ -25,7 +25,7 @@
>>
>> static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
>> {
>> - xrecord_t *rec = xdf->recs[ri];
>> + xrecord_t *rec = &xdf->recs[ri];
>>
>> if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
>> return -1;
>> @@ -110,7 +110,7 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
>> static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
>> char *buf, long sz)
>> {
>> - xrecord_t *rec = xdf->recs[ri];
>> + xrecord_t *rec = &xdf->recs[ri];
>>
>> if (!xecfg->find_func)
>> return def_ff(rec->ptr, rec->size, buf, sz);
>> @@ -150,7 +150,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
>>
>> static int is_empty_rec(xdfile_t *xdf, long ri)
>> {
>> - xrecord_t *rec = xdf->recs[ri];
>> + xrecord_t *rec = &xdf->recs[ri];
>> long i = 0;
>>
>> for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
>> diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
>> index 040d81e0bc..4d857e8ae2 100644
>> --- a/xdiff/xhistogram.c
>> +++ b/xdiff/xhistogram.c
>> @@ -86,7 +86,7 @@ struct region {
>> ((LINE_MAP(index, ptr))->cnt)
>>
>> #define REC(env, s, l) \
>> - (env->xdf##s.recs[l - 1])
>> + (&env->xdf##s.recs[l - 1])
>>
>> static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
>> {
>> diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
>> index af40c88a5b..fd600cbb5d 100644
>> --- a/xdiff/xmerge.c
>> +++ b/xdiff/xmerge.c
>> @@ -97,12 +97,12 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
>> int line_count, long flags)
>> {
>> int i;
>> - xrecord_t **rec1 = xe1->xdf2.recs + i1;
>> - xrecord_t **rec2 = xe2->xdf2.recs + i2;
>> + xrecord_t *rec1 = xe1->xdf2.recs + i1;
>> + xrecord_t *rec2 = xe2->xdf2.recs + i2;
>>
>> for (i = 0; i < line_count; i++) {
>> - int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
>> - rec2[i]->ptr, rec2[i]->size, flags);
>> + int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
>> + rec2[i].ptr, rec2[i].size, flags);
>> if (!result)
>> return -1;
>> }
>> @@ -111,7 +111,7 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
>>
>> static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int needs_cr, int add_nl, char *dest)
>> {
>> - xrecord_t **recs;
>> + xrecord_t *recs;
>> int size = 0;
>>
>> recs = (use_orig ? xe->xdf1.recs : xe->xdf2.recs) + i;
>> @@ -119,12 +119,12 @@ static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int nee
>> if (count < 1)
>> return 0;
>>
>> - for (i = 0; i < count; size += recs[i++]->size)
>> + for (i = 0; i < count; size += recs[i++].size)
>> if (dest)
>> - memcpy(dest + size, recs[i]->ptr, recs[i]->size);
>> + memcpy(dest + size, recs[i].ptr, recs[i].size);
>> if (add_nl) {
>> - i = recs[count - 1]->size;
>> - if (i == 0 || recs[count - 1]->ptr[i - 1] != '\n') {
>> + i = recs[count - 1].size;
>> + if (i == 0 || recs[count - 1].ptr[i - 1] != '\n') {
>> if (needs_cr) {
>> if (dest)
>> dest[size] = '\r';
>> @@ -160,22 +160,22 @@ static int is_eol_crlf(xdfile_t *file, int i)
>>
>> if (i < file->nrec - 1)
>> /* All lines before the last *must* end in LF */
>> - return (size = file->recs[i]->size) > 1 &&
>> - file->recs[i]->ptr[size - 2] == '\r';
>> + return (size = file->recs[i].size) > 1 &&
>> + file->recs[i].ptr[size - 2] == '\r';
>> if (!file->nrec)
>> /* Cannot determine eol style from empty file */
>> return -1;
>> - if ((size = file->recs[i]->size) &&
>> - file->recs[i]->ptr[size - 1] == '\n')
>> + if ((size = file->recs[i].size) &&
>> + file->recs[i].ptr[size - 1] == '\n')
>> /* Last line; ends in LF; Is it CR/LF? */
>> return size > 1 &&
>> - file->recs[i]->ptr[size - 2] == '\r';
>> + file->recs[i].ptr[size - 2] == '\r';
>> if (!i)
>> /* The only line has no eol */
>> return -1;
>> /* Determine eol from second-to-last line */
>> - return (size = file->recs[i - 1]->size) > 1 &&
>> - file->recs[i - 1]->ptr[size - 2] == '\r';
>> + return (size = file->recs[i - 1].size) > 1 &&
>> + file->recs[i - 1].ptr[size - 2] == '\r';
>> }
>>
>> static int is_cr_needed(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m)
>> @@ -334,22 +334,22 @@ static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
>> static void xdl_refine_zdiff3_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
>> xpparam_t const *xpp)
>> {
>> - xrecord_t **rec1 = xe1->xdf2.recs, **rec2 = xe2->xdf2.recs;
>> + xrecord_t *rec1 = xe1->xdf2.recs, *rec2 = xe2->xdf2.recs;
>> for (; m; m = m->next) {
>> /* let's handle just the conflicts */
>> if (m->mode)
>> continue;
>>
>> while(m->chg1 && m->chg2 &&
>> - recmatch(rec1[m->i1], rec2[m->i2], xpp->flags)) {
>> + recmatch(&rec1[m->i1], &rec2[m->i2], xpp->flags)) {
>> m->chg1--;
>> m->chg2--;
>> m->i1++;
>> m->i2++;
>> }
>> while (m->chg1 && m->chg2 &&
>> - recmatch(rec1[m->i1 + m->chg1 - 1],
>> - rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
>> + recmatch(&rec1[m->i1 + m->chg1 - 1],
>> + &rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
>> m->chg1--;
>> m->chg2--;
>> }
>> @@ -381,12 +381,12 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
>> * This probably does not work outside git, since
>> * we have a very simple mmfile structure.
>> */
>> - t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
>> - t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
>> - + xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
>> - t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
>> - t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
>> - + xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
>> + t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
>> + t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
>> + + xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
>> + t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
>> + t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
>> + + xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
>> if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
>> return -1;
>> if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
>> @@ -440,8 +440,8 @@ static int line_contains_alnum(const char *ptr, long size)
>> static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
>> {
>> for (; chg; chg--, i++)
>> - if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
>> - xe->xdf2.recs[i]->size))
>> + if (line_contains_alnum(xe->xdf2.recs[i].ptr,
>> + xe->xdf2.recs[i].size))
>> return 1;
>> return 0;
>> }
>> diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
>> index 77dc411d19..bf69a58527 100644
>> --- a/xdiff/xpatience.c
>> +++ b/xdiff/xpatience.c
>> @@ -88,9 +88,9 @@ static int is_anchor(xpparam_t const *xpp, const char *line)
>> static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
>> int pass)
>> {
>> - xrecord_t **records = pass == 1 ?
>> + xrecord_t *records = pass == 1 ?
>> map->env->xdf1.recs : map->env->xdf2.recs;
>> - xrecord_t *record = records[line - 1];
>> + xrecord_t *record = &records[line - 1];
>> /*
>> * After xdl_prepare_env() (or more precisely, due to
>> * xdl_classify_record()), the "ha" member of the records (AKA lines)
>> @@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
>> return;
>> map->entries[index].line1 = line;
>> map->entries[index].hash = record->ha;
>> - map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
>> + map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
>> if (!map->first)
>> map->first = map->entries + index;
>> if (map->last) {
>> @@ -246,8 +246,8 @@ static int find_longest_common_sequence(struct hashmap *map, struct entry **res)
>>
>> static int match(struct hashmap *map, int line1, int line2)
>> {
>> - xrecord_t *record1 = map->env->xdf1.recs[line1 - 1];
>> - xrecord_t *record2 = map->env->xdf2.recs[line2 - 1];
>> + xrecord_t *record1 = &map->env->xdf1.recs[line1 - 1];
>> + xrecord_t *record2 = &map->env->xdf2.recs[line2 - 1];
>> return record1->ha == record2->ha;
>> }
>>
>> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
>> index 6f1d4b4725..92f9845003 100644
>> --- a/xdiff/xprepare.c
>> +++ b/xdiff/xprepare.c
>> @@ -131,7 +131,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
>> xdl_free(xdf->rindex);
>> xdl_free(xdf->rchg - 1);
>> xdl_free(xdf->recs);
>> - xdl_cha_free(&xdf->rcha);
>> }
>>
>>
>> @@ -146,8 +145,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
>> xdf->rchg = NULL;
>> xdf->recs = NULL;
>>
>> - if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
>> - goto abort;
>> if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
>> goto abort;
>>
>> @@ -158,12 +155,10 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
>> hav = xdl_hash_record(&cur, top, xpp->flags);
>> if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
>> goto abort;
>> - if (!(crec = xdl_cha_alloc(&xdf->rcha)))
>> - goto abort;
>> + crec = &xdf->recs[xdf->nrec++];
>> crec->ptr = prev;
>> crec->size = (long) (cur - prev);
>> crec->ha = hav;
>> - xdf->recs[xdf->nrec++] = crec;
>> if (xdl_classify_record(pass, cf, crec) < 0)
>> goto abort;
>> }
>> @@ -263,7 +258,7 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
>> */
>> static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
>> long i, nm, nreff, mlim;
>> - xrecord_t **recs;
>> + xrecord_t *recs;
>> xdlclass_t *rcrec;
>> char *dis, *dis1, *dis2;
>> int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
>> @@ -276,7 +271,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
>> if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
>> mlim = XDL_MAX_EQLIMIT;
>> for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
>> - rcrec = cf->rcrecs[(*recs)->ha];
>> + rcrec = cf->rcrecs[recs->ha];
>> nm = rcrec ? rcrec->len2 : 0;
>> dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
>> }
>> @@ -284,7 +279,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
>> if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
>> mlim = XDL_MAX_EQLIMIT;
>> for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
>> - rcrec = cf->rcrecs[(*recs)->ha];
>> + rcrec = cf->rcrecs[recs->ha];
>> nm = rcrec ? rcrec->len1 : 0;
>> dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
>> }
>> @@ -320,13 +315,13 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
>> */
>> static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
>> long i, lim;
>> - xrecord_t **recs1, **recs2;
>> + xrecord_t *recs1, *recs2;
>>
>> recs1 = xdf1->recs;
>> recs2 = xdf2->recs;
>> for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
>> i++, recs1++, recs2++)
>> - if ((*recs1)->ha != (*recs2)->ha)
>> + if (recs1->ha != recs2->ha)
>> break;
>>
>> xdf1->dstart = xdf2->dstart = i;
>> @@ -334,7 +329,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
>> recs1 = xdf1->recs + xdf1->nrec - 1;
>> recs2 = xdf2->recs + xdf2->nrec - 1;
>> for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
>> - if ((*recs1)->ha != (*recs2)->ha)
>> + if (recs1->ha != recs2->ha)
>> break;
>>
>> xdf1->dend = xdf1->nrec - i - 1;
>> diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
>> index 85848f1685..3d26cbf1ec 100644
>> --- a/xdiff/xtypes.h
>> +++ b/xdiff/xtypes.h
>> @@ -45,10 +45,9 @@ typedef struct s_xrecord {
>> } xrecord_t;
>>
>> typedef struct s_xdfile {
>> - chastore_t rcha;
>> + xrecord_t *recs;
>> long nrec;
>> long dstart, dend;
>> - xrecord_t **recs;
>> char *rchg;
>> long *rindex;
>> long nreff;
>> diff --git a/xdiff/xutils.c b/xdiff/xutils.c
>> index 444a108f87..332982b509 100644
>> --- a/xdiff/xutils.c
>> +++ b/xdiff/xutils.c
>> @@ -416,12 +416,12 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
>> mmfile_t subfile1, subfile2;
>> xdfenv_t env;
>>
>> - subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
>> - subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
>> - diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
>> - subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
>> - subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
>> - diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
>> + subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
>> + subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
>> + diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
>> + subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
>> + subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
>> + diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
>> if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
>> return -1;
>>
>> --
>> gitgitgadget
>
> You weren't kidding with the --color-words callout; there's an awful
> lot of places where you only change one or two characters (e.g. '->'
> becoming '.'); that's much easier to see when viewing the diff with
> that flag.
>
> Anyway, looks good.
>
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 08/17] xdiff: delete chastore from xdfile_t, view with --color-words
2025-09-09 8:58 ` Elijah Newren
2025-09-09 13:50 ` Phillip Wood
@ 2025-09-09 20:33 ` Junio C Hamano
2025-09-10 22:02 ` Ben Knoble
2 siblings, 0 replies; 158+ messages in thread
From: Junio C Hamano @ 2025-09-09 20:33 UTC (permalink / raw)
To: Elijah Newren; +Cc: Ezekiel Newren via GitGitGadget, git, Ezekiel Newren
Elijah Newren <newren@gmail.com> writes:
> On Sun, Sep 7, 2025 at 12:46 PM Ezekiel Newren via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> My personal bias is that things like "view with --color-words" makes
> more sense to include near the end of the commit message, just before
> the sign-offs. Not sure if others agree on that.
FWIW, I am with you. Certainly not on the title. It is even fine
immediately below the three-dash line before the diffstat.
Thanks. I am enjoying to follow along the series by reading your
reviews.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 08/17] xdiff: delete chastore from xdfile_t, view with --color-words
2025-09-09 8:58 ` Elijah Newren
2025-09-09 13:50 ` Phillip Wood
2025-09-09 20:33 ` Junio C Hamano
@ 2025-09-10 22:02 ` Ben Knoble
2 siblings, 0 replies; 158+ messages in thread
From: Ben Knoble @ 2025-09-10 22:02 UTC (permalink / raw)
To: Elijah Newren; +Cc: Ezekiel Newren via GitGitGadget, git, Ezekiel Newren
> Le 9 sept. 2025 à 04:58, Elijah Newren <newren@gmail.com> a écrit :
>
> On Sun, Sep 7, 2025 at 12:46 PM Ezekiel Newren via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> My personal bias is that things like "view with --color-words" makes
> more sense to include near the end of the commit message, just before
> the sign-offs. Not sure if others agree on that.
I’ve been using a Best-viewed-with trailer; I saw Peff do something similar once, I think.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 00/17] Use rust types in xdiff.
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (16 preceding siblings ...)
2025-09-07 19:45 ` [PATCH 17/17] xdiff: change the types of dstart, dend, rchg, and rindex in xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-16 21:56 ` Junio C Hamano
2025-09-16 22:01 ` Ezekiel Newren
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
18 siblings, 1 reply; 158+ messages in thread
From: Junio C Hamano @ 2025-09-16 21:56 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget; +Cc: git, Ezekiel Newren
"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> This patch series has 2 parts:
>
> * Patches 1-9: Clean up xdiff, this can be merged without part 2.
> * Patches 10-17: Define Rust types in compat/rust_types.h and then start
> refactoring xdiff with Rust types. This depends on part 1.
>
> The cleanup in this patch series makes the structs xrecord_t and xdfile_t
> Rust FFI friendly. My opinion is that part 1 should be merged soon, while
> part 2 can be discussed further.
I think we saw that the earlier part were read carefully by Elijah
(and others may have read without finding anything worth commenting
on), so should we split this into two parts and start merging the
early 9 down to 'next' and then to 'master'?
Thanks.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 00/17] Use rust types in xdiff.
2025-09-16 21:56 ` [PATCH 00/17] Use rust types in xdiff Junio C Hamano
@ 2025-09-16 22:01 ` Ezekiel Newren
2025-09-17 2:16 ` Elijah Newren
2025-09-17 6:22 ` Junio C Hamano
0 siblings, 2 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-16 22:01 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Ezekiel Newren via GitGitGadget, git
On Tue, Sep 16, 2025 at 3:56 PM Junio C Hamano <gitster@pobox.com> wrote:
> I think we saw that the earlier part were read carefully by Elijah
> (and others may have read without finding anything worth commenting
> on), so should we split this into two parts and start merging the
> early 9 down to 'next' and then to 'master'?
I agree. 1-9 are ready to go. Do I need to create a new version of
this patch series? let it stand as is until it's been merged into
master, or something else?
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 00/17] Use rust types in xdiff.
2025-09-16 22:01 ` Ezekiel Newren
@ 2025-09-17 2:16 ` Elijah Newren
2025-09-17 13:53 ` Junio C Hamano
2025-09-17 6:22 ` Junio C Hamano
1 sibling, 1 reply; 158+ messages in thread
From: Elijah Newren @ 2025-09-17 2:16 UTC (permalink / raw)
To: Ezekiel Newren; +Cc: Junio C Hamano, Ezekiel Newren via GitGitGadget, git
On Tue, Sep 16, 2025 at 3:01 PM Ezekiel Newren <ezekielnewren@gmail.com> wrote:
>
> On Tue, Sep 16, 2025 at 3:56 PM Junio C Hamano <gitster@pobox.com> wrote:
> > I think we saw that the earlier part were read carefully by Elijah
> > (and others may have read without finding anything worth commenting
> > on), so should we split this into two parts and start merging the
> > early 9 down to 'next' and then to 'master'?
>
> I agree. 1-9 are ready to go. Do I need to create a new version of
> this patch series? let it stand as is until it's been merged into
> master, or something else?
I think 1-9 are close to ready to go, but there's several small
cleanups that would be nice to have in a v2 on patches 2, 4, 7, 8, 9.
See my comments on the patches, but it's things like adding detail to
commit messages or otherwise touching those up, removing orthogonal
style cleanups (or making them a separate patch), and removing extra
blank lines. Could we get a re-roll of just the first 9 patches with
these addressed? Then I think it'd be ready to merge down to 'next'.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 00/17] Use rust types in xdiff.
2025-09-16 22:01 ` Ezekiel Newren
2025-09-17 2:16 ` Elijah Newren
@ 2025-09-17 6:22 ` Junio C Hamano
1 sibling, 0 replies; 158+ messages in thread
From: Junio C Hamano @ 2025-09-17 6:22 UTC (permalink / raw)
To: Ezekiel Newren; +Cc: Ezekiel Newren via GitGitGadget, git
Ezekiel Newren <ezekielnewren@gmail.com> writes:
> On Tue, Sep 16, 2025 at 3:56 PM Junio C Hamano <gitster@pobox.com> wrote:
>> I think we saw that the earlier part were read carefully by Elijah
>> (and others may have read without finding anything worth commenting
>> on), so should we split this into two parts and start merging the
>> early 9 down to 'next' and then to 'master'?
>
> I agree. 1-9 are ready to go. Do I need to create a new version of
> this patch series? let it stand as is until it's been merged into
> master, or something else?
I have
- en/xdiff-cleanup topic that ends with cb1c89e5 (xdiff: treat
xdfile_t.rchg like an enum, 2025-09-07);
- en/xdiff-rust topic that
- forks from v2.51.0
- merges (with --no-ff) en/xdiff-cleanup
- has the remaining patches rebased on the above merge
and 'seen' will have the first one and then the other one merged.
Later, let's merge the former to 'next' and 'master', with the
understanding that everybody is already happy with its contents and
we won't need to touch anything in that first-half of the topic
until we merge it down to 'master'.
The remainder hasn't been commented on very much (yet). If you have
improvements, please do update these topmost 8 patches so that we
can replace the part of en/xdiff-rust above the merge.
Thanks.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH 00/17] Use rust types in xdiff.
2025-09-17 2:16 ` Elijah Newren
@ 2025-09-17 13:53 ` Junio C Hamano
0 siblings, 0 replies; 158+ messages in thread
From: Junio C Hamano @ 2025-09-17 13:53 UTC (permalink / raw)
To: Elijah Newren; +Cc: Ezekiel Newren, Ezekiel Newren via GitGitGadget, git
Elijah Newren <newren@gmail.com> writes:
> On Tue, Sep 16, 2025 at 3:01 PM Ezekiel Newren <ezekielnewren@gmail.com> wrote:
>>
>> On Tue, Sep 16, 2025 at 3:56 PM Junio C Hamano <gitster@pobox.com> wrote:
>> > I think we saw that the earlier part were read carefully by Elijah
>> > (and others may have read without finding anything worth commenting
>> > on), so should we split this into two parts and start merging the
>> > early 9 down to 'next' and then to 'master'?
>>
>> I agree. 1-9 are ready to go. Do I need to create a new version of
>> this patch series? let it stand as is until it's been merged into
>> master, or something else?
>
> I think 1-9 are close to ready to go, but there's several small
> cleanups that would be nice to have in a v2 on patches 2, 4, 7, 8, 9.
> See my comments on the patches, but it's things like adding detail to
> commit messages or otherwise touching those up, removing orthogonal
> style cleanups (or making them a separate patch), and removing extra
> blank lines. Could we get a re-roll of just the first 9 patches with
> these addressed? Then I think it'd be ready to merge down to 'next'.
Thanks for being careful. I went back to the review thread for
additional and/or unaddressed comments, and I agree that we may want
a bit of final polish.
Thanks.
^ permalink raw reply [flat|nested] 158+ messages in thread
* [PATCH v2 00/10] Use rust types in xdiff.
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
` (17 preceding siblings ...)
2025-09-16 21:56 ` [PATCH 00/17] Use rust types in xdiff Junio C Hamano
@ 2025-09-18 23:56 ` Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 01/10] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
` (10 more replies)
18 siblings, 11 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-18 23:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren
Changes since v1, to address review feedback.
* Only include the clean up patches; The remaining patches will be split
into a separate series.
* Commit message clarifications.
* Minor style cleanups.
* Performance impacts included in commit message of patch 8.
Relevant part of the original cover letter follows:
===================================================
This patch series involves ZERO Rust code and toolchains, which avoids the
debate about Rust's portability and timeline. Instead, it shows how Git can
immediately benefit from Rust's design choices without using it at all. The
rationale for using Rust types on the C and Rust side is addressed in the
commit that creates compat/rust_types.h.
This patch series has 2 parts:
* Patches 1-9: Clean up xdiff, this can be merged without part 2.
* Patches 10-17: Define Rust types in compat/rust_types.h and then start
refactoring xdiff with Rust types. This depends on part 1.
The cleanup in this patch series makes the structs xrecord_t and xdfile_t
Rust FFI friendly. My opinion is that part 1 should be merged soon, while
part 2 can be discussed further.
Before:
typedef struct s_xrecord {
struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
unsigned int hbits;
xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
unsigned long *ha;
} xdfile_t;
After cleanup:
typedef struct s_xrecord {
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
xrecord_t *recs;
long nrec;
long dstart, dend;
char *rchg;
long *rindex;
long nreff;
} xdfile_t;
===
Ezekiel Newren (10):
xdiff: delete static forward declarations in xprepare
xdiff: delete local variables and initialize/free xdfile_t directly
xdiff: delete unnecessary fields from xrecord_t and xdfile_t
xdiff: delete xdl_get_rec() in xemit
xdiff: delete struct diffdata_t
xdiff: delete redundant array xdfile_t.ha
xdiff: delete fields ha, line, size in xdlclass_t in favor of an
xrecord_t
xdiff: delete chastore from xdfile_t
xdiff: delete rchg aliasing
xdiff: treat xdfile_t.rchg like an enum
xdiff/xdiff.h | 4 +
xdiff/xdiffi.c | 101 ++++++++---------
xdiff/xdiffi.h | 11 +-
xdiff/xemit.c | 38 +++----
xdiff/xhistogram.c | 10 +-
xdiff/xmerge.c | 56 +++++-----
xdiff/xpatience.c | 18 ++--
xdiff/xprepare.c | 262 +++++++++++++++++----------------------------
xdiff/xtypes.h | 7 +-
xdiff/xutils.c | 12 +--
10 files changed, 212 insertions(+), 307 deletions(-)
base-commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2048%2Fezekielnewren%2Fuse_rust_types_in_xdiff-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2048/ezekielnewren/use_rust_types_in_xdiff-v2
Pull-Request: https://github.com/git/git/pull/2048
Range-diff vs v1:
1: 9cf9d09c07 ! 1: 784cffcef5 xdiff: delete static forward declarations in xprepare
@@ Commit message
Move xdl_prepare_env() later in the file to avoid the need
for static forward declarations.
+ Best-viewed-with: --color-moved
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
## xdiff/xprepare.c ##
2: 15832ad271 ! 2: b79157e64f xdiff: delete local variables and initialize/free xdfile_t directly
@@ Metadata
## Commit message ##
xdiff: delete local variables and initialize/free xdfile_t directly
- xdl_prepare_ctx() uses local variables and assigns them to the
- corresponding xdfile_t fields if there are no errors. Delete them and
- use the fields of xdfile_t directly.
+ These local variables are essentially a hand-rolled additional
+ implementation of xdl_free_ctx() inlined into xdl_prepare_ctx(). Modify
+ the code to use the existing xdl_free_ctx() function so there aren't
+ two ways to free such variables.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
@@ xdiff/xprepare.c: static int xdl_classify_record(unsigned int pass, xdlclassifie
+static void xdl_free_ctx(xdfile_t *xdf)
+{
-+
+ xdl_free(xdf->rhash);
+ xdl_free(xdf->rindex);
+ xdl_free(xdf->rchg - 1);
3: 7d5e387916 ! 3: 2e8de5be03 xdiff: delete unnecessary fields from xrecord_t and xdfile_t
@@ Commit message
xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.
+ Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
## xdiff/xprepare.c ##
@@ xdiff/xprepare.c: static int xdl_classify_record(unsigned int pass, xdlclassifie
return 0;
}
-@@ xdiff/xprepare.c: static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
+
static void xdl_free_ctx(xdfile_t *xdf)
{
-
- xdl_free(xdf->rhash);
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
4: ab17d8c23f ! 4: ddfee67e06 xdiff: delete xdl_get_rec() in xemit
@@ xdiff/xemit.c
#include "xinclude.h"
-static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
-
+-
- *rec = xdf->recs[ri]->ptr;
-
- return xdf->recs[ri]->size;
@@ xdiff/xemit.c
+{
+ xrecord_t *rec = xdf->recs[ri];
-+ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
++ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0) {
return -1;
-- }
+ }
- return 0;
- }
@@ xdiff/xemit.c: static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
5: 6cf371ec13 = 5: 807ce3e5aa xdiff: delete struct diffdata_t
6: bff4568602 ! 6: 0bacb1191d xdiff: delete redundant array xdfile_t.ha
@@ Commit message
When 0 <= i < xdfile_t.nreff the following is true:
xdfile_t.ha[i] == xdfile_t.recs[xdfile_t.rindex[i]]
+ This makes the code about 5% slower. The fields rindex and ha are
+ specific to the classic diff (myers and minimal). I plan on creating a
+ struct for classic diff, but there'a alot of cleanup that needs to be
+ done before that can happen and leaving ha in would make those cleanups
+ harder to follow.
+
+ A subsequent commit will delete the chastore cha from xdfile_t. That
+ later commit will investigate deleting ha and cha independently and
+ together.
+
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
## xdiff/xdiffi.c ##
@@ xdiff/xdiffi.c: int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
## xdiff/xprepare.c ##
@@ xdiff/xprepare.c: static void xdl_free_ctx(xdfile_t *xdf)
-
+ {
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
7: db3d4e9a89 ! 7: e1e94107c9 xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
@@ Metadata
## Commit message ##
xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
+ The fields from xdlclass_t are aliases of xrecord_t:
+ xdlclass_t.line -> xrecord_t.ptr
+ xdlclass_t.size -> xrecord_t.size
+ xdlclass_t.ha -> xrecord_t.ha
+
+ Remove aliasing from xdlclass_t, to reduce future refactoring mistakes.
+
+ Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
## xdiff/xprepare.c ##
8: e7d1933d1c ! 8: fae26d2a04 xdiff: delete chastore from xdfile_t, view with --color-words
@@ Metadata
Author: Ezekiel Newren <ezekielnewren@gmail.com>
## Commit message ##
- xdiff: delete chastore from xdfile_t, view with --color-words
+ xdiff: delete chastore from xdfile_t
- The chastore_t type is very unfriendly to Rust FFI. It's also redundant
- since 'recs' is a vector type that grows every time an xrecord_t is
- added.
+ xdfile_t currently uses chastore_t which is an arena allocator. I
+ think that xrecord_t used to be a linked list and recs didn't exist
+ originally. When recs was added I think they forgot to remove
+ xdfile_t.next, but was overlooked. This dual data structure setup
+ makes the code somewhat confusing.
+ Additionally the C type chastore_t isn't FFI friendly, and provides
+ little to no performance benefit over using realloc to grow an array.
+
+ Performance impact of deleting fields from xdfile_t:
+ Deleting ha is about 5% slower.
+ Deleting cha is about 5% faster.
+
+ Delete ha, but keep cha
+ time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
+ Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
+ Time (mean ± σ): 1.269 s ± 0.017 s [User: 1.135 s, System: 0.128 s]
+ Range (min … max): 1.249 s … 1.286 s 10 runs
+
+ Benchmark 2: build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
+ Time (mean ± σ): 1.339 s ± 0.017 s [User: 1.234 s, System: 0.099 s]
+ Range (min … max): 1.320 s … 1.358 s 10 runs
+
+ Summary
+ build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
+ 1.06 ± 0.02 times faster than build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
+
+ Delete cha, but keep ha
+ time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
+ Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
+ Time (mean ± σ): 1.290 s ± 0.001 s [User: 1.154 s, System: 0.130 s]
+ Range (min … max): 1.288 s … 1.292 s 10 runs
+
+ Benchmark 2: build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
+ Time (mean ± σ): 1.232 s ± 0.017 s [User: 1.105 s, System: 0.121 s]
+ Range (min … max): 1.205 s … 1.249 s 10 runs
+
+ Summary
+ build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
+ 1.05 ± 0.01 times faster than build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
+
+ Delete ha AND chastore
+ time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha_and_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
+ Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
+ Time (mean ± σ): 1.291 s ± 0.002 s [User: 1.156 s, System: 0.129 s]
+ Range (min … max): 1.287 s … 1.295 s 10 runs
+
+ Benchmark 2: build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
+ Time (mean ± σ): 1.306 s ± 0.001 s [User: 1.195 s, System: 0.105 s]
+ Range (min … max): 1.305 s … 1.308 s 10 runs
+
+ Summary
+ build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
+ 1.01 ± 0.00 times faster than build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
+
+ Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
## xdiff/xdiffi.c ##
@@ xdiff/xemit.c
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
- if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
+ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0) {
return -1;
@@ xdiff/xemit.c: static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
-: ---------- > 9: fd54135560 xdiff: delete rchg aliasing
9: d1657f5101 ! 10: 9a5ac3c488 xdiff: treat xdfile_t.rchg like an enum
@@ Commit message
make the code easier to follow. Perhaps 'rchg' should be renamed to
'changed'?
+ A few of the code changes might appear to change behavior, such as:
+ - while (xdf->rchg[g->start - 1])
+ + while (xdf->rchg[g->start - 1] == YES)
+ because it appears the value of MAYBE is being ignored. However, MAYBE
+ is only ever assigned as a value to a temporary array (dis1 & dis2) and
+ then as a last step use that temporary array to decide if it wants to
+ change xdfile_t.rchg[i] to YES or leave it as NO. As such, rchg will
+ never have a value of MAYBE and thus there is no behavioral change.
+
+ Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
## xdiff/xdiff.h ##
@@ xdiff/xdiffi.c: static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
g->start--;
return 0;
-@@ xdiff/xdiffi.c: int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
-
- int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
- xdchange_t *cscr = NULL, *xch;
-- char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
- long i1, i2, l1, l2;
-
- /*
- * Trivial. Collects "groups" of changes and creates an edit script.
- */
- for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
-- if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
-- for (l1 = i1; rchg1[i1 - 1]; i1--);
-- for (l2 = i2; rchg2[i2 - 1]; i2--);
-+ if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
-+ for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
-+ for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
-
- if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
- xdl_free_script(cscr);
## xdiff/xhistogram.c ##
@@ xdiff/xhistogram.c: redo:
10: 2a7d5b05c1 < -: ---------- compat/rust_types.h: define rust primitive types
11: ec54380ed3 < -: ---------- xdiff: include compat/rust_types.h
12: 182f93b60b < -: ---------- xdiff: make xrecord_t.ptr a u8 instead of char
13: f7aaef8f36 < -: ---------- xdiff: make xrecord_t.size a usize instead of long
14: af96763036 < -: ---------- xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash
15: 0a180f69ff < -: ---------- xdiff: make xdfile_t.nrec a usize instead of long
16: f4eda35e24 < -: ---------- xdiff: make xdfile_t.nreff a usize instead of long
17: 00401e775a < -: ---------- xdiff: change the types of dstart, dend, rchg, and rindex in xdfile_t
--
gitgitgadget
^ permalink raw reply [flat|nested] 158+ messages in thread
* [PATCH v2 01/10] xdiff: delete static forward declarations in xprepare
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
@ 2025-09-18 23:56 ` Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 02/10] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
` (9 subsequent siblings)
10 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-18 23:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Move xdl_prepare_env() later in the file to avoid the need
for static forward declarations.
Best-viewed-with: --color-moved
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 116 ++++++++++++++++++++---------------------------
1 file changed, 50 insertions(+), 66 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e1d4017b2d..a45c5ee208 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -53,21 +53,6 @@ typedef struct s_xdlclassifier {
-static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags);
-static void xdl_free_classifier(xdlclassifier_t *cf);
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec);
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
- xdlclassifier_t *cf, xdfile_t *xdf);
-static void xdl_free_ctx(xdfile_t *xdf);
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e);
-static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-
-
-
-
static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags) {
cf->flags = flags;
@@ -242,57 +227,6 @@ static void xdl_free_ctx(xdfile_t *xdf) {
}
-int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
- xdfenv_t *xe) {
- long enl1, enl2, sample;
- xdlclassifier_t cf;
-
- memset(&cf, 0, sizeof(cf));
-
- /*
- * For histogram diff, we can afford a smaller sample size and
- * thus a poorer estimate of the number of lines, as the hash
- * table (rhash) won't be filled up/grown. The number of lines
- * (nrecs) will be updated correctly anyway by
- * xdl_prepare_ctx().
- */
- sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
- ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
-
- enl1 = xdl_guess_lines(mf1, sample) + 1;
- enl2 = xdl_guess_lines(mf2, sample) + 1;
-
- if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
- return -1;
-
- if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
-
- xdl_free_classifier(&cf);
- return -1;
- }
- if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
- (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
- xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf2);
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- xdl_free_classifier(&cf);
-
- return 0;
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
@@ -460,3 +394,53 @@ static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2
return 0;
}
+
+int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+ xdfenv_t *xe) {
+ long enl1, enl2, sample;
+ xdlclassifier_t cf;
+
+ memset(&cf, 0, sizeof(cf));
+
+ /*
+ * For histogram diff, we can afford a smaller sample size and
+ * thus a poorer estimate of the number of lines, as the hash
+ * table (rhash) won't be filled up/grown. The number of lines
+ * (nrecs) will be updated correctly anyway by
+ * xdl_prepare_ctx().
+ */
+ sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
+ ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
+
+ enl1 = xdl_guess_lines(mf1, sample) + 1;
+ enl2 = xdl_guess_lines(mf2, sample) + 1;
+
+ if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
+ return -1;
+
+ if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
+
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+ if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
+ (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
+ xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf2);
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ xdl_free_classifier(&cf);
+
+ return 0;
+}
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v2 02/10] xdiff: delete local variables and initialize/free xdfile_t directly
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 01/10] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
@ 2025-09-18 23:56 ` Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 03/10] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
` (8 subsequent siblings)
10 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-18 23:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
These local variables are essentially a hand-rolled additional
implementation of xdl_free_ctx() inlined into xdl_prepare_ctx(). Modify
the code to use the existing xdl_free_ctx() function so there aren't
two ways to free such variables.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 78 +++++++++++++++++++-----------------------------
1 file changed, 30 insertions(+), 48 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index a45c5ee208..fe02fd7925 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -134,99 +134,81 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
}
+static void xdl_free_ctx(xdfile_t *xdf)
+{
+ xdl_free(xdf->rhash);
+ xdl_free(xdf->rindex);
+ xdl_free(xdf->rchg - 1);
+ xdl_free(xdf->ha);
+ xdl_free(xdf->recs);
+ xdl_cha_free(&xdf->rcha);
+}
+
+
static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
xdlclassifier_t *cf, xdfile_t *xdf) {
- unsigned int hbits;
- long nrec, hsize, bsize;
+ long bsize;
unsigned long hav;
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xrecord_t **recs;
- xrecord_t **rhash;
- unsigned long *ha;
- char *rchg;
- long *rindex;
- ha = NULL;
- rindex = NULL;
- rchg = NULL;
- rhash = NULL;
- recs = NULL;
+ xdf->ha = NULL;
+ xdf->rindex = NULL;
+ xdf->rchg = NULL;
+ xdf->rhash = NULL;
+ xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
goto abort;
- if (!XDL_ALLOC_ARRAY(recs, narec))
+ if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- hbits = xdl_hashbits((unsigned int) narec);
- hsize = 1 << hbits;
- if (!XDL_CALLOC_ARRAY(rhash, hsize))
+ xdf->hbits = xdl_hashbits((unsigned int) narec);
+ if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
goto abort;
- nrec = 0;
+ xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
prev = cur;
hav = xdl_hash_record(&cur, top, xpp->flags);
- if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
+ if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
if (!(crec = xdl_cha_alloc(&xdf->rcha)))
goto abort;
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- recs[nrec++] = crec;
- if (xdl_classify_record(pass, cf, rhash, hbits, crec) < 0)
+ xdf->recs[xdf->nrec++] = crec;
+ if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
goto abort;
}
}
- if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
+ if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
goto abort;
if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
- if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
goto abort;
}
- xdf->nrec = nrec;
- xdf->recs = recs;
- xdf->hbits = hbits;
- xdf->rhash = rhash;
- xdf->rchg = rchg + 1;
- xdf->rindex = rindex;
+ xdf->rchg += 1;
xdf->nreff = 0;
- xdf->ha = ha;
xdf->dstart = 0;
- xdf->dend = nrec - 1;
+ xdf->dend = xdf->nrec - 1;
return 0;
abort:
- xdl_free(ha);
- xdl_free(rindex);
- xdl_free(rchg);
- xdl_free(rhash);
- xdl_free(recs);
- xdl_cha_free(&xdf->rcha);
+ xdl_free_ctx(xdf);
return -1;
}
-static void xdl_free_ctx(xdfile_t *xdf) {
-
- xdl_free(xdf->rhash);
- xdl_free(xdf->rindex);
- xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
- xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v2 03/10] xdiff: delete unnecessary fields from xrecord_t and xdfile_t
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 01/10] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 02/10] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
@ 2025-09-18 23:56 ` Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 04/10] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
` (7 subsequent siblings)
10 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-18 23:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 15 ++-------------
xdiff/xtypes.h | 3 ---
2 files changed, 2 insertions(+), 16 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index fe02fd7925..7acca1cb38 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -91,8 +91,7 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
}
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec) {
+static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
char const *line;
xdlclass_t *rcrec;
@@ -126,17 +125,12 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
rec->ha = (unsigned long) rcrec->idx;
- hi = (long) XDL_HASHLONG(rec->ha, hbits);
- rec->next = rhash[hi];
- rhash[hi] = rec;
-
return 0;
}
static void xdl_free_ctx(xdfile_t *xdf)
{
- xdl_free(xdf->rhash);
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->ha);
@@ -155,7 +149,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
- xdf->rhash = NULL;
xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
@@ -163,10 +156,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- xdf->hbits = xdl_hashbits((unsigned int) narec);
- if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
- goto abort;
-
xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
@@ -180,7 +169,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
crec->size = (long) (cur - prev);
crec->ha = hav;
xdf->recs[xdf->nrec++] = crec;
- if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
+ if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8442bd436e..8b8467360e 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,7 +39,6 @@ typedef struct s_chastore {
} chastore_t;
typedef struct s_xrecord {
- struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
@@ -48,8 +47,6 @@ typedef struct s_xrecord {
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
- unsigned int hbits;
- xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v2 04/10] xdiff: delete xdl_get_rec() in xemit
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
` (2 preceding siblings ...)
2025-09-18 23:56 ` [PATCH v2 03/10] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-18 23:56 ` Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 05/10] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
` (6 subsequent siblings)
10 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-18 23:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
This function aliases the fields of xrecord_t, which makes it harder
to track the usages of those fields. Delete it.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xemit.c | 38 +++++++++++++-------------------------
1 file changed, 13 insertions(+), 25 deletions(-)
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 1d40c9cb40..b3793e81e2 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -22,21 +22,11 @@
#include "xinclude.h"
-static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
-
- *rec = xdf->recs[ri]->ptr;
-
- return xdf->recs[ri]->size;
-}
-
-
-static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
- long size, psize = strlen(pre);
- char const *rec;
-
- size = xdl_get_rec(xdf, ri, &rec);
- if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0) {
+static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
+{
+ xrecord_t *rec = xdf->recs[ri];
+ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0) {
return -1;
}
@@ -120,11 +110,11 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ xrecord_t *rec = xdf->recs[ri];
+
if (!xecfg->find_func)
- return def_ff(rec, len, buf, sz);
- return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
+ return def_ff(rec->ptr, rec->size, buf, sz);
+ return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
}
static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
@@ -160,14 +150,12 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ xrecord_t *rec = xdf->recs[ri];
+ long i = 0;
- while (len > 0 && XDL_ISSPACE(*rec)) {
- rec++;
- len--;
- }
- return !len;
+ for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
+
+ return i == rec->size;
}
int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v2 05/10] xdiff: delete struct diffdata_t
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
` (3 preceding siblings ...)
2025-09-18 23:56 ` [PATCH v2 04/10] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
@ 2025-09-18 23:56 ` Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 06/10] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
` (5 subsequent siblings)
10 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-18 23:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Every field in this struct is an alias for a certain field in xdfile_t.
diffdata_t.nrec -> xdfile_t.nreff
diffdata_t.ha -> xdfile_t.ha
diffdata_t.rindex -> xdfile_t.rindex
diffdata_t.rchg -> xdfile_t.rchg
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 32 ++++++++------------------------
xdiff/xdiffi.h | 11 ++---------
2 files changed, 10 insertions(+), 33 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 5a96e36dfb..bbf0161f84 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -257,10 +257,10 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
* sub-boxes by calling the box splitting function. Note that the real job
* (marking changed lines) is done in the two boundary reaching checks.
*/
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = dd1->ha, *ha2 = dd2->ha;
+ unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
@@ -273,17 +273,11 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
* be obviously changed.
*/
if (off1 == lim1) {
- char *rchg2 = dd2->rchg;
- long *rindex2 = dd2->rindex;
-
for (; off2 < lim2; off2++)
- rchg2[rindex2[off2]] = 1;
+ xdf2->rchg[xdf2->rindex[off2]] = 1;
} else if (off2 == lim2) {
- char *rchg1 = dd1->rchg;
- long *rindex1 = dd1->rindex;
-
for (; off1 < lim1; off1++)
- rchg1[rindex1[off1]] = 1;
+ xdf1->rchg[xdf1->rindex[off1]] = 1;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -300,9 +294,9 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
/*
* ... et Impera.
*/
- if (xdl_recs_cmp(dd1, off1, spl.i1, dd2, off2, spl.i2,
+ if (xdl_recs_cmp(xdf1, off1, spl.i1, xdf2, off2, spl.i2,
kvdf, kvdb, spl.min_lo, xenv) < 0 ||
- xdl_recs_cmp(dd1, spl.i1, lim1, dd2, spl.i2, lim2,
+ xdl_recs_cmp(xdf1, spl.i1, lim1, xdf2, spl.i2, lim2,
kvdf, kvdb, spl.min_hi, xenv) < 0) {
return -1;
@@ -318,7 +312,6 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
long ndiags;
long *kvd, *kvdf, *kvdb;
xdalgoenv_t xenv;
- diffdata_t dd1, dd2;
int res;
if (xdl_prepare_env(mf1, mf2, xpp, xe) < 0)
@@ -357,16 +350,7 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xenv.snake_cnt = XDL_SNAKE_CNT;
xenv.heur_min = XDL_HEUR_MIN_COST;
- dd1.nrec = xe->xdf1.nreff;
- dd1.ha = xe->xdf1.ha;
- dd1.rchg = xe->xdf1.rchg;
- dd1.rindex = xe->xdf1.rindex;
- dd2.nrec = xe->xdf2.nreff;
- dd2.ha = xe->xdf2.ha;
- dd2.rchg = xe->xdf2.rchg;
- dd2.rindex = xe->xdf2.rindex;
-
- res = xdl_recs_cmp(&dd1, 0, dd1.nrec, &dd2, 0, dd2.nrec,
+ res = xdl_recs_cmp(&xe->xdf1, 0, xe->xdf1.nreff, &xe->xdf2, 0, xe->xdf2.nreff,
kvdf, kvdb, (xpp->flags & XDF_NEED_MINIMAL) != 0,
&xenv);
xdl_free(kvd);
diff --git a/xdiff/xdiffi.h b/xdiff/xdiffi.h
index 126c9d8ff4..49e52c67f9 100644
--- a/xdiff/xdiffi.h
+++ b/xdiff/xdiffi.h
@@ -24,13 +24,6 @@
#define XDIFFI_H
-typedef struct s_diffdata {
- long nrec;
- unsigned long const *ha;
- long *rindex;
- char *rchg;
-} diffdata_t;
-
typedef struct s_xdalgoenv {
long mxcost;
long snake_cnt;
@@ -46,8 +39,8 @@ typedef struct s_xdchange {
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv);
int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xdfenv_t *xe);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v2 06/10] xdiff: delete redundant array xdfile_t.ha
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
` (4 preceding siblings ...)
2025-09-18 23:56 ` [PATCH v2 05/10] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
@ 2025-09-18 23:56 ` Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 07/10] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
` (4 subsequent siblings)
10 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-18 23:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
When 0 <= i < xdfile_t.nreff the following is true:
xdfile_t.ha[i] == xdfile_t.recs[xdfile_t.rindex[i]]
This makes the code about 5% slower. The fields rindex and ha are
specific to the classic diff (myers and minimal). I plan on creating a
struct for classic diff, but there'a alot of cleanup that needs to be
done before that can happen and leaving ha in would make those cleanups
harder to follow.
A subsequent commit will delete the chastore cha from xdfile_t. That
later commit will investigate deleting ha and cha independently and
together.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++++++----------
xdiff/xprepare.c | 12 ++----------
xdiff/xtypes.h | 1 -
3 files changed, 16 insertions(+), 21 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index bbf0161f84..11cd090b53 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -22,6 +22,11 @@
#include "xinclude.h"
+static unsigned long get_hash(xdfile_t *xdf, long index)
+{
+ return xdf->recs[xdf->rindex[index]]->ha;
+}
+
#define XDL_MAX_COST_MIN 256
#define XDL_HEUR_MIN_COST 256
#define XDL_LINE_MAX (long)((1UL << (CHAR_BIT * sizeof(long) - 1)) - 1)
@@ -42,8 +47,8 @@ typedef struct s_xdpsplit {
* using this algorithm, so a little bit of heuristic is needed to cut the
* search and to return a suboptimal point.
*/
-static long xdl_split(unsigned long const *ha1, long off1, long lim1,
- unsigned long const *ha2, long off2, long lim2,
+static long xdl_split(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl,
xdalgoenv_t *xenv) {
long dmin = off1 - lim2, dmax = lim1 - off2;
@@ -87,7 +92,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdf[d + 1];
prev1 = i1;
i2 = i1 - d;
- for (; i1 < lim1 && i2 < lim2 && ha1[i1] == ha2[i2]; i1++, i2++);
+ for (; i1 < lim1 && i2 < lim2 && get_hash(xdf1, i1) == get_hash(xdf2, i2); i1++, i2++);
if (i1 - prev1 > xenv->snake_cnt)
got_snake = 1;
kvdf[d] = i1;
@@ -124,7 +129,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdb[d + 1] - 1;
prev1 = i1;
i2 = i1 - d;
- for (; i1 > off1 && i2 > off2 && ha1[i1 - 1] == ha2[i2 - 1]; i1--, i2--);
+ for (; i1 > off1 && i2 > off2 && get_hash(xdf1, i1 - 1) == get_hash(xdf2, i2 - 1); i1--, i2--);
if (prev1 - i1 > xenv->snake_cnt)
got_snake = 1;
kvdb[d] = i1;
@@ -159,7 +164,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 + xenv->snake_cnt <= i1 && i1 < lim1 &&
off2 + xenv->snake_cnt <= i2 && i2 < lim2) {
- for (k = 1; ha1[i1 - k] == ha2[i2 - k]; k++)
+ for (k = 1; get_hash(xdf1, i1 - k) == get_hash(xdf2, i2 - k); k++)
if (k == xenv->snake_cnt) {
best = v;
spl->i1 = i1;
@@ -183,7 +188,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 < i1 && i1 <= lim1 - xenv->snake_cnt &&
off2 < i2 && i2 <= lim2 - xenv->snake_cnt) {
- for (k = 0; ha1[i1 + k] == ha2[i2 + k]; k++)
+ for (k = 0; get_hash(xdf1, i1 + k) == get_hash(xdf2, i2 + k); k++)
if (k == xenv->snake_cnt - 1) {
best = v;
spl->i1 = i1;
@@ -260,13 +265,12 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
*/
- for (; off1 < lim1 && off2 < lim2 && ha1[off1] == ha2[off2]; off1++, off2++);
- for (; off1 < lim1 && off2 < lim2 && ha1[lim1 - 1] == ha2[lim2 - 1]; lim1--, lim2--);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, off1) == get_hash(xdf2, off2); off1++, off2++);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, lim1 - 1) == get_hash(xdf2, lim2 - 1); lim1--, lim2--);
/*
* If one dimension is empty, then all records on the other one must
@@ -285,7 +289,7 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
/*
* Divide ...
*/
- if (xdl_split(ha1, off1, lim1, ha2, off2, lim2, kvdf, kvdb,
+ if (xdl_split(xdf1, off1, lim1, xdf2, off2, lim2, kvdf, kvdb,
need_min, &spl, xenv) < 0) {
return -1;
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 7acca1cb38..c39b65fea9 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -133,7 +133,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
{
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
xdl_free(xdf->recs);
xdl_cha_free(&xdf->rcha);
}
@@ -146,7 +145,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
xdf->recs = NULL;
@@ -181,8 +179,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
- goto abort;
}
xdf->rchg += 1;
@@ -300,9 +296,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf1->dend; i++, recs++) {
if (dis1[i] == 1 ||
(dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
- xdf1->rindex[nreff] = i;
- xdf1->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf1->rindex[nreff++] = i;
} else
xdf1->rchg[i] = 1;
}
@@ -312,9 +306,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf2->dend; i++, recs++) {
if (dis2[i] == 1 ||
(dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
- xdf2->rindex[nreff] = i;
- xdf2->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf2->rindex[nreff++] = i;
} else
xdf2->rchg[i] = 1;
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8b8467360e..85848f1685 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -52,7 +52,6 @@ typedef struct s_xdfile {
char *rchg;
long *rindex;
long nreff;
- unsigned long *ha;
} xdfile_t;
typedef struct s_xdfenv {
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v2 07/10] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
` (5 preceding siblings ...)
2025-09-18 23:56 ` [PATCH v2 06/10] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
@ 2025-09-18 23:56 ` Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 08/10] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
` (3 subsequent siblings)
10 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-18 23:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
The fields from xdlclass_t are aliases of xrecord_t:
xdlclass_t.line -> xrecord_t.ptr
xdlclass_t.size -> xrecord_t.size
xdlclass_t.ha -> xrecord_t.ha
Remove aliasing from xdlclass_t, to reduce future refactoring mistakes.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index c39b65fea9..43cebf6721 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -32,9 +32,7 @@
typedef struct s_xdlclass {
struct s_xdlclass *next;
- unsigned long ha;
- char const *line;
- long size;
+ xrecord_t rec;
long idx;
long len1, len2;
} xdlclass_t;
@@ -93,14 +91,12 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
- char const *line;
xdlclass_t *rcrec;
- line = rec->ptr;
hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
- if (rcrec->ha == rec->ha &&
- xdl_recmatch(rcrec->line, rcrec->size,
+ if (rcrec->rec.ha == rec->ha &&
+ xdl_recmatch(rcrec->rec.ptr, rcrec->rec.size,
rec->ptr, rec->size, cf->flags))
break;
@@ -113,9 +109,9 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
if (XDL_ALLOC_GROW(cf->rcrecs, cf->count, cf->alloc))
return -1;
cf->rcrecs[rcrec->idx] = rcrec;
- rcrec->line = line;
- rcrec->size = rec->size;
- rcrec->ha = rec->ha;
+ rcrec->rec.ptr = rec->ptr;
+ rcrec->rec.size = rec->size;
+ rcrec->rec.ha = rec->ha;
rcrec->len1 = rcrec->len2 = 0;
rcrec->next = cf->rchash[hi];
cf->rchash[hi] = rcrec;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v2 08/10] xdiff: delete chastore from xdfile_t
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
` (6 preceding siblings ...)
2025-09-18 23:56 ` [PATCH v2 07/10] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-18 23:56 ` Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 10/10] xdiff: treat xdfile_t.rchg like an enum Ezekiel Newren via GitGitGadget
` (2 subsequent siblings)
10 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-18 23:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xdfile_t currently uses chastore_t which is an arena allocator. I
think that xrecord_t used to be a linked list and recs didn't exist
originally. When recs was added I think they forgot to remove
xdfile_t.next, but was overlooked. This dual data structure setup
makes the code somewhat confusing.
Additionally the C type chastore_t isn't FFI friendly, and provides
little to no performance benefit over using realloc to grow an array.
Performance impact of deleting fields from xdfile_t:
Deleting ha is about 5% slower.
Deleting cha is about 5% faster.
Delete ha, but keep cha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.269 s ± 0.017 s [User: 1.135 s, System: 0.128 s]
Range (min … max): 1.249 s … 1.286 s 10 runs
Benchmark 2: build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.339 s ± 0.017 s [User: 1.234 s, System: 0.099 s]
Range (min … max): 1.320 s … 1.358 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.06 ± 0.02 times faster than build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete cha, but keep ha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.290 s ± 0.001 s [User: 1.154 s, System: 0.130 s]
Range (min … max): 1.288 s … 1.292 s 10 runs
Benchmark 2: build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.232 s ± 0.017 s [User: 1.105 s, System: 0.121 s]
Range (min … max): 1.205 s … 1.249 s 10 runs
Summary
build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.05 ± 0.01 times faster than build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete ha AND chastore
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha_and_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.291 s ± 0.002 s [User: 1.156 s, System: 0.129 s]
Range (min … max): 1.287 s … 1.295 s 10 runs
Benchmark 2: build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.306 s ± 0.001 s [User: 1.195 s, System: 0.105 s]
Range (min … max): 1.305 s … 1.308 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.01 ± 0.00 times faster than build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++----------
xdiff/xemit.c | 6 ++---
xdiff/xhistogram.c | 2 +-
xdiff/xmerge.c | 56 +++++++++++++++++++++++-----------------------
xdiff/xpatience.c | 10 ++++-----
xdiff/xprepare.c | 19 ++++++----------
xdiff/xtypes.h | 3 +--
xdiff/xutils.c | 12 +++++-----
8 files changed, 63 insertions(+), 69 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 11cd090b53..a66125d44a 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -24,7 +24,7 @@
static unsigned long get_hash(xdfile_t *xdf, long index)
{
- return xdf->recs[xdf->rindex[index]]->ha;
+ return xdf->recs[xdf->rindex[index]].ha;
}
#define XDL_MAX_COST_MIN 256
@@ -489,13 +489,13 @@ static void measure_split(const xdfile_t *xdf, long split,
m->indent = -1;
} else {
m->end_of_file = 0;
- m->indent = get_indent(xdf->recs[split]);
+ m->indent = get_indent(&xdf->recs[split]);
}
m->pre_blank = 0;
m->pre_indent = -1;
for (i = split - 1; i >= 0; i--) {
- m->pre_indent = get_indent(xdf->recs[i]);
+ m->pre_indent = get_indent(&xdf->recs[i]);
if (m->pre_indent != -1)
break;
m->pre_blank += 1;
@@ -508,7 +508,7 @@ static void measure_split(const xdfile_t *xdf, long split,
m->post_blank = 0;
m->post_indent = -1;
for (i = split + 1; i < xdf->nrec; i++) {
- m->post_indent = get_indent(xdf->recs[i]);
+ m->post_indent = get_indent(&xdf->recs[i]);
if (m->post_indent != -1)
break;
m->post_blank += 1;
@@ -752,7 +752,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
- recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
+ recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
xdf->rchg[g->start++] = 0;
xdf->rchg[g->end++] = 1;
@@ -773,7 +773,7 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
- recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1])) {
+ recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
xdf->rchg[--g->start] = 1;
xdf->rchg[--g->end] = 0;
@@ -988,16 +988,16 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
for (xch = xscr; xch; xch = xch->next) {
int ignore = 1;
- xrecord_t **rec;
+ xrecord_t *rec;
long i;
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
xch->ignore = ignore;
}
@@ -1021,7 +1021,7 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
xdchange_t *xch;
for (xch = xscr; xch; xch = xch->next) {
- xrecord_t **rec;
+ xrecord_t *rec;
int ignore = 1;
long i;
@@ -1033,11 +1033,11 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
xch->ignore = ignore;
}
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index b3793e81e2..79c14f8b4c 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -24,7 +24,7 @@
static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0) {
return -1;
@@ -110,7 +110,7 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (!xecfg->find_func)
return def_ff(rec->ptr, rec->size, buf, sz);
@@ -150,7 +150,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
long i = 0;
for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 040d81e0bc..4d857e8ae2 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -86,7 +86,7 @@ struct region {
((LINE_MAP(index, ptr))->cnt)
#define REC(env, s, l) \
- (env->xdf##s.recs[l - 1])
+ (&env->xdf##s.recs[l - 1])
static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
{
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index af40c88a5b..fd600cbb5d 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -97,12 +97,12 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
int line_count, long flags)
{
int i;
- xrecord_t **rec1 = xe1->xdf2.recs + i1;
- xrecord_t **rec2 = xe2->xdf2.recs + i2;
+ xrecord_t *rec1 = xe1->xdf2.recs + i1;
+ xrecord_t *rec2 = xe2->xdf2.recs + i2;
for (i = 0; i < line_count; i++) {
- int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
- rec2[i]->ptr, rec2[i]->size, flags);
+ int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
+ rec2[i].ptr, rec2[i].size, flags);
if (!result)
return -1;
}
@@ -111,7 +111,7 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int needs_cr, int add_nl, char *dest)
{
- xrecord_t **recs;
+ xrecord_t *recs;
int size = 0;
recs = (use_orig ? xe->xdf1.recs : xe->xdf2.recs) + i;
@@ -119,12 +119,12 @@ static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int nee
if (count < 1)
return 0;
- for (i = 0; i < count; size += recs[i++]->size)
+ for (i = 0; i < count; size += recs[i++].size)
if (dest)
- memcpy(dest + size, recs[i]->ptr, recs[i]->size);
+ memcpy(dest + size, recs[i].ptr, recs[i].size);
if (add_nl) {
- i = recs[count - 1]->size;
- if (i == 0 || recs[count - 1]->ptr[i - 1] != '\n') {
+ i = recs[count - 1].size;
+ if (i == 0 || recs[count - 1].ptr[i - 1] != '\n') {
if (needs_cr) {
if (dest)
dest[size] = '\r';
@@ -160,22 +160,22 @@ static int is_eol_crlf(xdfile_t *file, int i)
if (i < file->nrec - 1)
/* All lines before the last *must* end in LF */
- return (size = file->recs[i]->size) > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ return (size = file->recs[i].size) > 1 &&
+ file->recs[i].ptr[size - 2] == '\r';
if (!file->nrec)
/* Cannot determine eol style from empty file */
return -1;
- if ((size = file->recs[i]->size) &&
- file->recs[i]->ptr[size - 1] == '\n')
+ if ((size = file->recs[i].size) &&
+ file->recs[i].ptr[size - 1] == '\n')
/* Last line; ends in LF; Is it CR/LF? */
return size > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ file->recs[i].ptr[size - 2] == '\r';
if (!i)
/* The only line has no eol */
return -1;
/* Determine eol from second-to-last line */
- return (size = file->recs[i - 1]->size) > 1 &&
- file->recs[i - 1]->ptr[size - 2] == '\r';
+ return (size = file->recs[i - 1].size) > 1 &&
+ file->recs[i - 1].ptr[size - 2] == '\r';
}
static int is_cr_needed(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m)
@@ -334,22 +334,22 @@ static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
static void xdl_refine_zdiff3_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
xpparam_t const *xpp)
{
- xrecord_t **rec1 = xe1->xdf2.recs, **rec2 = xe2->xdf2.recs;
+ xrecord_t *rec1 = xe1->xdf2.recs, *rec2 = xe2->xdf2.recs;
for (; m; m = m->next) {
/* let's handle just the conflicts */
if (m->mode)
continue;
while(m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1], rec2[m->i2], xpp->flags)) {
+ recmatch(&rec1[m->i1], &rec2[m->i2], xpp->flags)) {
m->chg1--;
m->chg2--;
m->i1++;
m->i2++;
}
while (m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1 + m->chg1 - 1],
- rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
+ recmatch(&rec1[m->i1 + m->chg1 - 1],
+ &rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
m->chg1--;
m->chg2--;
}
@@ -381,12 +381,12 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
* This probably does not work outside git, since
* we have a very simple mmfile structure.
*/
- t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
- t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
- + xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
- t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
- t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
- + xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
+ t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
+ t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
+ + xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
+ t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
+ t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
+ + xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
return -1;
if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
@@ -440,8 +440,8 @@ static int line_contains_alnum(const char *ptr, long size)
static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
{
for (; chg; chg--, i++)
- if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
- xe->xdf2.recs[i]->size))
+ if (line_contains_alnum(xe->xdf2.recs[i].ptr,
+ xe->xdf2.recs[i].size))
return 1;
return 0;
}
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 77dc411d19..bf69a58527 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -88,9 +88,9 @@ static int is_anchor(xpparam_t const *xpp, const char *line)
static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
int pass)
{
- xrecord_t **records = pass == 1 ?
+ xrecord_t *records = pass == 1 ?
map->env->xdf1.recs : map->env->xdf2.recs;
- xrecord_t *record = records[line - 1];
+ xrecord_t *record = &records[line - 1];
/*
* After xdl_prepare_env() (or more precisely, due to
* xdl_classify_record()), the "ha" member of the records (AKA lines)
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
return;
map->entries[index].line1 = line;
map->entries[index].hash = record->ha;
- map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
+ map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
if (!map->first)
map->first = map->entries + index;
if (map->last) {
@@ -246,8 +246,8 @@ static int find_longest_common_sequence(struct hashmap *map, struct entry **res)
static int match(struct hashmap *map, int line1, int line2)
{
- xrecord_t *record1 = map->env->xdf1.recs[line1 - 1];
- xrecord_t *record2 = map->env->xdf2.recs[line2 - 1];
+ xrecord_t *record1 = &map->env->xdf1.recs[line1 - 1];
+ xrecord_t *record2 = &map->env->xdf2.recs[line2 - 1];
return record1->ha == record2->ha;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 43cebf6721..5c7e858b6b 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -130,7 +130,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
}
@@ -145,8 +144,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->rchg = NULL;
xdf->recs = NULL;
- if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
- goto abort;
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
@@ -157,12 +154,10 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
hav = xdl_hash_record(&cur, top, xpp->flags);
if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
- if (!(crec = xdl_cha_alloc(&xdf->rcha)))
- goto abort;
+ crec = &xdf->recs[xdf->nrec++];
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- xdf->recs[xdf->nrec++] = crec;
if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
@@ -262,7 +257,7 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
*/
static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
long i, nm, nreff, mlim;
- xrecord_t **recs;
+ xrecord_t *recs;
xdlclass_t *rcrec;
char *dis, *dis1, *dis2;
int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
@@ -275,7 +270,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -283,7 +278,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -319,13 +314,13 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
*/
static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
long i, lim;
- xrecord_t **recs1, **recs2;
+ xrecord_t *recs1, *recs2;
recs1 = xdf1->recs;
recs2 = xdf2->recs;
for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
i++, recs1++, recs2++)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dstart = xdf2->dstart = i;
@@ -333,7 +328,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
recs1 = xdf1->recs + xdf1->nrec - 1;
recs2 = xdf2->recs + xdf2->nrec - 1;
for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dend = xdf1->nrec - i - 1;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 85848f1685..3d26cbf1ec 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -45,10 +45,9 @@ typedef struct s_xrecord {
} xrecord_t;
typedef struct s_xdfile {
- chastore_t rcha;
+ xrecord_t *recs;
long nrec;
long dstart, dend;
- xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 444a108f87..332982b509 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -416,12 +416,12 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
mmfile_t subfile1, subfile2;
xdfenv_t env;
- subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
- subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
- diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
- subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
- subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
- diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
+ subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
+ subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
+ diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
+ subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
+ subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
+ diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
return -1;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v2 10/10] xdiff: treat xdfile_t.rchg like an enum
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
` (7 preceding siblings ...)
2025-09-18 23:56 ` [PATCH v2 08/10] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-18 23:56 ` Ezekiel Newren via GitGitGadget
2025-09-19 0:33 ` [PATCH v2 00/10] Use rust types in xdiff Junio C Hamano
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
10 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-18 23:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Define macros NO(0), YES(1), MAYBE(2) as the enum values for rchg to
make the code easier to follow. Perhaps 'rchg' should be renamed to
'changed'?
A few of the code changes might appear to change behavior, such as:
- while (xdf->rchg[g->start - 1])
+ while (xdf->rchg[g->start - 1] == YES)
because it appears the value of MAYBE is being ignored. However, MAYBE
is only ever assigned as a value to a temporary array (dis1 & dis2) and
then as a last step use that temporary array to decide if it wants to
change xdfile_t.rchg[i] to YES or leave it as NO. As such, rchg will
never have a value of MAYBE and thus there is no behavioral change.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiff.h | 4 ++++
xdiff/xdiffi.c | 22 +++++++++++-----------
xdiff/xhistogram.c | 8 ++++----
xdiff/xpatience.c | 8 ++++----
xdiff/xprepare.c | 24 ++++++++++++------------
5 files changed, 35 insertions(+), 31 deletions(-)
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 2cecde5afe..7092879829 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -27,6 +27,10 @@
extern "C" {
#endif /* #ifdef __cplusplus */
+#define NO 0
+#define YES 1
+#define MAYBE 2
+
/* xpparm_t.flags */
#define XDF_NEED_MINIMAL (1 << 0)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 83c4cff6f7..44fd27823a 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
- xdf2->rchg[xdf2->rindex[off2]] = 1;
+ xdf2->rchg[xdf2->rindex[off2]] = YES;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
- xdf1->rchg[xdf1->rindex[off1]] = 1;
+ xdf1->rchg[xdf1->rindex[off1]] = YES;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -708,7 +708,7 @@ struct xdlgroup {
static void group_init(xdfile_t *xdf, struct xdlgroup *g)
{
g->start = g->end = 0;
- while (xdf->rchg[g->end])
+ while (xdf->rchg[g->end] == YES)
g->end++;
}
@@ -722,7 +722,7 @@ static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->start = g->end + 1;
- for (g->end = g->start; xdf->rchg[g->end]; g->end++)
+ for (g->end = g->start; xdf->rchg[g->end] == YES; g->end++)
;
return 0;
@@ -738,7 +738,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->end = g->start - 1;
- for (g->start = g->end; xdf->rchg[g->start - 1]; g->start--)
+ for (g->start = g->end; xdf->rchg[g->start - 1] == YES; g->start--)
;
return 0;
@@ -753,10 +753,10 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
- xdf->rchg[g->start++] = 0;
- xdf->rchg[g->end++] = 1;
+ xdf->rchg[g->start++] = NO;
+ xdf->rchg[g->end++] = YES;
- while (xdf->rchg[g->end])
+ while (xdf->rchg[g->end] == YES)
g->end++;
return 0;
@@ -774,10 +774,10 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
- xdf->rchg[--g->start] = 1;
- xdf->rchg[--g->end] = 0;
+ xdf->rchg[--g->start] = YES;
+ xdf->rchg[--g->end] = NO;
- while (xdf->rchg[g->start - 1])
+ while (xdf->rchg[g->start - 1] == YES)
g->start--;
return 0;
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 4d857e8ae2..c2e85b8ab9 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -318,11 +318,11 @@ redo:
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
return 0;
}
@@ -335,9 +335,9 @@ redo:
else {
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
while (count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
result = 0;
} else {
result = histogram_diff(xpp, env,
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index bf69a58527..20cda5e258 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* trivial case: one side is empty */
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
return 0;
}
@@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* are there any matching lines at all? */
if (!map.has_matches) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
xdl_free(map.entries);
return 0;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 5c7e858b6b..c11875d07f 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -214,9 +214,9 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
* current line (i) is already a multimatch line.
*/
for (r = 1, rdis0 = 0, rpdis0 = 1; (i - r) >= s; r++) {
- if (!dis[i - r])
+ if (dis[i - r] == NO)
rdis0++;
- else if (dis[i - r] == 2)
+ else if (dis[i - r] == MAYBE)
rpdis0++;
else
break;
@@ -230,9 +230,9 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
if (rdis0 == 0)
return 0;
for (r = 1, rdis1 = 0, rpdis1 = 1; (i + r) <= e; r++) {
- if (!dis[i + r])
+ if (dis[i + r] == NO)
rdis1++;
- else if (dis[i + r] == 2)
+ else if (dis[i + r] == MAYBE)
rpdis1++;
else
break;
@@ -272,7 +272,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
- dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ dis1[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
}
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
@@ -280,26 +280,26 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
- dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ dis2[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
}
for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
i <= xdf1->dend; i++, recs++) {
- if (dis1[i] == 1 ||
- (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
+ if (dis1[i] == YES ||
+ (dis1[i] == MAYBE && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
} else
- xdf1->rchg[i] = 1;
+ xdf1->rchg[i] = YES;
}
xdf1->nreff = nreff;
for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
i <= xdf2->dend; i++, recs++) {
- if (dis2[i] == 1 ||
- (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
+ if (dis2[i] == YES ||
+ (dis2[i] == MAYBE && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
} else
- xdf2->rchg[i] = 1;
+ xdf2->rchg[i] = YES;
}
xdf2->nreff = nreff;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* Re: [PATCH v2 00/10] Use rust types in xdiff.
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
` (8 preceding siblings ...)
2025-09-18 23:56 ` [PATCH v2 10/10] xdiff: treat xdfile_t.rchg like an enum Ezekiel Newren via GitGitGadget
@ 2025-09-19 0:33 ` Junio C Hamano
2025-09-19 0:41 ` Ezekiel Newren
2025-09-19 15:15 ` Ezekiel Newren
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
10 siblings, 2 replies; 158+ messages in thread
From: Junio C Hamano @ 2025-09-19 0:33 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget
Cc: git, Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren
"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> Changes since v1, to address review feedback.
>
> * Only include the clean up patches; The remaining patches will be split
> into a separate series.
> * Commit message clarifications.
> * Minor style cleanups.
> * Performance impacts included in commit message of patch 8.
>
>
> Relevant part of the original cover letter follows:
> ===================================================
>
> This patch series involves ZERO Rust code and toolchains, which avoids the
> debate about Rust's portability and timeline. Instead, it shows how Git can
> immediately benefit from Rust's design choices without using it at all. The
> rationale for using Rust types on the C and Rust side is addressed in the
> commit that creates compat/rust_types.h.
>
> This patch series has 2 parts:
>
> * Patches 1-9: Clean up xdiff, this can be merged without part 2.
> * Patches 10-17: Define Rust types in compat/rust_types.h and then start
> refactoring xdiff with Rust types. This depends on part 1.
This is probably stale. If the patch numbering is to be trusted, we
are missing [09/10] (at least we haven't seen it in the list archive
30 minutes after the other messages in the series landed there), so
the "clean up xdiff" stage consists of 10 patches, and this cover
letter does not need to talk about "Patches 10-17" (yet).
Will see if lore.kernel.org catches up in the morning and process
them. Thanks for working on the topic.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v2 00/10] Use rust types in xdiff.
2025-09-19 0:33 ` [PATCH v2 00/10] Use rust types in xdiff Junio C Hamano
@ 2025-09-19 0:41 ` Ezekiel Newren
2025-09-19 15:15 ` Ezekiel Newren
1 sibling, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-19 0:41 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Phillip Wood,
Ben Knoble
On Thu, Sep 18, 2025 at 6:33 PM Junio C Hamano <gitster@pobox.com> wrote:
> This is probably stale. If the patch numbering is to be trusted, we
> are missing [09/10] (at least we haven't seen it in the list archive
> 30 minutes after the other messages in the series landed there), so
> the "clean up xdiff" stage consists of 10 patches, and this cover
> letter does not need to talk about "Patches 10-17" (yet).
>
> Will see if lore.kernel.org catches up in the morning and process
> them. Thanks for working on the topic.
You're welcome. There are exactly 10 patches. Once this gets merged
into 'next' I can post part 2 under a different patch series. Part 2
uses [ui]int(8|16|32|64)_t and s?size_t everywhere.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v2 00/10] Use rust types in xdiff.
2025-09-19 0:33 ` [PATCH v2 00/10] Use rust types in xdiff Junio C Hamano
2025-09-19 0:41 ` Ezekiel Newren
@ 2025-09-19 15:15 ` Ezekiel Newren
1 sibling, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-19 15:15 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Phillip Wood,
Ben Knoble
On Thu, Sep 18, 2025 at 6:33 PM Junio C Hamano <gitster@pobox.com> wrote:
> Will see if lore.kernel.org catches up in the morning and process
> them. Thanks for working on the topic.
Since patch 9 hasn't shown up. I'll resubmit with a slightly cleaned
up cover letter.
^ permalink raw reply [flat|nested] 158+ messages in thread
* [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
` (9 preceding siblings ...)
2025-09-19 0:33 ` [PATCH v2 00/10] Use rust types in xdiff Junio C Hamano
@ 2025-09-19 15:16 ` Ezekiel Newren via GitGitGadget
2025-09-19 15:16 ` [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
` (11 more replies)
10 siblings, 12 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-19 15:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren
Changes since v2.
* No patch changes, just resending to get patch 9 to show up on the mailing
list.
* A few tweaks to the cover letter.
Changes since v1, to address review feedback.
* Only include the clean up patches; The remaining patches will be split
into a separate series.
* Commit message clarifications.
* Minor style cleanups.
* Performance impacts included in commit message of patch 8.
Relevant part of the original cover letter follows:
===================================================
Before:
typedef struct s_xrecord {
struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
unsigned int hbits;
xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
unsigned long *ha;
} xdfile_t;
After cleanup:
typedef struct s_xrecord {
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
xrecord_t *recs;
long nrec;
long dstart, dend;
char *rchg;
long *rindex;
long nreff;
} xdfile_t;
===
Ezekiel Newren (10):
xdiff: delete static forward declarations in xprepare
xdiff: delete local variables and initialize/free xdfile_t directly
xdiff: delete unnecessary fields from xrecord_t and xdfile_t
xdiff: delete xdl_get_rec() in xemit
xdiff: delete struct diffdata_t
xdiff: delete redundant array xdfile_t.ha
xdiff: delete fields ha, line, size in xdlclass_t in favor of an
xrecord_t
xdiff: delete chastore from xdfile_t
xdiff: delete rchg aliasing
xdiff: treat xdfile_t.rchg like an enum
xdiff/xdiff.h | 4 +
xdiff/xdiffi.c | 101 ++++++++---------
xdiff/xdiffi.h | 11 +-
xdiff/xemit.c | 38 +++----
xdiff/xhistogram.c | 10 +-
xdiff/xmerge.c | 56 +++++-----
xdiff/xpatience.c | 18 ++--
xdiff/xprepare.c | 262 +++++++++++++++++----------------------------
xdiff/xtypes.h | 7 +-
xdiff/xutils.c | 12 +--
10 files changed, 212 insertions(+), 307 deletions(-)
base-commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2048%2Fezekielnewren%2Fuse_rust_types_in_xdiff-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2048/ezekielnewren/use_rust_types_in_xdiff-v3
Pull-Request: https://github.com/git/git/pull/2048
Range-diff vs v2:
1: 784cffcef5 = 1: 784cffcef5 xdiff: delete static forward declarations in xprepare
2: b79157e64f = 2: b79157e64f xdiff: delete local variables and initialize/free xdfile_t directly
3: 2e8de5be03 = 3: 2e8de5be03 xdiff: delete unnecessary fields from xrecord_t and xdfile_t
4: ddfee67e06 = 4: ddfee67e06 xdiff: delete xdl_get_rec() in xemit
5: 807ce3e5aa = 5: 807ce3e5aa xdiff: delete struct diffdata_t
6: 0bacb1191d = 6: 0bacb1191d xdiff: delete redundant array xdfile_t.ha
7: e1e94107c9 = 7: e1e94107c9 xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
8: fae26d2a04 = 8: fae26d2a04 xdiff: delete chastore from xdfile_t
9: fd54135560 = 9: fd54135560 xdiff: delete rchg aliasing
10: 9a5ac3c488 = 10: 1e404c3290 xdiff: treat xdfile_t.rchg like an enum
--
gitgitgadget
^ permalink raw reply [flat|nested] 158+ messages in thread
* [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
@ 2025-09-19 15:16 ` Ezekiel Newren via GitGitGadget
2025-09-20 17:16 ` Junio C Hamano
2025-09-19 15:16 ` [PATCH v3 02/10] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
` (10 subsequent siblings)
11 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-19 15:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Move xdl_prepare_env() later in the file to avoid the need
for static forward declarations.
Best-viewed-with: --color-moved
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 116 ++++++++++++++++++++---------------------------
1 file changed, 50 insertions(+), 66 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e1d4017b2d..a45c5ee208 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -53,21 +53,6 @@ typedef struct s_xdlclassifier {
-static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags);
-static void xdl_free_classifier(xdlclassifier_t *cf);
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec);
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
- xdlclassifier_t *cf, xdfile_t *xdf);
-static void xdl_free_ctx(xdfile_t *xdf);
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e);
-static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-
-
-
-
static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags) {
cf->flags = flags;
@@ -242,57 +227,6 @@ static void xdl_free_ctx(xdfile_t *xdf) {
}
-int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
- xdfenv_t *xe) {
- long enl1, enl2, sample;
- xdlclassifier_t cf;
-
- memset(&cf, 0, sizeof(cf));
-
- /*
- * For histogram diff, we can afford a smaller sample size and
- * thus a poorer estimate of the number of lines, as the hash
- * table (rhash) won't be filled up/grown. The number of lines
- * (nrecs) will be updated correctly anyway by
- * xdl_prepare_ctx().
- */
- sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
- ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
-
- enl1 = xdl_guess_lines(mf1, sample) + 1;
- enl2 = xdl_guess_lines(mf2, sample) + 1;
-
- if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
- return -1;
-
- if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
-
- xdl_free_classifier(&cf);
- return -1;
- }
- if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
- (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
- xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf2);
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- xdl_free_classifier(&cf);
-
- return 0;
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
@@ -460,3 +394,53 @@ static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2
return 0;
}
+
+int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+ xdfenv_t *xe) {
+ long enl1, enl2, sample;
+ xdlclassifier_t cf;
+
+ memset(&cf, 0, sizeof(cf));
+
+ /*
+ * For histogram diff, we can afford a smaller sample size and
+ * thus a poorer estimate of the number of lines, as the hash
+ * table (rhash) won't be filled up/grown. The number of lines
+ * (nrecs) will be updated correctly anyway by
+ * xdl_prepare_ctx().
+ */
+ sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
+ ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
+
+ enl1 = xdl_guess_lines(mf1, sample) + 1;
+ enl2 = xdl_guess_lines(mf2, sample) + 1;
+
+ if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
+ return -1;
+
+ if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
+
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+ if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
+ (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
+ xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf2);
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ xdl_free_classifier(&cf);
+
+ return 0;
+}
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v3 02/10] xdiff: delete local variables and initialize/free xdfile_t directly
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
2025-09-19 15:16 ` [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
@ 2025-09-19 15:16 ` Ezekiel Newren via GitGitGadget
2025-09-20 17:36 ` Junio C Hamano
2025-09-19 15:16 ` [PATCH v3 03/10] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
` (9 subsequent siblings)
11 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-19 15:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
These local variables are essentially a hand-rolled additional
implementation of xdl_free_ctx() inlined into xdl_prepare_ctx(). Modify
the code to use the existing xdl_free_ctx() function so there aren't
two ways to free such variables.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 78 +++++++++++++++++++-----------------------------
1 file changed, 30 insertions(+), 48 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index a45c5ee208..fe02fd7925 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -134,99 +134,81 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
}
+static void xdl_free_ctx(xdfile_t *xdf)
+{
+ xdl_free(xdf->rhash);
+ xdl_free(xdf->rindex);
+ xdl_free(xdf->rchg - 1);
+ xdl_free(xdf->ha);
+ xdl_free(xdf->recs);
+ xdl_cha_free(&xdf->rcha);
+}
+
+
static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
xdlclassifier_t *cf, xdfile_t *xdf) {
- unsigned int hbits;
- long nrec, hsize, bsize;
+ long bsize;
unsigned long hav;
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xrecord_t **recs;
- xrecord_t **rhash;
- unsigned long *ha;
- char *rchg;
- long *rindex;
- ha = NULL;
- rindex = NULL;
- rchg = NULL;
- rhash = NULL;
- recs = NULL;
+ xdf->ha = NULL;
+ xdf->rindex = NULL;
+ xdf->rchg = NULL;
+ xdf->rhash = NULL;
+ xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
goto abort;
- if (!XDL_ALLOC_ARRAY(recs, narec))
+ if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- hbits = xdl_hashbits((unsigned int) narec);
- hsize = 1 << hbits;
- if (!XDL_CALLOC_ARRAY(rhash, hsize))
+ xdf->hbits = xdl_hashbits((unsigned int) narec);
+ if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
goto abort;
- nrec = 0;
+ xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
prev = cur;
hav = xdl_hash_record(&cur, top, xpp->flags);
- if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
+ if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
if (!(crec = xdl_cha_alloc(&xdf->rcha)))
goto abort;
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- recs[nrec++] = crec;
- if (xdl_classify_record(pass, cf, rhash, hbits, crec) < 0)
+ xdf->recs[xdf->nrec++] = crec;
+ if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
goto abort;
}
}
- if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
+ if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
goto abort;
if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
- if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
goto abort;
}
- xdf->nrec = nrec;
- xdf->recs = recs;
- xdf->hbits = hbits;
- xdf->rhash = rhash;
- xdf->rchg = rchg + 1;
- xdf->rindex = rindex;
+ xdf->rchg += 1;
xdf->nreff = 0;
- xdf->ha = ha;
xdf->dstart = 0;
- xdf->dend = nrec - 1;
+ xdf->dend = xdf->nrec - 1;
return 0;
abort:
- xdl_free(ha);
- xdl_free(rindex);
- xdl_free(rchg);
- xdl_free(rhash);
- xdl_free(recs);
- xdl_cha_free(&xdf->rcha);
+ xdl_free_ctx(xdf);
return -1;
}
-static void xdl_free_ctx(xdfile_t *xdf) {
-
- xdl_free(xdf->rhash);
- xdl_free(xdf->rindex);
- xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
- xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v3 03/10] xdiff: delete unnecessary fields from xrecord_t and xdfile_t
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
2025-09-19 15:16 ` [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-19 15:16 ` [PATCH v3 02/10] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
@ 2025-09-19 15:16 ` Ezekiel Newren via GitGitGadget
2025-09-19 15:16 ` [PATCH v3 04/10] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
` (8 subsequent siblings)
11 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-19 15:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 15 ++-------------
xdiff/xtypes.h | 3 ---
2 files changed, 2 insertions(+), 16 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index fe02fd7925..7acca1cb38 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -91,8 +91,7 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
}
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec) {
+static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
char const *line;
xdlclass_t *rcrec;
@@ -126,17 +125,12 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
rec->ha = (unsigned long) rcrec->idx;
- hi = (long) XDL_HASHLONG(rec->ha, hbits);
- rec->next = rhash[hi];
- rhash[hi] = rec;
-
return 0;
}
static void xdl_free_ctx(xdfile_t *xdf)
{
- xdl_free(xdf->rhash);
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->ha);
@@ -155,7 +149,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
- xdf->rhash = NULL;
xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
@@ -163,10 +156,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- xdf->hbits = xdl_hashbits((unsigned int) narec);
- if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
- goto abort;
-
xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
@@ -180,7 +169,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
crec->size = (long) (cur - prev);
crec->ha = hav;
xdf->recs[xdf->nrec++] = crec;
- if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
+ if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8442bd436e..8b8467360e 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,7 +39,6 @@ typedef struct s_chastore {
} chastore_t;
typedef struct s_xrecord {
- struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
@@ -48,8 +47,6 @@ typedef struct s_xrecord {
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
- unsigned int hbits;
- xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v3 04/10] xdiff: delete xdl_get_rec() in xemit
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
` (2 preceding siblings ...)
2025-09-19 15:16 ` [PATCH v3 03/10] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-19 15:16 ` Ezekiel Newren via GitGitGadget
2025-09-20 17:48 ` Junio C Hamano
2025-09-21 13:06 ` Phillip Wood
2025-09-19 15:16 ` [PATCH v3 05/10] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
` (7 subsequent siblings)
11 siblings, 2 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-19 15:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
This function aliases the fields of xrecord_t, which makes it harder
to track the usages of those fields. Delete it.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xemit.c | 38 +++++++++++++-------------------------
1 file changed, 13 insertions(+), 25 deletions(-)
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 1d40c9cb40..b3793e81e2 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -22,21 +22,11 @@
#include "xinclude.h"
-static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
-
- *rec = xdf->recs[ri]->ptr;
-
- return xdf->recs[ri]->size;
-}
-
-
-static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
- long size, psize = strlen(pre);
- char const *rec;
-
- size = xdl_get_rec(xdf, ri, &rec);
- if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0) {
+static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
+{
+ xrecord_t *rec = xdf->recs[ri];
+ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0) {
return -1;
}
@@ -120,11 +110,11 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ xrecord_t *rec = xdf->recs[ri];
+
if (!xecfg->find_func)
- return def_ff(rec, len, buf, sz);
- return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
+ return def_ff(rec->ptr, rec->size, buf, sz);
+ return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
}
static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
@@ -160,14 +150,12 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ xrecord_t *rec = xdf->recs[ri];
+ long i = 0;
- while (len > 0 && XDL_ISSPACE(*rec)) {
- rec++;
- len--;
- }
- return !len;
+ for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
+
+ return i == rec->size;
}
int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v3 05/10] xdiff: delete struct diffdata_t
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
` (3 preceding siblings ...)
2025-09-19 15:16 ` [PATCH v3 04/10] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
@ 2025-09-19 15:16 ` Ezekiel Newren via GitGitGadget
2025-09-21 13:06 ` Phillip Wood
2025-09-19 15:16 ` [PATCH v3 06/10] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
` (6 subsequent siblings)
11 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-19 15:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Every field in this struct is an alias for a certain field in xdfile_t.
diffdata_t.nrec -> xdfile_t.nreff
diffdata_t.ha -> xdfile_t.ha
diffdata_t.rindex -> xdfile_t.rindex
diffdata_t.rchg -> xdfile_t.rchg
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 32 ++++++++------------------------
xdiff/xdiffi.h | 11 ++---------
2 files changed, 10 insertions(+), 33 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 5a96e36dfb..bbf0161f84 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -257,10 +257,10 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
* sub-boxes by calling the box splitting function. Note that the real job
* (marking changed lines) is done in the two boundary reaching checks.
*/
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = dd1->ha, *ha2 = dd2->ha;
+ unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
@@ -273,17 +273,11 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
* be obviously changed.
*/
if (off1 == lim1) {
- char *rchg2 = dd2->rchg;
- long *rindex2 = dd2->rindex;
-
for (; off2 < lim2; off2++)
- rchg2[rindex2[off2]] = 1;
+ xdf2->rchg[xdf2->rindex[off2]] = 1;
} else if (off2 == lim2) {
- char *rchg1 = dd1->rchg;
- long *rindex1 = dd1->rindex;
-
for (; off1 < lim1; off1++)
- rchg1[rindex1[off1]] = 1;
+ xdf1->rchg[xdf1->rindex[off1]] = 1;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -300,9 +294,9 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
/*
* ... et Impera.
*/
- if (xdl_recs_cmp(dd1, off1, spl.i1, dd2, off2, spl.i2,
+ if (xdl_recs_cmp(xdf1, off1, spl.i1, xdf2, off2, spl.i2,
kvdf, kvdb, spl.min_lo, xenv) < 0 ||
- xdl_recs_cmp(dd1, spl.i1, lim1, dd2, spl.i2, lim2,
+ xdl_recs_cmp(xdf1, spl.i1, lim1, xdf2, spl.i2, lim2,
kvdf, kvdb, spl.min_hi, xenv) < 0) {
return -1;
@@ -318,7 +312,6 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
long ndiags;
long *kvd, *kvdf, *kvdb;
xdalgoenv_t xenv;
- diffdata_t dd1, dd2;
int res;
if (xdl_prepare_env(mf1, mf2, xpp, xe) < 0)
@@ -357,16 +350,7 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xenv.snake_cnt = XDL_SNAKE_CNT;
xenv.heur_min = XDL_HEUR_MIN_COST;
- dd1.nrec = xe->xdf1.nreff;
- dd1.ha = xe->xdf1.ha;
- dd1.rchg = xe->xdf1.rchg;
- dd1.rindex = xe->xdf1.rindex;
- dd2.nrec = xe->xdf2.nreff;
- dd2.ha = xe->xdf2.ha;
- dd2.rchg = xe->xdf2.rchg;
- dd2.rindex = xe->xdf2.rindex;
-
- res = xdl_recs_cmp(&dd1, 0, dd1.nrec, &dd2, 0, dd2.nrec,
+ res = xdl_recs_cmp(&xe->xdf1, 0, xe->xdf1.nreff, &xe->xdf2, 0, xe->xdf2.nreff,
kvdf, kvdb, (xpp->flags & XDF_NEED_MINIMAL) != 0,
&xenv);
xdl_free(kvd);
diff --git a/xdiff/xdiffi.h b/xdiff/xdiffi.h
index 126c9d8ff4..49e52c67f9 100644
--- a/xdiff/xdiffi.h
+++ b/xdiff/xdiffi.h
@@ -24,13 +24,6 @@
#define XDIFFI_H
-typedef struct s_diffdata {
- long nrec;
- unsigned long const *ha;
- long *rindex;
- char *rchg;
-} diffdata_t;
-
typedef struct s_xdalgoenv {
long mxcost;
long snake_cnt;
@@ -46,8 +39,8 @@ typedef struct s_xdchange {
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv);
int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xdfenv_t *xe);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v3 06/10] xdiff: delete redundant array xdfile_t.ha
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
` (4 preceding siblings ...)
2025-09-19 15:16 ` [PATCH v3 05/10] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
@ 2025-09-19 15:16 ` Ezekiel Newren via GitGitGadget
2025-09-19 15:16 ` [PATCH v3 07/10] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
` (5 subsequent siblings)
11 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-19 15:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
When 0 <= i < xdfile_t.nreff the following is true:
xdfile_t.ha[i] == xdfile_t.recs[xdfile_t.rindex[i]]
This makes the code about 5% slower. The fields rindex and ha are
specific to the classic diff (myers and minimal). I plan on creating a
struct for classic diff, but there'a alot of cleanup that needs to be
done before that can happen and leaving ha in would make those cleanups
harder to follow.
A subsequent commit will delete the chastore cha from xdfile_t. That
later commit will investigate deleting ha and cha independently and
together.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++++++----------
xdiff/xprepare.c | 12 ++----------
xdiff/xtypes.h | 1 -
3 files changed, 16 insertions(+), 21 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index bbf0161f84..11cd090b53 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -22,6 +22,11 @@
#include "xinclude.h"
+static unsigned long get_hash(xdfile_t *xdf, long index)
+{
+ return xdf->recs[xdf->rindex[index]]->ha;
+}
+
#define XDL_MAX_COST_MIN 256
#define XDL_HEUR_MIN_COST 256
#define XDL_LINE_MAX (long)((1UL << (CHAR_BIT * sizeof(long) - 1)) - 1)
@@ -42,8 +47,8 @@ typedef struct s_xdpsplit {
* using this algorithm, so a little bit of heuristic is needed to cut the
* search and to return a suboptimal point.
*/
-static long xdl_split(unsigned long const *ha1, long off1, long lim1,
- unsigned long const *ha2, long off2, long lim2,
+static long xdl_split(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl,
xdalgoenv_t *xenv) {
long dmin = off1 - lim2, dmax = lim1 - off2;
@@ -87,7 +92,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdf[d + 1];
prev1 = i1;
i2 = i1 - d;
- for (; i1 < lim1 && i2 < lim2 && ha1[i1] == ha2[i2]; i1++, i2++);
+ for (; i1 < lim1 && i2 < lim2 && get_hash(xdf1, i1) == get_hash(xdf2, i2); i1++, i2++);
if (i1 - prev1 > xenv->snake_cnt)
got_snake = 1;
kvdf[d] = i1;
@@ -124,7 +129,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdb[d + 1] - 1;
prev1 = i1;
i2 = i1 - d;
- for (; i1 > off1 && i2 > off2 && ha1[i1 - 1] == ha2[i2 - 1]; i1--, i2--);
+ for (; i1 > off1 && i2 > off2 && get_hash(xdf1, i1 - 1) == get_hash(xdf2, i2 - 1); i1--, i2--);
if (prev1 - i1 > xenv->snake_cnt)
got_snake = 1;
kvdb[d] = i1;
@@ -159,7 +164,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 + xenv->snake_cnt <= i1 && i1 < lim1 &&
off2 + xenv->snake_cnt <= i2 && i2 < lim2) {
- for (k = 1; ha1[i1 - k] == ha2[i2 - k]; k++)
+ for (k = 1; get_hash(xdf1, i1 - k) == get_hash(xdf2, i2 - k); k++)
if (k == xenv->snake_cnt) {
best = v;
spl->i1 = i1;
@@ -183,7 +188,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 < i1 && i1 <= lim1 - xenv->snake_cnt &&
off2 < i2 && i2 <= lim2 - xenv->snake_cnt) {
- for (k = 0; ha1[i1 + k] == ha2[i2 + k]; k++)
+ for (k = 0; get_hash(xdf1, i1 + k) == get_hash(xdf2, i2 + k); k++)
if (k == xenv->snake_cnt - 1) {
best = v;
spl->i1 = i1;
@@ -260,13 +265,12 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
*/
- for (; off1 < lim1 && off2 < lim2 && ha1[off1] == ha2[off2]; off1++, off2++);
- for (; off1 < lim1 && off2 < lim2 && ha1[lim1 - 1] == ha2[lim2 - 1]; lim1--, lim2--);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, off1) == get_hash(xdf2, off2); off1++, off2++);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, lim1 - 1) == get_hash(xdf2, lim2 - 1); lim1--, lim2--);
/*
* If one dimension is empty, then all records on the other one must
@@ -285,7 +289,7 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
/*
* Divide ...
*/
- if (xdl_split(ha1, off1, lim1, ha2, off2, lim2, kvdf, kvdb,
+ if (xdl_split(xdf1, off1, lim1, xdf2, off2, lim2, kvdf, kvdb,
need_min, &spl, xenv) < 0) {
return -1;
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 7acca1cb38..c39b65fea9 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -133,7 +133,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
{
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
xdl_free(xdf->recs);
xdl_cha_free(&xdf->rcha);
}
@@ -146,7 +145,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
xdf->recs = NULL;
@@ -181,8 +179,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
- goto abort;
}
xdf->rchg += 1;
@@ -300,9 +296,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf1->dend; i++, recs++) {
if (dis1[i] == 1 ||
(dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
- xdf1->rindex[nreff] = i;
- xdf1->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf1->rindex[nreff++] = i;
} else
xdf1->rchg[i] = 1;
}
@@ -312,9 +306,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf2->dend; i++, recs++) {
if (dis2[i] == 1 ||
(dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
- xdf2->rindex[nreff] = i;
- xdf2->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf2->rindex[nreff++] = i;
} else
xdf2->rchg[i] = 1;
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8b8467360e..85848f1685 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -52,7 +52,6 @@ typedef struct s_xdfile {
char *rchg;
long *rindex;
long nreff;
- unsigned long *ha;
} xdfile_t;
typedef struct s_xdfenv {
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v3 07/10] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
` (5 preceding siblings ...)
2025-09-19 15:16 ` [PATCH v3 06/10] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
@ 2025-09-19 15:16 ` Ezekiel Newren via GitGitGadget
2025-09-21 13:06 ` Phillip Wood
2025-09-19 15:16 ` [PATCH v3 08/10] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
` (4 subsequent siblings)
11 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-19 15:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
The fields from xdlclass_t are aliases of xrecord_t:
xdlclass_t.line -> xrecord_t.ptr
xdlclass_t.size -> xrecord_t.size
xdlclass_t.ha -> xrecord_t.ha
Remove aliasing from xdlclass_t, to reduce future refactoring mistakes.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index c39b65fea9..43cebf6721 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -32,9 +32,7 @@
typedef struct s_xdlclass {
struct s_xdlclass *next;
- unsigned long ha;
- char const *line;
- long size;
+ xrecord_t rec;
long idx;
long len1, len2;
} xdlclass_t;
@@ -93,14 +91,12 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
- char const *line;
xdlclass_t *rcrec;
- line = rec->ptr;
hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
- if (rcrec->ha == rec->ha &&
- xdl_recmatch(rcrec->line, rcrec->size,
+ if (rcrec->rec.ha == rec->ha &&
+ xdl_recmatch(rcrec->rec.ptr, rcrec->rec.size,
rec->ptr, rec->size, cf->flags))
break;
@@ -113,9 +109,9 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
if (XDL_ALLOC_GROW(cf->rcrecs, cf->count, cf->alloc))
return -1;
cf->rcrecs[rcrec->idx] = rcrec;
- rcrec->line = line;
- rcrec->size = rec->size;
- rcrec->ha = rec->ha;
+ rcrec->rec.ptr = rec->ptr;
+ rcrec->rec.size = rec->size;
+ rcrec->rec.ha = rec->ha;
rcrec->len1 = rcrec->len2 = 0;
rcrec->next = cf->rchash[hi];
cf->rchash[hi] = rcrec;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v3 08/10] xdiff: delete chastore from xdfile_t
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
` (6 preceding siblings ...)
2025-09-19 15:16 ` [PATCH v3 07/10] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-19 15:16 ` Ezekiel Newren via GitGitGadget
2025-09-19 15:16 ` [PATCH v3 09/10] xdiff: delete rchg aliasing Ezekiel Newren via GitGitGadget
` (3 subsequent siblings)
11 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-19 15:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xdfile_t currently uses chastore_t which is an arena allocator. I
think that xrecord_t used to be a linked list and recs didn't exist
originally. When recs was added I think they forgot to remove
xdfile_t.next, but was overlooked. This dual data structure setup
makes the code somewhat confusing.
Additionally the C type chastore_t isn't FFI friendly, and provides
little to no performance benefit over using realloc to grow an array.
Performance impact of deleting fields from xdfile_t:
Deleting ha is about 5% slower.
Deleting cha is about 5% faster.
Delete ha, but keep cha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.269 s ± 0.017 s [User: 1.135 s, System: 0.128 s]
Range (min … max): 1.249 s … 1.286 s 10 runs
Benchmark 2: build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.339 s ± 0.017 s [User: 1.234 s, System: 0.099 s]
Range (min … max): 1.320 s … 1.358 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.06 ± 0.02 times faster than build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete cha, but keep ha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.290 s ± 0.001 s [User: 1.154 s, System: 0.130 s]
Range (min … max): 1.288 s … 1.292 s 10 runs
Benchmark 2: build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.232 s ± 0.017 s [User: 1.105 s, System: 0.121 s]
Range (min … max): 1.205 s … 1.249 s 10 runs
Summary
build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.05 ± 0.01 times faster than build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete ha AND chastore
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha_and_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.291 s ± 0.002 s [User: 1.156 s, System: 0.129 s]
Range (min … max): 1.287 s … 1.295 s 10 runs
Benchmark 2: build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.306 s ± 0.001 s [User: 1.195 s, System: 0.105 s]
Range (min … max): 1.305 s … 1.308 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.01 ± 0.00 times faster than build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++----------
xdiff/xemit.c | 6 ++---
xdiff/xhistogram.c | 2 +-
xdiff/xmerge.c | 56 +++++++++++++++++++++++-----------------------
xdiff/xpatience.c | 10 ++++-----
xdiff/xprepare.c | 19 ++++++----------
xdiff/xtypes.h | 3 +--
xdiff/xutils.c | 12 +++++-----
8 files changed, 63 insertions(+), 69 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 11cd090b53..a66125d44a 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -24,7 +24,7 @@
static unsigned long get_hash(xdfile_t *xdf, long index)
{
- return xdf->recs[xdf->rindex[index]]->ha;
+ return xdf->recs[xdf->rindex[index]].ha;
}
#define XDL_MAX_COST_MIN 256
@@ -489,13 +489,13 @@ static void measure_split(const xdfile_t *xdf, long split,
m->indent = -1;
} else {
m->end_of_file = 0;
- m->indent = get_indent(xdf->recs[split]);
+ m->indent = get_indent(&xdf->recs[split]);
}
m->pre_blank = 0;
m->pre_indent = -1;
for (i = split - 1; i >= 0; i--) {
- m->pre_indent = get_indent(xdf->recs[i]);
+ m->pre_indent = get_indent(&xdf->recs[i]);
if (m->pre_indent != -1)
break;
m->pre_blank += 1;
@@ -508,7 +508,7 @@ static void measure_split(const xdfile_t *xdf, long split,
m->post_blank = 0;
m->post_indent = -1;
for (i = split + 1; i < xdf->nrec; i++) {
- m->post_indent = get_indent(xdf->recs[i]);
+ m->post_indent = get_indent(&xdf->recs[i]);
if (m->post_indent != -1)
break;
m->post_blank += 1;
@@ -752,7 +752,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
- recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
+ recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
xdf->rchg[g->start++] = 0;
xdf->rchg[g->end++] = 1;
@@ -773,7 +773,7 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
- recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1])) {
+ recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
xdf->rchg[--g->start] = 1;
xdf->rchg[--g->end] = 0;
@@ -988,16 +988,16 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
for (xch = xscr; xch; xch = xch->next) {
int ignore = 1;
- xrecord_t **rec;
+ xrecord_t *rec;
long i;
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
xch->ignore = ignore;
}
@@ -1021,7 +1021,7 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
xdchange_t *xch;
for (xch = xscr; xch; xch = xch->next) {
- xrecord_t **rec;
+ xrecord_t *rec;
int ignore = 1;
long i;
@@ -1033,11 +1033,11 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
xch->ignore = ignore;
}
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index b3793e81e2..79c14f8b4c 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -24,7 +24,7 @@
static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0) {
return -1;
@@ -110,7 +110,7 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (!xecfg->find_func)
return def_ff(rec->ptr, rec->size, buf, sz);
@@ -150,7 +150,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
long i = 0;
for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 040d81e0bc..4d857e8ae2 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -86,7 +86,7 @@ struct region {
((LINE_MAP(index, ptr))->cnt)
#define REC(env, s, l) \
- (env->xdf##s.recs[l - 1])
+ (&env->xdf##s.recs[l - 1])
static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
{
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index af40c88a5b..fd600cbb5d 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -97,12 +97,12 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
int line_count, long flags)
{
int i;
- xrecord_t **rec1 = xe1->xdf2.recs + i1;
- xrecord_t **rec2 = xe2->xdf2.recs + i2;
+ xrecord_t *rec1 = xe1->xdf2.recs + i1;
+ xrecord_t *rec2 = xe2->xdf2.recs + i2;
for (i = 0; i < line_count; i++) {
- int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
- rec2[i]->ptr, rec2[i]->size, flags);
+ int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
+ rec2[i].ptr, rec2[i].size, flags);
if (!result)
return -1;
}
@@ -111,7 +111,7 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int needs_cr, int add_nl, char *dest)
{
- xrecord_t **recs;
+ xrecord_t *recs;
int size = 0;
recs = (use_orig ? xe->xdf1.recs : xe->xdf2.recs) + i;
@@ -119,12 +119,12 @@ static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int nee
if (count < 1)
return 0;
- for (i = 0; i < count; size += recs[i++]->size)
+ for (i = 0; i < count; size += recs[i++].size)
if (dest)
- memcpy(dest + size, recs[i]->ptr, recs[i]->size);
+ memcpy(dest + size, recs[i].ptr, recs[i].size);
if (add_nl) {
- i = recs[count - 1]->size;
- if (i == 0 || recs[count - 1]->ptr[i - 1] != '\n') {
+ i = recs[count - 1].size;
+ if (i == 0 || recs[count - 1].ptr[i - 1] != '\n') {
if (needs_cr) {
if (dest)
dest[size] = '\r';
@@ -160,22 +160,22 @@ static int is_eol_crlf(xdfile_t *file, int i)
if (i < file->nrec - 1)
/* All lines before the last *must* end in LF */
- return (size = file->recs[i]->size) > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ return (size = file->recs[i].size) > 1 &&
+ file->recs[i].ptr[size - 2] == '\r';
if (!file->nrec)
/* Cannot determine eol style from empty file */
return -1;
- if ((size = file->recs[i]->size) &&
- file->recs[i]->ptr[size - 1] == '\n')
+ if ((size = file->recs[i].size) &&
+ file->recs[i].ptr[size - 1] == '\n')
/* Last line; ends in LF; Is it CR/LF? */
return size > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ file->recs[i].ptr[size - 2] == '\r';
if (!i)
/* The only line has no eol */
return -1;
/* Determine eol from second-to-last line */
- return (size = file->recs[i - 1]->size) > 1 &&
- file->recs[i - 1]->ptr[size - 2] == '\r';
+ return (size = file->recs[i - 1].size) > 1 &&
+ file->recs[i - 1].ptr[size - 2] == '\r';
}
static int is_cr_needed(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m)
@@ -334,22 +334,22 @@ static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
static void xdl_refine_zdiff3_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
xpparam_t const *xpp)
{
- xrecord_t **rec1 = xe1->xdf2.recs, **rec2 = xe2->xdf2.recs;
+ xrecord_t *rec1 = xe1->xdf2.recs, *rec2 = xe2->xdf2.recs;
for (; m; m = m->next) {
/* let's handle just the conflicts */
if (m->mode)
continue;
while(m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1], rec2[m->i2], xpp->flags)) {
+ recmatch(&rec1[m->i1], &rec2[m->i2], xpp->flags)) {
m->chg1--;
m->chg2--;
m->i1++;
m->i2++;
}
while (m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1 + m->chg1 - 1],
- rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
+ recmatch(&rec1[m->i1 + m->chg1 - 1],
+ &rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
m->chg1--;
m->chg2--;
}
@@ -381,12 +381,12 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
* This probably does not work outside git, since
* we have a very simple mmfile structure.
*/
- t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
- t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
- + xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
- t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
- t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
- + xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
+ t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
+ t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
+ + xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
+ t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
+ t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
+ + xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
return -1;
if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
@@ -440,8 +440,8 @@ static int line_contains_alnum(const char *ptr, long size)
static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
{
for (; chg; chg--, i++)
- if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
- xe->xdf2.recs[i]->size))
+ if (line_contains_alnum(xe->xdf2.recs[i].ptr,
+ xe->xdf2.recs[i].size))
return 1;
return 0;
}
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 77dc411d19..bf69a58527 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -88,9 +88,9 @@ static int is_anchor(xpparam_t const *xpp, const char *line)
static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
int pass)
{
- xrecord_t **records = pass == 1 ?
+ xrecord_t *records = pass == 1 ?
map->env->xdf1.recs : map->env->xdf2.recs;
- xrecord_t *record = records[line - 1];
+ xrecord_t *record = &records[line - 1];
/*
* After xdl_prepare_env() (or more precisely, due to
* xdl_classify_record()), the "ha" member of the records (AKA lines)
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
return;
map->entries[index].line1 = line;
map->entries[index].hash = record->ha;
- map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
+ map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
if (!map->first)
map->first = map->entries + index;
if (map->last) {
@@ -246,8 +246,8 @@ static int find_longest_common_sequence(struct hashmap *map, struct entry **res)
static int match(struct hashmap *map, int line1, int line2)
{
- xrecord_t *record1 = map->env->xdf1.recs[line1 - 1];
- xrecord_t *record2 = map->env->xdf2.recs[line2 - 1];
+ xrecord_t *record1 = &map->env->xdf1.recs[line1 - 1];
+ xrecord_t *record2 = &map->env->xdf2.recs[line2 - 1];
return record1->ha == record2->ha;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 43cebf6721..5c7e858b6b 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -130,7 +130,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
}
@@ -145,8 +144,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->rchg = NULL;
xdf->recs = NULL;
- if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
- goto abort;
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
@@ -157,12 +154,10 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
hav = xdl_hash_record(&cur, top, xpp->flags);
if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
- if (!(crec = xdl_cha_alloc(&xdf->rcha)))
- goto abort;
+ crec = &xdf->recs[xdf->nrec++];
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- xdf->recs[xdf->nrec++] = crec;
if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
@@ -262,7 +257,7 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
*/
static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
long i, nm, nreff, mlim;
- xrecord_t **recs;
+ xrecord_t *recs;
xdlclass_t *rcrec;
char *dis, *dis1, *dis2;
int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
@@ -275,7 +270,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -283,7 +278,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -319,13 +314,13 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
*/
static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
long i, lim;
- xrecord_t **recs1, **recs2;
+ xrecord_t *recs1, *recs2;
recs1 = xdf1->recs;
recs2 = xdf2->recs;
for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
i++, recs1++, recs2++)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dstart = xdf2->dstart = i;
@@ -333,7 +328,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
recs1 = xdf1->recs + xdf1->nrec - 1;
recs2 = xdf2->recs + xdf2->nrec - 1;
for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dend = xdf1->nrec - i - 1;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 85848f1685..3d26cbf1ec 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -45,10 +45,9 @@ typedef struct s_xrecord {
} xrecord_t;
typedef struct s_xdfile {
- chastore_t rcha;
+ xrecord_t *recs;
long nrec;
long dstart, dend;
- xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 444a108f87..332982b509 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -416,12 +416,12 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
mmfile_t subfile1, subfile2;
xdfenv_t env;
- subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
- subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
- diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
- subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
- subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
- diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
+ subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
+ subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
+ diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
+ subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
+ subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
+ diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
return -1;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v3 09/10] xdiff: delete rchg aliasing
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
` (7 preceding siblings ...)
2025-09-19 15:16 ` [PATCH v3 08/10] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-19 15:16 ` Ezekiel Newren via GitGitGadget
2025-09-21 13:07 ` Phillip Wood
2025-09-19 15:16 ` [PATCH v3 10/10] xdiff: treat xdfile_t.rchg like an enum Ezekiel Newren via GitGitGadget
` (2 subsequent siblings)
11 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-19 15:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index a66125d44a..83c4cff6f7 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -932,16 +932,15 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
xdchange_t *cscr = NULL, *xch;
- char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
long i1, i2, l1, l2;
/*
* Trivial. Collects "groups" of changes and creates an edit script.
*/
for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
- if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
- for (l1 = i1; rchg1[i1 - 1]; i1--);
- for (l2 = i2; rchg2[i2 - 1]; i2--);
+ if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
+ for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
+ for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
xdl_free_script(cscr);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v3 10/10] xdiff: treat xdfile_t.rchg like an enum
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
` (8 preceding siblings ...)
2025-09-19 15:16 ` [PATCH v3 09/10] xdiff: delete rchg aliasing Ezekiel Newren via GitGitGadget
@ 2025-09-19 15:16 ` Ezekiel Newren via GitGitGadget
2025-09-21 0:00 ` Junio C Hamano
2025-09-19 23:30 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t in xdiff Elijah Newren
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
11 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-19 15:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren,
Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Define macros NO(0), YES(1), MAYBE(2) as the enum values for rchg to
make the code easier to follow. Perhaps 'rchg' should be renamed to
'changed'?
A few of the code changes might appear to change behavior, such as:
- while (xdf->rchg[g->start - 1])
+ while (xdf->rchg[g->start - 1] == YES)
because it appears the value of MAYBE is being ignored. However, MAYBE
is only ever assigned as a value to a temporary array (dis1 & dis2) and
then as a last step use that temporary array to decide if it wants to
change xdfile_t.rchg[i] to YES or leave it as NO. As such, rchg will
never have a value of MAYBE and thus there is no behavioral change.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiff.h | 4 ++++
xdiff/xdiffi.c | 22 +++++++++++-----------
xdiff/xhistogram.c | 8 ++++----
xdiff/xpatience.c | 8 ++++----
xdiff/xprepare.c | 24 ++++++++++++------------
5 files changed, 35 insertions(+), 31 deletions(-)
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 2cecde5afe..7092879829 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -27,6 +27,10 @@
extern "C" {
#endif /* #ifdef __cplusplus */
+#define NO 0
+#define YES 1
+#define MAYBE 2
+
/* xpparm_t.flags */
#define XDF_NEED_MINIMAL (1 << 0)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 83c4cff6f7..44fd27823a 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
- xdf2->rchg[xdf2->rindex[off2]] = 1;
+ xdf2->rchg[xdf2->rindex[off2]] = YES;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
- xdf1->rchg[xdf1->rindex[off1]] = 1;
+ xdf1->rchg[xdf1->rindex[off1]] = YES;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -708,7 +708,7 @@ struct xdlgroup {
static void group_init(xdfile_t *xdf, struct xdlgroup *g)
{
g->start = g->end = 0;
- while (xdf->rchg[g->end])
+ while (xdf->rchg[g->end] == YES)
g->end++;
}
@@ -722,7 +722,7 @@ static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->start = g->end + 1;
- for (g->end = g->start; xdf->rchg[g->end]; g->end++)
+ for (g->end = g->start; xdf->rchg[g->end] == YES; g->end++)
;
return 0;
@@ -738,7 +738,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->end = g->start - 1;
- for (g->start = g->end; xdf->rchg[g->start - 1]; g->start--)
+ for (g->start = g->end; xdf->rchg[g->start - 1] == YES; g->start--)
;
return 0;
@@ -753,10 +753,10 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
- xdf->rchg[g->start++] = 0;
- xdf->rchg[g->end++] = 1;
+ xdf->rchg[g->start++] = NO;
+ xdf->rchg[g->end++] = YES;
- while (xdf->rchg[g->end])
+ while (xdf->rchg[g->end] == YES)
g->end++;
return 0;
@@ -774,10 +774,10 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
- xdf->rchg[--g->start] = 1;
- xdf->rchg[--g->end] = 0;
+ xdf->rchg[--g->start] = YES;
+ xdf->rchg[--g->end] = NO;
- while (xdf->rchg[g->start - 1])
+ while (xdf->rchg[g->start - 1] == YES)
g->start--;
return 0;
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 4d857e8ae2..c2e85b8ab9 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -318,11 +318,11 @@ redo:
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
return 0;
}
@@ -335,9 +335,9 @@ redo:
else {
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
while (count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
result = 0;
} else {
result = histogram_diff(xpp, env,
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index bf69a58527..20cda5e258 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* trivial case: one side is empty */
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
return 0;
}
@@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* are there any matching lines at all? */
if (!map.has_matches) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = YES;
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = YES;
xdl_free(map.entries);
return 0;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 5c7e858b6b..c11875d07f 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -214,9 +214,9 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
* current line (i) is already a multimatch line.
*/
for (r = 1, rdis0 = 0, rpdis0 = 1; (i - r) >= s; r++) {
- if (!dis[i - r])
+ if (dis[i - r] == NO)
rdis0++;
- else if (dis[i - r] == 2)
+ else if (dis[i - r] == MAYBE)
rpdis0++;
else
break;
@@ -230,9 +230,9 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
if (rdis0 == 0)
return 0;
for (r = 1, rdis1 = 0, rpdis1 = 1; (i + r) <= e; r++) {
- if (!dis[i + r])
+ if (dis[i + r] == NO)
rdis1++;
- else if (dis[i + r] == 2)
+ else if (dis[i + r] == MAYBE)
rpdis1++;
else
break;
@@ -272,7 +272,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
- dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ dis1[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
}
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
@@ -280,26 +280,26 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
- dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ dis2[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
}
for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
i <= xdf1->dend; i++, recs++) {
- if (dis1[i] == 1 ||
- (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
+ if (dis1[i] == YES ||
+ (dis1[i] == MAYBE && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
} else
- xdf1->rchg[i] = 1;
+ xdf1->rchg[i] = YES;
}
xdf1->nreff = nreff;
for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
i <= xdf2->dend; i++, recs++) {
- if (dis2[i] == 1 ||
- (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
+ if (dis2[i] == YES ||
+ (dis2[i] == MAYBE && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
} else
- xdf2->rchg[i] = 1;
+ xdf2->rchg[i] = YES;
}
xdf2->nreff = nreff;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* Re: [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
` (9 preceding siblings ...)
2025-09-19 15:16 ` [PATCH v3 10/10] xdiff: treat xdfile_t.rchg like an enum Ezekiel Newren via GitGitGadget
@ 2025-09-19 23:30 ` Elijah Newren
2025-09-19 23:37 ` Ezekiel Newren
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
11 siblings, 1 reply; 158+ messages in thread
From: Elijah Newren @ 2025-09-19 23:30 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget
Cc: git, Phillip Wood, Ben Knoble, Ezekiel Newren
On Fri, Sep 19, 2025 at 8:16 AM Ezekiel Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> Changes since v2.
>
> * No patch changes, just resending to get patch 9 to show up on the mailing
> list.
> * A few tweaks to the cover letter.
>
> Changes since v1, to address review feedback.
>
> * Only include the clean up patches; The remaining patches will be split
> into a separate series.
> * Commit message clarifications.
> * Minor style cleanups.
> * Performance impacts included in commit message of patch 8.
I read over this latest round and it addresses all my feedback from
v1. On top of all the nice code cleanups that this series provides, I
appreciate the new detailed performance comparisons in the commit
message in patch 8; while this series as a whole doesn't make the code
appreciably faster yet, it's really cool that you've highlighted
another potential performance optimization (beyond the hashing one you
already highlighted elsewhere on the list) that we'll likely be able
to realize once you get some further preparatory refactoring done.
Looking forward to it.
I think this round is ready to merge to next.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-19 23:30 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t in xdiff Elijah Newren
@ 2025-09-19 23:37 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-19 23:37 UTC (permalink / raw)
To: Elijah Newren
Cc: Ezekiel Newren via GitGitGadget, git, Phillip Wood, Ben Knoble
On Fri, Sep 19, 2025 at 5:31 PM Elijah Newren <newren@gmail.com> wrote:
> I read over this latest round and it addresses all my feedback from
> v1. On top of all the nice code cleanups that this series provides, I
> appreciate the new detailed performance comparisons in the commit
> message in patch 8; while this series as a whole doesn't make the code
> appreciably faster yet, it's really cool that you've highlighted
> another potential performance optimization (beyond the hashing one you
> already highlighted elsewhere on the list) that we'll likely be able
> to realize once you get some further preparatory refactoring done.
> Looking forward to it.
Thanks for the positive feedback.
> I think this round is ready to merge to next.
I agree.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare
2025-09-19 15:16 ` [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
@ 2025-09-20 17:16 ` Junio C Hamano
2025-09-20 17:41 ` Ezekiel Newren
2025-09-20 17:46 ` Ben Knoble
0 siblings, 2 replies; 158+ messages in thread
From: Junio C Hamano @ 2025-09-20 17:16 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget
Cc: git, Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren
"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Move xdl_prepare_env() later in the file to avoid the need
> for static forward declarations.
>
> Best-viewed-with: --color-moved
Two comments.
- This is a bit unusual to see in the trailer.
- It turned out that it was a very effective way to spot a typo for
me. You should try it yourself before you send out your patches
;-).
> -int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
> - xdfenv_t *xe) {
> - long enl1, enl2, sample;
> - xdlclassifier_t cf;
> -
> - memset(&cf, 0, sizeof(cf));
> ...
> - xdl_free_ctx(&xe->xdf1);
> - xdl_free_classifier(&cf);
> - return -1;
> - }
The "--color-moved" painted the line above, with a single closing
brace, as removed, which stood out. It turns out that ...
> @@ -460,3 +394,53 @@ static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2
>
> return 0;
> }
> +
> +int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
> + xdfenv_t *xe) {
> + long enl1, enl2, sample;
> + xdlclassifier_t cf;
> +
> + memset(&cf, 0, sizeof(cf));
> ...
> + xdl_free_ctx(&xe->xdf2);
> + xdl_free_ctx(&xe->xdf1);
> + xdl_free_classifier(&cf);
> + return -1;
> + }
... the corresponding line in the postimage was shown as newly
added. That was because it was indented incorrectly.
> + xdl_free_classifier(&cf);
> +
> + return 0;
> +}
If I do not spot any other issues in the series, I may just "rebase
-i" to correct this single line to reduce the risk of mistakes,
instead of asking you to send an update. We'll see.
The change is sensible, and the proposed log message does a good
job, too.
Thanks.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 02/10] xdiff: delete local variables and initialize/free xdfile_t directly
2025-09-19 15:16 ` [PATCH v3 02/10] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
@ 2025-09-20 17:36 ` Junio C Hamano
0 siblings, 0 replies; 158+ messages in thread
From: Junio C Hamano @ 2025-09-20 17:36 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget
Cc: git, Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren
"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> These local variables are essentially a hand-rolled additional
> implementation of xdl_free_ctx() inlined into xdl_prepare_ctx(). Modify
> the code to use the existing xdl_free_ctx() function so there aren't
> two ways to free such variables.
Sensible.
> +static void xdl_free_ctx(xdfile_t *xdf)
> +{
> + xdl_free(xdf->rhash);
> + xdl_free(xdf->rindex);
> + xdl_free(xdf->rchg - 1);
> + xdl_free(xdf->ha);
> + xdl_free(xdf->recs);
> + xdl_cha_free(&xdf->rcha);
> +}
And I like the attention to the detail of where the opening brace is
in the "moved" existing function ;-).
> abort:
> - xdl_free(ha);
> - xdl_free(rindex);
> - xdl_free(rchg);
> - xdl_free(rhash);
> - xdl_free(recs);
> - xdl_cha_free(&xdf->rcha);
Upon an error, the original and the updated would behave a bit
differently here, as the original would not have touched xdf, other
than its rcha member, so the caller _could_ make use of the original
contents in the structure after seeing an error return. With the
new code, that is no longer possible.
Its only caller is xdl_prepare_env(), and its caller is
xdl_do_diff(), both of which passes the xdfenv_t *xe given by their
callers. There are four callers of xdl_do_diff():
xdl_fall_back_diff() in xdiff/xutils.c
xdl_merge() and xdl_refine_conflicts() in xdiff/xmerge.c
xdl_diff() in xdiff/xdiffi.c
and all of them seem to pass an uninitialized piece of memory as
xdfenv_t *xe down the callchain, so this behaviour change does not
make any difference.
> + xdl_free_ctx(xdf);
And the code certainly is safer as we know we have one place to look
at when we added a member that holds resources to xdfile_t.
> -static void xdl_free_ctx(xdfile_t *xdf) {
We know clearing/freeing side is fine, but what about initializing
side?
> static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
> xdlclassifier_t *cf, xdfile_t *xdf) {
> + long bsize;
> unsigned long hav;
> char const *blk, *cur, *top, *prev;
> xrecord_t *crec;
>
> + xdf->ha = NULL;
> + xdf->rindex = NULL;
> + xdf->rchg = NULL;
> + xdf->rhash = NULL;
> + xdf->recs = NULL;
It turns out that this is the only place that initializes xdfile_t
in xdiff/ API, so we are covered on both ends. xdiff/xprepare.c is
the only place we need to look at if we ever want to futz with
xdfile_t members, and with this change, we know there aren't two
ways to free things in it (there weren't two ways to initialize,
either, even before this patch).
Nice.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare
2025-09-20 17:16 ` Junio C Hamano
@ 2025-09-20 17:41 ` Ezekiel Newren
2025-09-20 18:31 ` Elijah Newren
2025-09-20 17:46 ` Ben Knoble
1 sibling, 1 reply; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-20 17:41 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Phillip Wood,
Ben Knoble
On Sat, Sep 20, 2025 at 11:16 AM Junio C Hamano <gitster@pobox.com> wrote:
> > Best-viewed-with: --color-moved
>
> Two comments.
>
> - This is a bit unusual to see in the trailer.
I'm still not sure what the etiquette is for including those kinds of
flags in a commit message. Could you show me my full commit message as
a response with your preferred way of adding those flags in a commit
message? please.
> - It turned out that it was a very effective way to spot a typo for
> me. You should try it yourself before you send out your patches
> ;-).
Huh, I have viewed this patch with --color-moved dozens of times. I
think CLion (My IDE of choice for C) "fixed" that for me, and I didn't
notice until you pointed it out. Elijah missed it too. Maybe my
terminal needs a more extreme contrast, so it's easier for me to spot
things like that.
> If I do not spot any other issues in the series, I may just "rebase
> -i" to correct this single line to reduce the risk of mistakes,
> instead of asking you to send an update. We'll see.
If that is the only problem then I would prefer that you fix that
single mistake, so I don't flood the mailing list.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare
2025-09-20 17:16 ` Junio C Hamano
2025-09-20 17:41 ` Ezekiel Newren
@ 2025-09-20 17:46 ` Ben Knoble
2025-09-20 18:46 ` Jeff King
1 sibling, 1 reply; 158+ messages in thread
From: Ben Knoble @ 2025-09-20 17:46 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Phillip Wood,
Ezekiel Newren
> Le 20 sept. 2025 à 13:16, Junio C Hamano <gitster@pobox.com> a écrit :
>
> "Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Ezekiel Newren <ezekielnewren@gmail.com>
>>
>> Move xdl_prepare_env() later in the file to avoid the need
>> for static forward declarations.
>>
>> Best-viewed-with: --color-moved
>
> Two comments.
>
> - This is a bit unusual to see in the trailer.
This was (loosely!) my suggestion, and I think Peff has once or twice done something similar.
No harm in working it into the prose instead, and I have no stake either way beyond a mild personal preference for the trailer.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 04/10] xdiff: delete xdl_get_rec() in xemit
2025-09-19 15:16 ` [PATCH v3 04/10] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
@ 2025-09-20 17:48 ` Junio C Hamano
2025-09-21 13:06 ` Phillip Wood
1 sibling, 0 replies; 158+ messages in thread
From: Junio C Hamano @ 2025-09-20 17:48 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget
Cc: git, Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren
"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> This function aliases the fields of xrecord_t, which makes it harder
> to track the usages of those fields. Delete it.
Can we phrase it in a way that is a bit more friendly to casual
readers? It is hard to tell if the function is serving any useful
purpose from the above. If it is doing something useful, but is
hard to read the surrounding code, that wouldn't be a good reason to
remove it.
It seems that this patch does what people often call "inlining a
function call". I am not sure your comment about aliasing or if
that is why the code is harder than necessary to read.
> -static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
> - long size, psize = strlen(pre);
> - char const *rec;
> -
> - size = xdl_get_rec(xdf, ri, &rec);
> - if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0) {
So we used to get "rec" and "size" separately because xdl_get_rec()
made xrecord_t inaccessible to its callers. To call a helper
function that takes a <ptr, len> pair, xdl_get_rec() is the way to
grab that <ptr, len> pair out of the record index.
> +static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
> +{
> + xrecord_t *rec = xdf->recs[ri];
>
> + if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0) {
> return -1;
> }
But by directly peeking into the xdf->recs[] array, we do not have
to. Each element of the array is the <ptr, len> pair we want.
The updated code is certainly easier to read. That applies to all
the other callers touched by this patch.
Thanks.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare
2025-09-20 17:41 ` Ezekiel Newren
@ 2025-09-20 18:31 ` Elijah Newren
2025-09-20 22:25 ` Ben Knoble
2025-09-20 22:43 ` Junio C Hamano
0 siblings, 2 replies; 158+ messages in thread
From: Elijah Newren @ 2025-09-20 18:31 UTC (permalink / raw)
To: Ezekiel Newren
Cc: Junio C Hamano, Ezekiel Newren via GitGitGadget, git,
Phillip Wood, Ben Knoble
On Sat, Sep 20, 2025 at 10:41 AM Ezekiel Newren <ezekielnewren@gmail.com> wrote:
>
> On Sat, Sep 20, 2025 at 11:16 AM Junio C Hamano <gitster@pobox.com> wrote:
> > > Best-viewed-with: --color-moved
> >
> > Two comments.
> >
> > - This is a bit unusual to see in the trailer.
>
> I'm still not sure what the etiquette is for including those kinds of
> flags in a commit message. Could you show me my full commit message as
> a response with your preferred way of adding those flags in a commit
> message? please.
This command run in git.git will give lots of examples from various authors:
git log --grep="est viewed with"
I didn't call out the trailer because I saw Ben suggest it in this
thread, and he said he had seen Peff do this. But, searching right
now, I can't actually find any such trailers from Peff or anyone else
-- am I just searching wrong?
> > - It turned out that it was a very effective way to spot a typo for
> > me. You should try it yourself before you send out your patches
> > ;-).
>
> Huh, I have viewed this patch with --color-moved dozens of times. I
> think CLion (My IDE of choice for C) "fixed" that for me, and I didn't
> notice until you pointed it out. Elijah missed it too. Maybe my
> terminal needs a more extreme contrast, so it's easier for me to spot
> things like that.
Yep, I did. In fact, in v1 of the series Ezekiel didn't call out the
--color-moved thing, but I used it and mentioned it in my review, and
Ezekiel decided to mention it in v2. When I looked at the
--color-moved output, I saw the sea of purple and blue and skimmed
quickly to verify it was all purple and blue -- and apparently didn't
see the one red character amidst the purple and the one green
character amidst the blue. Since the range-diff showed no differences
to v1 and I had already thought this patch was fine from v1, I didn't
look at it any closer.
So, not only did I miss it, but I missed it despite being the one to
suggest that flag after using it myself. Oops.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare
2025-09-20 17:46 ` Ben Knoble
@ 2025-09-20 18:46 ` Jeff King
2025-09-20 22:25 ` Ben Knoble
2025-09-20 22:52 ` Junio C Hamano
0 siblings, 2 replies; 158+ messages in thread
From: Jeff King @ 2025-09-20 18:46 UTC (permalink / raw)
To: Ben Knoble
Cc: Junio C Hamano, Ezekiel Newren via GitGitGadget, git,
Elijah Newren, Phillip Wood, Ezekiel Newren
On Sat, Sep 20, 2025 at 01:46:19PM -0400, Ben Knoble wrote:
> >> Best-viewed-with: --color-moved
> >
> > Two comments.
> >
> > - This is a bit unusual to see in the trailer.
>
> This was (loosely!) my suggestion, and I think Peff has once or twice
> done something similar.
I don't think I've ever used a trailer like that, but I do sometimes
mention it in prose. I'll sometimes put it in comments below the "---"
line, though.
-Peff
PS I sometimes find:
git log --format='%(trailers:only,keyonly)' |
sort | uniq -c | sort -rn
amusing to look through for this sort of thing.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare
2025-09-20 18:31 ` Elijah Newren
@ 2025-09-20 22:25 ` Ben Knoble
2025-09-20 22:43 ` Junio C Hamano
1 sibling, 0 replies; 158+ messages in thread
From: Ben Knoble @ 2025-09-20 22:25 UTC (permalink / raw)
To: Elijah Newren
Cc: Ezekiel Newren, Junio C Hamano, Ezekiel Newren via GitGitGadget,
git, Phillip Wood
> Le 20 sept. 2025 à 14:31, Elijah Newren <newren@gmail.com> a écrit :
>
> On Sat, Sep 20, 2025 at 10:41 AM Ezekiel Newren <ezekielnewren@gmail.com> wrote:
>>
>> On Sat, Sep 20, 2025 at 11:16 AM Junio C Hamano <gitster@pobox.com> wrote:
>>>> Best-viewed-with: --color-moved
>>>
>>> Two comments.
>>>
>>> - This is a bit unusual to see in the trailer.
>>
>> I'm still not sure what the etiquette is for including those kinds of
>> flags in a commit message. Could you show me my full commit message as
>> a response with your preferred way of adding those flags in a commit
>> message? please.
>
> This command run in git.git will give lots of examples from various authors:
> git log --grep="est viewed with"
>
> I didn't call out the trailer because I saw Ben suggest it in this
> thread, and he said he had seen Peff do this. But, searching right
> now, I can't actually find any such trailers from Peff or anyone else
> -- am I just searching wrong?
Looks like I misremembered. (Maybe I was remembering Peff’s notes, but I thought I had seen a differently-named trailer.) My apologies!
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare
2025-09-20 18:46 ` Jeff King
@ 2025-09-20 22:25 ` Ben Knoble
2025-09-20 22:52 ` Junio C Hamano
1 sibling, 0 replies; 158+ messages in thread
From: Ben Knoble @ 2025-09-20 22:25 UTC (permalink / raw)
To: Jeff King
Cc: Junio C Hamano, Ezekiel Newren via GitGitGadget, git,
Elijah Newren, Phillip Wood, Ezekiel Newren
> Le 20 sept. 2025 à 14:46, Jeff King <peff@peff.net> a écrit :
>
> On Sat, Sep 20, 2025 at 01:46:19PM -0400, Ben Knoble wrote:
>
>>>> Best-viewed-with: --color-moved
>>>
>>> Two comments.
>>>
>>> - This is a bit unusual to see in the trailer.
>>
>> This was (loosely!) my suggestion, and I think Peff has once or twice
>> done something similar.
>
> I don't think I've ever used a trailer like that, but I do sometimes
> mention it in prose. I'll sometimes put it in comments below the "---"
> line, though.
>
> -Peff
Silly memory. Sorry :)
> PS I sometimes find:
>
> git log --format='%(trailers:only,keyonly)' |
> sort | uniq -c | sort -rn
>
> amusing to look through for this sort of thing.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare
2025-09-20 18:31 ` Elijah Newren
2025-09-20 22:25 ` Ben Knoble
@ 2025-09-20 22:43 ` Junio C Hamano
1 sibling, 0 replies; 158+ messages in thread
From: Junio C Hamano @ 2025-09-20 22:43 UTC (permalink / raw)
To: Elijah Newren
Cc: Ezekiel Newren, Ezekiel Newren via GitGitGadget, git,
Phillip Wood, Ben Knoble
Elijah Newren <newren@gmail.com> writes:
>> Huh, I have viewed this patch with --color-moved dozens of times. I
>> think CLion (My IDE of choice for C) "fixed" that for me, and I didn't
>> notice until you pointed it out. Elijah missed it too. Maybe my
>> terminal needs a more extreme contrast, so it's easier for me to spot
>> things like that.
>
> Yep, I did. In fact, in v1 of the series Ezekiel didn't call out the
> --color-moved thing, but I used it and mentioned it in my review, and
> Ezekiel decided to mention it in v2. When I looked at the
> --color-moved output, I saw the sea of purple and blue and skimmed
> quickly to verify it was all purple and blue -- and apparently didn't
> see the one red character amidst the purple and the one green
> character amidst the blue. Since the range-diff showed no differences
> to v1 and I had already thought this patch was fine from v1, I didn't
> look at it any closer.
>
> So, not only did I miss it, but I missed it despite being the one to
> suggest that flag after using it myself. Oops.
Heh, no harm done. With enough eyeballs, these little things will
be found ;-)
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare
2025-09-20 18:46 ` Jeff King
2025-09-20 22:25 ` Ben Knoble
@ 2025-09-20 22:52 ` Junio C Hamano
2025-09-20 23:15 ` Jeff King
1 sibling, 1 reply; 158+ messages in thread
From: Junio C Hamano @ 2025-09-20 22:52 UTC (permalink / raw)
To: Jeff King
Cc: Ben Knoble, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
Phillip Wood, Ezekiel Newren
Jeff King <peff@peff.net> writes:
> I don't think I've ever used a trailer like that, but I do sometimes
> mention it in prose. I'll sometimes put it in comments below the "---"
> line, though.
Yeah, I am on the fence between in prose and below the three-dash
line, as unlike other comments that often help only those during
review (e.g., what is different relative to the previous round),
this hint is helpful to those who read "git log -p".
> PS I sometimes find:
>
> git log --format='%(trailers:only,keyonly)' |
> sort | uniq -c | sort -rn
>
> amusing to look through for this sort of thing.
The top entries are as expecte (this is with --since=5.years)
24336 Signed-off-by
17501
740 Helped-by
703 Reviewed-by
495 Acked-by
485 Reported-by
420 Mentored-by
186 Co-authored-by
164 Suggested-by
but I have to wonder what the empty one is.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare
2025-09-20 22:52 ` Junio C Hamano
@ 2025-09-20 23:15 ` Jeff King
0 siblings, 0 replies; 158+ messages in thread
From: Jeff King @ 2025-09-20 23:15 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ben Knoble, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
Phillip Wood, Ezekiel Newren
On Sat, Sep 20, 2025 at 03:52:41PM -0700, Junio C Hamano wrote:
> > PS I sometimes find:
> >
> > git log --format='%(trailers:only,keyonly)' |
> > sort | uniq -c | sort -rn
> >
> > amusing to look through for this sort of thing.
>
> The top entries are as expecte (this is with --since=5.years)
>
> 24336 Signed-off-by
> 17501
> 740 Helped-by
> 703 Reviewed-by
> 495 Acked-by
> 485 Reported-by
> 420 Mentored-by
> 186 Co-authored-by
> 164 Suggested-by
>
> but I have to wonder what the empty one is.
I think %(trailers) always ends each trailer line with its own newline,
including the final one. So then --format adds its own final newline,
and you get a bunch of blank lines between commits. It's easier to see
with --format='%h %s%n(trailers:only,keyonly)'.
I don't think you'd want to change %(trailers) to omit the final
newline, otherwise a format like:
git log --format='%h %s%n%(trailers)---'
would break.
Possibly the formatter could be more clever about adding the final
newline only when the formatted text does not itself end with one. But I
suspect some people's custom formats do rely on the current behavior, so
you probably need some kind of new option or format placeholder.
-Peff
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 10/10] xdiff: treat xdfile_t.rchg like an enum
2025-09-19 15:16 ` [PATCH v3 10/10] xdiff: treat xdfile_t.rchg like an enum Ezekiel Newren via GitGitGadget
@ 2025-09-21 0:00 ` Junio C Hamano
2025-09-21 0:38 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Junio C Hamano @ 2025-09-21 0:00 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget
Cc: git, Elijah Newren, Phillip Wood, Ben Knoble, Ezekiel Newren
"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
> index 2cecde5afe..7092879829 100644
> --- a/xdiff/xdiff.h
> +++ b/xdiff/xdiff.h
> @@ -27,6 +27,10 @@
> extern "C" {
> #endif /* #ifdef __cplusplus */
>
> +#define NO 0
> +#define YES 1
> +#define MAYBE 2
<xdiff/xdiff.h> is included surprisingly widely.
I am not comfortable with the idea of exposing a set of overly
genericly named macros like these, especially when they are *meant*
only to be used with xdfile_t.rchg, to those *.c files. So far,
when they include <xdiff-interface.h> (or <ll-merge.h>), they have
been rest assured that their namespaces won't be contaminated and
they would not risk stepping on others' toes as long as they stay
away from inventing their own xdsomething or s_xsomething (neither
of which is quite similar to how we name our symbols and types).
But these are names that they may legitimately want to use for their
own use.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 10/10] xdiff: treat xdfile_t.rchg like an enum
2025-09-21 0:00 ` Junio C Hamano
@ 2025-09-21 0:38 ` Ezekiel Newren
2025-09-21 9:19 ` Phillip Wood
0 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-21 0:38 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Phillip Wood,
Ben Knoble
On Sat, Sep 20, 2025 at 6:00 PM Junio C Hamano <gitster@pobox.com> wrote:
> I am not comfortable with the idea of exposing a set of overly
> genericly named macros like these, especially when they are *meant*
> only to be used with xdfile_t.rchg, to those *.c files. So far,
> when they include <xdiff-interface.h> (or <ll-merge.h>), they have
> been rest assured that their namespaces won't be contaminated and
> they would not risk stepping on others' toes as long as they stay
> away from inventing their own xdsomething or s_xsomething (neither
> of which is quite similar to how we name our symbols and types).
What if I move NO, YES, MAYBE into xprepare.c and refactor `char rchg`
to `bool changed`? The problem with bool is that C needs to include
stdbool.h to match how Rust defines bool. git-compat-util.h didn't
include it, then it did, then it didn't because compat/posix.h
included it instead.
How do you feel about xdiff.h including compat/posix.h too? If we
don't use bool on the C side then Rust is going to be littered with
some_condition != 0 or other_condition == 0 and won't be as clear that
it's a boolean instead of a numeric type.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 10/10] xdiff: treat xdfile_t.rchg like an enum
2025-09-21 0:38 ` Ezekiel Newren
@ 2025-09-21 9:19 ` Phillip Wood
2025-09-21 16:11 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Phillip Wood @ 2025-09-21 9:19 UTC (permalink / raw)
To: Ezekiel Newren, Junio C Hamano
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble
On 21/09/2025 01:38, Ezekiel Newren wrote:
> On Sat, Sep 20, 2025 at 6:00 PM Junio C Hamano <gitster@pobox.com> wrote:
>> I am not comfortable with the idea of exposing a set of overly
>> genericly named macros like these, especially when they are *meant*
>> only to be used with xdfile_t.rchg, to those *.c files. So far,
>> when they include <xdiff-interface.h> (or <ll-merge.h>), they have
>> been rest assured that their namespaces won't be contaminated and
>> they would not risk stepping on others' toes as long as they stay
>> away from inventing their own xdsomething or s_xsomething (neither
>> of which is quite similar to how we name our symbols and types).
>
> What if I move NO, YES, MAYBE into xprepare.c and refactor `char rchg`
> to `bool changed`?
That would be good as it avoids the possibility of using MAYBE outside
of xprepare.c
> The problem with bool is that C needs to include
> stdbool.h to match how Rust defines bool. git-compat-util.h didn't
> include it, then it did, then it didn't because compat/posix.h
> included it instead.
75a044f748f (git-compat-util.h: split out POSIX-emulating bits,
2025-02-18) moved '#include <stdbool.h>' from "git-compat-util.h" into
"compat/posix.h" but also added '#include "compat/posix.h"' to
"git-compat-util.h" so there should be no problem.
Thanks
Phillip
>
> How do you feel about xdiff.h including compat/posix.h too? If we
> don't use bool on the C side then Rust is going to be littered with
> some_condition != 0 or other_condition == 0 and won't be as clear that
> it's a boolean instead of a numeric type.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 04/10] xdiff: delete xdl_get_rec() in xemit
2025-09-19 15:16 ` [PATCH v3 04/10] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
2025-09-20 17:48 ` Junio C Hamano
@ 2025-09-21 13:06 ` Phillip Wood
2025-09-21 15:07 ` Ezekiel Newren
1 sibling, 1 reply; 158+ messages in thread
From: Phillip Wood @ 2025-09-21 13:06 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget, git
Cc: Elijah Newren, Ben Knoble, Ezekiel Newren
Hi Ezekiel
On 19/09/2025 16:16, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> This function aliases the fields of xrecord_t, which makes it harder
> to track the usages of those fields. Delete it.
Patch 6 goes the other way and introduces a getter function that hides
the field accesses so I'm not sure why this one is so bad that it needs
to be removed.
Thanks
Phillip
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xemit.c | 38 +++++++++++++-------------------------
> 1 file changed, 13 insertions(+), 25 deletions(-)
>
> diff --git a/xdiff/xemit.c b/xdiff/xemit.c
> index 1d40c9cb40..b3793e81e2 100644
> --- a/xdiff/xemit.c
> +++ b/xdiff/xemit.c
> @@ -22,21 +22,11 @@
>
> #include "xinclude.h"
>
> -static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
> -
> - *rec = xdf->recs[ri]->ptr;
> -
> - return xdf->recs[ri]->size;
> -}
> -
> -
> -static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
> - long size, psize = strlen(pre);
> - char const *rec;
> -
> - size = xdl_get_rec(xdf, ri, &rec);
> - if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0) {
> +static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
> +{
> + xrecord_t *rec = xdf->recs[ri];
>
> + if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0) {
> return -1;
> }
>
> @@ -120,11 +110,11 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
> static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
> char *buf, long sz)
> {
> - const char *rec;
> - long len = xdl_get_rec(xdf, ri, &rec);
> + xrecord_t *rec = xdf->recs[ri];
> +
> if (!xecfg->find_func)
> - return def_ff(rec, len, buf, sz);
> - return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
> + return def_ff(rec->ptr, rec->size, buf, sz);
> + return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
> }
>
> static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
> @@ -160,14 +150,12 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
>
> static int is_empty_rec(xdfile_t *xdf, long ri)
> {
> - const char *rec;
> - long len = xdl_get_rec(xdf, ri, &rec);
> + xrecord_t *rec = xdf->recs[ri];
> + long i = 0;
>
> - while (len > 0 && XDL_ISSPACE(*rec)) {
> - rec++;
> - len--;
> - }
> - return !len;
> + for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
> +
> + return i == rec->size;
> }
>
> int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 05/10] xdiff: delete struct diffdata_t
2025-09-19 15:16 ` [PATCH v3 05/10] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
@ 2025-09-21 13:06 ` Phillip Wood
2025-09-21 16:03 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Phillip Wood @ 2025-09-21 13:06 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget, git
Cc: Elijah Newren, Ben Knoble, Ezekiel Newren
Hi Ezekiel
On 19/09/2025 16:16, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Every field in this struct is an alias for a certain field in xdfile_t.
>
> diffdata_t.nrec -> xdfile_t.nreff
> diffdata_t.ha -> xdfile_t.ha
> diffdata_t.rindex -> xdfile_t.rindex
> diffdata_t.rchg -> xdfile_t.rchg
That explains some of the changes here (so long as one assumes the
aliasing is a bad thing) but it does not explain why it is a good idea
to remove the local variables rchg[12] and rindex[12] and instead
dereference xdf[12] inside the loops
Thanks
Phillip
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xdiffi.c | 32 ++++++++------------------------
> xdiff/xdiffi.h | 11 ++---------
> 2 files changed, 10 insertions(+), 33 deletions(-)
>
> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index 5a96e36dfb..bbf0161f84 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -257,10 +257,10 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
> * sub-boxes by calling the box splitting function. Note that the real job
> * (marking changed lines) is done in the two boundary reaching checks.
> */
> -int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
> - diffdata_t *dd2, long off2, long lim2,
> +int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
> + xdfile_t *xdf2, long off2, long lim2,
> long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
> - unsigned long const *ha1 = dd1->ha, *ha2 = dd2->ha;
> + unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
>
> /*
> * Shrink the box by walking through each diagonal snake (SW and NE).
> @@ -273,17 +273,11 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
> * be obviously changed.
> */
> if (off1 == lim1) {
> - char *rchg2 = dd2->rchg;
> - long *rindex2 = dd2->rindex;
> -
> for (; off2 < lim2; off2++)
> - rchg2[rindex2[off2]] = 1;
> + xdf2->rchg[xdf2->rindex[off2]] = 1;
> } else if (off2 == lim2) {
> - char *rchg1 = dd1->rchg;
> - long *rindex1 = dd1->rindex;
> -
> for (; off1 < lim1; off1++)
> - rchg1[rindex1[off1]] = 1;
> + xdf1->rchg[xdf1->rindex[off1]] = 1;
> } else {
> xdpsplit_t spl;
> spl.i1 = spl.i2 = 0;
> @@ -300,9 +294,9 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
> /*
> * ... et Impera.
> */
> - if (xdl_recs_cmp(dd1, off1, spl.i1, dd2, off2, spl.i2,
> + if (xdl_recs_cmp(xdf1, off1, spl.i1, xdf2, off2, spl.i2,
> kvdf, kvdb, spl.min_lo, xenv) < 0 ||
> - xdl_recs_cmp(dd1, spl.i1, lim1, dd2, spl.i2, lim2,
> + xdl_recs_cmp(xdf1, spl.i1, lim1, xdf2, spl.i2, lim2,
> kvdf, kvdb, spl.min_hi, xenv) < 0) {
>
> return -1;
> @@ -318,7 +312,6 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
> long ndiags;
> long *kvd, *kvdf, *kvdb;
> xdalgoenv_t xenv;
> - diffdata_t dd1, dd2;
> int res;
>
> if (xdl_prepare_env(mf1, mf2, xpp, xe) < 0)
> @@ -357,16 +350,7 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
> xenv.snake_cnt = XDL_SNAKE_CNT;
> xenv.heur_min = XDL_HEUR_MIN_COST;
>
> - dd1.nrec = xe->xdf1.nreff;
> - dd1.ha = xe->xdf1.ha;
> - dd1.rchg = xe->xdf1.rchg;
> - dd1.rindex = xe->xdf1.rindex;
> - dd2.nrec = xe->xdf2.nreff;
> - dd2.ha = xe->xdf2.ha;
> - dd2.rchg = xe->xdf2.rchg;
> - dd2.rindex = xe->xdf2.rindex;
> -
> - res = xdl_recs_cmp(&dd1, 0, dd1.nrec, &dd2, 0, dd2.nrec,
> + res = xdl_recs_cmp(&xe->xdf1, 0, xe->xdf1.nreff, &xe->xdf2, 0, xe->xdf2.nreff,
> kvdf, kvdb, (xpp->flags & XDF_NEED_MINIMAL) != 0,
> &xenv);
> xdl_free(kvd);
> diff --git a/xdiff/xdiffi.h b/xdiff/xdiffi.h
> index 126c9d8ff4..49e52c67f9 100644
> --- a/xdiff/xdiffi.h
> +++ b/xdiff/xdiffi.h
> @@ -24,13 +24,6 @@
> #define XDIFFI_H
>
>
> -typedef struct s_diffdata {
> - long nrec;
> - unsigned long const *ha;
> - long *rindex;
> - char *rchg;
> -} diffdata_t;
> -
> typedef struct s_xdalgoenv {
> long mxcost;
> long snake_cnt;
> @@ -46,8 +39,8 @@ typedef struct s_xdchange {
>
>
>
> -int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
> - diffdata_t *dd2, long off2, long lim2,
> +int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
> + xdfile_t *xdf2, long off2, long lim2,
> long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv);
> int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
> xdfenv_t *xe);
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 07/10] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
2025-09-19 15:16 ` [PATCH v3 07/10] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-21 13:06 ` Phillip Wood
2025-09-21 16:07 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Phillip Wood @ 2025-09-21 13:06 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget, git
Cc: Elijah Newren, Ben Knoble, Ezekiel Newren
Hi Ezekiol
On 19/09/2025 16:16, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> The fields from xdlclass_t are aliases of xrecord_t:
> xdlclass_t.line -> xrecord_t.ptr
> xdlclass_t.size -> xrecord_t.size
> xdlclass_t.ha -> xrecord_t.ha
>
> Remove aliasing from xdlclass_t, to reduce future refactoring mistakes.
This is a rather nebulous reason. I assume this is about changing the
types used in xrecord_t in which case it would be helpful to say
something like
xdlclass_t carries a copy of the data in xrecord_t, but instead of
embedding xrecord_t it duplicates the individual fields. A future commit
will change the types used in xrecord_t so embed it in xdlclass_t first
so we don't have to remember to change the types here as well.
As we're embedding the struct, instead of doing
> - rcrec->line = line;
> - rcrec->size = rec->size;
> - rcrec->ha = rec->ha;
> + rcrec->rec.ptr = rec->ptr;
> + rcrec->rec.size = rec->size;
> + rcrec->rec.ha = rec->ha;
it would be simpler do do
- rcrec->line = line;
- rcrec->size = rec->size;
- rcrec->ha = rec->ha;
+ rcrec->rec = rec;
which would make it clear we're copying all the struct members.
Thanks
Phillip
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 09/10] xdiff: delete rchg aliasing
2025-09-19 15:16 ` [PATCH v3 09/10] xdiff: delete rchg aliasing Ezekiel Newren via GitGitGadget
@ 2025-09-21 13:07 ` Phillip Wood
2025-09-21 16:37 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Phillip Wood @ 2025-09-21 13:07 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget, git
Cc: Elijah Newren, Ben Knoble, Ezekiel Newren
Hi Ezekiel
What is the purpose of this change. On the face of it it makes the code
more verbose and introduces an extra pointer dereference into the loop
condition. The compiler may lift the deference out of the loop but it
would be helpful to know why this change is useful.
Thanks
Phillip
On 19/09/2025 16:16, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Best-viewed-with: --color-words
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xdiffi.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index a66125d44a..83c4cff6f7 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -932,16 +932,15 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
>
> int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
> xdchange_t *cscr = NULL, *xch;
> - char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
> long i1, i2, l1, l2;
>
> /*
> * Trivial. Collects "groups" of changes and creates an edit script.
> */
> for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
> - if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
> - for (l1 = i1; rchg1[i1 - 1]; i1--);
> - for (l2 = i2; rchg2[i2 - 1]; i2--);
> + if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
> + for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
> + for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
>
> if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
> xdl_free_script(cscr);
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 04/10] xdiff: delete xdl_get_rec() in xemit
2025-09-21 13:06 ` Phillip Wood
@ 2025-09-21 15:07 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-21 15:07 UTC (permalink / raw)
To: Phillip Wood
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble
On Sun, Sep 21, 2025 at 7:06 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> Patch 6 goes the other way and introduces a getter function that hides
> the field accesses so I'm not sure why this one is so bad that it needs
> to be removed.
I've added a copy below of the two functions for easy reference.
To quote myself[1]:
```
The fields rindex and ha of xdfile_t are specific to the classic diff
(myers and minimal). I plan on creating a struct for classic diff, but
there's a lot of cleanup that needs to be done before that can happen,
and leaving ha in would make those cleanups harder to follow.
```
get_hash() is a scaffolding function that will reduce refactor churn.
It changes a few times in this patch series alone, and will change a
few more times before the code is cleaned up enough to delete it. By
contrast, xdl_get_rec() merely performs an array index, which is so
trivial that it doesn't justify having its own function.
get_hash() reduces confusion because xdfile_t.ha is an array that is a
sparse copy of xrecord_t.ha values from xdfile_t.recs. The field
xrecord_t.ha is confusing on its own, as it is first used to store the
hash of the line and later repurposed as a minimal perfect hash[2].
static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
*rec = xdf->recs[ri]->ptr;
return xdf->recs[ri]->size;
}
static unsigned long get_hash(xdfile_t *xdf, long index)
{
return xdf->recs[xdf->rindex[index]]->ha;
}
[1] https://lore.kernel.org/git/0bacb1191dad2748d2afa79665f1293b0381bde1.1758294992.git.gitgitgadget@gmail.com/
[2] https://lore.kernel.org/git/af96763036e13480ed4e6dfedcade5b2c90e414c.1757274320.git.gitgitgadget@gmail.com/
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 05/10] xdiff: delete struct diffdata_t
2025-09-21 13:06 ` Phillip Wood
@ 2025-09-21 16:03 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-21 16:03 UTC (permalink / raw)
To: Phillip Wood
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble
On Sun, Sep 21, 2025 at 7:06 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> > Every field in this struct is an alias for a certain field in xdfile_t.
> >
> > diffdata_t.nrec -> xdfile_t.nreff
> > diffdata_t.ha -> xdfile_t.ha
> > diffdata_t.rindex -> xdfile_t.rindex
> > diffdata_t.rchg -> xdfile_t.rchg
>
> That explains some of the changes here (so long as one assumes the
> aliasing is a bad thing) but it does not explain why it is a good idea
> to remove the local variables rchg[12] and rindex[12] and instead
> dereference xdf[12] inside the loops
I removed the struct and local variable aliases to make it easier for
usage-tracking tools to follow where fields are actually used. Whether
someone relies on grep, ctags/etags, or more modern tooling, aliases
hide the true field accesses and make it harder to track or refactor
those fields consistently. By using the fields directly, every
reference is literal and unambiguous. Also the local variable names
don't add any meaning and only barely shorten the lines where they are
used.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 07/10] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
2025-09-21 13:06 ` Phillip Wood
@ 2025-09-21 16:07 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-21 16:07 UTC (permalink / raw)
To: Phillip Wood
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble
On Sun, Sep 21, 2025 at 7:06 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> > The fields from xdlclass_t are aliases of xrecord_t:
> > xdlclass_t.line -> xrecord_t.ptr
> > xdlclass_t.size -> xrecord_t.size
> > xdlclass_t.ha -> xrecord_t.ha
> >
> > Remove aliasing from xdlclass_t, to reduce future refactoring mistakes.
>
> This is a rather nebulous reason. I assume this is about changing the
> types used in xrecord_t.
Yes, this is a stepping stone to many more refactorings that I have planned.
> in which case it would be helpful to say something like:
> xdlclass_t carries a copy of the data in xrecord_t, but instead of
> embedding xrecord_t it duplicates the individual fields. A future commit
> will change the types used in xrecord_t so embed it in xdlclass_t first
> so we don't have to remember to change the types here as well.
I've incorporated your wording into my commit message.
> As we're embedding the struct, instead of doing
>
> > - rcrec->line = line;
> > - rcrec->size = rec->size;
> > - rcrec->ha = rec->ha;
> > + rcrec->rec.ptr = rec->ptr;
> > + rcrec->rec.size = rec->size;
> > + rcrec->rec.ha = rec->ha;
>
> it would be simpler do do
>
> - rcrec->line = line;
> - rcrec->size = rec->size;
> - rcrec->ha = rec->ha;
> + rcrec->rec = rec;
>
> which would make it clear we're copying all the struct members.
Makes sense. I'll incorporate that.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 10/10] xdiff: treat xdfile_t.rchg like an enum
2025-09-21 9:19 ` Phillip Wood
@ 2025-09-21 16:11 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-21 16:11 UTC (permalink / raw)
To: phillip.wood
Cc: Junio C Hamano, Ezekiel Newren via GitGitGadget, git,
Elijah Newren, Ben Knoble
On Sun, Sep 21, 2025 at 3:19 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> > What if I move NO, YES, MAYBE into xprepare.c and refactor `char rchg`
> > to `bool changed`?
>
> That would be good as it avoids the possibility of using MAYBE outside
> of xprepare.c
Ok, I'll do that.
> > The problem with bool is that C needs to include
> > stdbool.h to match how Rust defines bool. git-compat-util.h didn't
> > include it, then it did, then it didn't because compat/posix.h
> > included it instead.
>
> 75a044f748f (git-compat-util.h: split out POSIX-emulating bits,
> 2025-02-18) moved '#include <stdbool.h>' from "git-compat-util.h" into
> "compat/posix.h" but also added '#include "compat/posix.h"' to
> "git-compat-util.h" so there should be no problem.
Oh, I missed that when I was reading through git-compat-util.h. I just
searched for `stdbool.h` and saw that it was in compat/posix.h instead
of git-compat-util.h.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v3 09/10] xdiff: delete rchg aliasing
2025-09-21 13:07 ` Phillip Wood
@ 2025-09-21 16:37 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-21 16:37 UTC (permalink / raw)
To: Phillip Wood
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble
On Sun, Sep 21, 2025 at 7:06 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> What is the purpose of this change. On the face of it it makes the code
> more verbose and introduces an extra pointer dereference into the loop
> condition. The compiler may lift the dereference out of the loop but it
> would be helpful to know why this change is useful.
Most of Git directly accesses rchg, so changing this to also directly
access it makes the code more consistent. Also, usage tracking tools
like ctags or a modern IDE won't show all uses. This makes it harder
to refactor or audit. Dropping the aliases means every access is
visible and discoverable through the actual field name.
^ permalink raw reply [flat|nested] 158+ messages in thread
* [PATCH v4 00/12] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
` (10 preceding siblings ...)
2025-09-19 23:30 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t in xdiff Elijah Newren
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 01/12] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
` (13 more replies)
11 siblings, 14 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren
Changes since v3.
* Address review feedback.
* Split the deletion of xdl_get_rec() into 2 commits.
* Move NO, YES, MAYBE into xprepare.c, and use bool literals.
* refactor 'char rchg' to 'bool changed'
Changes since v2.
* No patch changes, just resending to get patch 9 to show up on the mailing
list.
* A few tweaks to the cover letter.
Changes since v1, to address review feedback.
* Only include the clean up patches; The remaining patches will be split
into a separate series.
* Commit message clarifications.
* Minor style cleanups.
* Performance impacts included in commit message of patch 8.
Relevant part of the original cover letter follows:
===================================================
Before:
typedef struct s_xrecord {
struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
unsigned int hbits;
xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
unsigned long *ha;
} xdfile_t;
After cleanup:
typedef struct s_xrecord {
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
xrecord_t *recs;
long nrec;
long dstart, dend;
char *rchg;
long *rindex;
long nreff;
} xdfile_t;
===
Ezekiel Newren (12):
xdiff: delete static forward declarations in xprepare
xdiff: delete local variables and initialize/free xdfile_t directly
xdiff: delete unnecessary fields from xrecord_t and xdfile_t
xdiff: delete superfluous function xdl_get_rec() in xemit
xdiff: delete superfluous local variables that alias fields in
xrecord_t
xdiff: delete struct diffdata_t
xdiff: delete redundant array xdfile_t.ha
xdiff: delete fields ha, line, size in xdlclass_t in favor of an
xrecord_t
xdiff: delete chastore from xdfile_t
xdiff: delete rchg aliasing
xdiff: use bool literals for xdfile_t.rchg
xdiff: refactor 'char *rchg' to 'bool *changed' in xdfile_t
xdiff/xdiffi.c | 101 ++++++++---------
xdiff/xdiffi.h | 11 +-
xdiff/xemit.c | 38 +++----
xdiff/xhistogram.c | 10 +-
xdiff/xmerge.c | 56 +++++-----
xdiff/xpatience.c | 18 ++--
xdiff/xprepare.c | 263 +++++++++++++++++----------------------------
xdiff/xtypes.h | 9 +-
xdiff/xutils.c | 16 +--
9 files changed, 212 insertions(+), 310 deletions(-)
base-commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2048%2Fezekielnewren%2Fuse_rust_types_in_xdiff-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2048/ezekielnewren/use_rust_types_in_xdiff-v4
Pull-Request: https://github.com/git/git/pull/2048
Range-diff vs v3:
1: 784cffcef5 ! 1: 79d1099656 xdiff: delete static forward declarations in xprepare
@@ xdiff/xprepare.c: static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xd
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
-+ }
++ }
+
+ xdl_free_classifier(&cf);
+
2: b79157e64f = 2: 9142f28fcd xdiff: delete local variables and initialize/free xdfile_t directly
3: 2e8de5be03 = 3: 13f00f5683 xdiff: delete unnecessary fields from xrecord_t and xdfile_t
-: ---------- > 4: 311279c123 xdiff: delete superfluous function xdl_get_rec() in xemit
4: ddfee67e06 ! 5: d84658ac83 xdiff: delete xdl_get_rec() in xemit
@@ Metadata
Author: Ezekiel Newren <ezekielnewren@gmail.com>
## Commit message ##
- xdiff: delete xdl_get_rec() in xemit
+ xdiff: delete superfluous local variables that alias fields in xrecord_t
- This function aliases the fields of xrecord_t, which makes it harder
- to track the usages of those fields. Delete it.
+ Use the type xrecord_t as the local variable for the functions in the
+ file xdiff/xemit.c.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
## xdiff/xemit.c ##
@@
-
#include "xinclude.h"
--static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
--
-- *rec = xdf->recs[ri]->ptr;
--
-- return xdf->recs[ri]->size;
--}
--
--
+
-static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
- long size, psize = strlen(pre);
-- char const *rec;
--
-- size = xdl_get_rec(xdf, ri, &rec);
-- if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0) {
+- char const *rec = xdf->recs[ri]->ptr;
+static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
+{
+ xrecord_t *rec = xdf->recs[ri];
-+ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0) {
+- size = xdf->recs[ri]->size;
+- if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0)
++ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
return -1;
- }
+ return 0;
@@ xdiff/xemit.c: static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
-- const char *rec;
-- long len = xdl_get_rec(xdf, ri, &rec);
+- const char *rec = xdf->recs[ri]->ptr;
+- long len = xdf->recs[ri]->size;
+ xrecord_t *rec = xdf->recs[ri];
+
if (!xecfg->find_func)
@@ xdiff/xemit.c: static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg
static int is_empty_rec(xdfile_t *xdf, long ri)
{
-- const char *rec;
-- long len = xdl_get_rec(xdf, ri, &rec);
+- const char *rec = xdf->recs[ri]->ptr;
+- long len = xdf->recs[ri]->size;
+ xrecord_t *rec = xdf->recs[ri];
+ long i = 0;
5: 807ce3e5aa = 6: bf16453846 xdiff: delete struct diffdata_t
6: 0bacb1191d ! 7: 4ef7f243e9 xdiff: delete redundant array xdfile_t.ha
@@ Commit message
This makes the code about 5% slower. The fields rindex and ha are
specific to the classic diff (myers and minimal). I plan on creating a
- struct for classic diff, but there'a alot of cleanup that needs to be
+ struct for classic diff, but there's a lot of cleanup that needs to be
done before that can happen and leaving ha in would make those cleanups
harder to follow.
7: e1e94107c9 ! 8: 3b6c2127c4 xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
@@ Commit message
xdlclass_t.size -> xrecord_t.size
xdlclass_t.ha -> xrecord_t.ha
- Remove aliasing from xdlclass_t, to reduce future refactoring mistakes.
+ xdlclass_t carries a copy of the data in xrecord_t, but instead of
+ embedding xrecord_t it duplicates the individual fields. A future
+ commit will change the types used in xrecord_t so embed it in
+ xdlclass_t first, so we don't have to remember to change the types
+ here as well.
Best-viewed-with: --color-words
+ Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
## xdiff/xprepare.c ##
@@ xdiff/xprepare.c: static int xdl_classify_record(unsigned int pass, xdlclassifie
- rcrec->line = line;
- rcrec->size = rec->size;
- rcrec->ha = rec->ha;
-+ rcrec->rec.ptr = rec->ptr;
-+ rcrec->rec.size = rec->size;
-+ rcrec->rec.ha = rec->ha;
++ rcrec->rec = *rec;
rcrec->len1 = rcrec->len2 = 0;
rcrec->next = cf->rchash[hi];
cf->rchash[hi] = rcrec;
8: fae26d2a04 ! 9: f7b5021e48 xdiff: delete chastore from xdfile_t
@@ xdiff/xemit.c
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
- if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0) {
+ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
return -1;
@@ xdiff/xemit.c: static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
9: fd54135560 = 10: 97135495e2 xdiff: delete rchg aliasing
10: 1e404c3290 ! 11: b544c15a67 xdiff: treat xdfile_t.rchg like an enum
@@ Metadata
Author: Ezekiel Newren <ezekielnewren@gmail.com>
## Commit message ##
- xdiff: treat xdfile_t.rchg like an enum
+ xdiff: use bool literals for xdfile_t.rchg
- Define macros NO(0), YES(1), MAYBE(2) as the enum values for rchg to
- make the code easier to follow. Perhaps 'rchg' should be renamed to
- 'changed'?
-
- A few of the code changes might appear to change behavior, such as:
- - while (xdf->rchg[g->start - 1])
- + while (xdf->rchg[g->start - 1] == YES)
- because it appears the value of MAYBE is being ignored. However, MAYBE
- is only ever assigned as a value to a temporary array (dis1 & dis2) and
- then as a last step use that temporary array to decide if it wants to
- change xdfile_t.rchg[i] to YES or leave it as NO. As such, rchg will
- never have a value of MAYBE and thus there is no behavioral change.
+ Define macros NO(0), YES(1), MAYBE(2) as the enum values for dis1 and
+ dis2 to make the code easier to follow.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
- ## xdiff/xdiff.h ##
-@@
- extern "C" {
- #endif /* #ifdef __cplusplus */
-
-+#define NO 0
-+#define YES 1
-+#define MAYBE 2
-+
- /* xpparm_t.flags */
- #define XDF_NEED_MINIMAL (1 << 0)
-
-
## xdiff/xdiffi.c ##
@@ xdiff/xdiffi.c: int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
- xdf2->rchg[xdf2->rindex[off2]] = 1;
-+ xdf2->rchg[xdf2->rindex[off2]] = YES;
++ xdf2->rchg[xdf2->rindex[off2]] = true;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
- xdf1->rchg[xdf1->rindex[off1]] = 1;
-+ xdf1->rchg[xdf1->rindex[off1]] = YES;
++ xdf1->rchg[xdf1->rindex[off1]] = true;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
-@@ xdiff/xdiffi.c: struct xdlgroup {
- static void group_init(xdfile_t *xdf, struct xdlgroup *g)
- {
- g->start = g->end = 0;
-- while (xdf->rchg[g->end])
-+ while (xdf->rchg[g->end] == YES)
- g->end++;
- }
-
-@@ xdiff/xdiffi.c: static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
- return -1;
-
- g->start = g->end + 1;
-- for (g->end = g->start; xdf->rchg[g->end]; g->end++)
-+ for (g->end = g->start; xdf->rchg[g->end] == YES; g->end++)
- ;
-
- return 0;
-@@ xdiff/xdiffi.c: static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
- return -1;
-
- g->end = g->start - 1;
-- for (g->start = g->end; xdf->rchg[g->start - 1]; g->start--)
-+ for (g->start = g->end; xdf->rchg[g->start - 1] == YES; g->start--)
- ;
-
- return 0;
@@ xdiff/xdiffi.c: static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
- xdf->rchg[g->start++] = 0;
- xdf->rchg[g->end++] = 1;
-+ xdf->rchg[g->start++] = NO;
-+ xdf->rchg[g->end++] = YES;
++ xdf->rchg[g->start++] = false;
++ xdf->rchg[g->end++] = true;
-- while (xdf->rchg[g->end])
-+ while (xdf->rchg[g->end] == YES)
+ while (xdf->rchg[g->end])
g->end++;
-
- return 0;
@@ xdiff/xdiffi.c: static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
- xdf->rchg[--g->start] = 1;
- xdf->rchg[--g->end] = 0;
-+ xdf->rchg[--g->start] = YES;
-+ xdf->rchg[--g->end] = NO;
++ xdf->rchg[--g->start] = true;
++ xdf->rchg[--g->end] = false;
-- while (xdf->rchg[g->start - 1])
-+ while (xdf->rchg[g->start - 1] == YES)
+ while (xdf->rchg[g->start - 1])
g->start--;
-
- return 0;
## xdiff/xhistogram.c ##
@@ xdiff/xhistogram.c: redo:
@@ xdiff/xhistogram.c: redo:
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
-+ env->xdf2.rchg[line2++ - 1] = YES;
++ env->xdf2.rchg[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
-+ env->xdf1.rchg[line1++ - 1] = YES;
++ env->xdf1.rchg[line1++ - 1] = true;
return 0;
}
@@ xdiff/xhistogram.c: redo:
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
-+ env->xdf1.rchg[line1++ - 1] = YES;
++ env->xdf1.rchg[line1++ - 1] = true;
while (count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
-+ env->xdf2.rchg[line2++ - 1] = YES;
++ env->xdf2.rchg[line2++ - 1] = true;
result = 0;
} else {
result = histogram_diff(xpp, env,
@@ xdiff/xpatience.c: static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
-+ env->xdf2.rchg[line2++ - 1] = YES;
++ env->xdf2.rchg[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
-+ env->xdf1.rchg[line1++ - 1] = YES;
++ env->xdf1.rchg[line1++ - 1] = true;
return 0;
}
@@ xdiff/xpatience.c: static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
if (!map.has_matches) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
-+ env->xdf1.rchg[line1++ - 1] = YES;
++ env->xdf1.rchg[line1++ - 1] = true;
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
-+ env->xdf2.rchg[line2++ - 1] = YES;
++ env->xdf2.rchg[line2++ - 1] = true;
xdl_free(map.entries);
return 0;
}
## xdiff/xprepare.c ##
+@@
+ #define XDL_GUESS_NLINES1 256
+ #define XDL_GUESS_NLINES2 20
+
++#define NO 0
++#define YES 1
++#define MAYBE 2
+
+ typedef struct s_xdlclass {
+ struct s_xdlclass *next;
@@ xdiff/xprepare.c: static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
* current line (i) is already a multimatch line.
*/
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *
xdf1->rindex[nreff++] = i;
} else
- xdf1->rchg[i] = 1;
-+ xdf1->rchg[i] = YES;
++ xdf1->rchg[i] = true;
}
xdf1->nreff = nreff;
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *
xdf2->rindex[nreff++] = i;
} else
- xdf2->rchg[i] = 1;
-+ xdf2->rchg[i] = YES;
++ xdf2->rchg[i] = true;
}
xdf2->nreff = nreff;
-: ---------- > 12: 034a4a7b2a xdiff: refactor 'char *rchg' to 'bool *changed' in xdfile_t
--
gitgitgadget
^ permalink raw reply [flat|nested] 158+ messages in thread
* [PATCH v4 01/12] xdiff: delete static forward declarations in xprepare
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 02/12] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
` (12 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Move xdl_prepare_env() later in the file to avoid the need
for static forward declarations.
Best-viewed-with: --color-moved
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 116 ++++++++++++++++++++---------------------------
1 file changed, 50 insertions(+), 66 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e1d4017b2d..249bfa678f 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -53,21 +53,6 @@ typedef struct s_xdlclassifier {
-static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags);
-static void xdl_free_classifier(xdlclassifier_t *cf);
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec);
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
- xdlclassifier_t *cf, xdfile_t *xdf);
-static void xdl_free_ctx(xdfile_t *xdf);
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e);
-static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-
-
-
-
static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags) {
cf->flags = flags;
@@ -242,57 +227,6 @@ static void xdl_free_ctx(xdfile_t *xdf) {
}
-int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
- xdfenv_t *xe) {
- long enl1, enl2, sample;
- xdlclassifier_t cf;
-
- memset(&cf, 0, sizeof(cf));
-
- /*
- * For histogram diff, we can afford a smaller sample size and
- * thus a poorer estimate of the number of lines, as the hash
- * table (rhash) won't be filled up/grown. The number of lines
- * (nrecs) will be updated correctly anyway by
- * xdl_prepare_ctx().
- */
- sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
- ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
-
- enl1 = xdl_guess_lines(mf1, sample) + 1;
- enl2 = xdl_guess_lines(mf2, sample) + 1;
-
- if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
- return -1;
-
- if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
-
- xdl_free_classifier(&cf);
- return -1;
- }
- if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
- (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
- xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf2);
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- xdl_free_classifier(&cf);
-
- return 0;
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
@@ -460,3 +394,53 @@ static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2
return 0;
}
+
+int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+ xdfenv_t *xe) {
+ long enl1, enl2, sample;
+ xdlclassifier_t cf;
+
+ memset(&cf, 0, sizeof(cf));
+
+ /*
+ * For histogram diff, we can afford a smaller sample size and
+ * thus a poorer estimate of the number of lines, as the hash
+ * table (rhash) won't be filled up/grown. The number of lines
+ * (nrecs) will be updated correctly anyway by
+ * xdl_prepare_ctx().
+ */
+ sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
+ ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
+
+ enl1 = xdl_guess_lines(mf1, sample) + 1;
+ enl2 = xdl_guess_lines(mf2, sample) + 1;
+
+ if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
+ return -1;
+
+ if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
+
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+ if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
+ (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
+ xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf2);
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ xdl_free_classifier(&cf);
+
+ return 0;
+}
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v4 02/12] xdiff: delete local variables and initialize/free xdfile_t directly
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 01/12] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 03/12] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
` (11 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
These local variables are essentially a hand-rolled additional
implementation of xdl_free_ctx() inlined into xdl_prepare_ctx(). Modify
the code to use the existing xdl_free_ctx() function so there aren't
two ways to free such variables.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 78 +++++++++++++++++++-----------------------------
1 file changed, 30 insertions(+), 48 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 249bfa678f..96134c9fbf 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -134,99 +134,81 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
}
+static void xdl_free_ctx(xdfile_t *xdf)
+{
+ xdl_free(xdf->rhash);
+ xdl_free(xdf->rindex);
+ xdl_free(xdf->rchg - 1);
+ xdl_free(xdf->ha);
+ xdl_free(xdf->recs);
+ xdl_cha_free(&xdf->rcha);
+}
+
+
static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
xdlclassifier_t *cf, xdfile_t *xdf) {
- unsigned int hbits;
- long nrec, hsize, bsize;
+ long bsize;
unsigned long hav;
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xrecord_t **recs;
- xrecord_t **rhash;
- unsigned long *ha;
- char *rchg;
- long *rindex;
- ha = NULL;
- rindex = NULL;
- rchg = NULL;
- rhash = NULL;
- recs = NULL;
+ xdf->ha = NULL;
+ xdf->rindex = NULL;
+ xdf->rchg = NULL;
+ xdf->rhash = NULL;
+ xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
goto abort;
- if (!XDL_ALLOC_ARRAY(recs, narec))
+ if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- hbits = xdl_hashbits((unsigned int) narec);
- hsize = 1 << hbits;
- if (!XDL_CALLOC_ARRAY(rhash, hsize))
+ xdf->hbits = xdl_hashbits((unsigned int) narec);
+ if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
goto abort;
- nrec = 0;
+ xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
prev = cur;
hav = xdl_hash_record(&cur, top, xpp->flags);
- if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
+ if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
if (!(crec = xdl_cha_alloc(&xdf->rcha)))
goto abort;
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- recs[nrec++] = crec;
- if (xdl_classify_record(pass, cf, rhash, hbits, crec) < 0)
+ xdf->recs[xdf->nrec++] = crec;
+ if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
goto abort;
}
}
- if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
+ if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
goto abort;
if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
- if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
goto abort;
}
- xdf->nrec = nrec;
- xdf->recs = recs;
- xdf->hbits = hbits;
- xdf->rhash = rhash;
- xdf->rchg = rchg + 1;
- xdf->rindex = rindex;
+ xdf->rchg += 1;
xdf->nreff = 0;
- xdf->ha = ha;
xdf->dstart = 0;
- xdf->dend = nrec - 1;
+ xdf->dend = xdf->nrec - 1;
return 0;
abort:
- xdl_free(ha);
- xdl_free(rindex);
- xdl_free(rchg);
- xdl_free(rhash);
- xdl_free(recs);
- xdl_cha_free(&xdf->rcha);
+ xdl_free_ctx(xdf);
return -1;
}
-static void xdl_free_ctx(xdfile_t *xdf) {
-
- xdl_free(xdf->rhash);
- xdl_free(xdf->rindex);
- xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
- xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v4 03/12] xdiff: delete unnecessary fields from xrecord_t and xdfile_t
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 01/12] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 02/12] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 04/12] xdiff: delete superfluous function xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
` (10 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 15 ++-------------
xdiff/xtypes.h | 3 ---
2 files changed, 2 insertions(+), 16 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 96134c9fbf..3576415c85 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -91,8 +91,7 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
}
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec) {
+static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
char const *line;
xdlclass_t *rcrec;
@@ -126,17 +125,12 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
rec->ha = (unsigned long) rcrec->idx;
- hi = (long) XDL_HASHLONG(rec->ha, hbits);
- rec->next = rhash[hi];
- rhash[hi] = rec;
-
return 0;
}
static void xdl_free_ctx(xdfile_t *xdf)
{
- xdl_free(xdf->rhash);
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->ha);
@@ -155,7 +149,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
- xdf->rhash = NULL;
xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
@@ -163,10 +156,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- xdf->hbits = xdl_hashbits((unsigned int) narec);
- if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
- goto abort;
-
xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
@@ -180,7 +169,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
crec->size = (long) (cur - prev);
crec->ha = hav;
xdf->recs[xdf->nrec++] = crec;
- if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
+ if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8442bd436e..8b8467360e 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,7 +39,6 @@ typedef struct s_chastore {
} chastore_t;
typedef struct s_xrecord {
- struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
@@ -48,8 +47,6 @@ typedef struct s_xrecord {
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
- unsigned int hbits;
- xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v4 04/12] xdiff: delete superfluous function xdl_get_rec() in xemit
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
` (2 preceding siblings ...)
2025-09-22 19:51 ` [PATCH v4 03/12] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 05/12] xdiff: delete superfluous local variables that alias fields in xrecord_t Ezekiel Newren via GitGitGadget
` (9 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
When xrecord_t was a linked list, and recs didn't exist, I assume this
function walked the list until it found the right record. Accessing
a contiguous array is so trival that this function is now superfluous.
Delete it.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xemit.c | 23 +++++++----------------
1 file changed, 7 insertions(+), 16 deletions(-)
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 1d40c9cb40..40fc8154f3 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -22,23 +22,14 @@
#include "xinclude.h"
-static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
-
- *rec = xdf->recs[ri]->ptr;
-
- return xdf->recs[ri]->size;
-}
-
static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
long size, psize = strlen(pre);
- char const *rec;
-
- size = xdl_get_rec(xdf, ri, &rec);
- if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0) {
+ char const *rec = xdf->recs[ri]->ptr;
+ size = xdf->recs[ri]->size;
+ if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0)
return -1;
- }
return 0;
}
@@ -120,8 +111,8 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ const char *rec = xdf->recs[ri]->ptr;
+ long len = xdf->recs[ri]->size;
if (!xecfg->find_func)
return def_ff(rec, len, buf, sz);
return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
@@ -160,8 +151,8 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ const char *rec = xdf->recs[ri]->ptr;
+ long len = xdf->recs[ri]->size;
while (len > 0 && XDL_ISSPACE(*rec)) {
rec++;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v4 05/12] xdiff: delete superfluous local variables that alias fields in xrecord_t
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
` (3 preceding siblings ...)
2025-09-22 19:51 ` [PATCH v4 04/12] xdiff: delete superfluous function xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 06/12] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
` (8 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Use the type xrecord_t as the local variable for the functions in the
file xdiff/xemit.c.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xemit.c | 29 +++++++++++++----------------
1 file changed, 13 insertions(+), 16 deletions(-)
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 40fc8154f3..2161ac3cd0 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -23,12 +23,11 @@
#include "xinclude.h"
-static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
- long size, psize = strlen(pre);
- char const *rec = xdf->recs[ri]->ptr;
+static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
+{
+ xrecord_t *rec = xdf->recs[ri];
- size = xdf->recs[ri]->size;
- if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0)
+ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
return -1;
return 0;
@@ -111,11 +110,11 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- const char *rec = xdf->recs[ri]->ptr;
- long len = xdf->recs[ri]->size;
+ xrecord_t *rec = xdf->recs[ri];
+
if (!xecfg->find_func)
- return def_ff(rec, len, buf, sz);
- return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
+ return def_ff(rec->ptr, rec->size, buf, sz);
+ return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
}
static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
@@ -151,14 +150,12 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- const char *rec = xdf->recs[ri]->ptr;
- long len = xdf->recs[ri]->size;
+ xrecord_t *rec = xdf->recs[ri];
+ long i = 0;
- while (len > 0 && XDL_ISSPACE(*rec)) {
- rec++;
- len--;
- }
- return !len;
+ for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
+
+ return i == rec->size;
}
int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v4 06/12] xdiff: delete struct diffdata_t
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
` (4 preceding siblings ...)
2025-09-22 19:51 ` [PATCH v4 05/12] xdiff: delete superfluous local variables that alias fields in xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 07/12] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
` (7 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Every field in this struct is an alias for a certain field in xdfile_t.
diffdata_t.nrec -> xdfile_t.nreff
diffdata_t.ha -> xdfile_t.ha
diffdata_t.rindex -> xdfile_t.rindex
diffdata_t.rchg -> xdfile_t.rchg
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 32 ++++++++------------------------
xdiff/xdiffi.h | 11 ++---------
2 files changed, 10 insertions(+), 33 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 5a96e36dfb..bbf0161f84 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -257,10 +257,10 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
* sub-boxes by calling the box splitting function. Note that the real job
* (marking changed lines) is done in the two boundary reaching checks.
*/
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = dd1->ha, *ha2 = dd2->ha;
+ unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
@@ -273,17 +273,11 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
* be obviously changed.
*/
if (off1 == lim1) {
- char *rchg2 = dd2->rchg;
- long *rindex2 = dd2->rindex;
-
for (; off2 < lim2; off2++)
- rchg2[rindex2[off2]] = 1;
+ xdf2->rchg[xdf2->rindex[off2]] = 1;
} else if (off2 == lim2) {
- char *rchg1 = dd1->rchg;
- long *rindex1 = dd1->rindex;
-
for (; off1 < lim1; off1++)
- rchg1[rindex1[off1]] = 1;
+ xdf1->rchg[xdf1->rindex[off1]] = 1;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -300,9 +294,9 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
/*
* ... et Impera.
*/
- if (xdl_recs_cmp(dd1, off1, spl.i1, dd2, off2, spl.i2,
+ if (xdl_recs_cmp(xdf1, off1, spl.i1, xdf2, off2, spl.i2,
kvdf, kvdb, spl.min_lo, xenv) < 0 ||
- xdl_recs_cmp(dd1, spl.i1, lim1, dd2, spl.i2, lim2,
+ xdl_recs_cmp(xdf1, spl.i1, lim1, xdf2, spl.i2, lim2,
kvdf, kvdb, spl.min_hi, xenv) < 0) {
return -1;
@@ -318,7 +312,6 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
long ndiags;
long *kvd, *kvdf, *kvdb;
xdalgoenv_t xenv;
- diffdata_t dd1, dd2;
int res;
if (xdl_prepare_env(mf1, mf2, xpp, xe) < 0)
@@ -357,16 +350,7 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xenv.snake_cnt = XDL_SNAKE_CNT;
xenv.heur_min = XDL_HEUR_MIN_COST;
- dd1.nrec = xe->xdf1.nreff;
- dd1.ha = xe->xdf1.ha;
- dd1.rchg = xe->xdf1.rchg;
- dd1.rindex = xe->xdf1.rindex;
- dd2.nrec = xe->xdf2.nreff;
- dd2.ha = xe->xdf2.ha;
- dd2.rchg = xe->xdf2.rchg;
- dd2.rindex = xe->xdf2.rindex;
-
- res = xdl_recs_cmp(&dd1, 0, dd1.nrec, &dd2, 0, dd2.nrec,
+ res = xdl_recs_cmp(&xe->xdf1, 0, xe->xdf1.nreff, &xe->xdf2, 0, xe->xdf2.nreff,
kvdf, kvdb, (xpp->flags & XDF_NEED_MINIMAL) != 0,
&xenv);
xdl_free(kvd);
diff --git a/xdiff/xdiffi.h b/xdiff/xdiffi.h
index 126c9d8ff4..49e52c67f9 100644
--- a/xdiff/xdiffi.h
+++ b/xdiff/xdiffi.h
@@ -24,13 +24,6 @@
#define XDIFFI_H
-typedef struct s_diffdata {
- long nrec;
- unsigned long const *ha;
- long *rindex;
- char *rchg;
-} diffdata_t;
-
typedef struct s_xdalgoenv {
long mxcost;
long snake_cnt;
@@ -46,8 +39,8 @@ typedef struct s_xdchange {
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv);
int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xdfenv_t *xe);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v4 07/12] xdiff: delete redundant array xdfile_t.ha
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
` (5 preceding siblings ...)
2025-09-22 19:51 ` [PATCH v4 06/12] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 08/12] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
` (6 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
When 0 <= i < xdfile_t.nreff the following is true:
xdfile_t.ha[i] == xdfile_t.recs[xdfile_t.rindex[i]]
This makes the code about 5% slower. The fields rindex and ha are
specific to the classic diff (myers and minimal). I plan on creating a
struct for classic diff, but there's a lot of cleanup that needs to be
done before that can happen and leaving ha in would make those cleanups
harder to follow.
A subsequent commit will delete the chastore cha from xdfile_t. That
later commit will investigate deleting ha and cha independently and
together.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++++++----------
xdiff/xprepare.c | 12 ++----------
xdiff/xtypes.h | 1 -
3 files changed, 16 insertions(+), 21 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index bbf0161f84..11cd090b53 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -22,6 +22,11 @@
#include "xinclude.h"
+static unsigned long get_hash(xdfile_t *xdf, long index)
+{
+ return xdf->recs[xdf->rindex[index]]->ha;
+}
+
#define XDL_MAX_COST_MIN 256
#define XDL_HEUR_MIN_COST 256
#define XDL_LINE_MAX (long)((1UL << (CHAR_BIT * sizeof(long) - 1)) - 1)
@@ -42,8 +47,8 @@ typedef struct s_xdpsplit {
* using this algorithm, so a little bit of heuristic is needed to cut the
* search and to return a suboptimal point.
*/
-static long xdl_split(unsigned long const *ha1, long off1, long lim1,
- unsigned long const *ha2, long off2, long lim2,
+static long xdl_split(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl,
xdalgoenv_t *xenv) {
long dmin = off1 - lim2, dmax = lim1 - off2;
@@ -87,7 +92,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdf[d + 1];
prev1 = i1;
i2 = i1 - d;
- for (; i1 < lim1 && i2 < lim2 && ha1[i1] == ha2[i2]; i1++, i2++);
+ for (; i1 < lim1 && i2 < lim2 && get_hash(xdf1, i1) == get_hash(xdf2, i2); i1++, i2++);
if (i1 - prev1 > xenv->snake_cnt)
got_snake = 1;
kvdf[d] = i1;
@@ -124,7 +129,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdb[d + 1] - 1;
prev1 = i1;
i2 = i1 - d;
- for (; i1 > off1 && i2 > off2 && ha1[i1 - 1] == ha2[i2 - 1]; i1--, i2--);
+ for (; i1 > off1 && i2 > off2 && get_hash(xdf1, i1 - 1) == get_hash(xdf2, i2 - 1); i1--, i2--);
if (prev1 - i1 > xenv->snake_cnt)
got_snake = 1;
kvdb[d] = i1;
@@ -159,7 +164,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 + xenv->snake_cnt <= i1 && i1 < lim1 &&
off2 + xenv->snake_cnt <= i2 && i2 < lim2) {
- for (k = 1; ha1[i1 - k] == ha2[i2 - k]; k++)
+ for (k = 1; get_hash(xdf1, i1 - k) == get_hash(xdf2, i2 - k); k++)
if (k == xenv->snake_cnt) {
best = v;
spl->i1 = i1;
@@ -183,7 +188,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 < i1 && i1 <= lim1 - xenv->snake_cnt &&
off2 < i2 && i2 <= lim2 - xenv->snake_cnt) {
- for (k = 0; ha1[i1 + k] == ha2[i2 + k]; k++)
+ for (k = 0; get_hash(xdf1, i1 + k) == get_hash(xdf2, i2 + k); k++)
if (k == xenv->snake_cnt - 1) {
best = v;
spl->i1 = i1;
@@ -260,13 +265,12 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
*/
- for (; off1 < lim1 && off2 < lim2 && ha1[off1] == ha2[off2]; off1++, off2++);
- for (; off1 < lim1 && off2 < lim2 && ha1[lim1 - 1] == ha2[lim2 - 1]; lim1--, lim2--);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, off1) == get_hash(xdf2, off2); off1++, off2++);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, lim1 - 1) == get_hash(xdf2, lim2 - 1); lim1--, lim2--);
/*
* If one dimension is empty, then all records on the other one must
@@ -285,7 +289,7 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
/*
* Divide ...
*/
- if (xdl_split(ha1, off1, lim1, ha2, off2, lim2, kvdf, kvdb,
+ if (xdl_split(xdf1, off1, lim1, xdf2, off2, lim2, kvdf, kvdb,
need_min, &spl, xenv) < 0) {
return -1;
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 3576415c85..22c44f0683 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -133,7 +133,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
{
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
xdl_free(xdf->recs);
xdl_cha_free(&xdf->rcha);
}
@@ -146,7 +145,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
xdf->recs = NULL;
@@ -181,8 +179,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
- goto abort;
}
xdf->rchg += 1;
@@ -300,9 +296,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf1->dend; i++, recs++) {
if (dis1[i] == 1 ||
(dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
- xdf1->rindex[nreff] = i;
- xdf1->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf1->rindex[nreff++] = i;
} else
xdf1->rchg[i] = 1;
}
@@ -312,9 +306,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf2->dend; i++, recs++) {
if (dis2[i] == 1 ||
(dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
- xdf2->rindex[nreff] = i;
- xdf2->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf2->rindex[nreff++] = i;
} else
xdf2->rchg[i] = 1;
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8b8467360e..85848f1685 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -52,7 +52,6 @@ typedef struct s_xdfile {
char *rchg;
long *rindex;
long nreff;
- unsigned long *ha;
} xdfile_t;
typedef struct s_xdfenv {
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v4 08/12] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
` (6 preceding siblings ...)
2025-09-22 19:51 ` [PATCH v4 07/12] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 09/12] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
` (5 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
The fields from xdlclass_t are aliases of xrecord_t:
xdlclass_t.line -> xrecord_t.ptr
xdlclass_t.size -> xrecord_t.size
xdlclass_t.ha -> xrecord_t.ha
xdlclass_t carries a copy of the data in xrecord_t, but instead of
embedding xrecord_t it duplicates the individual fields. A future
commit will change the types used in xrecord_t so embed it in
xdlclass_t first, so we don't have to remember to change the types
here as well.
Best-viewed-with: --color-words
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 22c44f0683..e6e2c0e1c0 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -32,9 +32,7 @@
typedef struct s_xdlclass {
struct s_xdlclass *next;
- unsigned long ha;
- char const *line;
- long size;
+ xrecord_t rec;
long idx;
long len1, len2;
} xdlclass_t;
@@ -93,14 +91,12 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
- char const *line;
xdlclass_t *rcrec;
- line = rec->ptr;
hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
- if (rcrec->ha == rec->ha &&
- xdl_recmatch(rcrec->line, rcrec->size,
+ if (rcrec->rec.ha == rec->ha &&
+ xdl_recmatch(rcrec->rec.ptr, rcrec->rec.size,
rec->ptr, rec->size, cf->flags))
break;
@@ -113,9 +109,7 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
if (XDL_ALLOC_GROW(cf->rcrecs, cf->count, cf->alloc))
return -1;
cf->rcrecs[rcrec->idx] = rcrec;
- rcrec->line = line;
- rcrec->size = rec->size;
- rcrec->ha = rec->ha;
+ rcrec->rec = *rec;
rcrec->len1 = rcrec->len2 = 0;
rcrec->next = cf->rchash[hi];
cf->rchash[hi] = rcrec;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v4 09/12] xdiff: delete chastore from xdfile_t
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
` (7 preceding siblings ...)
2025-09-22 19:51 ` [PATCH v4 08/12] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 10/12] xdiff: delete rchg aliasing Ezekiel Newren via GitGitGadget
` (4 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xdfile_t currently uses chastore_t which is an arena allocator. I
think that xrecord_t used to be a linked list and recs didn't exist
originally. When recs was added I think they forgot to remove
xdfile_t.next, but was overlooked. This dual data structure setup
makes the code somewhat confusing.
Additionally the C type chastore_t isn't FFI friendly, and provides
little to no performance benefit over using realloc to grow an array.
Performance impact of deleting fields from xdfile_t:
Deleting ha is about 5% slower.
Deleting cha is about 5% faster.
Delete ha, but keep cha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.269 s ± 0.017 s [User: 1.135 s, System: 0.128 s]
Range (min … max): 1.249 s … 1.286 s 10 runs
Benchmark 2: build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.339 s ± 0.017 s [User: 1.234 s, System: 0.099 s]
Range (min … max): 1.320 s … 1.358 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.06 ± 0.02 times faster than build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete cha, but keep ha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.290 s ± 0.001 s [User: 1.154 s, System: 0.130 s]
Range (min … max): 1.288 s … 1.292 s 10 runs
Benchmark 2: build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.232 s ± 0.017 s [User: 1.105 s, System: 0.121 s]
Range (min … max): 1.205 s … 1.249 s 10 runs
Summary
build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.05 ± 0.01 times faster than build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete ha AND chastore
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha_and_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.291 s ± 0.002 s [User: 1.156 s, System: 0.129 s]
Range (min … max): 1.287 s … 1.295 s 10 runs
Benchmark 2: build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.306 s ± 0.001 s [User: 1.195 s, System: 0.105 s]
Range (min … max): 1.305 s … 1.308 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.01 ± 0.00 times faster than build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++----------
xdiff/xemit.c | 6 ++---
xdiff/xhistogram.c | 2 +-
xdiff/xmerge.c | 56 +++++++++++++++++++++++-----------------------
xdiff/xpatience.c | 10 ++++-----
xdiff/xprepare.c | 19 ++++++----------
xdiff/xtypes.h | 3 +--
xdiff/xutils.c | 12 +++++-----
8 files changed, 63 insertions(+), 69 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 11cd090b53..a66125d44a 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -24,7 +24,7 @@
static unsigned long get_hash(xdfile_t *xdf, long index)
{
- return xdf->recs[xdf->rindex[index]]->ha;
+ return xdf->recs[xdf->rindex[index]].ha;
}
#define XDL_MAX_COST_MIN 256
@@ -489,13 +489,13 @@ static void measure_split(const xdfile_t *xdf, long split,
m->indent = -1;
} else {
m->end_of_file = 0;
- m->indent = get_indent(xdf->recs[split]);
+ m->indent = get_indent(&xdf->recs[split]);
}
m->pre_blank = 0;
m->pre_indent = -1;
for (i = split - 1; i >= 0; i--) {
- m->pre_indent = get_indent(xdf->recs[i]);
+ m->pre_indent = get_indent(&xdf->recs[i]);
if (m->pre_indent != -1)
break;
m->pre_blank += 1;
@@ -508,7 +508,7 @@ static void measure_split(const xdfile_t *xdf, long split,
m->post_blank = 0;
m->post_indent = -1;
for (i = split + 1; i < xdf->nrec; i++) {
- m->post_indent = get_indent(xdf->recs[i]);
+ m->post_indent = get_indent(&xdf->recs[i]);
if (m->post_indent != -1)
break;
m->post_blank += 1;
@@ -752,7 +752,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
- recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
+ recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
xdf->rchg[g->start++] = 0;
xdf->rchg[g->end++] = 1;
@@ -773,7 +773,7 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
- recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1])) {
+ recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
xdf->rchg[--g->start] = 1;
xdf->rchg[--g->end] = 0;
@@ -988,16 +988,16 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
for (xch = xscr; xch; xch = xch->next) {
int ignore = 1;
- xrecord_t **rec;
+ xrecord_t *rec;
long i;
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
xch->ignore = ignore;
}
@@ -1021,7 +1021,7 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
xdchange_t *xch;
for (xch = xscr; xch; xch = xch->next) {
- xrecord_t **rec;
+ xrecord_t *rec;
int ignore = 1;
long i;
@@ -1033,11 +1033,11 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
xch->ignore = ignore;
}
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 2161ac3cd0..b2f1f30cd3 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -25,7 +25,7 @@
static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
return -1;
@@ -110,7 +110,7 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (!xecfg->find_func)
return def_ff(rec->ptr, rec->size, buf, sz);
@@ -150,7 +150,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
long i = 0;
for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 040d81e0bc..4d857e8ae2 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -86,7 +86,7 @@ struct region {
((LINE_MAP(index, ptr))->cnt)
#define REC(env, s, l) \
- (env->xdf##s.recs[l - 1])
+ (&env->xdf##s.recs[l - 1])
static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
{
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index af40c88a5b..fd600cbb5d 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -97,12 +97,12 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
int line_count, long flags)
{
int i;
- xrecord_t **rec1 = xe1->xdf2.recs + i1;
- xrecord_t **rec2 = xe2->xdf2.recs + i2;
+ xrecord_t *rec1 = xe1->xdf2.recs + i1;
+ xrecord_t *rec2 = xe2->xdf2.recs + i2;
for (i = 0; i < line_count; i++) {
- int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
- rec2[i]->ptr, rec2[i]->size, flags);
+ int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
+ rec2[i].ptr, rec2[i].size, flags);
if (!result)
return -1;
}
@@ -111,7 +111,7 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int needs_cr, int add_nl, char *dest)
{
- xrecord_t **recs;
+ xrecord_t *recs;
int size = 0;
recs = (use_orig ? xe->xdf1.recs : xe->xdf2.recs) + i;
@@ -119,12 +119,12 @@ static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int nee
if (count < 1)
return 0;
- for (i = 0; i < count; size += recs[i++]->size)
+ for (i = 0; i < count; size += recs[i++].size)
if (dest)
- memcpy(dest + size, recs[i]->ptr, recs[i]->size);
+ memcpy(dest + size, recs[i].ptr, recs[i].size);
if (add_nl) {
- i = recs[count - 1]->size;
- if (i == 0 || recs[count - 1]->ptr[i - 1] != '\n') {
+ i = recs[count - 1].size;
+ if (i == 0 || recs[count - 1].ptr[i - 1] != '\n') {
if (needs_cr) {
if (dest)
dest[size] = '\r';
@@ -160,22 +160,22 @@ static int is_eol_crlf(xdfile_t *file, int i)
if (i < file->nrec - 1)
/* All lines before the last *must* end in LF */
- return (size = file->recs[i]->size) > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ return (size = file->recs[i].size) > 1 &&
+ file->recs[i].ptr[size - 2] == '\r';
if (!file->nrec)
/* Cannot determine eol style from empty file */
return -1;
- if ((size = file->recs[i]->size) &&
- file->recs[i]->ptr[size - 1] == '\n')
+ if ((size = file->recs[i].size) &&
+ file->recs[i].ptr[size - 1] == '\n')
/* Last line; ends in LF; Is it CR/LF? */
return size > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ file->recs[i].ptr[size - 2] == '\r';
if (!i)
/* The only line has no eol */
return -1;
/* Determine eol from second-to-last line */
- return (size = file->recs[i - 1]->size) > 1 &&
- file->recs[i - 1]->ptr[size - 2] == '\r';
+ return (size = file->recs[i - 1].size) > 1 &&
+ file->recs[i - 1].ptr[size - 2] == '\r';
}
static int is_cr_needed(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m)
@@ -334,22 +334,22 @@ static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
static void xdl_refine_zdiff3_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
xpparam_t const *xpp)
{
- xrecord_t **rec1 = xe1->xdf2.recs, **rec2 = xe2->xdf2.recs;
+ xrecord_t *rec1 = xe1->xdf2.recs, *rec2 = xe2->xdf2.recs;
for (; m; m = m->next) {
/* let's handle just the conflicts */
if (m->mode)
continue;
while(m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1], rec2[m->i2], xpp->flags)) {
+ recmatch(&rec1[m->i1], &rec2[m->i2], xpp->flags)) {
m->chg1--;
m->chg2--;
m->i1++;
m->i2++;
}
while (m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1 + m->chg1 - 1],
- rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
+ recmatch(&rec1[m->i1 + m->chg1 - 1],
+ &rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
m->chg1--;
m->chg2--;
}
@@ -381,12 +381,12 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
* This probably does not work outside git, since
* we have a very simple mmfile structure.
*/
- t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
- t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
- + xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
- t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
- t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
- + xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
+ t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
+ t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
+ + xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
+ t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
+ t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
+ + xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
return -1;
if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
@@ -440,8 +440,8 @@ static int line_contains_alnum(const char *ptr, long size)
static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
{
for (; chg; chg--, i++)
- if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
- xe->xdf2.recs[i]->size))
+ if (line_contains_alnum(xe->xdf2.recs[i].ptr,
+ xe->xdf2.recs[i].size))
return 1;
return 0;
}
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 77dc411d19..bf69a58527 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -88,9 +88,9 @@ static int is_anchor(xpparam_t const *xpp, const char *line)
static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
int pass)
{
- xrecord_t **records = pass == 1 ?
+ xrecord_t *records = pass == 1 ?
map->env->xdf1.recs : map->env->xdf2.recs;
- xrecord_t *record = records[line - 1];
+ xrecord_t *record = &records[line - 1];
/*
* After xdl_prepare_env() (or more precisely, due to
* xdl_classify_record()), the "ha" member of the records (AKA lines)
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
return;
map->entries[index].line1 = line;
map->entries[index].hash = record->ha;
- map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
+ map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
if (!map->first)
map->first = map->entries + index;
if (map->last) {
@@ -246,8 +246,8 @@ static int find_longest_common_sequence(struct hashmap *map, struct entry **res)
static int match(struct hashmap *map, int line1, int line2)
{
- xrecord_t *record1 = map->env->xdf1.recs[line1 - 1];
- xrecord_t *record2 = map->env->xdf2.recs[line2 - 1];
+ xrecord_t *record1 = &map->env->xdf1.recs[line1 - 1];
+ xrecord_t *record2 = &map->env->xdf2.recs[line2 - 1];
return record1->ha == record2->ha;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e6e2c0e1c0..27c5a4d636 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -128,7 +128,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
}
@@ -143,8 +142,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->rchg = NULL;
xdf->recs = NULL;
- if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
- goto abort;
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
@@ -155,12 +152,10 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
hav = xdl_hash_record(&cur, top, xpp->flags);
if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
- if (!(crec = xdl_cha_alloc(&xdf->rcha)))
- goto abort;
+ crec = &xdf->recs[xdf->nrec++];
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- xdf->recs[xdf->nrec++] = crec;
if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
@@ -260,7 +255,7 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
*/
static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
long i, nm, nreff, mlim;
- xrecord_t **recs;
+ xrecord_t *recs;
xdlclass_t *rcrec;
char *dis, *dis1, *dis2;
int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
@@ -273,7 +268,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -281,7 +276,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -317,13 +312,13 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
*/
static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
long i, lim;
- xrecord_t **recs1, **recs2;
+ xrecord_t *recs1, *recs2;
recs1 = xdf1->recs;
recs2 = xdf2->recs;
for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
i++, recs1++, recs2++)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dstart = xdf2->dstart = i;
@@ -331,7 +326,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
recs1 = xdf1->recs + xdf1->nrec - 1;
recs2 = xdf2->recs + xdf2->nrec - 1;
for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dend = xdf1->nrec - i - 1;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 85848f1685..3d26cbf1ec 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -45,10 +45,9 @@ typedef struct s_xrecord {
} xrecord_t;
typedef struct s_xdfile {
- chastore_t rcha;
+ xrecord_t *recs;
long nrec;
long dstart, dend;
- xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 444a108f87..332982b509 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -416,12 +416,12 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
mmfile_t subfile1, subfile2;
xdfenv_t env;
- subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
- subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
- diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
- subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
- subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
- diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
+ subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
+ subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
+ diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
+ subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
+ subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
+ diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
return -1;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v4 10/12] xdiff: delete rchg aliasing
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
` (8 preceding siblings ...)
2025-09-22 19:51 ` [PATCH v4 09/12] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 11/12] xdiff: use bool literals for xdfile_t.rchg Ezekiel Newren via GitGitGadget
` (3 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index a66125d44a..83c4cff6f7 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -932,16 +932,15 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
xdchange_t *cscr = NULL, *xch;
- char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
long i1, i2, l1, l2;
/*
* Trivial. Collects "groups" of changes and creates an edit script.
*/
for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
- if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
- for (l1 = i1; rchg1[i1 - 1]; i1--);
- for (l2 = i2; rchg2[i2 - 1]; i2--);
+ if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
+ for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
+ for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
xdl_free_script(cscr);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v4 11/12] xdiff: use bool literals for xdfile_t.rchg
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
` (9 preceding siblings ...)
2025-09-22 19:51 ` [PATCH v4 10/12] xdiff: delete rchg aliasing Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 12/12] xdiff: refactor 'char *rchg' to 'bool *changed' in xdfile_t Ezekiel Newren via GitGitGadget
` (2 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Define macros NO(0), YES(1), MAYBE(2) as the enum values for dis1 and
dis2 to make the code easier to follow.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 12 ++++++------
xdiff/xhistogram.c | 8 ++++----
xdiff/xpatience.c | 8 ++++----
xdiff/xprepare.c | 27 +++++++++++++++------------
4 files changed, 29 insertions(+), 26 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 83c4cff6f7..6213ce7a03 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
- xdf2->rchg[xdf2->rindex[off2]] = 1;
+ xdf2->rchg[xdf2->rindex[off2]] = true;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
- xdf1->rchg[xdf1->rindex[off1]] = 1;
+ xdf1->rchg[xdf1->rindex[off1]] = true;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -753,8 +753,8 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
- xdf->rchg[g->start++] = 0;
- xdf->rchg[g->end++] = 1;
+ xdf->rchg[g->start++] = false;
+ xdf->rchg[g->end++] = true;
while (xdf->rchg[g->end])
g->end++;
@@ -774,8 +774,8 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
- xdf->rchg[--g->start] = 1;
- xdf->rchg[--g->end] = 0;
+ xdf->rchg[--g->start] = true;
+ xdf->rchg[--g->end] = false;
while (xdf->rchg[g->start - 1])
g->start--;
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 4d857e8ae2..ad88406656 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -318,11 +318,11 @@ redo:
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = true;
return 0;
}
@@ -335,9 +335,9 @@ redo:
else {
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = true;
while (count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = true;
result = 0;
} else {
result = histogram_diff(xpp, env,
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index bf69a58527..042e889348 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* trivial case: one side is empty */
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = true;
return 0;
}
@@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* are there any matching lines at all? */
if (!map.has_matches) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.rchg[line1++ - 1] = true;
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.rchg[line2++ - 1] = true;
xdl_free(map.entries);
return 0;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 27c5a4d636..f152e3acd8 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -29,6 +29,9 @@
#define XDL_GUESS_NLINES1 256
#define XDL_GUESS_NLINES2 20
+#define NO 0
+#define YES 1
+#define MAYBE 2
typedef struct s_xdlclass {
struct s_xdlclass *next;
@@ -212,9 +215,9 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
* current line (i) is already a multimatch line.
*/
for (r = 1, rdis0 = 0, rpdis0 = 1; (i - r) >= s; r++) {
- if (!dis[i - r])
+ if (dis[i - r] == NO)
rdis0++;
- else if (dis[i - r] == 2)
+ else if (dis[i - r] == MAYBE)
rpdis0++;
else
break;
@@ -228,9 +231,9 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
if (rdis0 == 0)
return 0;
for (r = 1, rdis1 = 0, rpdis1 = 1; (i + r) <= e; r++) {
- if (!dis[i + r])
+ if (dis[i + r] == NO)
rdis1++;
- else if (dis[i + r] == 2)
+ else if (dis[i + r] == MAYBE)
rpdis1++;
else
break;
@@ -270,7 +273,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
- dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ dis1[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
}
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
@@ -278,26 +281,26 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
- dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ dis2[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
}
for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
i <= xdf1->dend; i++, recs++) {
- if (dis1[i] == 1 ||
- (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
+ if (dis1[i] == YES ||
+ (dis1[i] == MAYBE && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
} else
- xdf1->rchg[i] = 1;
+ xdf1->rchg[i] = true;
}
xdf1->nreff = nreff;
for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
i <= xdf2->dend; i++, recs++) {
- if (dis2[i] == 1 ||
- (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
+ if (dis2[i] == YES ||
+ (dis2[i] == MAYBE && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
} else
- xdf2->rchg[i] = 1;
+ xdf2->rchg[i] = true;
}
xdf2->nreff = nreff;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v4 12/12] xdiff: refactor 'char *rchg' to 'bool *changed' in xdfile_t
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
` (10 preceding siblings ...)
2025-09-22 19:51 ` [PATCH v4 11/12] xdiff: use bool literals for xdfile_t.rchg Ezekiel Newren via GitGitGadget
@ 2025-09-22 19:51 ` Ezekiel Newren via GitGitGadget
2025-09-22 22:39 ` [PATCH v4 00/12] Cleanup xdfile_t and xrecord_t in xdiff Junio C Hamano
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-22 19:51 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 28 ++++++++++++++--------------
xdiff/xhistogram.c | 8 ++++----
xdiff/xpatience.c | 8 ++++----
xdiff/xprepare.c | 12 ++++++------
xdiff/xtypes.h | 2 +-
xdiff/xutils.c | 4 ++--
6 files changed, 31 insertions(+), 31 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 6213ce7a03..b902be9d0e 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
- xdf2->rchg[xdf2->rindex[off2]] = true;
+ xdf2->changed[xdf2->rindex[off2]] = true;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
- xdf1->rchg[xdf1->rindex[off1]] = true;
+ xdf1->changed[xdf1->rindex[off1]] = true;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -708,7 +708,7 @@ struct xdlgroup {
static void group_init(xdfile_t *xdf, struct xdlgroup *g)
{
g->start = g->end = 0;
- while (xdf->rchg[g->end])
+ while (xdf->changed[g->end])
g->end++;
}
@@ -722,7 +722,7 @@ static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->start = g->end + 1;
- for (g->end = g->start; xdf->rchg[g->end]; g->end++)
+ for (g->end = g->start; xdf->changed[g->end]; g->end++)
;
return 0;
@@ -738,7 +738,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->end = g->start - 1;
- for (g->start = g->end; xdf->rchg[g->start - 1]; g->start--)
+ for (g->start = g->end; xdf->changed[g->start - 1]; g->start--)
;
return 0;
@@ -753,10 +753,10 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
- xdf->rchg[g->start++] = false;
- xdf->rchg[g->end++] = true;
+ xdf->changed[g->start++] = false;
+ xdf->changed[g->end++] = true;
- while (xdf->rchg[g->end])
+ while (xdf->changed[g->end])
g->end++;
return 0;
@@ -774,10 +774,10 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
- xdf->rchg[--g->start] = true;
- xdf->rchg[--g->end] = false;
+ xdf->changed[--g->start] = true;
+ xdf->changed[--g->end] = false;
- while (xdf->rchg[g->start - 1])
+ while (xdf->changed[g->start - 1])
g->start--;
return 0;
@@ -938,9 +938,9 @@ int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
* Trivial. Collects "groups" of changes and creates an edit script.
*/
for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
- if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
- for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
- for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
+ if (xe->xdf1.changed[i1 - 1] || xe->xdf2.changed[i2 - 1]) {
+ for (l1 = i1; xe->xdf1.changed[i1 - 1]; i1--);
+ for (l2 = i2; xe->xdf2.changed[i2 - 1]; i2--);
if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
xdl_free_script(cscr);
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index ad88406656..6dc450b1fe 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -318,11 +318,11 @@ redo:
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = true;
+ env->xdf2.changed[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = true;
+ env->xdf1.changed[line1++ - 1] = true;
return 0;
}
@@ -335,9 +335,9 @@ redo:
else {
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
- env->xdf1.rchg[line1++ - 1] = true;
+ env->xdf1.changed[line1++ - 1] = true;
while (count2--)
- env->xdf2.rchg[line2++ - 1] = true;
+ env->xdf2.changed[line2++ - 1] = true;
result = 0;
} else {
result = histogram_diff(xpp, env,
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 042e889348..669b653580 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* trivial case: one side is empty */
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = true;
+ env->xdf2.changed[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = true;
+ env->xdf1.changed[line1++ - 1] = true;
return 0;
}
@@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* are there any matching lines at all? */
if (!map.has_matches) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = true;
+ env->xdf1.changed[line1++ - 1] = true;
while(count2--)
- env->xdf2.rchg[line2++ - 1] = true;
+ env->xdf2.changed[line2++ - 1] = true;
xdl_free(map.entries);
return 0;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index f152e3acd8..009556f7c2 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -129,7 +129,7 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
static void xdl_free_ctx(xdfile_t *xdf)
{
xdl_free(xdf->rindex);
- xdl_free(xdf->rchg - 1);
+ xdl_free(xdf->changed - 1);
xdl_free(xdf->recs);
}
@@ -142,7 +142,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xrecord_t *crec;
xdf->rindex = NULL;
- xdf->rchg = NULL;
+ xdf->changed = NULL;
xdf->recs = NULL;
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
@@ -164,7 +164,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
}
}
- if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
+ if (!XDL_CALLOC_ARRAY(xdf->changed, xdf->nrec + 2))
goto abort;
if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
@@ -173,7 +173,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
goto abort;
}
- xdf->rchg += 1;
+ xdf->changed += 1;
xdf->nreff = 0;
xdf->dstart = 0;
xdf->dend = xdf->nrec - 1;
@@ -290,7 +290,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
(dis1[i] == MAYBE && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
} else
- xdf1->rchg[i] = true;
+ xdf1->changed[i] = true;
}
xdf1->nreff = nreff;
@@ -300,7 +300,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
(dis2[i] == MAYBE && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
} else
- xdf2->rchg[i] = true;
+ xdf2->changed[i] = true;
}
xdf2->nreff = nreff;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 3d26cbf1ec..f145abba3e 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -48,7 +48,7 @@ typedef struct s_xdfile {
xrecord_t *recs;
long nrec;
long dstart, dend;
- char *rchg;
+ bool *changed;
long *rindex;
long nreff;
} xdfile_t;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 332982b509..ed65c222e6 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -425,8 +425,8 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
return -1;
- memcpy(diff_env->xdf1.rchg + line1 - 1, env.xdf1.rchg, count1);
- memcpy(diff_env->xdf2.rchg + line2 - 1, env.xdf2.rchg, count2);
+ memcpy(diff_env->xdf1.changed + line1 - 1, env.xdf1.changed, count1);
+ memcpy(diff_env->xdf2.changed + line2 - 1, env.xdf2.changed, count2);
xdl_free_env(&env);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* Re: [PATCH v4 00/12] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
` (11 preceding siblings ...)
2025-09-22 19:51 ` [PATCH v4 12/12] xdiff: refactor 'char *rchg' to 'bool *changed' in xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-22 22:39 ` Junio C Hamano
2025-09-23 0:13 ` Ezekiel Newren
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
13 siblings, 1 reply; 158+ messages in thread
From: Junio C Hamano @ 2025-09-22 22:39 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget
Cc: git, Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren
"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> Changes since v3.
>
> * Address review feedback.
> * Split the deletion of xdl_get_rec() into 2 commits.
> * Move NO, YES, MAYBE into xprepare.c, and use bool literals.
The elements of this array is of type "char"; it makes me feel a bit
awkward that the code
* assigns bool "true" or "false" to "char"
* expects that reading it in another compilation unit yields 0 or 1
Instead of abusing boolean true/false, I'd rather see the code that
assigns 0 or 1 to use 0 or 1 as literals. As the array got a much
better name .changed[], anybody would understand that
env->xdf.changed[line] = 0;
env->xdf.changed[line] = 1;
mean what they mean.
> * refactor 'char rchg' to 'bool changed'
Hmph, I am not sure if it is a good idea to pretend that this
changed[] array that is more than bool to be a mere bool. An object
declared as type _Bool is guaranteed to be only large enough to
store the values 0 and 1. Granted that you cannot allocate less
than a single bite or make an array of bits in modern architectures,
an array of _Bool would likely be byte addressed and if you assign 2
and read it back, you may get 2 back in practice, but I'd rather not
to see such a strange code to live in this codebase.
How about
- rename rchg[] to changed[], which is a very good move;
- optionally make it unsigned char, not char;
- the user of changed[] that uses only 0 or 1 and is not even aware
of that MAYBE thing use 0 or 1;
- the user of changed[] that has to be aware of that MAYBE state
use its own NO/YES/MAYBE for readability.
Hmm?
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v4 00/12] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-22 22:39 ` [PATCH v4 00/12] Cleanup xdfile_t and xrecord_t in xdiff Junio C Hamano
@ 2025-09-23 0:13 ` Ezekiel Newren
2025-09-23 1:06 ` Junio C Hamano
0 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-23 0:13 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Phillip Wood,
Ben Knoble, Jeff King
On Mon, Sep 22, 2025 at 4:39 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > Changes since v3.
> >
> > * Address review feedback.
> > * Split the deletion of xdl_get_rec() into 2 commits.
>
>
> > * Move NO, YES, MAYBE into xprepare.c, and use bool literals.
>
> The elements of this array is of type "char"; it makes me feel a bit
> awkward that the code
>
> * assigns bool "true" or "false" to "char"
>
> * expects that reading it in another compilation unit yields 0 or 1
>
> Instead of abusing boolean true/false, I'd rather see the code that
> assigns 0 or 1 to use 0 or 1 as literals. As the array got a much
> better name .changed[], anybody would understand that
>
> env->xdf.changed[line] = 0;
> env->xdf.changed[line] = 1;
>
> mean what they mean.
>
> > * refactor 'char rchg' to 'bool changed'
>
> Hmph, I am not sure if it is a good idea to pretend that this
> changed[] array that is more than bool to be a mere bool. An object
> declared as type _Bool is guaranteed to be only large enough to
> store the values 0 and 1. Granted that you cannot allocate less
> than a single bite or make an array of bits in modern architectures,
> an array of _Bool would likely be byte addressed and if you assign 2
> and read it back, you may get 2 back in practice, but I'd rather not
> to see such a strange code to live in this codebase.
>
> How about
>
> - rename rchg[] to changed[], which is a very good move;
>
> - optionally make it unsigned char, not char;
>
> - the user of changed[] that uses only 0 or 1 and is not even aware
> of that MAYBE thing use 0 or 1;
>
> - the user of changed[] that has to be aware of that MAYBE state
> use its own NO/YES/MAYBE for readability.
>
> Hmm?
'changed' is NEVER EVER!!! assigned anything other than 0 or 1 which
strictly makes it a bool. It's easy to mistake that because the
functions in xprepare.c that deal with NO, YES, and MAYBE are within a
few lines of 'changed'. Please re-read xdl_cleanup_records() and
xdl_clean_mmatch() very carefully. I will update my commit message to
make this more clear.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v4 00/12] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-23 0:13 ` Ezekiel Newren
@ 2025-09-23 1:06 ` Junio C Hamano
2025-09-23 1:30 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Junio C Hamano @ 2025-09-23 1:06 UTC (permalink / raw)
To: Ezekiel Newren
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Phillip Wood,
Ben Knoble, Jeff King
Ezekiel Newren <ezekielnewren@gmail.com> writes:
>> How about
>>
>> - rename rchg[] to changed[], which is a very good move;
>>
>> - optionally make it unsigned char, not char;
>>
>> - the user of changed[] that uses only 0 or 1 and is not even aware
>> of that MAYBE thing use 0 or 1;
>>
>> - the user of changed[] that has to be aware of that MAYBE state
>> use its own NO/YES/MAYBE for readability.
>>
>> Hmm?
>
> 'changed' is NEVER EVER!!! assigned anything other than 0 or 1 which
> strictly makes it a bool. It's easy to mistake that because the
> functions in xprepare.c that deal with NO, YES, and MAYBE are within a
> few lines of 'changed'. Please re-read xdl_cleanup_records() and
> xdl_clean_mmatch() very carefully. I will update my commit message to
> make this more clear.
OK, then there is a variable with some type that is _not_ bool that
is used in xprepare.c and the code that deal with MAYBE does
something like
u8 current_state = MAYBE;
if (the .changed[line] is NOT valid)
current_state = MAYBE;
else if (env->xdf.changed[line])
current_state = YES;
else /* false */
current_state = NO;
and then use current_state as a three-way variable, perhaps like
switch (current_state) {
case YES:
do the yes thing;
break;
case NO:
do the no thing;
break;
case MAYBE:
do the maybe thing;
break;
}
?
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v4 00/12] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-23 1:06 ` Junio C Hamano
@ 2025-09-23 1:30 ` Ezekiel Newren
2025-09-23 14:12 ` Junio C Hamano
0 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-23 1:30 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Phillip Wood,
Ben Knoble, Jeff King
On Mon, Sep 22, 2025 at 7:06 PM Junio C Hamano <gitster@pobox.com> wrote:
> > 'changed' is NEVER EVER!!! assigned anything other than 0 or 1 which
> > strictly makes it a bool. It's easy to mistake that because the
> > functions in xprepare.c that deal with NO, YES, and MAYBE are within a
> > few lines of 'changed'. Please re-read xdl_cleanup_records() and
> > xdl_clean_mmatch() very carefully. I will update my commit message to
> > make this more clear.
>
> OK, then there is a variable with some type that is _not_ bool that
> is used in xprepare.c and the code that deal with MAYBE does
> something like
>
> u8 current_state = MAYBE;
>
> if (the .changed[line] is NOT valid)
> current_state = MAYBE;
> else if (env->xdf.changed[line])
> current_state = YES;
> else /* false */
> current_state = NO;
>
> and then use current_state as a three-way variable, perhaps like
>
> switch (current_state) {
> case YES:
> do the yes thing;
> break;
> case NO:
> do the no thing;
> break;
> case MAYBE:
> do the maybe thing;
> break;
> }
I apologize for my previous phrasing. I was not very tactful. Yes, I
think your suggestion is a good idea. I'll incorporate that into my
patches.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v4 00/12] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-23 1:30 ` Ezekiel Newren
@ 2025-09-23 14:12 ` Junio C Hamano
2025-09-23 16:50 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Junio C Hamano @ 2025-09-23 14:12 UTC (permalink / raw)
To: Ezekiel Newren
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Phillip Wood,
Ben Knoble, Jeff King
Ezekiel Newren <ezekielnewren@gmail.com> writes:
> I apologize for my previous phrasing. I was not very tactful. Yes, I
> think your suggestion is a good idea. I'll incorporate that into my
> patches.
I didn't get an impression that you were _not_ tactful at all. If
the arrangement is like what I outlined in the message you are
responding to, I am perfectly fine if the type of changed[] is an
array of bool. The only thing I found was disturbing was the idea
to assign 2 into a _Bool. Comparing a _Bool, which can be either 0
oor 1, and find it is different from litral 2 (or MAYBE that is
defined to be 2) is perfectly fine.
Thanks.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v4 00/12] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-23 14:12 ` Junio C Hamano
@ 2025-09-23 16:50 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-23 16:50 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Phillip Wood,
Ben Knoble, Jeff King
On Tue, Sep 23, 2025 at 8:12 AM Junio C Hamano <gitster@pobox.com> wrote:
> Ezekiel Newren <ezekielnewren@gmail.com> writes:
>
> > I apologize for my previous phrasing. I was not very tactful. Yes, I
> > think your suggestion is a good idea. I'll incorporate that into my
> > patches.
>
> I didn't get an impression that you were _not_ tactful at all. If
> the arrangement is like what I outlined in the message you are
> responding to, I am perfectly fine if the type of changed[] is an
> array of bool. The only thing I found was disturbing was the idea
> to assign 2 into a _Bool. Comparing a _Bool, which can be either 0
> oor 1, and find it is different from litral 2 (or MAYBE that is
> defined to be 2) is perfectly fine.
I'm going to reroll this to make it much easier to see that the enum
macros are separate from rchg/changed. After carefully re-reading
those 2 functions, I think the macros NONE(0), SOME(1), TOO_MANY(2)
make more sense. I think what dis1 and dis2 (which I'll rename to
matches1, matches2) was doing is setting a state value based on the
question "How many times does this line in file 1 show up in file 2,
and vice versa". My guess is that if the line in file 1 doesn't show
up in file 2 then it's obviously different. But if the number of
matches is greater than some threshold and isn't minimal then TOO_MANY
otherwise set it to SOME. So I think dis1, dis2 are meant as "Here is
how we deal with the number of matches found in the other file."
Is this explanation congruent with the classic diff (myers/minimal)?
^ permalink raw reply [flat|nested] 158+ messages in thread
* [PATCH v5 00/13] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
` (12 preceding siblings ...)
2025-09-22 22:39 ` [PATCH v4 00/12] Cleanup xdfile_t and xrecord_t in xdiff Junio C Hamano
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 01/13] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
` (13 more replies)
13 siblings, 14 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren
Changes since v4.
* Make it clear that the field xdfile_t.rchg (now 'xdfile_t.changed') is
distinct from the local variables dis1, dis2 (now 'matches1',
'matches2').
* Use NONE, SOME, TOO_MANY instead of NO, YES, MAYBE.
* Use bool literals for xdfile_t.changed.
Changes since v3.
* Address review feedback.
* Split the deletion of xdl_get_rec() into 2 commits.
* Move NO, YES, MAYBE into xprepare.c, and use bool literals.
* refactor 'char rchg' to 'bool changed'
Changes since v2.
* No patch changes, just resending to get patch 9 to show up on the mailing
list.
* A few tweaks to the cover letter.
Changes since v1, to address review feedback.
* Only include the clean up patches; The remaining patches will be split
into a separate series.
* Commit message clarifications.
* Minor style cleanups.
* Performance impacts included in commit message of patch 8.
Relevant part of the original cover letter follows:
===================================================
Before:
typedef struct s_xrecord {
struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
unsigned int hbits;
xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
unsigned long *ha;
} xdfile_t;
After cleanup:
typedef struct s_xrecord {
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
xrecord_t *recs;
long nrec;
long dstart, dend;
bool *changed;
long *rindex;
long nreff;
} xdfile_t;
===
Ezekiel Newren (13):
xdiff: delete static forward declarations in xprepare
xdiff: delete local variables and initialize/free xdfile_t directly
xdiff: delete unnecessary fields from xrecord_t and xdfile_t
xdiff: delete superfluous function xdl_get_rec() in xemit
xdiff: delete superfluous local variables that alias fields in
xrecord_t
xdiff: delete struct diffdata_t
xdiff: delete redundant array xdfile_t.ha
xdiff: delete fields ha, line, size in xdlclass_t in favor of an
xrecord_t
xdiff: delete chastore from xdfile_t
xdiff: delete rchg aliasing
xdiff: rename rchg -> changed in xdfile_t
xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
xdiff: change type of xdfile_t.changed from char to bool
xdiff/xdiffi.c | 101 ++++++--------
xdiff/xdiffi.h | 11 +-
xdiff/xemit.c | 38 ++----
xdiff/xhistogram.c | 10 +-
xdiff/xmerge.c | 56 ++++----
xdiff/xpatience.c | 18 +--
xdiff/xprepare.c | 330 ++++++++++++++++++++-------------------------
xdiff/xtypes.h | 9 +-
xdiff/xutils.c | 16 +--
9 files changed, 259 insertions(+), 330 deletions(-)
base-commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2048%2Fezekielnewren%2Fuse_rust_types_in_xdiff-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2048/ezekielnewren/use_rust_types_in_xdiff-v5
Pull-Request: https://github.com/git/git/pull/2048
Range-diff vs v4:
1: 79d1099656 = 1: 890e508000 xdiff: delete static forward declarations in xprepare
2: 9142f28fcd = 2: 0cfd75b1ff xdiff: delete local variables and initialize/free xdfile_t directly
3: 13f00f5683 = 3: 92c81d2ff6 xdiff: delete unnecessary fields from xrecord_t and xdfile_t
4: 311279c123 = 4: 7d3a7e617c xdiff: delete superfluous function xdl_get_rec() in xemit
5: d84658ac83 = 5: 1d550cf308 xdiff: delete superfluous local variables that alias fields in xrecord_t
6: bf16453846 = 6: 2a3a1b657e xdiff: delete struct diffdata_t
7: 4ef7f243e9 = 7: 4c6543cbe3 xdiff: delete redundant array xdfile_t.ha
8: 3b6c2127c4 = 8: 21bf4b5a20 xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
9: f7b5021e48 = 9: ef6ae7d29c xdiff: delete chastore from xdfile_t
10: 97135495e2 = 10: 7b0856108a xdiff: delete rchg aliasing
12: 034a4a7b2a ! 11: 570ab9f898 xdiff: refactor 'char *rchg' to 'bool *changed' in xdfile_t
@@ Metadata
Author: Ezekiel Newren <ezekielnewren@gmail.com>
## Commit message ##
- xdiff: refactor 'char *rchg' to 'bool *changed' in xdfile_t
+ xdiff: rename rchg -> changed in xdfile_t
+ Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
## xdiff/xdiffi.c ##
@@ xdiff/xdiffi.c: int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
-- xdf2->rchg[xdf2->rindex[off2]] = true;
-+ xdf2->changed[xdf2->rindex[off2]] = true;
+- xdf2->rchg[xdf2->rindex[off2]] = 1;
++ xdf2->changed[xdf2->rindex[off2]] = 1;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
-- xdf1->rchg[xdf1->rindex[off1]] = true;
-+ xdf1->changed[xdf1->rindex[off1]] = true;
+- xdf1->rchg[xdf1->rindex[off1]] = 1;
++ xdf1->changed[xdf1->rindex[off1]] = 1;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ xdiff/xdiffi.c: static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
-- xdf->rchg[g->start++] = false;
-- xdf->rchg[g->end++] = true;
-+ xdf->changed[g->start++] = false;
-+ xdf->changed[g->end++] = true;
+- xdf->rchg[g->start++] = 0;
+- xdf->rchg[g->end++] = 1;
++ xdf->changed[g->start++] = 0;
++ xdf->changed[g->end++] = 1;
- while (xdf->rchg[g->end])
+ while (xdf->changed[g->end])
@@ xdiff/xdiffi.c: static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
-- xdf->rchg[--g->start] = true;
-- xdf->rchg[--g->end] = false;
-+ xdf->changed[--g->start] = true;
-+ xdf->changed[--g->end] = false;
+- xdf->rchg[--g->start] = 1;
+- xdf->rchg[--g->end] = 0;
++ xdf->changed[--g->start] = 1;
++ xdf->changed[--g->end] = 0;
- while (xdf->rchg[g->start - 1])
+ while (xdf->changed[g->start - 1])
@@ xdiff/xhistogram.c: redo:
if (!count1) {
while(count2--)
-- env->xdf2.rchg[line2++ - 1] = true;
-+ env->xdf2.changed[line2++ - 1] = true;
+- env->xdf2.rchg[line2++ - 1] = 1;
++ env->xdf2.changed[line2++ - 1] = 1;
return 0;
} else if (!count2) {
while(count1--)
-- env->xdf1.rchg[line1++ - 1] = true;
-+ env->xdf1.changed[line1++ - 1] = true;
+- env->xdf1.rchg[line1++ - 1] = 1;
++ env->xdf1.changed[line1++ - 1] = 1;
return 0;
}
@@ xdiff/xhistogram.c: redo:
else {
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
-- env->xdf1.rchg[line1++ - 1] = true;
-+ env->xdf1.changed[line1++ - 1] = true;
+- env->xdf1.rchg[line1++ - 1] = 1;
++ env->xdf1.changed[line1++ - 1] = 1;
while (count2--)
-- env->xdf2.rchg[line2++ - 1] = true;
-+ env->xdf2.changed[line2++ - 1] = true;
+- env->xdf2.rchg[line2++ - 1] = 1;
++ env->xdf2.changed[line2++ - 1] = 1;
result = 0;
} else {
result = histogram_diff(xpp, env,
@@ xdiff/xpatience.c: static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* trivial case: one side is empty */
if (!count1) {
while(count2--)
-- env->xdf2.rchg[line2++ - 1] = true;
-+ env->xdf2.changed[line2++ - 1] = true;
+- env->xdf2.rchg[line2++ - 1] = 1;
++ env->xdf2.changed[line2++ - 1] = 1;
return 0;
} else if (!count2) {
while(count1--)
-- env->xdf1.rchg[line1++ - 1] = true;
-+ env->xdf1.changed[line1++ - 1] = true;
+- env->xdf1.rchg[line1++ - 1] = 1;
++ env->xdf1.changed[line1++ - 1] = 1;
return 0;
}
@@ xdiff/xpatience.c: static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* are there any matching lines at all? */
if (!map.has_matches) {
while(count1--)
-- env->xdf1.rchg[line1++ - 1] = true;
-+ env->xdf1.changed[line1++ - 1] = true;
+- env->xdf1.rchg[line1++ - 1] = 1;
++ env->xdf1.changed[line1++ - 1] = 1;
while(count2--)
-- env->xdf2.rchg[line2++ - 1] = true;
-+ env->xdf2.changed[line2++ - 1] = true;
+- env->xdf2.rchg[line2++ - 1] = 1;
++ env->xdf2.changed[line2++ - 1] = 1;
xdl_free(map.entries);
return 0;
}
@@ xdiff/xprepare.c: static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, lo
xdf->dstart = 0;
xdf->dend = xdf->nrec - 1;
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
- (dis1[i] == MAYBE && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
+ (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
} else
-- xdf1->rchg[i] = true;
-+ xdf1->changed[i] = true;
+- xdf1->rchg[i] = 1;
++ xdf1->changed[i] = 1;
}
xdf1->nreff = nreff;
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
- (dis2[i] == MAYBE && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
+ (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
} else
-- xdf2->rchg[i] = true;
-+ xdf2->changed[i] = true;
+- xdf2->rchg[i] = 1;
++ xdf2->changed[i] = 1;
}
xdf2->nreff = nreff;
@@ xdiff/xtypes.h: typedef struct s_xdfile {
long nrec;
long dstart, dend;
- char *rchg;
-+ bool *changed;
++ char *changed;
long *rindex;
long nreff;
} xdfile_t;
-: ---------- > 12: 08a0fceb72 xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
11: b544c15a67 ! 13: 975e845bfa xdiff: use bool literals for xdfile_t.rchg
@@ Metadata
Author: Ezekiel Newren <ezekielnewren@gmail.com>
## Commit message ##
- xdiff: use bool literals for xdfile_t.rchg
+ xdiff: change type of xdfile_t.changed from char to bool
- Define macros NO(0), YES(1), MAYBE(2) as the enum values for dis1 and
- dis2 to make the code easier to follow.
+ The only values possible for 'changed' is 1 and 0, which exactly maps
+ to a bool type. It might not look like this is the case because
+ matches1 and matches2 (which use to be dis1, and dis2) were also char
+ and were assigned numerical values within a few lines of 'changed'
+ (what used to be rchg).
+
+ Using NONE, SOME, TOO_MANY for matches1[i]/matches2[j], and true/false
+ for changed[k] makes it clear to future readers that these are
+ logically separate concepts.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
@@ xdiff/xdiffi.c: int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
-- xdf2->rchg[xdf2->rindex[off2]] = 1;
-+ xdf2->rchg[xdf2->rindex[off2]] = true;
+- xdf2->changed[xdf2->rindex[off2]] = 1;
++ xdf2->changed[xdf2->rindex[off2]] = true;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
-- xdf1->rchg[xdf1->rindex[off1]] = 1;
-+ xdf1->rchg[xdf1->rindex[off1]] = true;
+- xdf1->changed[xdf1->rindex[off1]] = 1;
++ xdf1->changed[xdf1->rindex[off1]] = true;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ xdiff/xdiffi.c: static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
-- xdf->rchg[g->start++] = 0;
-- xdf->rchg[g->end++] = 1;
-+ xdf->rchg[g->start++] = false;
-+ xdf->rchg[g->end++] = true;
+- xdf->changed[g->start++] = 0;
+- xdf->changed[g->end++] = 1;
++ xdf->changed[g->start++] = false;
++ xdf->changed[g->end++] = true;
- while (xdf->rchg[g->end])
+ while (xdf->changed[g->end])
g->end++;
@@ xdiff/xdiffi.c: static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
-- xdf->rchg[--g->start] = 1;
-- xdf->rchg[--g->end] = 0;
-+ xdf->rchg[--g->start] = true;
-+ xdf->rchg[--g->end] = false;
+- xdf->changed[--g->start] = 1;
+- xdf->changed[--g->end] = 0;
++ xdf->changed[--g->start] = true;
++ xdf->changed[--g->end] = false;
- while (xdf->rchg[g->start - 1])
+ while (xdf->changed[g->start - 1])
g->start--;
## xdiff/xhistogram.c ##
@@ xdiff/xhistogram.c: redo:
if (!count1) {
while(count2--)
-- env->xdf2.rchg[line2++ - 1] = 1;
-+ env->xdf2.rchg[line2++ - 1] = true;
+- env->xdf2.changed[line2++ - 1] = 1;
++ env->xdf2.changed[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
-- env->xdf1.rchg[line1++ - 1] = 1;
-+ env->xdf1.rchg[line1++ - 1] = true;
+- env->xdf1.changed[line1++ - 1] = 1;
++ env->xdf1.changed[line1++ - 1] = true;
return 0;
}
@@ xdiff/xhistogram.c: redo:
else {
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
-- env->xdf1.rchg[line1++ - 1] = 1;
-+ env->xdf1.rchg[line1++ - 1] = true;
+- env->xdf1.changed[line1++ - 1] = 1;
++ env->xdf1.changed[line1++ - 1] = true;
while (count2--)
-- env->xdf2.rchg[line2++ - 1] = 1;
-+ env->xdf2.rchg[line2++ - 1] = true;
+- env->xdf2.changed[line2++ - 1] = 1;
++ env->xdf2.changed[line2++ - 1] = true;
result = 0;
} else {
result = histogram_diff(xpp, env,
@@ xdiff/xpatience.c: static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* trivial case: one side is empty */
if (!count1) {
while(count2--)
-- env->xdf2.rchg[line2++ - 1] = 1;
-+ env->xdf2.rchg[line2++ - 1] = true;
+- env->xdf2.changed[line2++ - 1] = 1;
++ env->xdf2.changed[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
-- env->xdf1.rchg[line1++ - 1] = 1;
-+ env->xdf1.rchg[line1++ - 1] = true;
+- env->xdf1.changed[line1++ - 1] = 1;
++ env->xdf1.changed[line1++ - 1] = true;
return 0;
}
@@ xdiff/xpatience.c: static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* are there any matching lines at all? */
if (!map.has_matches) {
while(count1--)
-- env->xdf1.rchg[line1++ - 1] = 1;
-+ env->xdf1.rchg[line1++ - 1] = true;
+- env->xdf1.changed[line1++ - 1] = 1;
++ env->xdf1.changed[line1++ - 1] = true;
while(count2--)
-- env->xdf2.rchg[line2++ - 1] = 1;
-+ env->xdf2.rchg[line2++ - 1] = true;
+- env->xdf2.changed[line2++ - 1] = 1;
++ env->xdf2.changed[line2++ - 1] = true;
xdl_free(map.entries);
return 0;
}
## xdiff/xprepare.c ##
-@@
- #define XDL_GUESS_NLINES1 256
- #define XDL_GUESS_NLINES2 20
-
-+#define NO 0
-+#define YES 1
-+#define MAYBE 2
-
- typedef struct s_xdlclass {
- struct s_xdlclass *next;
-@@ xdiff/xprepare.c: static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
- * current line (i) is already a multimatch line.
- */
- for (r = 1, rdis0 = 0, rpdis0 = 1; (i - r) >= s; r++) {
-- if (!dis[i - r])
-+ if (dis[i - r] == NO)
- rdis0++;
-- else if (dis[i - r] == 2)
-+ else if (dis[i - r] == MAYBE)
- rpdis0++;
- else
- break;
-@@ xdiff/xprepare.c: static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
- if (rdis0 == 0)
- return 0;
- for (r = 1, rdis1 = 0, rpdis1 = 1; (i + r) <= e; r++) {
-- if (!dis[i + r])
-+ if (dis[i + r] == NO)
- rdis1++;
-- else if (dis[i + r] == 2)
-+ else if (dis[i + r] == MAYBE)
- rpdis1++;
- else
- break;
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
- for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
- rcrec = cf->rcrecs[recs->ha];
- nm = rcrec ? rcrec->len2 : 0;
-- dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
-+ dis1[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
- }
- if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
+ /*
+ * Create temporary arrays that will help us decide if
+- * changed[i] should remain 0 or become 1.
++ * changed[i] should remain false, or become true.
+ */
+ if (!XDL_CALLOC_ARRAY(matches1, xdf1->nrec + 1)) {
+ status = -1;
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
- for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
- rcrec = cf->rcrecs[recs->ha];
- nm = rcrec ? rcrec->len1 : 0;
-- dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
-+ dis2[i] = (nm == 0) ? NO: (nm >= mlim && !need_min) ? MAYBE: YES;
- }
+ /*
+ * Use temporary arrays to decide if changed[i] should remain
+- * 0 or become 1.
++ * false, or become true.
+ */
for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
i <= xdf1->dend; i++, recs++) {
-- if (dis1[i] == 1 ||
-- (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
-+ if (dis1[i] == YES ||
-+ (dis1[i] == MAYBE && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
+ if (matches1[i] == SOME ||
+ (matches1[i] == TOO_MANY && !xdl_clean_mmatch(matches1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
+- /* changed[i] remains 0 */
++ /* changed[i] remains false */
} else
-- xdf1->rchg[i] = 1;
-+ xdf1->rchg[i] = true;
+- xdf1->changed[i] = 1;
++ xdf1->changed[i] = true;
}
xdf1->nreff = nreff;
- for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
- i <= xdf2->dend; i++, recs++) {
-- if (dis2[i] == 1 ||
-- (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
-+ if (dis2[i] == YES ||
-+ (dis2[i] == MAYBE && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
+@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
+ if (matches2[i] == SOME ||
+ (matches2[i] == TOO_MANY && !xdl_clean_mmatch(matches2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
+- /* changed[i] remains 0 */
++ /* changed[i] remains false */
} else
-- xdf2->rchg[i] = 1;
-+ xdf2->rchg[i] = true;
+- xdf2->changed[i] = 1;
++ xdf2->changed[i] = true;
}
xdf2->nreff = nreff;
+
+ ## xdiff/xtypes.h ##
+@@ xdiff/xtypes.h: typedef struct s_xdfile {
+ xrecord_t *recs;
+ long nrec;
+ long dstart, dend;
+- char *changed;
++ bool *changed;
+ long *rindex;
+ long nreff;
+ } xdfile_t;
--
gitgitgadget
^ permalink raw reply [flat|nested] 158+ messages in thread
* [PATCH v5 01/13] xdiff: delete static forward declarations in xprepare
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 02/13] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
` (12 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Move xdl_prepare_env() later in the file to avoid the need
for static forward declarations.
Best-viewed-with: --color-moved
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 116 ++++++++++++++++++++---------------------------
1 file changed, 50 insertions(+), 66 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e1d4017b2d..249bfa678f 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -53,21 +53,6 @@ typedef struct s_xdlclassifier {
-static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags);
-static void xdl_free_classifier(xdlclassifier_t *cf);
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec);
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
- xdlclassifier_t *cf, xdfile_t *xdf);
-static void xdl_free_ctx(xdfile_t *xdf);
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e);
-static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-
-
-
-
static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags) {
cf->flags = flags;
@@ -242,57 +227,6 @@ static void xdl_free_ctx(xdfile_t *xdf) {
}
-int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
- xdfenv_t *xe) {
- long enl1, enl2, sample;
- xdlclassifier_t cf;
-
- memset(&cf, 0, sizeof(cf));
-
- /*
- * For histogram diff, we can afford a smaller sample size and
- * thus a poorer estimate of the number of lines, as the hash
- * table (rhash) won't be filled up/grown. The number of lines
- * (nrecs) will be updated correctly anyway by
- * xdl_prepare_ctx().
- */
- sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
- ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
-
- enl1 = xdl_guess_lines(mf1, sample) + 1;
- enl2 = xdl_guess_lines(mf2, sample) + 1;
-
- if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
- return -1;
-
- if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
-
- xdl_free_classifier(&cf);
- return -1;
- }
- if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
- (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
- xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf2);
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- xdl_free_classifier(&cf);
-
- return 0;
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
@@ -460,3 +394,53 @@ static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2
return 0;
}
+
+int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+ xdfenv_t *xe) {
+ long enl1, enl2, sample;
+ xdlclassifier_t cf;
+
+ memset(&cf, 0, sizeof(cf));
+
+ /*
+ * For histogram diff, we can afford a smaller sample size and
+ * thus a poorer estimate of the number of lines, as the hash
+ * table (rhash) won't be filled up/grown. The number of lines
+ * (nrecs) will be updated correctly anyway by
+ * xdl_prepare_ctx().
+ */
+ sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
+ ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
+
+ enl1 = xdl_guess_lines(mf1, sample) + 1;
+ enl2 = xdl_guess_lines(mf2, sample) + 1;
+
+ if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
+ return -1;
+
+ if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
+
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+ if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
+ (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
+ xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf2);
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ xdl_free_classifier(&cf);
+
+ return 0;
+}
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 02/13] xdiff: delete local variables and initialize/free xdfile_t directly
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 01/13] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 03/13] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
` (11 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
These local variables are essentially a hand-rolled additional
implementation of xdl_free_ctx() inlined into xdl_prepare_ctx(). Modify
the code to use the existing xdl_free_ctx() function so there aren't
two ways to free such variables.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 78 +++++++++++++++++++-----------------------------
1 file changed, 30 insertions(+), 48 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 249bfa678f..96134c9fbf 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -134,99 +134,81 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
}
+static void xdl_free_ctx(xdfile_t *xdf)
+{
+ xdl_free(xdf->rhash);
+ xdl_free(xdf->rindex);
+ xdl_free(xdf->rchg - 1);
+ xdl_free(xdf->ha);
+ xdl_free(xdf->recs);
+ xdl_cha_free(&xdf->rcha);
+}
+
+
static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
xdlclassifier_t *cf, xdfile_t *xdf) {
- unsigned int hbits;
- long nrec, hsize, bsize;
+ long bsize;
unsigned long hav;
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xrecord_t **recs;
- xrecord_t **rhash;
- unsigned long *ha;
- char *rchg;
- long *rindex;
- ha = NULL;
- rindex = NULL;
- rchg = NULL;
- rhash = NULL;
- recs = NULL;
+ xdf->ha = NULL;
+ xdf->rindex = NULL;
+ xdf->rchg = NULL;
+ xdf->rhash = NULL;
+ xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
goto abort;
- if (!XDL_ALLOC_ARRAY(recs, narec))
+ if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- hbits = xdl_hashbits((unsigned int) narec);
- hsize = 1 << hbits;
- if (!XDL_CALLOC_ARRAY(rhash, hsize))
+ xdf->hbits = xdl_hashbits((unsigned int) narec);
+ if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
goto abort;
- nrec = 0;
+ xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
prev = cur;
hav = xdl_hash_record(&cur, top, xpp->flags);
- if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
+ if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
if (!(crec = xdl_cha_alloc(&xdf->rcha)))
goto abort;
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- recs[nrec++] = crec;
- if (xdl_classify_record(pass, cf, rhash, hbits, crec) < 0)
+ xdf->recs[xdf->nrec++] = crec;
+ if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
goto abort;
}
}
- if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
+ if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
goto abort;
if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
- if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
goto abort;
}
- xdf->nrec = nrec;
- xdf->recs = recs;
- xdf->hbits = hbits;
- xdf->rhash = rhash;
- xdf->rchg = rchg + 1;
- xdf->rindex = rindex;
+ xdf->rchg += 1;
xdf->nreff = 0;
- xdf->ha = ha;
xdf->dstart = 0;
- xdf->dend = nrec - 1;
+ xdf->dend = xdf->nrec - 1;
return 0;
abort:
- xdl_free(ha);
- xdl_free(rindex);
- xdl_free(rchg);
- xdl_free(rhash);
- xdl_free(recs);
- xdl_cha_free(&xdf->rcha);
+ xdl_free_ctx(xdf);
return -1;
}
-static void xdl_free_ctx(xdfile_t *xdf) {
-
- xdl_free(xdf->rhash);
- xdl_free(xdf->rindex);
- xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
- xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 03/13] xdiff: delete unnecessary fields from xrecord_t and xdfile_t
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 01/13] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 02/13] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 04/13] xdiff: delete superfluous function xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
` (10 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 15 ++-------------
xdiff/xtypes.h | 3 ---
2 files changed, 2 insertions(+), 16 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 96134c9fbf..3576415c85 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -91,8 +91,7 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
}
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec) {
+static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
char const *line;
xdlclass_t *rcrec;
@@ -126,17 +125,12 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
rec->ha = (unsigned long) rcrec->idx;
- hi = (long) XDL_HASHLONG(rec->ha, hbits);
- rec->next = rhash[hi];
- rhash[hi] = rec;
-
return 0;
}
static void xdl_free_ctx(xdfile_t *xdf)
{
- xdl_free(xdf->rhash);
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->ha);
@@ -155,7 +149,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
- xdf->rhash = NULL;
xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
@@ -163,10 +156,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- xdf->hbits = xdl_hashbits((unsigned int) narec);
- if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
- goto abort;
-
xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
@@ -180,7 +169,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
crec->size = (long) (cur - prev);
crec->ha = hav;
xdf->recs[xdf->nrec++] = crec;
- if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
+ if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8442bd436e..8b8467360e 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,7 +39,6 @@ typedef struct s_chastore {
} chastore_t;
typedef struct s_xrecord {
- struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
@@ -48,8 +47,6 @@ typedef struct s_xrecord {
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
- unsigned int hbits;
- xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 04/13] xdiff: delete superfluous function xdl_get_rec() in xemit
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
` (2 preceding siblings ...)
2025-09-23 21:24 ` [PATCH v5 03/13] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-30 13:31 ` Kristoffer Haugsbakk
2025-09-23 21:24 ` [PATCH v5 05/13] xdiff: delete superfluous local variables that alias fields in xrecord_t Ezekiel Newren via GitGitGadget
` (9 subsequent siblings)
13 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
When xrecord_t was a linked list, and recs didn't exist, I assume this
function walked the list until it found the right record. Accessing
a contiguous array is so trival that this function is now superfluous.
Delete it.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xemit.c | 23 +++++++----------------
1 file changed, 7 insertions(+), 16 deletions(-)
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 1d40c9cb40..40fc8154f3 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -22,23 +22,14 @@
#include "xinclude.h"
-static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
-
- *rec = xdf->recs[ri]->ptr;
-
- return xdf->recs[ri]->size;
-}
-
static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
long size, psize = strlen(pre);
- char const *rec;
-
- size = xdl_get_rec(xdf, ri, &rec);
- if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0) {
+ char const *rec = xdf->recs[ri]->ptr;
+ size = xdf->recs[ri]->size;
+ if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0)
return -1;
- }
return 0;
}
@@ -120,8 +111,8 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ const char *rec = xdf->recs[ri]->ptr;
+ long len = xdf->recs[ri]->size;
if (!xecfg->find_func)
return def_ff(rec, len, buf, sz);
return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
@@ -160,8 +151,8 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ const char *rec = xdf->recs[ri]->ptr;
+ long len = xdf->recs[ri]->size;
while (len > 0 && XDL_ISSPACE(*rec)) {
rec++;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 05/13] xdiff: delete superfluous local variables that alias fields in xrecord_t
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
` (3 preceding siblings ...)
2025-09-23 21:24 ` [PATCH v5 04/13] xdiff: delete superfluous function xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-24 10:22 ` Phillip Wood
2025-09-23 21:24 ` [PATCH v5 06/13] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
` (8 subsequent siblings)
13 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Use the type xrecord_t as the local variable for the functions in the
file xdiff/xemit.c.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xemit.c | 29 +++++++++++++----------------
1 file changed, 13 insertions(+), 16 deletions(-)
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 40fc8154f3..2161ac3cd0 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -23,12 +23,11 @@
#include "xinclude.h"
-static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
- long size, psize = strlen(pre);
- char const *rec = xdf->recs[ri]->ptr;
+static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
+{
+ xrecord_t *rec = xdf->recs[ri];
- size = xdf->recs[ri]->size;
- if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0)
+ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
return -1;
return 0;
@@ -111,11 +110,11 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- const char *rec = xdf->recs[ri]->ptr;
- long len = xdf->recs[ri]->size;
+ xrecord_t *rec = xdf->recs[ri];
+
if (!xecfg->find_func)
- return def_ff(rec, len, buf, sz);
- return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
+ return def_ff(rec->ptr, rec->size, buf, sz);
+ return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
}
static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
@@ -151,14 +150,12 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- const char *rec = xdf->recs[ri]->ptr;
- long len = xdf->recs[ri]->size;
+ xrecord_t *rec = xdf->recs[ri];
+ long i = 0;
- while (len > 0 && XDL_ISSPACE(*rec)) {
- rec++;
- len--;
- }
- return !len;
+ for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
+
+ return i == rec->size;
}
int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 06/13] xdiff: delete struct diffdata_t
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
` (4 preceding siblings ...)
2025-09-23 21:24 ` [PATCH v5 05/13] xdiff: delete superfluous local variables that alias fields in xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 07/13] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
` (7 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Every field in this struct is an alias for a certain field in xdfile_t.
diffdata_t.nrec -> xdfile_t.nreff
diffdata_t.ha -> xdfile_t.ha
diffdata_t.rindex -> xdfile_t.rindex
diffdata_t.rchg -> xdfile_t.rchg
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 32 ++++++++------------------------
xdiff/xdiffi.h | 11 ++---------
2 files changed, 10 insertions(+), 33 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 5a96e36dfb..bbf0161f84 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -257,10 +257,10 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
* sub-boxes by calling the box splitting function. Note that the real job
* (marking changed lines) is done in the two boundary reaching checks.
*/
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = dd1->ha, *ha2 = dd2->ha;
+ unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
@@ -273,17 +273,11 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
* be obviously changed.
*/
if (off1 == lim1) {
- char *rchg2 = dd2->rchg;
- long *rindex2 = dd2->rindex;
-
for (; off2 < lim2; off2++)
- rchg2[rindex2[off2]] = 1;
+ xdf2->rchg[xdf2->rindex[off2]] = 1;
} else if (off2 == lim2) {
- char *rchg1 = dd1->rchg;
- long *rindex1 = dd1->rindex;
-
for (; off1 < lim1; off1++)
- rchg1[rindex1[off1]] = 1;
+ xdf1->rchg[xdf1->rindex[off1]] = 1;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -300,9 +294,9 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
/*
* ... et Impera.
*/
- if (xdl_recs_cmp(dd1, off1, spl.i1, dd2, off2, spl.i2,
+ if (xdl_recs_cmp(xdf1, off1, spl.i1, xdf2, off2, spl.i2,
kvdf, kvdb, spl.min_lo, xenv) < 0 ||
- xdl_recs_cmp(dd1, spl.i1, lim1, dd2, spl.i2, lim2,
+ xdl_recs_cmp(xdf1, spl.i1, lim1, xdf2, spl.i2, lim2,
kvdf, kvdb, spl.min_hi, xenv) < 0) {
return -1;
@@ -318,7 +312,6 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
long ndiags;
long *kvd, *kvdf, *kvdb;
xdalgoenv_t xenv;
- diffdata_t dd1, dd2;
int res;
if (xdl_prepare_env(mf1, mf2, xpp, xe) < 0)
@@ -357,16 +350,7 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xenv.snake_cnt = XDL_SNAKE_CNT;
xenv.heur_min = XDL_HEUR_MIN_COST;
- dd1.nrec = xe->xdf1.nreff;
- dd1.ha = xe->xdf1.ha;
- dd1.rchg = xe->xdf1.rchg;
- dd1.rindex = xe->xdf1.rindex;
- dd2.nrec = xe->xdf2.nreff;
- dd2.ha = xe->xdf2.ha;
- dd2.rchg = xe->xdf2.rchg;
- dd2.rindex = xe->xdf2.rindex;
-
- res = xdl_recs_cmp(&dd1, 0, dd1.nrec, &dd2, 0, dd2.nrec,
+ res = xdl_recs_cmp(&xe->xdf1, 0, xe->xdf1.nreff, &xe->xdf2, 0, xe->xdf2.nreff,
kvdf, kvdb, (xpp->flags & XDF_NEED_MINIMAL) != 0,
&xenv);
xdl_free(kvd);
diff --git a/xdiff/xdiffi.h b/xdiff/xdiffi.h
index 126c9d8ff4..49e52c67f9 100644
--- a/xdiff/xdiffi.h
+++ b/xdiff/xdiffi.h
@@ -24,13 +24,6 @@
#define XDIFFI_H
-typedef struct s_diffdata {
- long nrec;
- unsigned long const *ha;
- long *rindex;
- char *rchg;
-} diffdata_t;
-
typedef struct s_xdalgoenv {
long mxcost;
long snake_cnt;
@@ -46,8 +39,8 @@ typedef struct s_xdchange {
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv);
int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xdfenv_t *xe);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 07/13] xdiff: delete redundant array xdfile_t.ha
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
` (5 preceding siblings ...)
2025-09-23 21:24 ` [PATCH v5 06/13] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 08/13] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
` (6 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
When 0 <= i < xdfile_t.nreff the following is true:
xdfile_t.ha[i] == xdfile_t.recs[xdfile_t.rindex[i]]
This makes the code about 5% slower. The fields rindex and ha are
specific to the classic diff (myers and minimal). I plan on creating a
struct for classic diff, but there's a lot of cleanup that needs to be
done before that can happen and leaving ha in would make those cleanups
harder to follow.
A subsequent commit will delete the chastore cha from xdfile_t. That
later commit will investigate deleting ha and cha independently and
together.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++++++----------
xdiff/xprepare.c | 12 ++----------
xdiff/xtypes.h | 1 -
3 files changed, 16 insertions(+), 21 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index bbf0161f84..11cd090b53 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -22,6 +22,11 @@
#include "xinclude.h"
+static unsigned long get_hash(xdfile_t *xdf, long index)
+{
+ return xdf->recs[xdf->rindex[index]]->ha;
+}
+
#define XDL_MAX_COST_MIN 256
#define XDL_HEUR_MIN_COST 256
#define XDL_LINE_MAX (long)((1UL << (CHAR_BIT * sizeof(long) - 1)) - 1)
@@ -42,8 +47,8 @@ typedef struct s_xdpsplit {
* using this algorithm, so a little bit of heuristic is needed to cut the
* search and to return a suboptimal point.
*/
-static long xdl_split(unsigned long const *ha1, long off1, long lim1,
- unsigned long const *ha2, long off2, long lim2,
+static long xdl_split(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl,
xdalgoenv_t *xenv) {
long dmin = off1 - lim2, dmax = lim1 - off2;
@@ -87,7 +92,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdf[d + 1];
prev1 = i1;
i2 = i1 - d;
- for (; i1 < lim1 && i2 < lim2 && ha1[i1] == ha2[i2]; i1++, i2++);
+ for (; i1 < lim1 && i2 < lim2 && get_hash(xdf1, i1) == get_hash(xdf2, i2); i1++, i2++);
if (i1 - prev1 > xenv->snake_cnt)
got_snake = 1;
kvdf[d] = i1;
@@ -124,7 +129,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdb[d + 1] - 1;
prev1 = i1;
i2 = i1 - d;
- for (; i1 > off1 && i2 > off2 && ha1[i1 - 1] == ha2[i2 - 1]; i1--, i2--);
+ for (; i1 > off1 && i2 > off2 && get_hash(xdf1, i1 - 1) == get_hash(xdf2, i2 - 1); i1--, i2--);
if (prev1 - i1 > xenv->snake_cnt)
got_snake = 1;
kvdb[d] = i1;
@@ -159,7 +164,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 + xenv->snake_cnt <= i1 && i1 < lim1 &&
off2 + xenv->snake_cnt <= i2 && i2 < lim2) {
- for (k = 1; ha1[i1 - k] == ha2[i2 - k]; k++)
+ for (k = 1; get_hash(xdf1, i1 - k) == get_hash(xdf2, i2 - k); k++)
if (k == xenv->snake_cnt) {
best = v;
spl->i1 = i1;
@@ -183,7 +188,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 < i1 && i1 <= lim1 - xenv->snake_cnt &&
off2 < i2 && i2 <= lim2 - xenv->snake_cnt) {
- for (k = 0; ha1[i1 + k] == ha2[i2 + k]; k++)
+ for (k = 0; get_hash(xdf1, i1 + k) == get_hash(xdf2, i2 + k); k++)
if (k == xenv->snake_cnt - 1) {
best = v;
spl->i1 = i1;
@@ -260,13 +265,12 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
*/
- for (; off1 < lim1 && off2 < lim2 && ha1[off1] == ha2[off2]; off1++, off2++);
- for (; off1 < lim1 && off2 < lim2 && ha1[lim1 - 1] == ha2[lim2 - 1]; lim1--, lim2--);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, off1) == get_hash(xdf2, off2); off1++, off2++);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, lim1 - 1) == get_hash(xdf2, lim2 - 1); lim1--, lim2--);
/*
* If one dimension is empty, then all records on the other one must
@@ -285,7 +289,7 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
/*
* Divide ...
*/
- if (xdl_split(ha1, off1, lim1, ha2, off2, lim2, kvdf, kvdb,
+ if (xdl_split(xdf1, off1, lim1, xdf2, off2, lim2, kvdf, kvdb,
need_min, &spl, xenv) < 0) {
return -1;
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 3576415c85..22c44f0683 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -133,7 +133,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
{
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
xdl_free(xdf->recs);
xdl_cha_free(&xdf->rcha);
}
@@ -146,7 +145,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
xdf->recs = NULL;
@@ -181,8 +179,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
- goto abort;
}
xdf->rchg += 1;
@@ -300,9 +296,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf1->dend; i++, recs++) {
if (dis1[i] == 1 ||
(dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
- xdf1->rindex[nreff] = i;
- xdf1->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf1->rindex[nreff++] = i;
} else
xdf1->rchg[i] = 1;
}
@@ -312,9 +306,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf2->dend; i++, recs++) {
if (dis2[i] == 1 ||
(dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
- xdf2->rindex[nreff] = i;
- xdf2->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf2->rindex[nreff++] = i;
} else
xdf2->rchg[i] = 1;
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8b8467360e..85848f1685 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -52,7 +52,6 @@ typedef struct s_xdfile {
char *rchg;
long *rindex;
long nreff;
- unsigned long *ha;
} xdfile_t;
typedef struct s_xdfenv {
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 08/13] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
` (6 preceding siblings ...)
2025-09-23 21:24 ` [PATCH v5 07/13] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 09/13] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
` (5 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
The fields from xdlclass_t are aliases of xrecord_t:
xdlclass_t.line -> xrecord_t.ptr
xdlclass_t.size -> xrecord_t.size
xdlclass_t.ha -> xrecord_t.ha
xdlclass_t carries a copy of the data in xrecord_t, but instead of
embedding xrecord_t it duplicates the individual fields. A future
commit will change the types used in xrecord_t so embed it in
xdlclass_t first, so we don't have to remember to change the types
here as well.
Best-viewed-with: --color-words
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 22c44f0683..e6e2c0e1c0 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -32,9 +32,7 @@
typedef struct s_xdlclass {
struct s_xdlclass *next;
- unsigned long ha;
- char const *line;
- long size;
+ xrecord_t rec;
long idx;
long len1, len2;
} xdlclass_t;
@@ -93,14 +91,12 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
- char const *line;
xdlclass_t *rcrec;
- line = rec->ptr;
hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
- if (rcrec->ha == rec->ha &&
- xdl_recmatch(rcrec->line, rcrec->size,
+ if (rcrec->rec.ha == rec->ha &&
+ xdl_recmatch(rcrec->rec.ptr, rcrec->rec.size,
rec->ptr, rec->size, cf->flags))
break;
@@ -113,9 +109,7 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
if (XDL_ALLOC_GROW(cf->rcrecs, cf->count, cf->alloc))
return -1;
cf->rcrecs[rcrec->idx] = rcrec;
- rcrec->line = line;
- rcrec->size = rec->size;
- rcrec->ha = rec->ha;
+ rcrec->rec = *rec;
rcrec->len1 = rcrec->len2 = 0;
rcrec->next = cf->rchash[hi];
cf->rchash[hi] = rcrec;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 09/13] xdiff: delete chastore from xdfile_t
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
` (7 preceding siblings ...)
2025-09-23 21:24 ` [PATCH v5 08/13] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 10/13] xdiff: delete rchg aliasing Ezekiel Newren via GitGitGadget
` (4 subsequent siblings)
13 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xdfile_t currently uses chastore_t which is an arena allocator. I
think that xrecord_t used to be a linked list and recs didn't exist
originally. When recs was added I think they forgot to remove
xdfile_t.next, but was overlooked. This dual data structure setup
makes the code somewhat confusing.
Additionally the C type chastore_t isn't FFI friendly, and provides
little to no performance benefit over using realloc to grow an array.
Performance impact of deleting fields from xdfile_t:
Deleting ha is about 5% slower.
Deleting cha is about 5% faster.
Delete ha, but keep cha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.269 s ± 0.017 s [User: 1.135 s, System: 0.128 s]
Range (min … max): 1.249 s … 1.286 s 10 runs
Benchmark 2: build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.339 s ± 0.017 s [User: 1.234 s, System: 0.099 s]
Range (min … max): 1.320 s … 1.358 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.06 ± 0.02 times faster than build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete cha, but keep ha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.290 s ± 0.001 s [User: 1.154 s, System: 0.130 s]
Range (min … max): 1.288 s … 1.292 s 10 runs
Benchmark 2: build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.232 s ± 0.017 s [User: 1.105 s, System: 0.121 s]
Range (min … max): 1.205 s … 1.249 s 10 runs
Summary
build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.05 ± 0.01 times faster than build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete ha AND chastore
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha_and_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.291 s ± 0.002 s [User: 1.156 s, System: 0.129 s]
Range (min … max): 1.287 s … 1.295 s 10 runs
Benchmark 2: build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.306 s ± 0.001 s [User: 1.195 s, System: 0.105 s]
Range (min … max): 1.305 s … 1.308 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.01 ± 0.00 times faster than build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++----------
xdiff/xemit.c | 6 ++---
xdiff/xhistogram.c | 2 +-
xdiff/xmerge.c | 56 +++++++++++++++++++++++-----------------------
xdiff/xpatience.c | 10 ++++-----
xdiff/xprepare.c | 19 ++++++----------
xdiff/xtypes.h | 3 +--
xdiff/xutils.c | 12 +++++-----
8 files changed, 63 insertions(+), 69 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 11cd090b53..a66125d44a 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -24,7 +24,7 @@
static unsigned long get_hash(xdfile_t *xdf, long index)
{
- return xdf->recs[xdf->rindex[index]]->ha;
+ return xdf->recs[xdf->rindex[index]].ha;
}
#define XDL_MAX_COST_MIN 256
@@ -489,13 +489,13 @@ static void measure_split(const xdfile_t *xdf, long split,
m->indent = -1;
} else {
m->end_of_file = 0;
- m->indent = get_indent(xdf->recs[split]);
+ m->indent = get_indent(&xdf->recs[split]);
}
m->pre_blank = 0;
m->pre_indent = -1;
for (i = split - 1; i >= 0; i--) {
- m->pre_indent = get_indent(xdf->recs[i]);
+ m->pre_indent = get_indent(&xdf->recs[i]);
if (m->pre_indent != -1)
break;
m->pre_blank += 1;
@@ -508,7 +508,7 @@ static void measure_split(const xdfile_t *xdf, long split,
m->post_blank = 0;
m->post_indent = -1;
for (i = split + 1; i < xdf->nrec; i++) {
- m->post_indent = get_indent(xdf->recs[i]);
+ m->post_indent = get_indent(&xdf->recs[i]);
if (m->post_indent != -1)
break;
m->post_blank += 1;
@@ -752,7 +752,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
- recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
+ recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
xdf->rchg[g->start++] = 0;
xdf->rchg[g->end++] = 1;
@@ -773,7 +773,7 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
- recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1])) {
+ recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
xdf->rchg[--g->start] = 1;
xdf->rchg[--g->end] = 0;
@@ -988,16 +988,16 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
for (xch = xscr; xch; xch = xch->next) {
int ignore = 1;
- xrecord_t **rec;
+ xrecord_t *rec;
long i;
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
xch->ignore = ignore;
}
@@ -1021,7 +1021,7 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
xdchange_t *xch;
for (xch = xscr; xch; xch = xch->next) {
- xrecord_t **rec;
+ xrecord_t *rec;
int ignore = 1;
long i;
@@ -1033,11 +1033,11 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
xch->ignore = ignore;
}
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 2161ac3cd0..b2f1f30cd3 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -25,7 +25,7 @@
static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
return -1;
@@ -110,7 +110,7 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (!xecfg->find_func)
return def_ff(rec->ptr, rec->size, buf, sz);
@@ -150,7 +150,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
long i = 0;
for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 040d81e0bc..4d857e8ae2 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -86,7 +86,7 @@ struct region {
((LINE_MAP(index, ptr))->cnt)
#define REC(env, s, l) \
- (env->xdf##s.recs[l - 1])
+ (&env->xdf##s.recs[l - 1])
static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
{
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index af40c88a5b..fd600cbb5d 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -97,12 +97,12 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
int line_count, long flags)
{
int i;
- xrecord_t **rec1 = xe1->xdf2.recs + i1;
- xrecord_t **rec2 = xe2->xdf2.recs + i2;
+ xrecord_t *rec1 = xe1->xdf2.recs + i1;
+ xrecord_t *rec2 = xe2->xdf2.recs + i2;
for (i = 0; i < line_count; i++) {
- int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
- rec2[i]->ptr, rec2[i]->size, flags);
+ int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
+ rec2[i].ptr, rec2[i].size, flags);
if (!result)
return -1;
}
@@ -111,7 +111,7 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int needs_cr, int add_nl, char *dest)
{
- xrecord_t **recs;
+ xrecord_t *recs;
int size = 0;
recs = (use_orig ? xe->xdf1.recs : xe->xdf2.recs) + i;
@@ -119,12 +119,12 @@ static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int nee
if (count < 1)
return 0;
- for (i = 0; i < count; size += recs[i++]->size)
+ for (i = 0; i < count; size += recs[i++].size)
if (dest)
- memcpy(dest + size, recs[i]->ptr, recs[i]->size);
+ memcpy(dest + size, recs[i].ptr, recs[i].size);
if (add_nl) {
- i = recs[count - 1]->size;
- if (i == 0 || recs[count - 1]->ptr[i - 1] != '\n') {
+ i = recs[count - 1].size;
+ if (i == 0 || recs[count - 1].ptr[i - 1] != '\n') {
if (needs_cr) {
if (dest)
dest[size] = '\r';
@@ -160,22 +160,22 @@ static int is_eol_crlf(xdfile_t *file, int i)
if (i < file->nrec - 1)
/* All lines before the last *must* end in LF */
- return (size = file->recs[i]->size) > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ return (size = file->recs[i].size) > 1 &&
+ file->recs[i].ptr[size - 2] == '\r';
if (!file->nrec)
/* Cannot determine eol style from empty file */
return -1;
- if ((size = file->recs[i]->size) &&
- file->recs[i]->ptr[size - 1] == '\n')
+ if ((size = file->recs[i].size) &&
+ file->recs[i].ptr[size - 1] == '\n')
/* Last line; ends in LF; Is it CR/LF? */
return size > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ file->recs[i].ptr[size - 2] == '\r';
if (!i)
/* The only line has no eol */
return -1;
/* Determine eol from second-to-last line */
- return (size = file->recs[i - 1]->size) > 1 &&
- file->recs[i - 1]->ptr[size - 2] == '\r';
+ return (size = file->recs[i - 1].size) > 1 &&
+ file->recs[i - 1].ptr[size - 2] == '\r';
}
static int is_cr_needed(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m)
@@ -334,22 +334,22 @@ static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
static void xdl_refine_zdiff3_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
xpparam_t const *xpp)
{
- xrecord_t **rec1 = xe1->xdf2.recs, **rec2 = xe2->xdf2.recs;
+ xrecord_t *rec1 = xe1->xdf2.recs, *rec2 = xe2->xdf2.recs;
for (; m; m = m->next) {
/* let's handle just the conflicts */
if (m->mode)
continue;
while(m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1], rec2[m->i2], xpp->flags)) {
+ recmatch(&rec1[m->i1], &rec2[m->i2], xpp->flags)) {
m->chg1--;
m->chg2--;
m->i1++;
m->i2++;
}
while (m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1 + m->chg1 - 1],
- rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
+ recmatch(&rec1[m->i1 + m->chg1 - 1],
+ &rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
m->chg1--;
m->chg2--;
}
@@ -381,12 +381,12 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
* This probably does not work outside git, since
* we have a very simple mmfile structure.
*/
- t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
- t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
- + xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
- t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
- t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
- + xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
+ t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
+ t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
+ + xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
+ t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
+ t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
+ + xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
return -1;
if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
@@ -440,8 +440,8 @@ static int line_contains_alnum(const char *ptr, long size)
static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
{
for (; chg; chg--, i++)
- if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
- xe->xdf2.recs[i]->size))
+ if (line_contains_alnum(xe->xdf2.recs[i].ptr,
+ xe->xdf2.recs[i].size))
return 1;
return 0;
}
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 77dc411d19..bf69a58527 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -88,9 +88,9 @@ static int is_anchor(xpparam_t const *xpp, const char *line)
static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
int pass)
{
- xrecord_t **records = pass == 1 ?
+ xrecord_t *records = pass == 1 ?
map->env->xdf1.recs : map->env->xdf2.recs;
- xrecord_t *record = records[line - 1];
+ xrecord_t *record = &records[line - 1];
/*
* After xdl_prepare_env() (or more precisely, due to
* xdl_classify_record()), the "ha" member of the records (AKA lines)
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
return;
map->entries[index].line1 = line;
map->entries[index].hash = record->ha;
- map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
+ map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
if (!map->first)
map->first = map->entries + index;
if (map->last) {
@@ -246,8 +246,8 @@ static int find_longest_common_sequence(struct hashmap *map, struct entry **res)
static int match(struct hashmap *map, int line1, int line2)
{
- xrecord_t *record1 = map->env->xdf1.recs[line1 - 1];
- xrecord_t *record2 = map->env->xdf2.recs[line2 - 1];
+ xrecord_t *record1 = &map->env->xdf1.recs[line1 - 1];
+ xrecord_t *record2 = &map->env->xdf2.recs[line2 - 1];
return record1->ha == record2->ha;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e6e2c0e1c0..27c5a4d636 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -128,7 +128,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
}
@@ -143,8 +142,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->rchg = NULL;
xdf->recs = NULL;
- if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
- goto abort;
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
@@ -155,12 +152,10 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
hav = xdl_hash_record(&cur, top, xpp->flags);
if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
- if (!(crec = xdl_cha_alloc(&xdf->rcha)))
- goto abort;
+ crec = &xdf->recs[xdf->nrec++];
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- xdf->recs[xdf->nrec++] = crec;
if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
@@ -260,7 +255,7 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
*/
static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
long i, nm, nreff, mlim;
- xrecord_t **recs;
+ xrecord_t *recs;
xdlclass_t *rcrec;
char *dis, *dis1, *dis2;
int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
@@ -273,7 +268,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -281,7 +276,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -317,13 +312,13 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
*/
static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
long i, lim;
- xrecord_t **recs1, **recs2;
+ xrecord_t *recs1, *recs2;
recs1 = xdf1->recs;
recs2 = xdf2->recs;
for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
i++, recs1++, recs2++)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dstart = xdf2->dstart = i;
@@ -331,7 +326,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
recs1 = xdf1->recs + xdf1->nrec - 1;
recs2 = xdf2->recs + xdf2->nrec - 1;
for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dend = xdf1->nrec - i - 1;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 85848f1685..3d26cbf1ec 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -45,10 +45,9 @@ typedef struct s_xrecord {
} xrecord_t;
typedef struct s_xdfile {
- chastore_t rcha;
+ xrecord_t *recs;
long nrec;
long dstart, dend;
- xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 444a108f87..332982b509 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -416,12 +416,12 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
mmfile_t subfile1, subfile2;
xdfenv_t env;
- subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
- subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
- diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
- subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
- subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
- diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
+ subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
+ subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
+ diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
+ subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
+ subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
+ diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
return -1;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 10/13] xdiff: delete rchg aliasing
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
` (8 preceding siblings ...)
2025-09-23 21:24 ` [PATCH v5 09/13] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-24 10:22 ` Phillip Wood
2025-09-23 21:24 ` [PATCH v5 11/13] xdiff: rename rchg -> changed in xdfile_t Ezekiel Newren via GitGitGadget
` (3 subsequent siblings)
13 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index a66125d44a..83c4cff6f7 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -932,16 +932,15 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
xdchange_t *cscr = NULL, *xch;
- char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
long i1, i2, l1, l2;
/*
* Trivial. Collects "groups" of changes and creates an edit script.
*/
for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
- if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
- for (l1 = i1; rchg1[i1 - 1]; i1--);
- for (l2 = i2; rchg2[i2 - 1]; i2--);
+ if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
+ for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
+ for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
xdl_free_script(cscr);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 11/13] xdiff: rename rchg -> changed in xdfile_t
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
` (9 preceding siblings ...)
2025-09-23 21:24 ` [PATCH v5 10/13] xdiff: delete rchg aliasing Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-24 10:22 ` Phillip Wood
2025-09-23 21:24 ` [PATCH v5 12/13] xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c Ezekiel Newren via GitGitGadget
` (2 subsequent siblings)
13 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 28 ++++++++++++++--------------
xdiff/xhistogram.c | 8 ++++----
xdiff/xpatience.c | 8 ++++----
xdiff/xprepare.c | 12 ++++++------
xdiff/xtypes.h | 2 +-
xdiff/xutils.c | 4 ++--
6 files changed, 31 insertions(+), 31 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 83c4cff6f7..5535452061 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
- xdf2->rchg[xdf2->rindex[off2]] = 1;
+ xdf2->changed[xdf2->rindex[off2]] = 1;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
- xdf1->rchg[xdf1->rindex[off1]] = 1;
+ xdf1->changed[xdf1->rindex[off1]] = 1;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -708,7 +708,7 @@ struct xdlgroup {
static void group_init(xdfile_t *xdf, struct xdlgroup *g)
{
g->start = g->end = 0;
- while (xdf->rchg[g->end])
+ while (xdf->changed[g->end])
g->end++;
}
@@ -722,7 +722,7 @@ static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->start = g->end + 1;
- for (g->end = g->start; xdf->rchg[g->end]; g->end++)
+ for (g->end = g->start; xdf->changed[g->end]; g->end++)
;
return 0;
@@ -738,7 +738,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->end = g->start - 1;
- for (g->start = g->end; xdf->rchg[g->start - 1]; g->start--)
+ for (g->start = g->end; xdf->changed[g->start - 1]; g->start--)
;
return 0;
@@ -753,10 +753,10 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
- xdf->rchg[g->start++] = 0;
- xdf->rchg[g->end++] = 1;
+ xdf->changed[g->start++] = 0;
+ xdf->changed[g->end++] = 1;
- while (xdf->rchg[g->end])
+ while (xdf->changed[g->end])
g->end++;
return 0;
@@ -774,10 +774,10 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
- xdf->rchg[--g->start] = 1;
- xdf->rchg[--g->end] = 0;
+ xdf->changed[--g->start] = 1;
+ xdf->changed[--g->end] = 0;
- while (xdf->rchg[g->start - 1])
+ while (xdf->changed[g->start - 1])
g->start--;
return 0;
@@ -938,9 +938,9 @@ int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
* Trivial. Collects "groups" of changes and creates an edit script.
*/
for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
- if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
- for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
- for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
+ if (xe->xdf1.changed[i1 - 1] || xe->xdf2.changed[i2 - 1]) {
+ for (l1 = i1; xe->xdf1.changed[i1 - 1]; i1--);
+ for (l2 = i2; xe->xdf2.changed[i2 - 1]; i2--);
if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
xdl_free_script(cscr);
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 4d857e8ae2..15ca15f6b0 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -318,11 +318,11 @@ redo:
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = 1;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = 1;
return 0;
}
@@ -335,9 +335,9 @@ redo:
else {
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = 1;
while (count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = 1;
result = 0;
} else {
result = histogram_diff(xpp, env,
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index bf69a58527..14092ffb86 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* trivial case: one side is empty */
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = 1;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = 1;
return 0;
}
@@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* are there any matching lines at all? */
if (!map.has_matches) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = 1;
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = 1;
xdl_free(map.entries);
return 0;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 27c5a4d636..b9b19c36de 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -126,7 +126,7 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
static void xdl_free_ctx(xdfile_t *xdf)
{
xdl_free(xdf->rindex);
- xdl_free(xdf->rchg - 1);
+ xdl_free(xdf->changed - 1);
xdl_free(xdf->recs);
}
@@ -139,7 +139,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xrecord_t *crec;
xdf->rindex = NULL;
- xdf->rchg = NULL;
+ xdf->changed = NULL;
xdf->recs = NULL;
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
@@ -161,7 +161,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
}
}
- if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
+ if (!XDL_CALLOC_ARRAY(xdf->changed, xdf->nrec + 2))
goto abort;
if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
@@ -170,7 +170,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
goto abort;
}
- xdf->rchg += 1;
+ xdf->changed += 1;
xdf->nreff = 0;
xdf->dstart = 0;
xdf->dend = xdf->nrec - 1;
@@ -287,7 +287,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
(dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
} else
- xdf1->rchg[i] = 1;
+ xdf1->changed[i] = 1;
}
xdf1->nreff = nreff;
@@ -297,7 +297,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
(dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
} else
- xdf2->rchg[i] = 1;
+ xdf2->changed[i] = 1;
}
xdf2->nreff = nreff;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 3d26cbf1ec..c4b5d2d8fa 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -48,7 +48,7 @@ typedef struct s_xdfile {
xrecord_t *recs;
long nrec;
long dstart, dend;
- char *rchg;
+ char *changed;
long *rindex;
long nreff;
} xdfile_t;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 332982b509..ed65c222e6 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -425,8 +425,8 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
return -1;
- memcpy(diff_env->xdf1.rchg + line1 - 1, env.xdf1.rchg, count1);
- memcpy(diff_env->xdf2.rchg + line2 - 1, env.xdf2.rchg, count2);
+ memcpy(diff_env->xdf1.changed + line1 - 1, env.xdf1.changed, count1);
+ memcpy(diff_env->xdf2.changed + line2 - 1, env.xdf2.changed, count2);
xdl_free_env(&env);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 12/13] xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
` (10 preceding siblings ...)
2025-09-23 21:24 ` [PATCH v5 11/13] xdiff: rename rchg -> changed in xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-24 10:21 ` Phillip Wood
2025-09-23 21:24 ` [PATCH v5 13/13] xdiff: change type of xdfile_t.changed from char to bool Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
13 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Rename dis1, dis2 to matches1, matches2.
Define macros NONE(0), SOME(1), TOO_MANY(2) as the enum values for
matches1 and matches2. These states will influence whether changed[i]
is set to 1 or kept as 0.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 90 ++++++++++++++++++++++++++++++++----------------
1 file changed, 60 insertions(+), 30 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index b9b19c36de..e1d575f779 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -29,6 +29,9 @@
#define XDL_GUESS_NLINES1 256
#define XDL_GUESS_NLINES2 20
+#define NONE 0
+#define SOME 1
+#define TOO_MANY 2
typedef struct s_xdlclass {
struct s_xdlclass *next;
@@ -190,12 +193,12 @@ void xdl_free_env(xdfenv_t *xe) {
}
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
+static bool xdl_clean_mmatch(uint8_t const *matches, long i, long s, long e) {
long r, rdis0, rpdis0, rdis1, rpdis1;
/*
- * Limits the window the is examined during the similar-lines
- * scan. The loops below stops when dis[i - r] == 1 (line that
+ * Limits the window that is examined during the similar-lines
+ * scan. The loops below stops when matches[i - r] == SOME (line that
* has no match), but there are corner cases where the loop
* proceed all the way to the extremities by causing huge
* performance penalties in case of big files.
@@ -207,40 +210,44 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
/*
* Scans the lines before 'i' to find a run of lines that either
- * have no match (dis[j] == 0) or have multiple matches (dis[j] > 1).
- * Note that we always call this function with dis[i] > 1, so the
+ * have no match (matches[j] == NONE) or have multiple matches (matches[j] == TOO_MANY).
+ * Note that we always call this function with matches[i] == TOO_MANY, so the
* current line (i) is already a multimatch line.
*/
for (r = 1, rdis0 = 0, rpdis0 = 1; (i - r) >= s; r++) {
- if (!dis[i - r])
+ if (matches[i - r] == NONE)
rdis0++;
- else if (dis[i - r] == 2)
+ else if (matches[i - r] == TOO_MANY)
rpdis0++;
- else
+ else if (matches[i - r] == SOME)
break;
+ else
+ BUG("Illegal value for matches[i - r]");
}
/*
* If the run before the line 'i' found only multimatch lines, we
- * return 0 and hence we don't make the current line (i) discarded.
+ * return false and hence we don't make the current line (i) discarded.
* We want to discard multimatch lines only when they appear in the
- * middle of runs with nomatch lines (dis[j] == 0).
+ * middle of runs with nomatch lines (matches[j] == NONE).
*/
if (rdis0 == 0)
return 0;
for (r = 1, rdis1 = 0, rpdis1 = 1; (i + r) <= e; r++) {
- if (!dis[i + r])
+ if (matches[i + r] == NONE)
rdis1++;
- else if (dis[i + r] == 2)
+ else if (matches[i + r] == TOO_MANY)
rpdis1++;
- else
+ else if (matches[i + r] == SOME)
break;
+ else
+ BUG("Illegal value for matches[i + r]");
}
/*
* If the run after the line 'i' found only multimatch lines, we
- * return 0 and hence we don't make the current line (i) discarded.
+ * return false and hence we don't make the current line (i) discarded.
*/
if (rdis1 == 0)
- return 0;
+ return false;
rdis1 += rdis0;
rpdis1 += rpdis0;
@@ -251,26 +258,41 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
/*
* Try to reduce the problem complexity, discard records that have no
* matches on the other file. Also, lines that have multiple matches
- * might be potentially discarded if they happear in a run of discardable.
+ * might be potentially discarded if they appear in a run of discardable.
*/
static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
long i, nm, nreff, mlim;
xrecord_t *recs;
xdlclass_t *rcrec;
- char *dis, *dis1, *dis2;
- int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
+ uint8_t *matches1, *matches2;
+ int status = 0;
+ bool need_min = !!(cf->flags & XDF_NEED_MINIMAL);
- if (!XDL_CALLOC_ARRAY(dis, xdf1->nrec + xdf2->nrec + 2))
- return -1;
- dis1 = dis;
- dis2 = dis1 + xdf1->nrec + 1;
+ matches1 = NULL;
+ matches2 = NULL;
+
+ /*
+ * Create temporary arrays that will help us decide if
+ * changed[i] should remain 0 or become 1.
+ */
+ if (!XDL_CALLOC_ARRAY(matches1, xdf1->nrec + 1)) {
+ status = -1;
+ goto cleanup;
+ }
+ if (!XDL_CALLOC_ARRAY(matches2, xdf2->nrec + 1)) {
+ status = -1;
+ goto cleanup;
+ }
+ /*
+ * Initialize temporary arrays with NONE, SOME, or TOO_MANY.
+ */
if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
- dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ matches1[i] = (nm == 0) ? NONE: (nm >= mlim && !need_min) ? TOO_MANY: SOME;
}
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
@@ -278,14 +300,19 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
- dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ matches2[i] = (nm == 0) ? NONE: (nm >= mlim && !need_min) ? TOO_MANY: SOME;
}
+ /*
+ * Use temporary arrays to decide if changed[i] should remain
+ * 0 or become 1.
+ */
for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
i <= xdf1->dend; i++, recs++) {
- if (dis1[i] == 1 ||
- (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
+ if (matches1[i] == SOME ||
+ (matches1[i] == TOO_MANY && !xdl_clean_mmatch(matches1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
+ /* changed[i] remains 0 */
} else
xdf1->changed[i] = 1;
}
@@ -293,17 +320,20 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
i <= xdf2->dend; i++, recs++) {
- if (dis2[i] == 1 ||
- (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
+ if (matches2[i] == SOME ||
+ (matches2[i] == TOO_MANY && !xdl_clean_mmatch(matches2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
+ /* changed[i] remains 0 */
} else
xdf2->changed[i] = 1;
}
xdf2->nreff = nreff;
- xdl_free(dis);
+cleanup:
+ xdl_free(matches1);
+ xdl_free(matches2);
- return 0;
+ return status;
}
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v5 13/13] xdiff: change type of xdfile_t.changed from char to bool
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
` (11 preceding siblings ...)
2025-09-23 21:24 ` [PATCH v5 12/13] xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c Ezekiel Newren via GitGitGadget
@ 2025-09-23 21:24 ` Ezekiel Newren via GitGitGadget
2025-09-24 10:21 ` Phillip Wood
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
13 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-23 21:24 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
The only values possible for 'changed' is 1 and 0, which exactly maps
to a bool type. It might not look like this is the case because
matches1 and matches2 (which use to be dis1, and dis2) were also char
and were assigned numerical values within a few lines of 'changed'
(what used to be rchg).
Using NONE, SOME, TOO_MANY for matches1[i]/matches2[j], and true/false
for changed[k] makes it clear to future readers that these are
logically separate concepts.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 12 ++++++------
xdiff/xhistogram.c | 8 ++++----
xdiff/xpatience.c | 8 ++++----
xdiff/xprepare.c | 12 ++++++------
xdiff/xtypes.h | 2 +-
5 files changed, 21 insertions(+), 21 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 5535452061..b902be9d0e 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
- xdf2->changed[xdf2->rindex[off2]] = 1;
+ xdf2->changed[xdf2->rindex[off2]] = true;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
- xdf1->changed[xdf1->rindex[off1]] = 1;
+ xdf1->changed[xdf1->rindex[off1]] = true;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -753,8 +753,8 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
- xdf->changed[g->start++] = 0;
- xdf->changed[g->end++] = 1;
+ xdf->changed[g->start++] = false;
+ xdf->changed[g->end++] = true;
while (xdf->changed[g->end])
g->end++;
@@ -774,8 +774,8 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
- xdf->changed[--g->start] = 1;
- xdf->changed[--g->end] = 0;
+ xdf->changed[--g->start] = true;
+ xdf->changed[--g->end] = false;
while (xdf->changed[g->start - 1])
g->start--;
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 15ca15f6b0..6dc450b1fe 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -318,11 +318,11 @@ redo:
if (!count1) {
while(count2--)
- env->xdf2.changed[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.changed[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = true;
return 0;
}
@@ -335,9 +335,9 @@ redo:
else {
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
- env->xdf1.changed[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = true;
while (count2--)
- env->xdf2.changed[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = true;
result = 0;
} else {
result = histogram_diff(xpp, env,
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 14092ffb86..669b653580 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* trivial case: one side is empty */
if (!count1) {
while(count2--)
- env->xdf2.changed[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.changed[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = true;
return 0;
}
@@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* are there any matching lines at all? */
if (!map.has_matches) {
while(count1--)
- env->xdf1.changed[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = true;
while(count2--)
- env->xdf2.changed[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = true;
xdl_free(map.entries);
return 0;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e1d575f779..070d220f3b 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -273,7 +273,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
/*
* Create temporary arrays that will help us decide if
- * changed[i] should remain 0 or become 1.
+ * changed[i] should remain false, or become true.
*/
if (!XDL_CALLOC_ARRAY(matches1, xdf1->nrec + 1)) {
status = -1;
@@ -305,16 +305,16 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
/*
* Use temporary arrays to decide if changed[i] should remain
- * 0 or become 1.
+ * false, or become true.
*/
for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
i <= xdf1->dend; i++, recs++) {
if (matches1[i] == SOME ||
(matches1[i] == TOO_MANY && !xdl_clean_mmatch(matches1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
- /* changed[i] remains 0 */
+ /* changed[i] remains false */
} else
- xdf1->changed[i] = 1;
+ xdf1->changed[i] = true;
}
xdf1->nreff = nreff;
@@ -323,9 +323,9 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if (matches2[i] == SOME ||
(matches2[i] == TOO_MANY && !xdl_clean_mmatch(matches2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
- /* changed[i] remains 0 */
+ /* changed[i] remains false */
} else
- xdf2->changed[i] = 1;
+ xdf2->changed[i] = true;
}
xdf2->nreff = nreff;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index c4b5d2d8fa..f145abba3e 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -48,7 +48,7 @@ typedef struct s_xdfile {
xrecord_t *recs;
long nrec;
long dstart, dend;
- char *changed;
+ bool *changed;
long *rindex;
long nreff;
} xdfile_t;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* Re: [PATCH v5 13/13] xdiff: change type of xdfile_t.changed from char to bool
2025-09-23 21:24 ` [PATCH v5 13/13] xdiff: change type of xdfile_t.changed from char to bool Ezekiel Newren via GitGitGadget
@ 2025-09-24 10:21 ` Phillip Wood
2025-09-24 15:14 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Phillip Wood @ 2025-09-24 10:21 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget, git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren
On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> The only values possible for 'changed' is 1 and 0, which exactly maps
> to a bool type. It might not look like this is the case because
> matches1 and matches2 (which use to be dis1, and dis2) were also char
> and were assigned numerical values within a few lines of 'changed'
> (what used to be rchg).
>
> Using NONE, SOME, TOO_MANY for matches1[i]/matches2[j], and true/false
> for changed[k] makes it clear to future readers that these are
> logically separate concepts.
Nicely explained - I think this change is a very good idea and
separating it out like this makes it much clearer what's going on
compared to V4.
Thanks
Phillip
> Best-viewed-with: --color-words
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xdiffi.c | 12 ++++++------
> xdiff/xhistogram.c | 8 ++++----
> xdiff/xpatience.c | 8 ++++----
> xdiff/xprepare.c | 12 ++++++------
> xdiff/xtypes.h | 2 +-
> 5 files changed, 21 insertions(+), 21 deletions(-)
>
> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index 5535452061..b902be9d0e 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
> */
> if (off1 == lim1) {
> for (; off2 < lim2; off2++)
> - xdf2->changed[xdf2->rindex[off2]] = 1;
> + xdf2->changed[xdf2->rindex[off2]] = true;
> } else if (off2 == lim2) {
> for (; off1 < lim1; off1++)
> - xdf1->changed[xdf1->rindex[off1]] = 1;
> + xdf1->changed[xdf1->rindex[off1]] = true;
> } else {
> xdpsplit_t spl;
> spl.i1 = spl.i2 = 0;
> @@ -753,8 +753,8 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
> {
> if (g->end < xdf->nrec &&
> recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
> - xdf->changed[g->start++] = 0;
> - xdf->changed[g->end++] = 1;
> + xdf->changed[g->start++] = false;
> + xdf->changed[g->end++] = true;
>
> while (xdf->changed[g->end])
> g->end++;
> @@ -774,8 +774,8 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
> {
> if (g->start > 0 &&
> recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
> - xdf->changed[--g->start] = 1;
> - xdf->changed[--g->end] = 0;
> + xdf->changed[--g->start] = true;
> + xdf->changed[--g->end] = false;
>
> while (xdf->changed[g->start - 1])
> g->start--;
> diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
> index 15ca15f6b0..6dc450b1fe 100644
> --- a/xdiff/xhistogram.c
> +++ b/xdiff/xhistogram.c
> @@ -318,11 +318,11 @@ redo:
>
> if (!count1) {
> while(count2--)
> - env->xdf2.changed[line2++ - 1] = 1;
> + env->xdf2.changed[line2++ - 1] = true;
> return 0;
> } else if (!count2) {
> while(count1--)
> - env->xdf1.changed[line1++ - 1] = 1;
> + env->xdf1.changed[line1++ - 1] = true;
> return 0;
> }
>
> @@ -335,9 +335,9 @@ redo:
> else {
> if (lcs.begin1 == 0 && lcs.begin2 == 0) {
> while (count1--)
> - env->xdf1.changed[line1++ - 1] = 1;
> + env->xdf1.changed[line1++ - 1] = true;
> while (count2--)
> - env->xdf2.changed[line2++ - 1] = 1;
> + env->xdf2.changed[line2++ - 1] = true;
> result = 0;
> } else {
> result = histogram_diff(xpp, env,
> diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
> index 14092ffb86..669b653580 100644
> --- a/xdiff/xpatience.c
> +++ b/xdiff/xpatience.c
> @@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
> /* trivial case: one side is empty */
> if (!count1) {
> while(count2--)
> - env->xdf2.changed[line2++ - 1] = 1;
> + env->xdf2.changed[line2++ - 1] = true;
> return 0;
> } else if (!count2) {
> while(count1--)
> - env->xdf1.changed[line1++ - 1] = 1;
> + env->xdf1.changed[line1++ - 1] = true;
> return 0;
> }
>
> @@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
> /* are there any matching lines at all? */
> if (!map.has_matches) {
> while(count1--)
> - env->xdf1.changed[line1++ - 1] = 1;
> + env->xdf1.changed[line1++ - 1] = true;
> while(count2--)
> - env->xdf2.changed[line2++ - 1] = 1;
> + env->xdf2.changed[line2++ - 1] = true;
> xdl_free(map.entries);
> return 0;
> }
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index e1d575f779..070d220f3b 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -273,7 +273,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
>
> /*
> * Create temporary arrays that will help us decide if
> - * changed[i] should remain 0 or become 1.
> + * changed[i] should remain false, or become true.
> */
> if (!XDL_CALLOC_ARRAY(matches1, xdf1->nrec + 1)) {
> status = -1;
> @@ -305,16 +305,16 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
>
> /*
> * Use temporary arrays to decide if changed[i] should remain
> - * 0 or become 1.
> + * false, or become true.
> */
> for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
> i <= xdf1->dend; i++, recs++) {
> if (matches1[i] == SOME ||
> (matches1[i] == TOO_MANY && !xdl_clean_mmatch(matches1, i, xdf1->dstart, xdf1->dend))) {
> xdf1->rindex[nreff++] = i;
> - /* changed[i] remains 0 */
> + /* changed[i] remains false */
> } else
> - xdf1->changed[i] = 1;
> + xdf1->changed[i] = true;
> }
> xdf1->nreff = nreff;
>
> @@ -323,9 +323,9 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> if (matches2[i] == SOME ||
> (matches2[i] == TOO_MANY && !xdl_clean_mmatch(matches2, i, xdf2->dstart, xdf2->dend))) {
> xdf2->rindex[nreff++] = i;
> - /* changed[i] remains 0 */
> + /* changed[i] remains false */
> } else
> - xdf2->changed[i] = 1;
> + xdf2->changed[i] = true;
> }
> xdf2->nreff = nreff;
>
> diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
> index c4b5d2d8fa..f145abba3e 100644
> --- a/xdiff/xtypes.h
> +++ b/xdiff/xtypes.h
> @@ -48,7 +48,7 @@ typedef struct s_xdfile {
> xrecord_t *recs;
> long nrec;
> long dstart, dend;
> - char *changed;
> + bool *changed;
> long *rindex;
> long nreff;
> } xdfile_t;
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 12/13] xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
2025-09-23 21:24 ` [PATCH v5 12/13] xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c Ezekiel Newren via GitGitGadget
@ 2025-09-24 10:21 ` Phillip Wood
2025-09-24 14:46 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Phillip Wood @ 2025-09-24 10:21 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget, git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren
On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Rename dis1, dis2 to matches1, matches2.
>
> Define macros NONE(0), SOME(1), TOO_MANY(2) as the enum values for
> matches1 and matches2. These states will influence whether changed[i]
> is set to 1 or kept as 0.
This message also says what is being changed rather than why it is being
changed. I think the rename here is a good idea but I'm not sure what
"rdis[01]" and "rpdis[01]" are used for and whether they should be
renamed if we're renaming "dis[01]"
> /*
> - * Limits the window the is examined during the similar-lines
> - * scan. The loops below stops when dis[i - r] == 1 (line that
> + * Limits the window that is examined during the similar-lines
> + * scan. The loops below stops when matches[i - r] == SOME (line that
Thanks for updating the comments. Not reflowing the lines makes the diff
easier to read but leaves the comments in a rather strange state with
random long lines.
> * has no match), but there are corner cases where the loop
> * proceed all the way to the extremities by causing huge
> * performance penalties in case of big files.
> @@ -207,40 +210,44 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
>
> /*
> * Scans the lines before 'i' to find a run of lines that either
> - * have no match (dis[j] == 0) or have multiple matches (dis[j] > 1).
> - * Note that we always call this function with dis[i] > 1, so the
> + * have no match (matches[j] == NONE) or have multiple matches (matches[j] == TOO_MANY).
> + * Note that we always call this function with matches[i] == TOO_MANY, so the
especially here
> - if (!dis[i + r])
> + if (matches[i + r] == NONE)
> rdis1++;
> - else if (dis[i + r] == 2)
> + else if (matches[i + r] == TOO_MANY)
> rpdis1++;
> - else
> + else if (matches[i + r] == SOME)
> break;
> + else
> + BUG("Illegal value for matches[i + r]");
Nice addition
> static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
> long i, nm, nreff, mlim;
> xrecord_t *recs;
> xdlclass_t *rcrec;
> - char *dis, *dis1, *dis2;
> - int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
> + uint8_t *matches1, *matches2;
Let's initialize these where they're declared rather than later on
> + int status = 0;
I think we typically we call this "ret" or "res" in the rest of the code
base.
> + bool need_min = !!(cf->flags & XDF_NEED_MINIMAL);
Nice use of bool, strictly speaking I don't think we need the !! if
we're changing the type from int. I think Junio recently suggested that
we might start using (bool) instead of !! for cases like this.
Everything below this looks good, though some of the lines are getting
quite long with the renamed variables and symbolic values so we might
want to break them.
Thanks
Phillip
> - if (!XDL_CALLOC_ARRAY(dis, xdf1->nrec + xdf2->nrec + 2))
> - return -1;
> - dis1 = dis;
> - dis2 = dis1 + xdf1->nrec + 1;
> + matches1 = NULL;
> + matches2 = NULL;
> +
> + /*
> + * Create temporary arrays that will help us decide if
> + * changed[i] should remain 0 or become 1.
> + */
> + if (!XDL_CALLOC_ARRAY(matches1, xdf1->nrec + 1)) {
> + status = -1;
> + goto cleanup;
> + }
> + if (!XDL_CALLOC_ARRAY(matches2, xdf2->nrec + 1)) {
> + status = -1;
> + goto cleanup;
> + }
>
> + /*
> + * Initialize temporary arrays with NONE, SOME, or TOO_MANY.
> + */
> if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
> mlim = XDL_MAX_EQLIMIT;
> for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
> rcrec = cf->rcrecs[recs->ha];
> nm = rcrec ? rcrec->len2 : 0;
> - dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
> + matches1[i] = (nm == 0) ? NONE: (nm >= mlim && !need_min) ? TOO_MANY: SOME;
> }
>
> if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
> @@ -278,14 +300,19 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
> rcrec = cf->rcrecs[recs->ha];
> nm = rcrec ? rcrec->len1 : 0;
> - dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
> + matches2[i] = (nm == 0) ? NONE: (nm >= mlim && !need_min) ? TOO_MANY: SOME;
> }
>
> + /*
> + * Use temporary arrays to decide if changed[i] should remain
> + * 0 or become 1.
> + */
> for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
> i <= xdf1->dend; i++, recs++) {
> - if (dis1[i] == 1 ||
> - (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
> + if (matches1[i] == SOME ||
> + (matches1[i] == TOO_MANY && !xdl_clean_mmatch(matches1, i, xdf1->dstart, xdf1->dend))) {
> xdf1->rindex[nreff++] = i;
> + /* changed[i] remains 0 */
> } else
> xdf1->changed[i] = 1;
> }
> @@ -293,17 +320,20 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
>
> for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
> i <= xdf2->dend; i++, recs++) {
> - if (dis2[i] == 1 ||
> - (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
> + if (matches2[i] == SOME ||
> + (matches2[i] == TOO_MANY && !xdl_clean_mmatch(matches2, i, xdf2->dstart, xdf2->dend))) {
> xdf2->rindex[nreff++] = i;
> + /* changed[i] remains 0 */
> } else
> xdf2->changed[i] = 1;
> }
> xdf2->nreff = nreff;
>
> - xdl_free(dis);
> +cleanup:
> + xdl_free(matches1);
> + xdl_free(matches2);
>
> - return 0;
> + return status;
> }
>
>
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 11/13] xdiff: rename rchg -> changed in xdfile_t
2025-09-23 21:24 ` [PATCH v5 11/13] xdiff: rename rchg -> changed in xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-24 10:22 ` Phillip Wood
2025-09-24 15:10 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Phillip Wood @ 2025-09-24 10:22 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget, git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren
On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
I agree "changed" is a better name but the commit message should explain
what "rchg" is used for so that someone who is not familiar with the
code can understand why the change in name is desirable.
Thanks
Phillip
> Best-viewed-with: --color-words
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xdiffi.c | 28 ++++++++++++++--------------
> xdiff/xhistogram.c | 8 ++++----
> xdiff/xpatience.c | 8 ++++----
> xdiff/xprepare.c | 12 ++++++------
> xdiff/xtypes.h | 2 +-
> xdiff/xutils.c | 4 ++--
> 6 files changed, 31 insertions(+), 31 deletions(-)
>
> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index 83c4cff6f7..5535452061 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
> */
> if (off1 == lim1) {
> for (; off2 < lim2; off2++)
> - xdf2->rchg[xdf2->rindex[off2]] = 1;
> + xdf2->changed[xdf2->rindex[off2]] = 1;
> } else if (off2 == lim2) {
> for (; off1 < lim1; off1++)
> - xdf1->rchg[xdf1->rindex[off1]] = 1;
> + xdf1->changed[xdf1->rindex[off1]] = 1;
> } else {
> xdpsplit_t spl;
> spl.i1 = spl.i2 = 0;
> @@ -708,7 +708,7 @@ struct xdlgroup {
> static void group_init(xdfile_t *xdf, struct xdlgroup *g)
> {
> g->start = g->end = 0;
> - while (xdf->rchg[g->end])
> + while (xdf->changed[g->end])
> g->end++;
> }
>
> @@ -722,7 +722,7 @@ static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
> return -1;
>
> g->start = g->end + 1;
> - for (g->end = g->start; xdf->rchg[g->end]; g->end++)
> + for (g->end = g->start; xdf->changed[g->end]; g->end++)
> ;
>
> return 0;
> @@ -738,7 +738,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
> return -1;
>
> g->end = g->start - 1;
> - for (g->start = g->end; xdf->rchg[g->start - 1]; g->start--)
> + for (g->start = g->end; xdf->changed[g->start - 1]; g->start--)
> ;
>
> return 0;
> @@ -753,10 +753,10 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
> {
> if (g->end < xdf->nrec &&
> recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
> - xdf->rchg[g->start++] = 0;
> - xdf->rchg[g->end++] = 1;
> + xdf->changed[g->start++] = 0;
> + xdf->changed[g->end++] = 1;
>
> - while (xdf->rchg[g->end])
> + while (xdf->changed[g->end])
> g->end++;
>
> return 0;
> @@ -774,10 +774,10 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
> {
> if (g->start > 0 &&
> recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
> - xdf->rchg[--g->start] = 1;
> - xdf->rchg[--g->end] = 0;
> + xdf->changed[--g->start] = 1;
> + xdf->changed[--g->end] = 0;
>
> - while (xdf->rchg[g->start - 1])
> + while (xdf->changed[g->start - 1])
> g->start--;
>
> return 0;
> @@ -938,9 +938,9 @@ int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
> * Trivial. Collects "groups" of changes and creates an edit script.
> */
> for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
> - if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
> - for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
> - for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
> + if (xe->xdf1.changed[i1 - 1] || xe->xdf2.changed[i2 - 1]) {
> + for (l1 = i1; xe->xdf1.changed[i1 - 1]; i1--);
> + for (l2 = i2; xe->xdf2.changed[i2 - 1]; i2--);
>
> if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
> xdl_free_script(cscr);
> diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
> index 4d857e8ae2..15ca15f6b0 100644
> --- a/xdiff/xhistogram.c
> +++ b/xdiff/xhistogram.c
> @@ -318,11 +318,11 @@ redo:
>
> if (!count1) {
> while(count2--)
> - env->xdf2.rchg[line2++ - 1] = 1;
> + env->xdf2.changed[line2++ - 1] = 1;
> return 0;
> } else if (!count2) {
> while(count1--)
> - env->xdf1.rchg[line1++ - 1] = 1;
> + env->xdf1.changed[line1++ - 1] = 1;
> return 0;
> }
>
> @@ -335,9 +335,9 @@ redo:
> else {
> if (lcs.begin1 == 0 && lcs.begin2 == 0) {
> while (count1--)
> - env->xdf1.rchg[line1++ - 1] = 1;
> + env->xdf1.changed[line1++ - 1] = 1;
> while (count2--)
> - env->xdf2.rchg[line2++ - 1] = 1;
> + env->xdf2.changed[line2++ - 1] = 1;
> result = 0;
> } else {
> result = histogram_diff(xpp, env,
> diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
> index bf69a58527..14092ffb86 100644
> --- a/xdiff/xpatience.c
> +++ b/xdiff/xpatience.c
> @@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
> /* trivial case: one side is empty */
> if (!count1) {
> while(count2--)
> - env->xdf2.rchg[line2++ - 1] = 1;
> + env->xdf2.changed[line2++ - 1] = 1;
> return 0;
> } else if (!count2) {
> while(count1--)
> - env->xdf1.rchg[line1++ - 1] = 1;
> + env->xdf1.changed[line1++ - 1] = 1;
> return 0;
> }
>
> @@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
> /* are there any matching lines at all? */
> if (!map.has_matches) {
> while(count1--)
> - env->xdf1.rchg[line1++ - 1] = 1;
> + env->xdf1.changed[line1++ - 1] = 1;
> while(count2--)
> - env->xdf2.rchg[line2++ - 1] = 1;
> + env->xdf2.changed[line2++ - 1] = 1;
> xdl_free(map.entries);
> return 0;
> }
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index 27c5a4d636..b9b19c36de 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -126,7 +126,7 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
> static void xdl_free_ctx(xdfile_t *xdf)
> {
> xdl_free(xdf->rindex);
> - xdl_free(xdf->rchg - 1);
> + xdl_free(xdf->changed - 1);
> xdl_free(xdf->recs);
> }
>
> @@ -139,7 +139,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> xrecord_t *crec;
>
> xdf->rindex = NULL;
> - xdf->rchg = NULL;
> + xdf->changed = NULL;
> xdf->recs = NULL;
>
> if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
> @@ -161,7 +161,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> }
> }
>
> - if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
> + if (!XDL_CALLOC_ARRAY(xdf->changed, xdf->nrec + 2))
> goto abort;
>
> if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
> @@ -170,7 +170,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> goto abort;
> }
>
> - xdf->rchg += 1;
> + xdf->changed += 1;
> xdf->nreff = 0;
> xdf->dstart = 0;
> xdf->dend = xdf->nrec - 1;
> @@ -287,7 +287,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
> xdf1->rindex[nreff++] = i;
> } else
> - xdf1->rchg[i] = 1;
> + xdf1->changed[i] = 1;
> }
> xdf1->nreff = nreff;
>
> @@ -297,7 +297,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
> xdf2->rindex[nreff++] = i;
> } else
> - xdf2->rchg[i] = 1;
> + xdf2->changed[i] = 1;
> }
> xdf2->nreff = nreff;
>
> diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
> index 3d26cbf1ec..c4b5d2d8fa 100644
> --- a/xdiff/xtypes.h
> +++ b/xdiff/xtypes.h
> @@ -48,7 +48,7 @@ typedef struct s_xdfile {
> xrecord_t *recs;
> long nrec;
> long dstart, dend;
> - char *rchg;
> + char *changed;
> long *rindex;
> long nreff;
> } xdfile_t;
> diff --git a/xdiff/xutils.c b/xdiff/xutils.c
> index 332982b509..ed65c222e6 100644
> --- a/xdiff/xutils.c
> +++ b/xdiff/xutils.c
> @@ -425,8 +425,8 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
> if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
> return -1;
>
> - memcpy(diff_env->xdf1.rchg + line1 - 1, env.xdf1.rchg, count1);
> - memcpy(diff_env->xdf2.rchg + line2 - 1, env.xdf2.rchg, count2);
> + memcpy(diff_env->xdf1.changed + line1 - 1, env.xdf1.changed, count1);
> + memcpy(diff_env->xdf2.changed + line2 - 1, env.xdf2.changed, count2);
>
> xdl_free_env(&env);
>
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 10/13] xdiff: delete rchg aliasing
2025-09-23 21:24 ` [PATCH v5 10/13] xdiff: delete rchg aliasing Ezekiel Newren via GitGitGadget
@ 2025-09-24 10:22 ` Phillip Wood
2025-09-24 15:01 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Phillip Wood @ 2025-09-24 10:22 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget, git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren
On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
The commit message should explain why this change is being made
Thanks
Phillip
> Best-viewed-with: --color-words
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xdiffi.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index a66125d44a..83c4cff6f7 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -932,16 +932,15 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
>
> int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
> xdchange_t *cscr = NULL, *xch;
> - char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
> long i1, i2, l1, l2;
>
> /*
> * Trivial. Collects "groups" of changes and creates an edit script.
> */
> for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
> - if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
> - for (l1 = i1; rchg1[i1 - 1]; i1--);
> - for (l2 = i2; rchg2[i2 - 1]; i2--);
> + if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
> + for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
> + for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
>
> if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
> xdl_free_script(cscr);
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 05/13] xdiff: delete superfluous local variables that alias fields in xrecord_t
2025-09-23 21:24 ` [PATCH v5 05/13] xdiff: delete superfluous local variables that alias fields in xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-24 10:22 ` Phillip Wood
2025-09-24 14:52 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Phillip Wood @ 2025-09-24 10:22 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget, git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren
On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Use the type xrecord_t as the local variable for the functions in the
> file xdiff/xemit.c.
This explains what the change is but not why it is being made. Commit
messages in this project are expected to explain the reason for the
change so that future readers can understand why a change was made.
Thanks
Phillip
>
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> xdiff/xemit.c | 29 +++++++++++++----------------
> 1 file changed, 13 insertions(+), 16 deletions(-)
>
> diff --git a/xdiff/xemit.c b/xdiff/xemit.c
> index 40fc8154f3..2161ac3cd0 100644
> --- a/xdiff/xemit.c
> +++ b/xdiff/xemit.c
> @@ -23,12 +23,11 @@
> #include "xinclude.h"
>
>
> -static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
> - long size, psize = strlen(pre);
> - char const *rec = xdf->recs[ri]->ptr;
> +static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
> +{
> + xrecord_t *rec = xdf->recs[ri];
>
> - size = xdf->recs[ri]->size;
> - if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0)
> + if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
> return -1;
>
> return 0;
> @@ -111,11 +110,11 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
> static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
> char *buf, long sz)
> {
> - const char *rec = xdf->recs[ri]->ptr;
> - long len = xdf->recs[ri]->size;
> + xrecord_t *rec = xdf->recs[ri];
> +
> if (!xecfg->find_func)
> - return def_ff(rec, len, buf, sz);
> - return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
> + return def_ff(rec->ptr, rec->size, buf, sz);
> + return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
> }
>
> static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
> @@ -151,14 +150,12 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
>
> static int is_empty_rec(xdfile_t *xdf, long ri)
> {
> - const char *rec = xdf->recs[ri]->ptr;
> - long len = xdf->recs[ri]->size;
> + xrecord_t *rec = xdf->recs[ri];
> + long i = 0;
>
> - while (len > 0 && XDL_ISSPACE(*rec)) {
> - rec++;
> - len--;
> - }
> - return !len;
> + for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
> +
> + return i == rec->size;
> }
>
> int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 12/13] xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
2025-09-24 10:21 ` Phillip Wood
@ 2025-09-24 14:46 ` Ezekiel Newren
2025-09-24 15:18 ` Phillip Wood
0 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-24 14:46 UTC (permalink / raw)
To: Phillip Wood
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble,
Jeff King
On Wed, Sep 24, 2025 at 4:21 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <ezekielnewren@gmail.com>
> >
> > Rename dis1, dis2 to matches1, matches2.
> >
> > Define macros NONE(0), SOME(1), TOO_MANY(2) as the enum values for
> > matches1 and matches2. These states will influence whether changed[i]
> > is set to 1 or kept as 0.
>
> This message also says what is being changed rather than why it is being
> changed. I think the rename here is a good idea but I'm not sure what
> "rdis[01]" and "rpdis[01]" are used for and whether they should be
> renamed if we're renaming "dis[01]"
"Rename dis1, dis2 to matches1, matches2 to give the variable names a
more obvious meaning."
Would something like that work, or do I need to refine it further? I
would love to rename rdis, rpdis, etc... except that I don't
understand what is happening or why. Could someone explain the purpose
of these variables?
> > /*
> > - * Limits the window the is examined during the similar-lines
> > - * scan. The loops below stops when dis[i - r] == 1 (line that
> > + * Limits the window that is examined during the similar-lines
> > + * scan. The loops below stops when matches[i - r] == SOME (line that
>
> Thanks for updating the comments. Not reflowing the lines makes the diff
> easier to read but leaves the comments in a rather strange state with
> random long lines.
What is the reflow limit for comments? 72? 80?
> > * has no match), but there are corner cases where the loop
> > * proceed all the way to the extremities by causing huge
> > * performance penalties in case of big files.
> > @@ -207,40 +210,44 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
> >
> > /*
> > * Scans the lines before 'i' to find a run of lines that either
> > - * have no match (dis[j] == 0) or have multiple matches (dis[j] > 1).
> > - * Note that we always call this function with dis[i] > 1, so the
> > + * have no match (matches[j] == NONE) or have multiple matches (matches[j] == TOO_MANY).
> > + * Note that we always call this function with matches[i] == TOO_MANY, so the
>
> especially here
>
> > - if (!dis[i + r])
> > + if (matches[i + r] == NONE)
> > rdis1++;
> > - else if (dis[i + r] == 2)
> > + else if (matches[i + r] == TOO_MANY)
> > rpdis1++;
> > - else
> > + else if (matches[i + r] == SOME)
> > break;
> > + else
> > + BUG("Illegal value for matches[i + r]");
>
> Nice addition
Thanks.
> > static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
> > long i, nm, nreff, mlim;
> > xrecord_t *recs;
> > xdlclass_t *rcrec;
> > - char *dis, *dis1, *dis2;
> > - int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
> > + uint8_t *matches1, *matches2;
>
> Let's initialize these where they're declared rather than later on
I can do that.
> > + int status = 0;
> I think we typically we call this "ret" or "res" in the rest of the code
> base.
>
> > + bool need_min = !!(cf->flags & XDF_NEED_MINIMAL);
>
> Nice use of bool, strictly speaking I don't think we need the !! if
> we're changing the type from int. I think Junio recently suggested that
> we might start using (bool) instead of !! for cases like this.
>
> Everything below this looks good, though some of the lines are getting
> quite long with the renamed variables and symbolic values so we might
> want to break them.
I didn't add !! and thought it looked funny myself. I didn't remove it
because I wasn't sure if I should.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 05/13] xdiff: delete superfluous local variables that alias fields in xrecord_t
2025-09-24 10:22 ` Phillip Wood
@ 2025-09-24 14:52 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-24 14:52 UTC (permalink / raw)
To: Phillip Wood
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble,
Jeff King
On Wed, Sep 24, 2025 at 4:21 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <ezekielnewren@gmail.com>
> >
> > Use the type xrecord_t as the local variable for the functions in the
> > file xdiff/xemit.c.
>
> This explains what the change is but not why it is being made. Commit
> messages in this project are expected to explain the reason for the
> change so that future readers can understand why a change was made.
"Use the type xrecord_t as the local variable for the functions in the
file xdiff/xemit.c. This helps tools like ctags or modern IDE's more
accurately follow the usage of xrecord_t."
Is this commit message good enough?
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 10/13] xdiff: delete rchg aliasing
2025-09-24 10:22 ` Phillip Wood
@ 2025-09-24 15:01 ` Ezekiel Newren
2025-09-24 15:34 ` Junio C Hamano
0 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-24 15:01 UTC (permalink / raw)
To: Phillip Wood
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble,
Jeff King
On Wed, Sep 24, 2025 at 4:21 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> The commit message should explain why this change is being made
Reasons to delete local variable aliasing:
* Usage tracking: Tools are better able to follow the usage.
* Refactor churn: Later commits will refactor rchg.
* No additional meaning: The local variables express the same meaning
as the struct field itself.
Would that suffice?
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 11/13] xdiff: rename rchg -> changed in xdfile_t
2025-09-24 10:22 ` Phillip Wood
@ 2025-09-24 15:10 ` Ezekiel Newren
2025-09-24 15:18 ` Phillip Wood
0 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-24 15:10 UTC (permalink / raw)
To: Phillip Wood
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble,
Jeff King
On Wed, Sep 24, 2025 at 4:21 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> I agree "changed" is a better name but the commit message should explain
> what "rchg" is used for so that someone who is not familiar with the
> code can understand why the change in name is desirable.
The field rchg (now 'changed') declares if a line in a file is changed
or not. A later commit will change it's type from 'char' to 'bool'
to make its purpose even more clear.
Something like that?
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 13/13] xdiff: change type of xdfile_t.changed from char to bool
2025-09-24 10:21 ` Phillip Wood
@ 2025-09-24 15:14 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-24 15:14 UTC (permalink / raw)
To: Phillip Wood
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble,
Jeff King
On Wed, Sep 24, 2025 at 4:21 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <ezekielnewren@gmail.com>
> >
> > The only values possible for 'changed' is 1 and 0, which exactly maps
> > to a bool type. It might not look like this is the case because
> > matches1 and matches2 (which use to be dis1, and dis2) were also char
> > and were assigned numerical values within a few lines of 'changed'
> > (what used to be rchg).
> >
> > Using NONE, SOME, TOO_MANY for matches1[i]/matches2[j], and true/false
> > for changed[k] makes it clear to future readers that these are
> > logically separate concepts.
>
> Nicely explained - I think this change is a very good idea and
> separating it out like this makes it much clearer what's going on
> compared to V4.
Thank you! It was obvious to me because I've been refactoring xdiff
for many months now, but I wasn't doing a good job of expressing why
these 2 concepts are closely related, but distinct.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 12/13] xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
2025-09-24 14:46 ` Ezekiel Newren
@ 2025-09-24 15:18 ` Phillip Wood
2025-09-24 17:29 ` Junio C Hamano
2025-09-25 18:40 ` Ezekiel Newren
0 siblings, 2 replies; 158+ messages in thread
From: Phillip Wood @ 2025-09-24 15:18 UTC (permalink / raw)
To: Ezekiel Newren
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble,
Jeff King
On 24/09/2025 15:46, Ezekiel Newren wrote:
> On Wed, Sep 24, 2025 at 4:21 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>
>> On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
>>> From: Ezekiel Newren <ezekielnewren@gmail.com>
>>>
>>> Rename dis1, dis2 to matches1, matches2.
>>>
>>> Define macros NONE(0), SOME(1), TOO_MANY(2) as the enum values for
>>> matches1 and matches2. These states will influence whether changed[i]
>>> is set to 1 or kept as 0.
>>
>> This message also says what is being changed rather than why it is being
>> changed. I think the rename here is a good idea but I'm not sure what
>> "rdis[01]" and "rpdis[01]" are used for and whether they should be
>> renamed if we're renaming "dis[01]"
>
> "Rename dis1, dis2 to matches1, matches2 to give the variable names a
> more obvious meaning."
>
> Would something like that work, or do I need to refine it further?
I'd maybe add a sentence before that to explain that "dis1 and dis2 are
used to record if a line has zero, one or many matches on the other side
of the diff". I don't think any of these patches need huge commit
messages but a couple of sentences explaining the reasoning would be
helpful for anyone looking at them it the future.
> I
> would love to rename rdis, rpdis, etc... except that I don't
> understand what is happening or why. Could someone explain the purpose
> of these variables?
Good question, I'm not sure anyone has an intimate knowledge of this
code. My understanding is that the code aims to remove runs of common
lines when they occur between unique lines in order to reduce the number
of lines we need to look at when we're calculating the diff. I haven't
worked through the code in detail though.
>>> /*
>>> - * Limits the window the is examined during the similar-lines
>>> - * scan. The loops below stops when dis[i - r] == 1 (line that
>>> + * Limits the window that is examined during the similar-lines
>>> + * scan. The loops below stops when matches[i - r] == SOME (line that
>>
>> Thanks for updating the comments. Not reflowing the lines makes the diff
>> easier to read but leaves the comments in a rather strange state with
>> random long lines.
>
> What is the reflow limit for comments? 72? 80?
For code it's 80 columns with a little leaway if that makes things
clearer. For comments I'd match whatever it is using at the moment.
>>
>>> + bool need_min = !!(cf->flags & XDF_NEED_MINIMAL);
>>
>> Nice use of bool, strictly speaking I don't think we need the !! if
>> we're changing the type from int. I think Junio recently suggested that
>> we might start using (bool) instead of !! for cases like this.
>>
>> Everything below this looks good, though some of the lines are getting
>> quite long with the renamed variables and symbolic values so we might
>> want to break them.
>
> I didn't add !! and thought it looked funny myself. I didn't remove it
> because I wasn't sure if I should.
Our coding guidelines say not to use "!!x" (I assume we're supposed to
do "x != 0" instead) but in practice it's pretty common to see it in our
codebase. I'd maybe try a (bool) cast and see what people say.
Thanks for cleaning up the xdiff code, it is much appreciated
Phillip
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 11/13] xdiff: rename rchg -> changed in xdfile_t
2025-09-24 15:10 ` Ezekiel Newren
@ 2025-09-24 15:18 ` Phillip Wood
0 siblings, 0 replies; 158+ messages in thread
From: Phillip Wood @ 2025-09-24 15:18 UTC (permalink / raw)
To: Ezekiel Newren
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble,
Jeff King
On 24/09/2025 16:10, Ezekiel Newren wrote:
> On Wed, Sep 24, 2025 at 4:21 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>
>> On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
>>> From: Ezekiel Newren <ezekielnewren@gmail.com>
>>
>> I agree "changed" is a better name but the commit message should explain
>> what "rchg" is used for so that someone who is not familiar with the
>> code can understand why the change in name is desirable.
>
> The field rchg (now 'changed') declares if a line in a file is changed
> or not. A later commit will change it's type from 'char' to 'bool'
> to make its purpose even more clear.
>
> Something like that?
That's great
Phillip
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 10/13] xdiff: delete rchg aliasing
2025-09-24 15:01 ` Ezekiel Newren
@ 2025-09-24 15:34 ` Junio C Hamano
2025-09-24 15:58 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Junio C Hamano @ 2025-09-24 15:34 UTC (permalink / raw)
To: Ezekiel Newren
Cc: Phillip Wood, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
Ben Knoble, Jeff King
Ezekiel Newren <ezekielnewren@gmail.com> writes:
> On Wed, Sep 24, 2025 at 4:21 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>
>> On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
>> > From: Ezekiel Newren <ezekielnewren@gmail.com>
>>
>> The commit message should explain why this change is being made
>
> Reasons to delete local variable aliasing:
> * Usage tracking: Tools are better able to follow the usage.
> * Refactor churn: Later commits will refactor rchg.
> * No additional meaning: The local variables express the same meaning
> as the struct field itself.
>
> Would that suffice?
In general, I do not view the first one is a good excuse.
When using a separate local variable enhannces readability of the
code (which often is true, with a pointer that points deep into a
nested structure member) to humans, we shouldn't blindly bend the
code to cater to less intelligent tools; it needs balancing.
The third one alone is not a good excuse for the same reason. It
(and the first one) depends on how much benefit we are gaining from
having a short-and-sweet local variables that may make the expressions
and statements they are involved in easier to read.
For this particular change, I would think it is on borderline, and
subjective. I would be OK with the third point if you rephrase it
to additionally say that the conditional and the inner loop is easy
enough to follow without using the local aliases to make the code
shorter (which of course is the commit author's opinion, but they
deserve to have and express their opinion as part of the rationale
for a change).
Thanks.
- char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
long i1, i2, l1, l2;
/*
* Trivial. Collects "groups" of changes and creates an edit script.
*/
for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
- if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
- for (l1 = i1; rchg1[i1 - 1]; i1--);
- for (l2 = i2; rchg2[i2 - 1]; i2--);
+ if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
+ for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
+ for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 10/13] xdiff: delete rchg aliasing
2025-09-24 15:34 ` Junio C Hamano
@ 2025-09-24 15:58 ` Ezekiel Newren
2025-09-24 21:31 ` Junio C Hamano
0 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-24 15:58 UTC (permalink / raw)
To: Junio C Hamano
Cc: Phillip Wood, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
Ben Knoble, Jeff King
On Wed, Sep 24, 2025 at 9:34 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Ezekiel Newren <ezekielnewren@gmail.com> writes:
>
> > On Wed, Sep 24, 2025 at 4:21 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> >>
> >> On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> >> > From: Ezekiel Newren <ezekielnewren@gmail.com>
> >>
> >> The commit message should explain why this change is being made
> >
> > Reasons to delete local variable aliasing:
> > * Usage tracking: Tools are better able to follow the usage.
> > * Refactor churn: Later commits will refactor rchg.
> > * No additional meaning: The local variables express the same meaning
> > as the struct field itself.
> >
> > Would that suffice?
>
> In general, I do not view the first one is a good excuse.
>
> When using a separate local variable enhances readability of the
> code (which often is true, with a pointer that points deep into a
> nested structure member) to humans, we shouldn't blindly bend the
> code to cater to less intelligent tools; it needs balancing.
>
> The third one alone is not a good excuse for the same reason. It
> (and the first one) depends on how much benefit we are gaining from
> having a short-and-sweet local variables that may make the expressions
> and statements they are involved in easier to read.
>
> For this particular change, I would think it is borderline, and
> subjective. I would be OK with the third point if you rephrase it
> to additionally say that the conditional and the inner loop is easy
> enough to follow without using the local aliases to make the code
> shorter (which of course is the commit author's opinion, but they
> deserve to have and express their opinion as part of the rationale
> for a change).
Ok, I can see that.
I have a question for everyone: Does preparing C code to be translated
into Rust count as a valid reason for changing it? Provided that there
is no violation of the Git style (or very small in some cases).
If my intent was to keep this as C code forever I'd agree, but... My
other reason is that it more closely follows Rust paradigms. Creating
multiple pointers to the same memory in Rust subverts the borrow
checker's ability to keep track of who owns the memory. Since C
doesn't have a concept of borrowing, I decided to delete the aliasing
here, and in many other places. I'm removing as much aliasing from the
code as I can because it makes it easier to translate into idiomatic
Rust later. Using ctags and modern IDE's to follow variable usage is
convenient, but safe Rust refuses to compile in many cases where C
aliasing is common. We could use unsafe Rust to make literal
translations of C to Rust, but then we'd forfeit the reasons and
benefits of why we want to add Rust to Git in the first place.
Translating C to Rust has been difficult because many styles in Git's
C flat out won't compile in Rust. Many places need a little tweaking,
others need major overhauls. In all of my C cleanups I am keeping
idiomatic Rust in mind.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 12/13] xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
2025-09-24 15:18 ` Phillip Wood
@ 2025-09-24 17:29 ` Junio C Hamano
2025-09-25 18:40 ` Ezekiel Newren
1 sibling, 0 replies; 158+ messages in thread
From: Junio C Hamano @ 2025-09-24 17:29 UTC (permalink / raw)
To: Phillip Wood
Cc: Ezekiel Newren, Ezekiel Newren via GitGitGadget, git,
Elijah Newren, Ben Knoble, Jeff King
Phillip Wood <phillip.wood123@gmail.com> writes:
> Our coding guidelines say not to use "!!x" (I assume we're supposed to
> do "x != 0" instead) but in practice it's pretty common to see it in
> our codebase. I'd maybe try a (bool) cast and see what people say.
Offtopic. I am perfectly fine to remove the "avoid !!x, as it is
too clever and confusing to others" entry from the guidelines. As
we have many of them and it is a fairly well understood idiom in C,
I would imagine that it have become less confusing already since the
entry was written.
> Thanks for cleaning up the xdiff code, it is much appreciated
Hear, hear.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 10/13] xdiff: delete rchg aliasing
2025-09-24 15:58 ` Ezekiel Newren
@ 2025-09-24 21:31 ` Junio C Hamano
2025-09-24 22:46 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Junio C Hamano @ 2025-09-24 21:31 UTC (permalink / raw)
To: Ezekiel Newren
Cc: Phillip Wood, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
Ben Knoble, Jeff King
Ezekiel Newren <ezekielnewren@gmail.com> writes:
> I have a question for everyone: Does preparing C code to be translated
> into Rust count as a valid reason for changing it? Provided that there
> is no violation of the Git style (or very small in some cases).
>
> If my intent was to keep this as C code forever I'd agree, but...
You'd agree that "I am preparing this for eventual rewrite" would be
a valid reason? Or you are agreeing with something else?
> My
> other reason is that it more closely follows Rust paradigms. Creating
> multiple pointers to the same memory in Rust subverts the borrow
> checker's ability to keep track of who owns the memory.
Sure. But looking at the use of rchg[12] in xdl_build_script(), if
they were "const char *", combined with the fact that they are local
and their addresses are never taken (to be leaked to our callers),
you wouldn't have much trouble with the current code, or would you
still have issues?
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 10/13] xdiff: delete rchg aliasing
2025-09-24 21:31 ` Junio C Hamano
@ 2025-09-24 22:46 ` Ezekiel Newren
2025-09-25 7:09 ` Junio C Hamano
0 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-24 22:46 UTC (permalink / raw)
To: Junio C Hamano
Cc: Phillip Wood, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
Ben Knoble, Jeff King
On Wed, Sep 24, 2025 at 3:31 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Ezekiel Newren <ezekielnewren@gmail.com> writes:
>
> > I have a question for everyone: Does preparing C code to be translated
> > into Rust count as a valid reason for changing it? Provided that there
> > is no violation of the Git style (or very small in some cases).
> >
> > If my intent was to keep this as C code forever I'd agree, but...
>
> You'd agree that "I am preparing this for eventual rewrite" would be
> a valid reason? Or you are agreeing with something else?
I'd agree that my reasons for making this change are insufficient. I
think usage tracking tools _is_ a weak argument, but perhaps not quite
as weak as what you're thinking. For example, when I renamed the rchg
field to changed, it was as simple as right-clicking the field,
choosing Rename, typing 'changed', and letting the IDE update every
use. Patch 11/13, "xdiff: rename rchg -> changed in xdfile_t", was
generated directly from that one action. That patch was clean because
I had already gone through and removed all the aliases of that field.
> > My
> > other reason is that it more closely follows Rust paradigms. Creating
> > multiple pointers to the same memory in Rust subverts the borrow
> > checker's ability to keep track of who owns the memory.
>
> Sure. But looking at the use of rchg[12] in xdl_build_script(), if
> they were "const char *", combined with the fact that they are local
> and their addresses are never taken (to be leaked to our callers),
> you wouldn't have much trouble with the current code, or would you
> still have issues?
For xdl_build_script() specifically it would work just fine keeping
the local variable aliasing in. I think this is another case of
personal preference vs established style. Which path would you prefer
that I take?
1. Drop this commit and remember to refactor rchg1, rchg2 to changed1,
and changed2.
2. Keep this commit with reasons like this:
* Refactor churn: Later commits will refactor rchg.
* No additional meaning: The local variables express the same
meaning as the struct field itself. Also, the conditional and the
inner loop is easy enough to follow without using the local aliases to
make the code shorter.
My preference is number 2.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 10/13] xdiff: delete rchg aliasing
2025-09-24 22:46 ` Ezekiel Newren
@ 2025-09-25 7:09 ` Junio C Hamano
2025-09-25 22:02 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Junio C Hamano @ 2025-09-25 7:09 UTC (permalink / raw)
To: Ezekiel Newren
Cc: Phillip Wood, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
Ben Knoble, Jeff King
Ezekiel Newren <ezekielnewren@gmail.com> writes:
> I'd agree that my reasons for making this change are insufficient. I
> think usage tracking tools _is_ a weak argument, but perhaps not quite
> as weak as what you're thinking. For example, when I renamed the rchg
> field to changed, it was as simple as right-clicking the field,
> choosing Rename, typing 'changed', and letting the IDE update every
> use. Patch 11/13, "xdiff: rename rchg -> changed in xdfile_t", was
> generated directly from that one action. That patch was clean because
> I had already gone through and removed all the aliases of that field.
If I am reading you correctly, you are describing IDE's syntax-aware
editor's symbol renaming feature; I am not quite sure what it has to
do with "usage tracking", which would be more of static analysis
thing, no?
Surely, IDE makes these symbol renaming easy and that would be one
reason that makes "because we will change this part of the code in
later commit" less relevant, isn't it? Whether the struct member
rchg is accessed directly in the conditional and loop, or is used as
the source of an assignment to a local variable, it needs to be
renamed either way. And with tools, it is not as bad.
>> Sure. But looking at the use of rchg[12] in xdl_build_script(), if
>> they were "const char *", combined with the fact that they are local
>> and their addresses are never taken (to be leaked to our callers),
>> you wouldn't have much trouble with the current code, or would you
>> still have issues?
>
> For xdl_build_script() specifically it would work just fine keeping
> the local variable aliasing in.
And transliterating that directly to Rust would not cause the borrow
issue as you described? Then it would be great. One less thing to
worry about when we need to look at C and then write an equivalent
in Rust.
> 2. Keep this commit with reasons like this:
> * Refactor churn: Later commits will refactor rchg.
> * No additional meaning: The local variables express the same
> meaning as the struct field itself. Also, the conditional and the
> inner loop is easy enough to follow without using the local aliases to
> make the code shorter.
If you have to go route #2, I can live with it, but "No additional
meaning" is _not_ a valid reason to remove aliasing variables.
By definition, a local variable that aliases something like a deeply
nested structure member should *not* introduce any additional
meaning (in other words, if the code modifies that local variable
making it out of sync with the underlying structure, the variable is
not without "addtional meaning" and is no longer an alias).
The whole reason I asked you to justify removal of the local
variable on the basis of lack of readability improvement ("the
original is simple enough to read without shorter variables") was
just that. "This variable is merely an alias to something else",
aka "There is no meaning added by the presence of this variable", is
*not* a valid reason to remove it by itself.
So, with the same code but with a better justification like "the
original uses a few local variables to shorten the code, but open
coding the access to underlying members of nested structure without
these local variables is not all that hard to read, so let's do so",
would probably be an acceptable explanation with no need for other
excuses, I would think.
Thanks.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 12/13] xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
2025-09-24 15:18 ` Phillip Wood
2025-09-24 17:29 ` Junio C Hamano
@ 2025-09-25 18:40 ` Ezekiel Newren
2025-09-26 2:29 ` Ezekiel Newren
1 sibling, 1 reply; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-25 18:40 UTC (permalink / raw)
To: phillip.wood
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble,
Jeff King
On Wed, Sep 24, 2025 at 9:18 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> On 24/09/2025 15:46, Ezekiel Newren wrote:
> > On Wed, Sep 24, 2025 at 4:21 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> >>
> >> On 23/09/2025 22:24, Ezekiel Newren via GitGitGadget wrote:
> >>> From: Ezekiel Newren <ezekielnewren@gmail.com>
> >>>
> >>> Rename dis1, dis2 to matches1, matches2.
> >>>
> >>> Define macros NONE(0), SOME(1), TOO_MANY(2) as the enum values for
> >>> matches1 and matches2. These states will influence whether changed[i]
> >>> is set to 1 or kept as 0.
> >>
> >> This message also says what is being changed rather than why it is being
> >> changed. I think the rename here is a good idea but I'm not sure what
> >> "rdis[01]" and "rpdis[01]" are used for and whether they should be
> >> renamed if we're renaming "dis[01]"
> >
> > "Rename dis1, dis2 to matches1, matches2 to give the variable names a
> > more obvious meaning."
> >
> > Would something like that work, or do I need to refine it further?
>
> I'd maybe add a sentence before that to explain that "dis1 and dis2 are
> used to record if a line has zero, one or many matches on the other side
> of the diff". I don't think any of these patches need huge commit
> messages but a couple of sentences explaining the reasoning would be
> helpful for anyone looking at them it the future.
>
> > I
> > would love to rename rdis, rpdis, etc... except that I don't
> > understand what is happening or why. Could someone explain the purpose
> > of these variables?
>
> Good question, I'm not sure anyone has an intimate knowledge of this
> code. My understanding is that the code aims to remove runs of common
> lines when they occur between unique lines in order to reduce the number
> of lines we need to look at when we're calculating the diff. I haven't
> worked through the code in detail though.
I'm really struggling with how to write this commit message. I would
very much appreciate suggestions. Here is what I have so far:
--- commit message start ---
xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
The local variables dis1 and dis2 describe how a line should be treated
based on how many lines, in the other file, match this line. NONE means
the other file does not have any matches to this line. SOME means that
there are more than 0 matches, but less than some heuristic threshold.
TOO_MANY is when there are more matches than that heuristic threshold.
Note: When need_min is true, matches[i] is always set to SOME when the
number of matches is greater than 0.
The names dis1 and dis2 don't convey what they mean, so let's rename
them to matches1 and matches2.
Define macros NONE(0), SOME(1), TOO_MANY(2) as the enum values for
matches1 and matches2. These states will influence whether changed[i]
is set to 1 or kept as 0.
The variables r, rdis0, rpdis0, rdis1, rpdis1 in xdl_clean_mmatch()
have not been renamed because I don't understand their purpose.
--- commit message end ---
I'll explain the parts of the code that are relevant to the commit
message with an example. The following snippet goes through every line
(matches1[i]) of file1 to determine what matches1[i] should be by
looking at the number of times that line shows up in file2.
if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <=
xdf1->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
matches1[i] = (nm == 0) ? NONE: (nm >= mlim && !need_min) ?
TOO_MANY: SOME;
}
The lines:
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
gets the number of matches from file2 (i.e. rcrec->len2), and then this line:
matches1[i] = (nm == 0) ? NONE: (nm >= mlim && !need_min) ? TOO_MANY: SOME;
Is the logic to set matches1[i] (the line in question of file1) to
NONE, SOME, or TOO_MANY.
mlim seems to be some heuristic threshold based on the XDL_MAX_EQLIMIT
constant which is set to 1024.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 10/13] xdiff: delete rchg aliasing
2025-09-25 7:09 ` Junio C Hamano
@ 2025-09-25 22:02 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-25 22:02 UTC (permalink / raw)
To: Junio C Hamano
Cc: Phillip Wood, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
Ben Knoble, Jeff King
On Thu, Sep 25, 2025 at 1:09 AM Junio C Hamano <gitster@pobox.com> wrote:
> So, with the same code but with a better justification like "the
> original uses a few local variables to shorten the code, but open
> coding the access to underlying members of nested structure without
> these local variables is not all that hard to read, so let's do so",
> would probably be an acceptable explanation with no need for other
> excuses, I would think.
I've changed my opinion. I'm going to drop this commit because I found
myself using this design pattern in other functions, which would make
me a hypocrite if this commit stays.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 12/13] xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
2025-09-25 18:40 ` Ezekiel Newren
@ 2025-09-26 2:29 ` Ezekiel Newren
0 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-26 2:29 UTC (permalink / raw)
To: phillip.wood
Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren, Ben Knoble,
Jeff King
On Thu, Sep 25, 2025 at 12:40 PM Ezekiel Newren <ezekielnewren@gmail.com> wrote:
> I'm really struggling with how to write this commit message. I would
> very much appreciate suggestions. Here is what I have so far:
> --- commit message start ---
> xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
> ...
I think I understand what is happening now. The macros I should be
using are: DISCARD(0), KEEP(1), INVESTIGATE(2). The comments in and
around xdl_cleanup_records() and xdl_clean_mmatch() uses the term
discard, and not discarded. I think what DISCARD means is discard this
line from consideration in the diff algorithm, and KEEP means the diff
algorithm is going to look at that line. INVESTIGATE means that the
current line has more matches in the other file than some threshold,
so we need to do some more work to decide if the line should be kept
or discarded.
xdl_cleanup_records() does not belong in xprepare.c because it is only
used by the classic diff (myers/minimal). It should be moved to
xdiffi.c, but it depends on the classifier which is only defined in
xprepare.c. So I can't move xdl_cleanup_records() until more of the
code has been cleaned up.
I'll work on the commit, and hopefully publish the next version soon.
^ permalink raw reply [flat|nested] 158+ messages in thread
* [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
` (12 preceding siblings ...)
2025-09-23 21:24 ` [PATCH v5 13/13] xdiff: change type of xdfile_t.changed from char to bool Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 01/12] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
` (12 more replies)
13 siblings, 13 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren
Changes since v5.
* Address review feedback on commit messages.
* Drop commit "xdiff: delete rchg aliasing"
* Use DISCARD/KEEP/INVESTIGATE instead of NONE/SOME/TOO_MANY
* Fix the word wrapping in the comments of xprepare.c
Cleanup of the functions xdl_cleanup_records() and xdl_clean_mmatch() is out
of scope for this patch series. The changes to them are incidental to
explaining why 'char rchg' is refactored to 'bool changed'.
Changes since v4.
* Make it clear that the field xdfile_t.rchg (now 'xdfile_t.changed') is
distinct from the local variables dis1, dis2 (now 'matches1',
'matches2').
* Use NONE, SOME, TOO_MANY instead of NO, YES, MAYBE.
* Use bool literals for xdfile_t.changed.
Changes since v3.
* Address review feedback.
* Split the deletion of xdl_get_rec() into 2 commits.
* Move NO, YES, MAYBE into xprepare.c, and use bool literals.
* refactor 'char rchg' to 'bool changed'
Changes since v2.
* No patch changes, just resending to get patch 9 to show up on the mailing
list.
* A few tweaks to the cover letter.
Changes since v1, to address review feedback.
* Only include the clean up patches; The remaining patches will be split
into a separate series.
* Commit message clarifications.
* Minor style cleanups.
* Performance impacts included in commit message of patch 8.
Relevant part of the original cover letter follows:
===================================================
Before:
typedef struct s_xrecord {
struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
unsigned int hbits;
xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
unsigned long *ha;
} xdfile_t;
After cleanup:
typedef struct s_xrecord {
char const *ptr;
long size;
unsigned long ha;
} xrecord_t;
typedef struct s_xdfile {
xrecord_t *recs;
long nrec;
long dstart, dend;
bool *changed;
long *rindex;
long nreff;
} xdfile_t;
===
Ezekiel Newren (12):
xdiff: delete static forward declarations in xprepare
xdiff: delete local variables and initialize/free xdfile_t directly
xdiff: delete unnecessary fields from xrecord_t and xdfile_t
xdiff: delete superfluous function xdl_get_rec() in xemit
xdiff: delete local variables that alias fields in xrecord_t
xdiff: delete struct diffdata_t
xdiff: delete redundant array xdfile_t.ha
xdiff: delete fields ha, line, size in xdlclass_t in favor of an
xrecord_t
xdiff: delete chastore from xdfile_t
xdiff: rename rchg -> changed in xdfile_t
xdiff: add macros DISCARD(0), KEEP(1), INVESTIGATE(2) in xprepare.c
xdiff: change type of xdfile_t.changed from char to bool
xdiff/xdiffi.c | 102 ++++++-------
xdiff/xdiffi.h | 11 +-
xdiff/xemit.c | 38 ++---
xdiff/xhistogram.c | 10 +-
xdiff/xmerge.c | 56 ++++----
xdiff/xpatience.c | 18 +--
xdiff/xprepare.c | 346 ++++++++++++++++++++-------------------------
xdiff/xtypes.h | 9 +-
xdiff/xutils.c | 16 +--
9 files changed, 269 insertions(+), 337 deletions(-)
base-commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2048%2Fezekielnewren%2Fuse_rust_types_in_xdiff-v6
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2048/ezekielnewren/use_rust_types_in_xdiff-v6
Pull-Request: https://github.com/git/git/pull/2048
Range-diff vs v5:
1: 890e508000 = 1: 890e508000 xdiff: delete static forward declarations in xprepare
2: 0cfd75b1ff = 2: 0cfd75b1ff xdiff: delete local variables and initialize/free xdfile_t directly
3: 92c81d2ff6 = 3: 92c81d2ff6 xdiff: delete unnecessary fields from xrecord_t and xdfile_t
4: 7d3a7e617c = 4: 7d3a7e617c xdiff: delete superfluous function xdl_get_rec() in xemit
5: 1d550cf308 ! 5: 7a9380328e xdiff: delete superfluous local variables that alias fields in xrecord_t
@@ Metadata
Author: Ezekiel Newren <ezekielnewren@gmail.com>
## Commit message ##
- xdiff: delete superfluous local variables that alias fields in xrecord_t
+ xdiff: delete local variables that alias fields in xrecord_t
Use the type xrecord_t as the local variable for the functions in the
- file xdiff/xemit.c.
+ file xdiff/xemit.c. Most places directly reference the fields inside of
+ this struct, doing that here makes it more consistent with the rest of
+ the code.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
6: 2a3a1b657e ! 6: 6dce41cd3d xdiff: delete struct diffdata_t
@@ Commit message
diffdata_t.rindex -> xdfile_t.rindex
diffdata_t.rchg -> xdfile_t.rchg
+ I think this struct existed before xdfile_t, and was kept for backward
+ compatibility reasons. I think xdiffi should have been refactored to
+ use the new (xdfile_t) struct, but was easier to alias it instead.
+
+ The local variables rchg* and rindex* don't shorten the lines by much,
+ nor do they really need to be there to make the code more readable.
+ Delete them.
+
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
## xdiff/xdiffi.c ##
7: 4c6543cbe3 = 7: 637d1032ab xdiff: delete redundant array xdfile_t.ha
8: 21bf4b5a20 = 8: 738daab090 xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
9: ef6ae7d29c = 9: 59b00b63b8 xdiff: delete chastore from xdfile_t
10: 7b0856108a < -: ---------- xdiff: delete rchg aliasing
11: 570ab9f898 ! 10: 5702ca6912 xdiff: rename rchg -> changed in xdfile_t
@@ Metadata
## Commit message ##
xdiff: rename rchg -> changed in xdfile_t
+ The field rchg (now 'changed') declares if a line in a file is changed
+ or not. A later commit will change it's type from 'char' to 'bool'
+ to make its purpose even more clear.
+
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
@@ xdiff/xdiffi.c: static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
g->start--;
return 0;
-@@ xdiff/xdiffi.c: int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
+@@ xdiff/xdiffi.c: int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
+
+ int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
+ xdchange_t *cscr = NULL, *xch;
+- char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
++ char *changed1 = xe->xdf1.changed, *changed2 = xe->xdf2.changed;
+ long i1, i2, l1, l2;
+
+ /*
* Trivial. Collects "groups" of changes and creates an edit script.
*/
for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
-- if (xe->xdf1.rchg[i1 - 1] || xe->xdf2.rchg[i2 - 1]) {
-- for (l1 = i1; xe->xdf1.rchg[i1 - 1]; i1--);
-- for (l2 = i2; xe->xdf2.rchg[i2 - 1]; i2--);
-+ if (xe->xdf1.changed[i1 - 1] || xe->xdf2.changed[i2 - 1]) {
-+ for (l1 = i1; xe->xdf1.changed[i1 - 1]; i1--);
-+ for (l2 = i2; xe->xdf2.changed[i2 - 1]; i2--);
+- if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
+- for (l1 = i1; rchg1[i1 - 1]; i1--);
+- for (l2 = i2; rchg2[i2 - 1]; i2--);
++ if (changed1[i1 - 1] || changed2[i2 - 1]) {
++ for (l1 = i1; changed1[i1 - 1]; i1--);
++ for (l2 = i2; changed2[i2 - 1]; i2--);
if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
xdl_free_script(cscr);
12: 08a0fceb72 ! 11: f08782a977 xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
@@ Metadata
Author: Ezekiel Newren <ezekielnewren@gmail.com>
## Commit message ##
- xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
+ xdiff: add macros DISCARD(0), KEEP(1), INVESTIGATE(2) in xprepare.c
- Rename dis1, dis2 to matches1, matches2.
+ This commit is refactor-only; no behavior is changed. A future commit
+ will use bool literals for changed[i].
- Define macros NONE(0), SOME(1), TOO_MANY(2) as the enum values for
- matches1 and matches2. These states will influence whether changed[i]
- is set to 1 or kept as 0.
+ The functions xdl_clean_mmatch() and xdl_cleanup_records() will be
+ cleaned up more in a future patch series. The changes to
+ xdl_cleanup_records(), in this patch, is just to make it clear why
+ `char rchg` is refactored to `bool changed`.
+
+ Rename dis* to action* and replace literal numericals with macros.
+ The old names came from when dis* (which I think was short for discard)
+ was treated like a boolean, but over time it grew into a ternary state
+ machine. The result was confusing because dis* and rchg* both used 0/1
+ values with different meanings.
+
+ The new names and macros make the states explicit. nm is short for
+ number of matches, and mlim is a heuristic limit:
+
+ nm == 0 -> action[i] = DISCARD -> changed[i] = true
+ 0 < nm < mlim -> action[i] = KEEP -> changed[i] = false
+ nm >= mlim -> action[i] = INVESTIGATE -> changed[i] = xdl_clean_mmatch()
+
+ When need_min is true, only DISCARD and KEEP occur because the limit
+ is effectively infinite.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
@@ xdiff/xprepare.c
#define XDL_GUESS_NLINES1 256
#define XDL_GUESS_NLINES2 20
-+#define NONE 0
-+#define SOME 1
-+#define TOO_MANY 2
++#define DISCARD 0
++#define KEEP 1
++#define INVESTIGATE 2
typedef struct s_xdlclass {
struct s_xdlclass *next;
@@ xdiff/xprepare.c: void xdl_free_env(xdfenv_t *xe) {
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
-+static bool xdl_clean_mmatch(uint8_t const *matches, long i, long s, long e) {
++static bool xdl_clean_mmatch(uint8_t const *action, long i, long s, long e) {
long r, rdis0, rpdis0, rdis1, rpdis1;
/*
- * Limits the window the is examined during the similar-lines
- * scan. The loops below stops when dis[i - r] == 1 (line that
+- * has no match), but there are corner cases where the loop
+- * proceed all the way to the extremities by causing huge
+- * performance penalties in case of big files.
+ * Limits the window that is examined during the similar-lines
-+ * scan. The loops below stops when matches[i - r] == SOME (line that
- * has no match), but there are corner cases where the loop
- * proceed all the way to the extremities by causing huge
- * performance penalties in case of big files.
++ * scan. The loops below stops when action[i - r] == KEEP
++ * (line that has no match), but there are corner cases where
++ * the loop proceed all the way to the extremities by causing
++ * huge performance penalties in case of big files.
+ */
+ if (i - s > XDL_SIMSCAN_WINDOW)
+ s = i - XDL_SIMSCAN_WINDOW;
@@ xdiff/xprepare.c: static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
/*
* Scans the lines before 'i' to find a run of lines that either
- * have no match (dis[j] == 0) or have multiple matches (dis[j] > 1).
- * Note that we always call this function with dis[i] > 1, so the
-+ * have no match (matches[j] == NONE) or have multiple matches (matches[j] == TOO_MANY).
-+ * Note that we always call this function with matches[i] == TOO_MANY, so the
- * current line (i) is already a multimatch line.
+- * current line (i) is already a multimatch line.
++ * have no match (action[j] == DISCARD) or have multiple matches
++ * (action[j] == INVESTIGATE). Note that we always call this
++ * function with action[i] == INVESTIGATE, so the current line
++ * (i) is already a multimatch line.
*/
for (r = 1, rdis0 = 0, rpdis0 = 1; (i - r) >= s; r++) {
- if (!dis[i - r])
-+ if (matches[i - r] == NONE)
++ if (action[i - r] == DISCARD)
rdis0++;
- else if (dis[i - r] == 2)
-+ else if (matches[i - r] == TOO_MANY)
++ else if (action[i - r] == INVESTIGATE)
rpdis0++;
- else
-+ else if (matches[i - r] == SOME)
++ else if (action[i - r] == KEEP)
break;
+ else
-+ BUG("Illegal value for matches[i - r]");
++ BUG("Illegal value for action[i - r]");
}
/*
- * If the run before the line 'i' found only multimatch lines, we
+- * If the run before the line 'i' found only multimatch lines, we
- * return 0 and hence we don't make the current line (i) discarded.
-+ * return false and hence we don't make the current line (i) discarded.
- * We want to discard multimatch lines only when they appear in the
+- * We want to discard multimatch lines only when they appear in the
- * middle of runs with nomatch lines (dis[j] == 0).
-+ * middle of runs with nomatch lines (matches[j] == NONE).
++ * If the run before the line 'i' found only multimatch lines,
++ * we return false and hence we don't make the current line (i)
++ * discarded. We want to discard multimatch lines only when
++ * they appear in the middle of runs with nomatch lines
++ * (action[j] == DISCARD).
*/
if (rdis0 == 0)
return 0;
for (r = 1, rdis1 = 0, rpdis1 = 1; (i + r) <= e; r++) {
- if (!dis[i + r])
-+ if (matches[i + r] == NONE)
++ if (action[i + r] == DISCARD)
rdis1++;
- else if (dis[i + r] == 2)
-+ else if (matches[i + r] == TOO_MANY)
++ else if (action[i + r] == INVESTIGATE)
rpdis1++;
- else
-+ else if (matches[i + r] == SOME)
++ else if (action[i + r] == KEEP)
break;
+ else
-+ BUG("Illegal value for matches[i + r]");
++ BUG("Illegal value for action[i + r]");
}
/*
- * If the run after the line 'i' found only multimatch lines, we
+- * If the run after the line 'i' found only multimatch lines, we
- * return 0 and hence we don't make the current line (i) discarded.
-+ * return false and hence we don't make the current line (i) discarded.
++ * If the run after the line 'i' found only multimatch lines,
++ * we return false and hence we don't make the current line (i)
++ * discarded.
*/
if (rdis1 == 0)
- return 0;
@@ xdiff/xprepare.c: static int xdl_clean_mmatch(char const *dis, long i, long s, l
xdlclass_t *rcrec;
- char *dis, *dis1, *dis2;
- int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
-+ uint8_t *matches1, *matches2;
-+ int status = 0;
++ uint8_t *action1 = NULL, *action2 = NULL;
+ bool need_min = !!(cf->flags & XDF_NEED_MINIMAL);
++ int ret = 0;
- if (!XDL_CALLOC_ARRAY(dis, xdf1->nrec + xdf2->nrec + 2))
- return -1;
- dis1 = dis;
- dis2 = dis1 + xdf1->nrec + 1;
-+ matches1 = NULL;
-+ matches2 = NULL;
-+
+ /*
+ * Create temporary arrays that will help us decide if
+ * changed[i] should remain 0 or become 1.
+ */
-+ if (!XDL_CALLOC_ARRAY(matches1, xdf1->nrec + 1)) {
-+ status = -1;
++ if (!XDL_CALLOC_ARRAY(action1, xdf1->nrec + 1)) {
++ ret = -1;
+ goto cleanup;
+ }
-+ if (!XDL_CALLOC_ARRAY(matches2, xdf2->nrec + 1)) {
-+ status = -1;
++ if (!XDL_CALLOC_ARRAY(action2, xdf2->nrec + 1)) {
++ ret = -1;
+ goto cleanup;
+ }
+ /*
-+ * Initialize temporary arrays with NONE, SOME, or TOO_MANY.
++ * Initialize temporary arrays with DISCARD, KEEP, or INVESTIGATE.
+ */
if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
@@ xdiff/xprepare.c: static int xdl_clean_mmatch(char const *dis, long i, long s, l
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
- dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
-+ matches1[i] = (nm == 0) ? NONE: (nm >= mlim && !need_min) ? TOO_MANY: SOME;
++ action1[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
}
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
- dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
-+ matches2[i] = (nm == 0) ? NONE: (nm >= mlim && !need_min) ? TOO_MANY: SOME;
++ action2[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
}
+ /*
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *
i <= xdf1->dend; i++, recs++) {
- if (dis1[i] == 1 ||
- (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
-+ if (matches1[i] == SOME ||
-+ (matches1[i] == TOO_MANY && !xdl_clean_mmatch(matches1, i, xdf1->dstart, xdf1->dend))) {
++ if (action1[i] == KEEP ||
++ (action1[i] == INVESTIGATE && !xdl_clean_mmatch(action1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
-+ /* changed[i] remains 0 */
++ /* changed[i] remains 0, i.e. keep */
} else
xdf1->changed[i] = 1;
++ /* i.e. discard */
}
-@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
+ xdf1->nreff = nreff;
for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
i <= xdf2->dend; i++, recs++) {
- if (dis2[i] == 1 ||
- (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
-+ if (matches2[i] == SOME ||
-+ (matches2[i] == TOO_MANY && !xdl_clean_mmatch(matches2, i, xdf2->dstart, xdf2->dend))) {
++ if (action2[i] == KEEP ||
++ (action2[i] == INVESTIGATE && !xdl_clean_mmatch(action2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
-+ /* changed[i] remains 0 */
++ /* changed[i] remains 0, i.e. keep */
} else
xdf2->changed[i] = 1;
++ /* i.e. discard */
}
xdf2->nreff = nreff;
- xdl_free(dis);
+cleanup:
-+ xdl_free(matches1);
-+ xdl_free(matches2);
++ xdl_free(action1);
++ xdl_free(action2);
- return 0;
-+ return status;
++ return ret;
}
13: 975e845bfa ! 12: 83e1ace5bd xdiff: change type of xdfile_t.changed from char to bool
@@ Commit message
xdiff: change type of xdfile_t.changed from char to bool
The only values possible for 'changed' is 1 and 0, which exactly maps
- to a bool type. It might not look like this is the case because
- matches1 and matches2 (which use to be dis1, and dis2) were also char
- and were assigned numerical values within a few lines of 'changed'
- (what used to be rchg).
+ to a bool type. It might not look like this because action1 and action2
+ (which use to be dis1, and dis2) were also of type char and were
+ assigned numerical values within a few lines of 'changed' (what used to
+ be rchg).
- Using NONE, SOME, TOO_MANY for matches1[i]/matches2[j], and true/false
+ Using DISCARD/KEEP/INVESTIGATE for action1[i]/action2[j], and true/false
for changed[k] makes it clear to future readers that these are
logically separate concepts.
@@ xdiff/xdiffi.c: static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
while (xdf->changed[g->start - 1])
g->start--;
+@@ xdiff/xdiffi.c: int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
+
+ int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
+ xdchange_t *cscr = NULL, *xch;
+- char *changed1 = xe->xdf1.changed, *changed2 = xe->xdf2.changed;
++ bool *changed1 = xe->xdf1.changed, *changed2 = xe->xdf2.changed;
+ long i1, i2, l1, l2;
+
+ /*
## xdiff/xhistogram.c ##
@@ xdiff/xhistogram.c: redo:
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *
- * changed[i] should remain 0 or become 1.
+ * changed[i] should remain false, or become true.
*/
- if (!XDL_CALLOC_ARRAY(matches1, xdf1->nrec + 1)) {
- status = -1;
+ if (!XDL_CALLOC_ARRAY(action1, xdf1->nrec + 1)) {
+ ret = -1;
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
/*
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *
*/
for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
i <= xdf1->dend; i++, recs++) {
- if (matches1[i] == SOME ||
- (matches1[i] == TOO_MANY && !xdl_clean_mmatch(matches1, i, xdf1->dstart, xdf1->dend))) {
+ if (action1[i] == KEEP ||
+ (action1[i] == INVESTIGATE && !xdl_clean_mmatch(action1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
-- /* changed[i] remains 0 */
-+ /* changed[i] remains false */
+- /* changed[i] remains 0, i.e. keep */
++ /* changed[i] remains false, i.e. keep */
} else
- xdf1->changed[i] = 1;
+ xdf1->changed[i] = true;
+ /* i.e. discard */
}
xdf1->nreff = nreff;
-
@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
- if (matches2[i] == SOME ||
- (matches2[i] == TOO_MANY && !xdl_clean_mmatch(matches2, i, xdf2->dstart, xdf2->dend))) {
+ if (action2[i] == KEEP ||
+ (action2[i] == INVESTIGATE && !xdl_clean_mmatch(action2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
-- /* changed[i] remains 0 */
-+ /* changed[i] remains false */
+- /* changed[i] remains 0, i.e. keep */
++ /* changed[i] remains false, i.e. keep */
} else
- xdf2->changed[i] = 1;
+ xdf2->changed[i] = true;
+ /* i.e. discard */
}
xdf2->nreff = nreff;
-
## xdiff/xtypes.h ##
@@ xdiff/xtypes.h: typedef struct s_xdfile {
--
gitgitgadget
^ permalink raw reply [flat|nested] 158+ messages in thread
* [PATCH v6 01/12] xdiff: delete static forward declarations in xprepare
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 02/12] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
` (11 subsequent siblings)
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Move xdl_prepare_env() later in the file to avoid the need
for static forward declarations.
Best-viewed-with: --color-moved
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 116 ++++++++++++++++++++---------------------------
1 file changed, 50 insertions(+), 66 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e1d4017b2d..249bfa678f 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -53,21 +53,6 @@ typedef struct s_xdlclassifier {
-static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags);
-static void xdl_free_classifier(xdlclassifier_t *cf);
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec);
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
- xdlclassifier_t *cf, xdfile_t *xdf);
-static void xdl_free_ctx(xdfile_t *xdf);
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e);
-static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-
-
-
-
static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags) {
cf->flags = flags;
@@ -242,57 +227,6 @@ static void xdl_free_ctx(xdfile_t *xdf) {
}
-int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
- xdfenv_t *xe) {
- long enl1, enl2, sample;
- xdlclassifier_t cf;
-
- memset(&cf, 0, sizeof(cf));
-
- /*
- * For histogram diff, we can afford a smaller sample size and
- * thus a poorer estimate of the number of lines, as the hash
- * table (rhash) won't be filled up/grown. The number of lines
- * (nrecs) will be updated correctly anyway by
- * xdl_prepare_ctx().
- */
- sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
- ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
-
- enl1 = xdl_guess_lines(mf1, sample) + 1;
- enl2 = xdl_guess_lines(mf2, sample) + 1;
-
- if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
- return -1;
-
- if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
-
- xdl_free_classifier(&cf);
- return -1;
- }
- if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
- (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
- xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
-
- xdl_free_ctx(&xe->xdf2);
- xdl_free_ctx(&xe->xdf1);
- xdl_free_classifier(&cf);
- return -1;
- }
-
- xdl_free_classifier(&cf);
-
- return 0;
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
@@ -460,3 +394,53 @@ static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2
return 0;
}
+
+int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+ xdfenv_t *xe) {
+ long enl1, enl2, sample;
+ xdlclassifier_t cf;
+
+ memset(&cf, 0, sizeof(cf));
+
+ /*
+ * For histogram diff, we can afford a smaller sample size and
+ * thus a poorer estimate of the number of lines, as the hash
+ * table (rhash) won't be filled up/grown. The number of lines
+ * (nrecs) will be updated correctly anyway by
+ * xdl_prepare_ctx().
+ */
+ sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
+ ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
+
+ enl1 = xdl_guess_lines(mf1, sample) + 1;
+ enl2 = xdl_guess_lines(mf2, sample) + 1;
+
+ if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
+ return -1;
+
+ if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
+
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+ if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
+ (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
+ xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
+
+ xdl_free_ctx(&xe->xdf2);
+ xdl_free_ctx(&xe->xdf1);
+ xdl_free_classifier(&cf);
+ return -1;
+ }
+
+ xdl_free_classifier(&cf);
+
+ return 0;
+}
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v6 02/12] xdiff: delete local variables and initialize/free xdfile_t directly
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 01/12] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 03/12] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
` (10 subsequent siblings)
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
These local variables are essentially a hand-rolled additional
implementation of xdl_free_ctx() inlined into xdl_prepare_ctx(). Modify
the code to use the existing xdl_free_ctx() function so there aren't
two ways to free such variables.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 78 +++++++++++++++++++-----------------------------
1 file changed, 30 insertions(+), 48 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 249bfa678f..96134c9fbf 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -134,99 +134,81 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
}
+static void xdl_free_ctx(xdfile_t *xdf)
+{
+ xdl_free(xdf->rhash);
+ xdl_free(xdf->rindex);
+ xdl_free(xdf->rchg - 1);
+ xdl_free(xdf->ha);
+ xdl_free(xdf->recs);
+ xdl_cha_free(&xdf->rcha);
+}
+
+
static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
xdlclassifier_t *cf, xdfile_t *xdf) {
- unsigned int hbits;
- long nrec, hsize, bsize;
+ long bsize;
unsigned long hav;
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xrecord_t **recs;
- xrecord_t **rhash;
- unsigned long *ha;
- char *rchg;
- long *rindex;
- ha = NULL;
- rindex = NULL;
- rchg = NULL;
- rhash = NULL;
- recs = NULL;
+ xdf->ha = NULL;
+ xdf->rindex = NULL;
+ xdf->rchg = NULL;
+ xdf->rhash = NULL;
+ xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
goto abort;
- if (!XDL_ALLOC_ARRAY(recs, narec))
+ if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- hbits = xdl_hashbits((unsigned int) narec);
- hsize = 1 << hbits;
- if (!XDL_CALLOC_ARRAY(rhash, hsize))
+ xdf->hbits = xdl_hashbits((unsigned int) narec);
+ if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
goto abort;
- nrec = 0;
+ xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
prev = cur;
hav = xdl_hash_record(&cur, top, xpp->flags);
- if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
+ if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
if (!(crec = xdl_cha_alloc(&xdf->rcha)))
goto abort;
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- recs[nrec++] = crec;
- if (xdl_classify_record(pass, cf, rhash, hbits, crec) < 0)
+ xdf->recs[xdf->nrec++] = crec;
+ if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
goto abort;
}
}
- if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
+ if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
goto abort;
if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
- if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
+ if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
goto abort;
}
- xdf->nrec = nrec;
- xdf->recs = recs;
- xdf->hbits = hbits;
- xdf->rhash = rhash;
- xdf->rchg = rchg + 1;
- xdf->rindex = rindex;
+ xdf->rchg += 1;
xdf->nreff = 0;
- xdf->ha = ha;
xdf->dstart = 0;
- xdf->dend = nrec - 1;
+ xdf->dend = xdf->nrec - 1;
return 0;
abort:
- xdl_free(ha);
- xdl_free(rindex);
- xdl_free(rchg);
- xdl_free(rhash);
- xdl_free(recs);
- xdl_cha_free(&xdf->rcha);
+ xdl_free_ctx(xdf);
return -1;
}
-static void xdl_free_ctx(xdfile_t *xdf) {
-
- xdl_free(xdf->rhash);
- xdl_free(xdf->rindex);
- xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
- xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
-}
-
-
void xdl_free_env(xdfenv_t *xe) {
xdl_free_ctx(&xe->xdf2);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v6 03/12] xdiff: delete unnecessary fields from xrecord_t and xdfile_t
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 01/12] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 02/12] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 04/12] xdiff: delete superfluous function xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
` (9 subsequent siblings)
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 15 ++-------------
xdiff/xtypes.h | 3 ---
2 files changed, 2 insertions(+), 16 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 96134c9fbf..3576415c85 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -91,8 +91,7 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
}
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
- unsigned int hbits, xrecord_t *rec) {
+static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
char const *line;
xdlclass_t *rcrec;
@@ -126,17 +125,12 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
rec->ha = (unsigned long) rcrec->idx;
- hi = (long) XDL_HASHLONG(rec->ha, hbits);
- rec->next = rhash[hi];
- rhash[hi] = rec;
-
return 0;
}
static void xdl_free_ctx(xdfile_t *xdf)
{
- xdl_free(xdf->rhash);
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->ha);
@@ -155,7 +149,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
- xdf->rhash = NULL;
xdf->recs = NULL;
if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
@@ -163,10 +156,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
- xdf->hbits = xdl_hashbits((unsigned int) narec);
- if (!XDL_CALLOC_ARRAY(xdf->rhash, 1 << xdf->hbits))
- goto abort;
-
xdf->nrec = 0;
if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
for (top = blk + bsize; cur < top; ) {
@@ -180,7 +169,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
crec->size = (long) (cur - prev);
crec->ha = hav;
xdf->recs[xdf->nrec++] = crec;
- if (xdl_classify_record(pass, cf, xdf->rhash, xdf->hbits, crec) < 0)
+ if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8442bd436e..8b8467360e 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,7 +39,6 @@ typedef struct s_chastore {
} chastore_t;
typedef struct s_xrecord {
- struct s_xrecord *next;
char const *ptr;
long size;
unsigned long ha;
@@ -48,8 +47,6 @@ typedef struct s_xrecord {
typedef struct s_xdfile {
chastore_t rcha;
long nrec;
- unsigned int hbits;
- xrecord_t **rhash;
long dstart, dend;
xrecord_t **recs;
char *rchg;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v6 04/12] xdiff: delete superfluous function xdl_get_rec() in xemit
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
` (2 preceding siblings ...)
2025-09-26 22:41 ` [PATCH v6 03/12] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 05/12] xdiff: delete local variables that alias fields in xrecord_t Ezekiel Newren via GitGitGadget
` (8 subsequent siblings)
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
When xrecord_t was a linked list, and recs didn't exist, I assume this
function walked the list until it found the right record. Accessing
a contiguous array is so trival that this function is now superfluous.
Delete it.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xemit.c | 23 +++++++----------------
1 file changed, 7 insertions(+), 16 deletions(-)
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 1d40c9cb40..40fc8154f3 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -22,23 +22,14 @@
#include "xinclude.h"
-static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
-
- *rec = xdf->recs[ri]->ptr;
-
- return xdf->recs[ri]->size;
-}
-
static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
long size, psize = strlen(pre);
- char const *rec;
-
- size = xdl_get_rec(xdf, ri, &rec);
- if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0) {
+ char const *rec = xdf->recs[ri]->ptr;
+ size = xdf->recs[ri]->size;
+ if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0)
return -1;
- }
return 0;
}
@@ -120,8 +111,8 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ const char *rec = xdf->recs[ri]->ptr;
+ long len = xdf->recs[ri]->size;
if (!xecfg->find_func)
return def_ff(rec, len, buf, sz);
return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
@@ -160,8 +151,8 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- const char *rec;
- long len = xdl_get_rec(xdf, ri, &rec);
+ const char *rec = xdf->recs[ri]->ptr;
+ long len = xdf->recs[ri]->size;
while (len > 0 && XDL_ISSPACE(*rec)) {
rec++;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v6 05/12] xdiff: delete local variables that alias fields in xrecord_t
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
` (3 preceding siblings ...)
2025-09-26 22:41 ` [PATCH v6 04/12] xdiff: delete superfluous function xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 06/12] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
` (7 subsequent siblings)
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Use the type xrecord_t as the local variable for the functions in the
file xdiff/xemit.c. Most places directly reference the fields inside of
this struct, doing that here makes it more consistent with the rest of
the code.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xemit.c | 29 +++++++++++++----------------
1 file changed, 13 insertions(+), 16 deletions(-)
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 40fc8154f3..2161ac3cd0 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -23,12 +23,11 @@
#include "xinclude.h"
-static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
- long size, psize = strlen(pre);
- char const *rec = xdf->recs[ri]->ptr;
+static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
+{
+ xrecord_t *rec = xdf->recs[ri];
- size = xdf->recs[ri]->size;
- if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0)
+ if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
return -1;
return 0;
@@ -111,11 +110,11 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- const char *rec = xdf->recs[ri]->ptr;
- long len = xdf->recs[ri]->size;
+ xrecord_t *rec = xdf->recs[ri];
+
if (!xecfg->find_func)
- return def_ff(rec, len, buf, sz);
- return xecfg->find_func(rec, len, buf, sz, xecfg->find_func_priv);
+ return def_ff(rec->ptr, rec->size, buf, sz);
+ return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
}
static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
@@ -151,14 +150,12 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- const char *rec = xdf->recs[ri]->ptr;
- long len = xdf->recs[ri]->size;
+ xrecord_t *rec = xdf->recs[ri];
+ long i = 0;
- while (len > 0 && XDL_ISSPACE(*rec)) {
- rec++;
- len--;
- }
- return !len;
+ for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
+
+ return i == rec->size;
}
int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v6 06/12] xdiff: delete struct diffdata_t
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
` (4 preceding siblings ...)
2025-09-26 22:41 ` [PATCH v6 05/12] xdiff: delete local variables that alias fields in xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 07/12] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
` (6 subsequent siblings)
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
Every field in this struct is an alias for a certain field in xdfile_t.
diffdata_t.nrec -> xdfile_t.nreff
diffdata_t.ha -> xdfile_t.ha
diffdata_t.rindex -> xdfile_t.rindex
diffdata_t.rchg -> xdfile_t.rchg
I think this struct existed before xdfile_t, and was kept for backward
compatibility reasons. I think xdiffi should have been refactored to
use the new (xdfile_t) struct, but was easier to alias it instead.
The local variables rchg* and rindex* don't shorten the lines by much,
nor do they really need to be there to make the code more readable.
Delete them.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 32 ++++++++------------------------
xdiff/xdiffi.h | 11 ++---------
2 files changed, 10 insertions(+), 33 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 5a96e36dfb..bbf0161f84 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -257,10 +257,10 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
* sub-boxes by calling the box splitting function. Note that the real job
* (marking changed lines) is done in the two boundary reaching checks.
*/
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = dd1->ha, *ha2 = dd2->ha;
+ unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
@@ -273,17 +273,11 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
* be obviously changed.
*/
if (off1 == lim1) {
- char *rchg2 = dd2->rchg;
- long *rindex2 = dd2->rindex;
-
for (; off2 < lim2; off2++)
- rchg2[rindex2[off2]] = 1;
+ xdf2->rchg[xdf2->rindex[off2]] = 1;
} else if (off2 == lim2) {
- char *rchg1 = dd1->rchg;
- long *rindex1 = dd1->rindex;
-
for (; off1 < lim1; off1++)
- rchg1[rindex1[off1]] = 1;
+ xdf1->rchg[xdf1->rindex[off1]] = 1;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -300,9 +294,9 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
/*
* ... et Impera.
*/
- if (xdl_recs_cmp(dd1, off1, spl.i1, dd2, off2, spl.i2,
+ if (xdl_recs_cmp(xdf1, off1, spl.i1, xdf2, off2, spl.i2,
kvdf, kvdb, spl.min_lo, xenv) < 0 ||
- xdl_recs_cmp(dd1, spl.i1, lim1, dd2, spl.i2, lim2,
+ xdl_recs_cmp(xdf1, spl.i1, lim1, xdf2, spl.i2, lim2,
kvdf, kvdb, spl.min_hi, xenv) < 0) {
return -1;
@@ -318,7 +312,6 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
long ndiags;
long *kvd, *kvdf, *kvdb;
xdalgoenv_t xenv;
- diffdata_t dd1, dd2;
int res;
if (xdl_prepare_env(mf1, mf2, xpp, xe) < 0)
@@ -357,16 +350,7 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xenv.snake_cnt = XDL_SNAKE_CNT;
xenv.heur_min = XDL_HEUR_MIN_COST;
- dd1.nrec = xe->xdf1.nreff;
- dd1.ha = xe->xdf1.ha;
- dd1.rchg = xe->xdf1.rchg;
- dd1.rindex = xe->xdf1.rindex;
- dd2.nrec = xe->xdf2.nreff;
- dd2.ha = xe->xdf2.ha;
- dd2.rchg = xe->xdf2.rchg;
- dd2.rindex = xe->xdf2.rindex;
-
- res = xdl_recs_cmp(&dd1, 0, dd1.nrec, &dd2, 0, dd2.nrec,
+ res = xdl_recs_cmp(&xe->xdf1, 0, xe->xdf1.nreff, &xe->xdf2, 0, xe->xdf2.nreff,
kvdf, kvdb, (xpp->flags & XDF_NEED_MINIMAL) != 0,
&xenv);
xdl_free(kvd);
diff --git a/xdiff/xdiffi.h b/xdiff/xdiffi.h
index 126c9d8ff4..49e52c67f9 100644
--- a/xdiff/xdiffi.h
+++ b/xdiff/xdiffi.h
@@ -24,13 +24,6 @@
#define XDIFFI_H
-typedef struct s_diffdata {
- long nrec;
- unsigned long const *ha;
- long *rindex;
- char *rchg;
-} diffdata_t;
-
typedef struct s_xdalgoenv {
long mxcost;
long snake_cnt;
@@ -46,8 +39,8 @@ typedef struct s_xdchange {
-int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
- diffdata_t *dd2, long off2, long lim2,
+int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv);
int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
xdfenv_t *xe);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v6 07/12] xdiff: delete redundant array xdfile_t.ha
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
` (5 preceding siblings ...)
2025-09-26 22:41 ` [PATCH v6 06/12] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 08/12] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
` (5 subsequent siblings)
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
When 0 <= i < xdfile_t.nreff the following is true:
xdfile_t.ha[i] == xdfile_t.recs[xdfile_t.rindex[i]]
This makes the code about 5% slower. The fields rindex and ha are
specific to the classic diff (myers and minimal). I plan on creating a
struct for classic diff, but there's a lot of cleanup that needs to be
done before that can happen and leaving ha in would make those cleanups
harder to follow.
A subsequent commit will delete the chastore cha from xdfile_t. That
later commit will investigate deleting ha and cha independently and
together.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++++++----------
xdiff/xprepare.c | 12 ++----------
xdiff/xtypes.h | 1 -
3 files changed, 16 insertions(+), 21 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index bbf0161f84..11cd090b53 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -22,6 +22,11 @@
#include "xinclude.h"
+static unsigned long get_hash(xdfile_t *xdf, long index)
+{
+ return xdf->recs[xdf->rindex[index]]->ha;
+}
+
#define XDL_MAX_COST_MIN 256
#define XDL_HEUR_MIN_COST 256
#define XDL_LINE_MAX (long)((1UL << (CHAR_BIT * sizeof(long) - 1)) - 1)
@@ -42,8 +47,8 @@ typedef struct s_xdpsplit {
* using this algorithm, so a little bit of heuristic is needed to cut the
* search and to return a suboptimal point.
*/
-static long xdl_split(unsigned long const *ha1, long off1, long lim1,
- unsigned long const *ha2, long off2, long lim2,
+static long xdl_split(xdfile_t *xdf1, long off1, long lim1,
+ xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl,
xdalgoenv_t *xenv) {
long dmin = off1 - lim2, dmax = lim1 - off2;
@@ -87,7 +92,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdf[d + 1];
prev1 = i1;
i2 = i1 - d;
- for (; i1 < lim1 && i2 < lim2 && ha1[i1] == ha2[i2]; i1++, i2++);
+ for (; i1 < lim1 && i2 < lim2 && get_hash(xdf1, i1) == get_hash(xdf2, i2); i1++, i2++);
if (i1 - prev1 > xenv->snake_cnt)
got_snake = 1;
kvdf[d] = i1;
@@ -124,7 +129,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
i1 = kvdb[d + 1] - 1;
prev1 = i1;
i2 = i1 - d;
- for (; i1 > off1 && i2 > off2 && ha1[i1 - 1] == ha2[i2 - 1]; i1--, i2--);
+ for (; i1 > off1 && i2 > off2 && get_hash(xdf1, i1 - 1) == get_hash(xdf2, i2 - 1); i1--, i2--);
if (prev1 - i1 > xenv->snake_cnt)
got_snake = 1;
kvdb[d] = i1;
@@ -159,7 +164,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 + xenv->snake_cnt <= i1 && i1 < lim1 &&
off2 + xenv->snake_cnt <= i2 && i2 < lim2) {
- for (k = 1; ha1[i1 - k] == ha2[i2 - k]; k++)
+ for (k = 1; get_hash(xdf1, i1 - k) == get_hash(xdf2, i2 - k); k++)
if (k == xenv->snake_cnt) {
best = v;
spl->i1 = i1;
@@ -183,7 +188,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
if (v > XDL_K_HEUR * ec && v > best &&
off1 < i1 && i1 <= lim1 - xenv->snake_cnt &&
off2 < i2 && i2 <= lim2 - xenv->snake_cnt) {
- for (k = 0; ha1[i1 + k] == ha2[i2 + k]; k++)
+ for (k = 0; get_hash(xdf1, i1 + k) == get_hash(xdf2, i2 + k); k++)
if (k == xenv->snake_cnt - 1) {
best = v;
spl->i1 = i1;
@@ -260,13 +265,12 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
xdfile_t *xdf2, long off2, long lim2,
long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
- unsigned long const *ha1 = xdf1->ha, *ha2 = xdf2->ha;
/*
* Shrink the box by walking through each diagonal snake (SW and NE).
*/
- for (; off1 < lim1 && off2 < lim2 && ha1[off1] == ha2[off2]; off1++, off2++);
- for (; off1 < lim1 && off2 < lim2 && ha1[lim1 - 1] == ha2[lim2 - 1]; lim1--, lim2--);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, off1) == get_hash(xdf2, off2); off1++, off2++);
+ for (; off1 < lim1 && off2 < lim2 && get_hash(xdf1, lim1 - 1) == get_hash(xdf2, lim2 - 1); lim1--, lim2--);
/*
* If one dimension is empty, then all records on the other one must
@@ -285,7 +289,7 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
/*
* Divide ...
*/
- if (xdl_split(ha1, off1, lim1, ha2, off2, lim2, kvdf, kvdb,
+ if (xdl_split(xdf1, off1, lim1, xdf2, off2, lim2, kvdf, kvdb,
need_min, &spl, xenv) < 0) {
return -1;
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 3576415c85..22c44f0683 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -133,7 +133,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
{
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
- xdl_free(xdf->ha);
xdl_free(xdf->recs);
xdl_cha_free(&xdf->rcha);
}
@@ -146,7 +145,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
char const *blk, *cur, *top, *prev;
xrecord_t *crec;
- xdf->ha = NULL;
xdf->rindex = NULL;
xdf->rchg = NULL;
xdf->recs = NULL;
@@ -181,8 +179,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
(XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
goto abort;
- if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
- goto abort;
}
xdf->rchg += 1;
@@ -300,9 +296,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf1->dend; i++, recs++) {
if (dis1[i] == 1 ||
(dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
- xdf1->rindex[nreff] = i;
- xdf1->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf1->rindex[nreff++] = i;
} else
xdf1->rchg[i] = 1;
}
@@ -312,9 +306,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
i <= xdf2->dend; i++, recs++) {
if (dis2[i] == 1 ||
(dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
- xdf2->rindex[nreff] = i;
- xdf2->ha[nreff] = (*recs)->ha;
- nreff++;
+ xdf2->rindex[nreff++] = i;
} else
xdf2->rchg[i] = 1;
}
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8b8467360e..85848f1685 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -52,7 +52,6 @@ typedef struct s_xdfile {
char *rchg;
long *rindex;
long nreff;
- unsigned long *ha;
} xdfile_t;
typedef struct s_xdfenv {
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v6 08/12] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
` (6 preceding siblings ...)
2025-09-26 22:41 ` [PATCH v6 07/12] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 09/12] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
` (4 subsequent siblings)
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
The fields from xdlclass_t are aliases of xrecord_t:
xdlclass_t.line -> xrecord_t.ptr
xdlclass_t.size -> xrecord_t.size
xdlclass_t.ha -> xrecord_t.ha
xdlclass_t carries a copy of the data in xrecord_t, but instead of
embedding xrecord_t it duplicates the individual fields. A future
commit will change the types used in xrecord_t so embed it in
xdlclass_t first, so we don't have to remember to change the types
here as well.
Best-viewed-with: --color-words
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 22c44f0683..e6e2c0e1c0 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -32,9 +32,7 @@
typedef struct s_xdlclass {
struct s_xdlclass *next;
- unsigned long ha;
- char const *line;
- long size;
+ xrecord_t rec;
long idx;
long len1, len2;
} xdlclass_t;
@@ -93,14 +91,12 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
long hi;
- char const *line;
xdlclass_t *rcrec;
- line = rec->ptr;
hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
- if (rcrec->ha == rec->ha &&
- xdl_recmatch(rcrec->line, rcrec->size,
+ if (rcrec->rec.ha == rec->ha &&
+ xdl_recmatch(rcrec->rec.ptr, rcrec->rec.size,
rec->ptr, rec->size, cf->flags))
break;
@@ -113,9 +109,7 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
if (XDL_ALLOC_GROW(cf->rcrecs, cf->count, cf->alloc))
return -1;
cf->rcrecs[rcrec->idx] = rcrec;
- rcrec->line = line;
- rcrec->size = rec->size;
- rcrec->ha = rec->ha;
+ rcrec->rec = *rec;
rcrec->len1 = rcrec->len2 = 0;
rcrec->next = cf->rchash[hi];
cf->rchash[hi] = rcrec;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v6 09/12] xdiff: delete chastore from xdfile_t
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
` (7 preceding siblings ...)
2025-09-26 22:41 ` [PATCH v6 08/12] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 10/12] xdiff: rename rchg -> changed in xdfile_t Ezekiel Newren via GitGitGadget
` (3 subsequent siblings)
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
xdfile_t currently uses chastore_t which is an arena allocator. I
think that xrecord_t used to be a linked list and recs didn't exist
originally. When recs was added I think they forgot to remove
xdfile_t.next, but was overlooked. This dual data structure setup
makes the code somewhat confusing.
Additionally the C type chastore_t isn't FFI friendly, and provides
little to no performance benefit over using realloc to grow an array.
Performance impact of deleting fields from xdfile_t:
Deleting ha is about 5% slower.
Deleting cha is about 5% faster.
Delete ha, but keep cha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.269 s ± 0.017 s [User: 1.135 s, System: 0.128 s]
Range (min … max): 1.249 s … 1.286 s 10 runs
Benchmark 2: build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.339 s ± 0.017 s [User: 1.234 s, System: 0.099 s]
Range (min … max): 1.320 s … 1.358 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.06 ± 0.02 times faster than build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete cha, but keep ha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.290 s ± 0.001 s [User: 1.154 s, System: 0.130 s]
Range (min … max): 1.288 s … 1.292 s 10 runs
Benchmark 2: build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.232 s ± 0.017 s [User: 1.105 s, System: 0.121 s]
Range (min … max): 1.205 s … 1.249 s 10 runs
Summary
build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.05 ± 0.01 times faster than build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete ha AND chastore
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha_and_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.291 s ± 0.002 s [User: 1.156 s, System: 0.129 s]
Range (min … max): 1.287 s … 1.295 s 10 runs
Benchmark 2: build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.306 s ± 0.001 s [User: 1.195 s, System: 0.105 s]
Range (min … max): 1.305 s … 1.308 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.01 ± 0.00 times faster than build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 24 ++++++++++----------
xdiff/xemit.c | 6 ++---
xdiff/xhistogram.c | 2 +-
xdiff/xmerge.c | 56 +++++++++++++++++++++++-----------------------
xdiff/xpatience.c | 10 ++++-----
xdiff/xprepare.c | 19 ++++++----------
xdiff/xtypes.h | 3 +--
xdiff/xutils.c | 12 +++++-----
8 files changed, 63 insertions(+), 69 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 11cd090b53..a66125d44a 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -24,7 +24,7 @@
static unsigned long get_hash(xdfile_t *xdf, long index)
{
- return xdf->recs[xdf->rindex[index]]->ha;
+ return xdf->recs[xdf->rindex[index]].ha;
}
#define XDL_MAX_COST_MIN 256
@@ -489,13 +489,13 @@ static void measure_split(const xdfile_t *xdf, long split,
m->indent = -1;
} else {
m->end_of_file = 0;
- m->indent = get_indent(xdf->recs[split]);
+ m->indent = get_indent(&xdf->recs[split]);
}
m->pre_blank = 0;
m->pre_indent = -1;
for (i = split - 1; i >= 0; i--) {
- m->pre_indent = get_indent(xdf->recs[i]);
+ m->pre_indent = get_indent(&xdf->recs[i]);
if (m->pre_indent != -1)
break;
m->pre_blank += 1;
@@ -508,7 +508,7 @@ static void measure_split(const xdfile_t *xdf, long split,
m->post_blank = 0;
m->post_indent = -1;
for (i = split + 1; i < xdf->nrec; i++) {
- m->post_indent = get_indent(xdf->recs[i]);
+ m->post_indent = get_indent(&xdf->recs[i]);
if (m->post_indent != -1)
break;
m->post_blank += 1;
@@ -752,7 +752,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
- recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
+ recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
xdf->rchg[g->start++] = 0;
xdf->rchg[g->end++] = 1;
@@ -773,7 +773,7 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
- recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1])) {
+ recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
xdf->rchg[--g->start] = 1;
xdf->rchg[--g->end] = 0;
@@ -988,16 +988,16 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
for (xch = xscr; xch; xch = xch->next) {
int ignore = 1;
- xrecord_t **rec;
+ xrecord_t *rec;
long i;
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+ ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
xch->ignore = ignore;
}
@@ -1021,7 +1021,7 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
xdchange_t *xch;
for (xch = xscr; xch; xch = xch->next) {
- xrecord_t **rec;
+ xrecord_t *rec;
int ignore = 1;
long i;
@@ -1033,11 +1033,11 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
rec = &xe->xdf1.recs[xch->i1];
for (i = 0; i < xch->chg1 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
rec = &xe->xdf2.recs[xch->i2];
for (i = 0; i < xch->chg2 && ignore; i++)
- ignore = record_matches_regex(rec[i], xpp);
+ ignore = record_matches_regex(&rec[i], xpp);
xch->ignore = ignore;
}
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 2161ac3cd0..b2f1f30cd3 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -25,7 +25,7 @@
static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
return -1;
@@ -110,7 +110,7 @@ static long def_ff(const char *rec, long len, char *buf, long sz)
static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
char *buf, long sz)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
if (!xecfg->find_func)
return def_ff(rec->ptr, rec->size, buf, sz);
@@ -150,7 +150,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
static int is_empty_rec(xdfile_t *xdf, long ri)
{
- xrecord_t *rec = xdf->recs[ri];
+ xrecord_t *rec = &xdf->recs[ri];
long i = 0;
for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 040d81e0bc..4d857e8ae2 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -86,7 +86,7 @@ struct region {
((LINE_MAP(index, ptr))->cnt)
#define REC(env, s, l) \
- (env->xdf##s.recs[l - 1])
+ (&env->xdf##s.recs[l - 1])
static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
{
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index af40c88a5b..fd600cbb5d 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -97,12 +97,12 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
int line_count, long flags)
{
int i;
- xrecord_t **rec1 = xe1->xdf2.recs + i1;
- xrecord_t **rec2 = xe2->xdf2.recs + i2;
+ xrecord_t *rec1 = xe1->xdf2.recs + i1;
+ xrecord_t *rec2 = xe2->xdf2.recs + i2;
for (i = 0; i < line_count; i++) {
- int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
- rec2[i]->ptr, rec2[i]->size, flags);
+ int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
+ rec2[i].ptr, rec2[i].size, flags);
if (!result)
return -1;
}
@@ -111,7 +111,7 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int needs_cr, int add_nl, char *dest)
{
- xrecord_t **recs;
+ xrecord_t *recs;
int size = 0;
recs = (use_orig ? xe->xdf1.recs : xe->xdf2.recs) + i;
@@ -119,12 +119,12 @@ static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int nee
if (count < 1)
return 0;
- for (i = 0; i < count; size += recs[i++]->size)
+ for (i = 0; i < count; size += recs[i++].size)
if (dest)
- memcpy(dest + size, recs[i]->ptr, recs[i]->size);
+ memcpy(dest + size, recs[i].ptr, recs[i].size);
if (add_nl) {
- i = recs[count - 1]->size;
- if (i == 0 || recs[count - 1]->ptr[i - 1] != '\n') {
+ i = recs[count - 1].size;
+ if (i == 0 || recs[count - 1].ptr[i - 1] != '\n') {
if (needs_cr) {
if (dest)
dest[size] = '\r';
@@ -160,22 +160,22 @@ static int is_eol_crlf(xdfile_t *file, int i)
if (i < file->nrec - 1)
/* All lines before the last *must* end in LF */
- return (size = file->recs[i]->size) > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ return (size = file->recs[i].size) > 1 &&
+ file->recs[i].ptr[size - 2] == '\r';
if (!file->nrec)
/* Cannot determine eol style from empty file */
return -1;
- if ((size = file->recs[i]->size) &&
- file->recs[i]->ptr[size - 1] == '\n')
+ if ((size = file->recs[i].size) &&
+ file->recs[i].ptr[size - 1] == '\n')
/* Last line; ends in LF; Is it CR/LF? */
return size > 1 &&
- file->recs[i]->ptr[size - 2] == '\r';
+ file->recs[i].ptr[size - 2] == '\r';
if (!i)
/* The only line has no eol */
return -1;
/* Determine eol from second-to-last line */
- return (size = file->recs[i - 1]->size) > 1 &&
- file->recs[i - 1]->ptr[size - 2] == '\r';
+ return (size = file->recs[i - 1].size) > 1 &&
+ file->recs[i - 1].ptr[size - 2] == '\r';
}
static int is_cr_needed(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m)
@@ -334,22 +334,22 @@ static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
static void xdl_refine_zdiff3_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
xpparam_t const *xpp)
{
- xrecord_t **rec1 = xe1->xdf2.recs, **rec2 = xe2->xdf2.recs;
+ xrecord_t *rec1 = xe1->xdf2.recs, *rec2 = xe2->xdf2.recs;
for (; m; m = m->next) {
/* let's handle just the conflicts */
if (m->mode)
continue;
while(m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1], rec2[m->i2], xpp->flags)) {
+ recmatch(&rec1[m->i1], &rec2[m->i2], xpp->flags)) {
m->chg1--;
m->chg2--;
m->i1++;
m->i2++;
}
while (m->chg1 && m->chg2 &&
- recmatch(rec1[m->i1 + m->chg1 - 1],
- rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
+ recmatch(&rec1[m->i1 + m->chg1 - 1],
+ &rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
m->chg1--;
m->chg2--;
}
@@ -381,12 +381,12 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
* This probably does not work outside git, since
* we have a very simple mmfile structure.
*/
- t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
- t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
- + xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
- t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
- t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
- + xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
+ t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
+ t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
+ + xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
+ t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
+ t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
+ + xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
return -1;
if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
@@ -440,8 +440,8 @@ static int line_contains_alnum(const char *ptr, long size)
static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
{
for (; chg; chg--, i++)
- if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
- xe->xdf2.recs[i]->size))
+ if (line_contains_alnum(xe->xdf2.recs[i].ptr,
+ xe->xdf2.recs[i].size))
return 1;
return 0;
}
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 77dc411d19..bf69a58527 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -88,9 +88,9 @@ static int is_anchor(xpparam_t const *xpp, const char *line)
static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
int pass)
{
- xrecord_t **records = pass == 1 ?
+ xrecord_t *records = pass == 1 ?
map->env->xdf1.recs : map->env->xdf2.recs;
- xrecord_t *record = records[line - 1];
+ xrecord_t *record = &records[line - 1];
/*
* After xdl_prepare_env() (or more precisely, due to
* xdl_classify_record()), the "ha" member of the records (AKA lines)
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
return;
map->entries[index].line1 = line;
map->entries[index].hash = record->ha;
- map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
+ map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
if (!map->first)
map->first = map->entries + index;
if (map->last) {
@@ -246,8 +246,8 @@ static int find_longest_common_sequence(struct hashmap *map, struct entry **res)
static int match(struct hashmap *map, int line1, int line2)
{
- xrecord_t *record1 = map->env->xdf1.recs[line1 - 1];
- xrecord_t *record2 = map->env->xdf2.recs[line2 - 1];
+ xrecord_t *record1 = &map->env->xdf1.recs[line1 - 1];
+ xrecord_t *record2 = &map->env->xdf2.recs[line2 - 1];
return record1->ha == record2->ha;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e6e2c0e1c0..27c5a4d636 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -128,7 +128,6 @@ static void xdl_free_ctx(xdfile_t *xdf)
xdl_free(xdf->rindex);
xdl_free(xdf->rchg - 1);
xdl_free(xdf->recs);
- xdl_cha_free(&xdf->rcha);
}
@@ -143,8 +142,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xdf->rchg = NULL;
xdf->recs = NULL;
- if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
- goto abort;
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
goto abort;
@@ -155,12 +152,10 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
hav = xdl_hash_record(&cur, top, xpp->flags);
if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
goto abort;
- if (!(crec = xdl_cha_alloc(&xdf->rcha)))
- goto abort;
+ crec = &xdf->recs[xdf->nrec++];
crec->ptr = prev;
crec->size = (long) (cur - prev);
crec->ha = hav;
- xdf->recs[xdf->nrec++] = crec;
if (xdl_classify_record(pass, cf, crec) < 0)
goto abort;
}
@@ -260,7 +255,7 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
*/
static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
long i, nm, nreff, mlim;
- xrecord_t **recs;
+ xrecord_t *recs;
xdlclass_t *rcrec;
char *dis, *dis1, *dis2;
int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
@@ -273,7 +268,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -281,7 +276,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
- rcrec = cf->rcrecs[(*recs)->ha];
+ rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
}
@@ -317,13 +312,13 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
*/
static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
long i, lim;
- xrecord_t **recs1, **recs2;
+ xrecord_t *recs1, *recs2;
recs1 = xdf1->recs;
recs2 = xdf2->recs;
for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
i++, recs1++, recs2++)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dstart = xdf2->dstart = i;
@@ -331,7 +326,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
recs1 = xdf1->recs + xdf1->nrec - 1;
recs2 = xdf2->recs + xdf2->nrec - 1;
for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
- if ((*recs1)->ha != (*recs2)->ha)
+ if (recs1->ha != recs2->ha)
break;
xdf1->dend = xdf1->nrec - i - 1;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 85848f1685..3d26cbf1ec 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -45,10 +45,9 @@ typedef struct s_xrecord {
} xrecord_t;
typedef struct s_xdfile {
- chastore_t rcha;
+ xrecord_t *recs;
long nrec;
long dstart, dend;
- xrecord_t **recs;
char *rchg;
long *rindex;
long nreff;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 444a108f87..332982b509 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -416,12 +416,12 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
mmfile_t subfile1, subfile2;
xdfenv_t env;
- subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
- subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
- diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
- subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
- subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
- diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
+ subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
+ subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
+ diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
+ subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
+ subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
+ diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
return -1;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v6 10/12] xdiff: rename rchg -> changed in xdfile_t
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
` (8 preceding siblings ...)
2025-09-26 22:41 ` [PATCH v6 09/12] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 11/12] xdiff: add macros DISCARD(0), KEEP(1), INVESTIGATE(2) in xprepare.c Ezekiel Newren via GitGitGadget
` (2 subsequent siblings)
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
The field rchg (now 'changed') declares if a line in a file is changed
or not. A later commit will change it's type from 'char' to 'bool'
to make its purpose even more clear.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 30 +++++++++++++++---------------
xdiff/xhistogram.c | 8 ++++----
xdiff/xpatience.c | 8 ++++----
xdiff/xprepare.c | 12 ++++++------
xdiff/xtypes.h | 2 +-
xdiff/xutils.c | 4 ++--
6 files changed, 32 insertions(+), 32 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index a66125d44a..bd5b31c664 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
- xdf2->rchg[xdf2->rindex[off2]] = 1;
+ xdf2->changed[xdf2->rindex[off2]] = 1;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
- xdf1->rchg[xdf1->rindex[off1]] = 1;
+ xdf1->changed[xdf1->rindex[off1]] = 1;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -708,7 +708,7 @@ struct xdlgroup {
static void group_init(xdfile_t *xdf, struct xdlgroup *g)
{
g->start = g->end = 0;
- while (xdf->rchg[g->end])
+ while (xdf->changed[g->end])
g->end++;
}
@@ -722,7 +722,7 @@ static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->start = g->end + 1;
- for (g->end = g->start; xdf->rchg[g->end]; g->end++)
+ for (g->end = g->start; xdf->changed[g->end]; g->end++)
;
return 0;
@@ -738,7 +738,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
return -1;
g->end = g->start - 1;
- for (g->start = g->end; xdf->rchg[g->start - 1]; g->start--)
+ for (g->start = g->end; xdf->changed[g->start - 1]; g->start--)
;
return 0;
@@ -753,10 +753,10 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
- xdf->rchg[g->start++] = 0;
- xdf->rchg[g->end++] = 1;
+ xdf->changed[g->start++] = 0;
+ xdf->changed[g->end++] = 1;
- while (xdf->rchg[g->end])
+ while (xdf->changed[g->end])
g->end++;
return 0;
@@ -774,10 +774,10 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
- xdf->rchg[--g->start] = 1;
- xdf->rchg[--g->end] = 0;
+ xdf->changed[--g->start] = 1;
+ xdf->changed[--g->end] = 0;
- while (xdf->rchg[g->start - 1])
+ while (xdf->changed[g->start - 1])
g->start--;
return 0;
@@ -932,16 +932,16 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
xdchange_t *cscr = NULL, *xch;
- char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
+ char *changed1 = xe->xdf1.changed, *changed2 = xe->xdf2.changed;
long i1, i2, l1, l2;
/*
* Trivial. Collects "groups" of changes and creates an edit script.
*/
for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
- if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
- for (l1 = i1; rchg1[i1 - 1]; i1--);
- for (l2 = i2; rchg2[i2 - 1]; i2--);
+ if (changed1[i1 - 1] || changed2[i2 - 1]) {
+ for (l1 = i1; changed1[i1 - 1]; i1--);
+ for (l2 = i2; changed2[i2 - 1]; i2--);
if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
xdl_free_script(cscr);
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 4d857e8ae2..15ca15f6b0 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -318,11 +318,11 @@ redo:
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = 1;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = 1;
return 0;
}
@@ -335,9 +335,9 @@ redo:
else {
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = 1;
while (count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = 1;
result = 0;
} else {
result = histogram_diff(xpp, env,
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index bf69a58527..14092ffb86 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* trivial case: one side is empty */
if (!count1) {
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = 1;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = 1;
return 0;
}
@@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* are there any matching lines at all? */
if (!map.has_matches) {
while(count1--)
- env->xdf1.rchg[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = 1;
while(count2--)
- env->xdf2.rchg[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = 1;
xdl_free(map.entries);
return 0;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 27c5a4d636..b9b19c36de 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -126,7 +126,7 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
static void xdl_free_ctx(xdfile_t *xdf)
{
xdl_free(xdf->rindex);
- xdl_free(xdf->rchg - 1);
+ xdl_free(xdf->changed - 1);
xdl_free(xdf->recs);
}
@@ -139,7 +139,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
xrecord_t *crec;
xdf->rindex = NULL;
- xdf->rchg = NULL;
+ xdf->changed = NULL;
xdf->recs = NULL;
if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
@@ -161,7 +161,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
}
}
- if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
+ if (!XDL_CALLOC_ARRAY(xdf->changed, xdf->nrec + 2))
goto abort;
if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
@@ -170,7 +170,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
goto abort;
}
- xdf->rchg += 1;
+ xdf->changed += 1;
xdf->nreff = 0;
xdf->dstart = 0;
xdf->dend = xdf->nrec - 1;
@@ -287,7 +287,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
(dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
} else
- xdf1->rchg[i] = 1;
+ xdf1->changed[i] = 1;
}
xdf1->nreff = nreff;
@@ -297,7 +297,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
(dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
} else
- xdf2->rchg[i] = 1;
+ xdf2->changed[i] = 1;
}
xdf2->nreff = nreff;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 3d26cbf1ec..c4b5d2d8fa 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -48,7 +48,7 @@ typedef struct s_xdfile {
xrecord_t *recs;
long nrec;
long dstart, dend;
- char *rchg;
+ char *changed;
long *rindex;
long nreff;
} xdfile_t;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 332982b509..ed65c222e6 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -425,8 +425,8 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
return -1;
- memcpy(diff_env->xdf1.rchg + line1 - 1, env.xdf1.rchg, count1);
- memcpy(diff_env->xdf2.rchg + line2 - 1, env.xdf2.rchg, count2);
+ memcpy(diff_env->xdf1.changed + line1 - 1, env.xdf1.changed, count1);
+ memcpy(diff_env->xdf2.changed + line2 - 1, env.xdf2.changed, count2);
xdl_free_env(&env);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v6 11/12] xdiff: add macros DISCARD(0), KEEP(1), INVESTIGATE(2) in xprepare.c
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
` (9 preceding siblings ...)
2025-09-26 22:41 ` [PATCH v6 10/12] xdiff: rename rchg -> changed in xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 12/12] xdiff: change type of xdfile_t.changed from char to bool Ezekiel Newren via GitGitGadget
2025-10-03 13:47 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Phillip Wood
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
This commit is refactor-only; no behavior is changed. A future commit
will use bool literals for changed[i].
The functions xdl_clean_mmatch() and xdl_cleanup_records() will be
cleaned up more in a future patch series. The changes to
xdl_cleanup_records(), in this patch, is just to make it clear why
`char rchg` is refactored to `bool changed`.
Rename dis* to action* and replace literal numericals with macros.
The old names came from when dis* (which I think was short for discard)
was treated like a boolean, but over time it grew into a ternary state
machine. The result was confusing because dis* and rchg* both used 0/1
values with different meanings.
The new names and macros make the states explicit. nm is short for
number of matches, and mlim is a heuristic limit:
nm == 0 -> action[i] = DISCARD -> changed[i] = true
0 < nm < mlim -> action[i] = KEEP -> changed[i] = false
nm >= mlim -> action[i] = INVESTIGATE -> changed[i] = xdl_clean_mmatch()
When need_min is true, only DISCARD and KEEP occur because the limit
is effectively infinite.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xprepare.c | 106 ++++++++++++++++++++++++++++++-----------------
1 file changed, 69 insertions(+), 37 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index b9b19c36de..55e3b50ce6 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -29,6 +29,9 @@
#define XDL_GUESS_NLINES1 256
#define XDL_GUESS_NLINES2 20
+#define DISCARD 0
+#define KEEP 1
+#define INVESTIGATE 2
typedef struct s_xdlclass {
struct s_xdlclass *next;
@@ -190,15 +193,15 @@ void xdl_free_env(xdfenv_t *xe) {
}
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
+static bool xdl_clean_mmatch(uint8_t const *action, long i, long s, long e) {
long r, rdis0, rpdis0, rdis1, rpdis1;
/*
- * Limits the window the is examined during the similar-lines
- * scan. The loops below stops when dis[i - r] == 1 (line that
- * has no match), but there are corner cases where the loop
- * proceed all the way to the extremities by causing huge
- * performance penalties in case of big files.
+ * Limits the window that is examined during the similar-lines
+ * scan. The loops below stops when action[i - r] == KEEP
+ * (line that has no match), but there are corner cases where
+ * the loop proceed all the way to the extremities by causing
+ * huge performance penalties in case of big files.
*/
if (i - s > XDL_SIMSCAN_WINDOW)
s = i - XDL_SIMSCAN_WINDOW;
@@ -207,40 +210,47 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
/*
* Scans the lines before 'i' to find a run of lines that either
- * have no match (dis[j] == 0) or have multiple matches (dis[j] > 1).
- * Note that we always call this function with dis[i] > 1, so the
- * current line (i) is already a multimatch line.
+ * have no match (action[j] == DISCARD) or have multiple matches
+ * (action[j] == INVESTIGATE). Note that we always call this
+ * function with action[i] == INVESTIGATE, so the current line
+ * (i) is already a multimatch line.
*/
for (r = 1, rdis0 = 0, rpdis0 = 1; (i - r) >= s; r++) {
- if (!dis[i - r])
+ if (action[i - r] == DISCARD)
rdis0++;
- else if (dis[i - r] == 2)
+ else if (action[i - r] == INVESTIGATE)
rpdis0++;
- else
+ else if (action[i - r] == KEEP)
break;
+ else
+ BUG("Illegal value for action[i - r]");
}
/*
- * If the run before the line 'i' found only multimatch lines, we
- * return 0 and hence we don't make the current line (i) discarded.
- * We want to discard multimatch lines only when they appear in the
- * middle of runs with nomatch lines (dis[j] == 0).
+ * If the run before the line 'i' found only multimatch lines,
+ * we return false and hence we don't make the current line (i)
+ * discarded. We want to discard multimatch lines only when
+ * they appear in the middle of runs with nomatch lines
+ * (action[j] == DISCARD).
*/
if (rdis0 == 0)
return 0;
for (r = 1, rdis1 = 0, rpdis1 = 1; (i + r) <= e; r++) {
- if (!dis[i + r])
+ if (action[i + r] == DISCARD)
rdis1++;
- else if (dis[i + r] == 2)
+ else if (action[i + r] == INVESTIGATE)
rpdis1++;
- else
+ else if (action[i + r] == KEEP)
break;
+ else
+ BUG("Illegal value for action[i + r]");
}
/*
- * If the run after the line 'i' found only multimatch lines, we
- * return 0 and hence we don't make the current line (i) discarded.
+ * If the run after the line 'i' found only multimatch lines,
+ * we return false and hence we don't make the current line (i)
+ * discarded.
*/
if (rdis1 == 0)
- return 0;
+ return false;
rdis1 += rdis0;
rpdis1 += rpdis0;
@@ -251,26 +261,38 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
/*
* Try to reduce the problem complexity, discard records that have no
* matches on the other file. Also, lines that have multiple matches
- * might be potentially discarded if they happear in a run of discardable.
+ * might be potentially discarded if they appear in a run of discardable.
*/
static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
long i, nm, nreff, mlim;
xrecord_t *recs;
xdlclass_t *rcrec;
- char *dis, *dis1, *dis2;
- int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
+ uint8_t *action1 = NULL, *action2 = NULL;
+ bool need_min = !!(cf->flags & XDF_NEED_MINIMAL);
+ int ret = 0;
- if (!XDL_CALLOC_ARRAY(dis, xdf1->nrec + xdf2->nrec + 2))
- return -1;
- dis1 = dis;
- dis2 = dis1 + xdf1->nrec + 1;
+ /*
+ * Create temporary arrays that will help us decide if
+ * changed[i] should remain 0 or become 1.
+ */
+ if (!XDL_CALLOC_ARRAY(action1, xdf1->nrec + 1)) {
+ ret = -1;
+ goto cleanup;
+ }
+ if (!XDL_CALLOC_ARRAY(action2, xdf2->nrec + 1)) {
+ ret = -1;
+ goto cleanup;
+ }
+ /*
+ * Initialize temporary arrays with DISCARD, KEEP, or INVESTIGATE.
+ */
if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
mlim = XDL_MAX_EQLIMIT;
for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len2 : 0;
- dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ action1[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
}
if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
@@ -278,32 +300,42 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
rcrec = cf->rcrecs[recs->ha];
nm = rcrec ? rcrec->len1 : 0;
- dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
+ action2[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
}
+ /*
+ * Use temporary arrays to decide if changed[i] should remain
+ * 0 or become 1.
+ */
for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
i <= xdf1->dend; i++, recs++) {
- if (dis1[i] == 1 ||
- (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
+ if (action1[i] == KEEP ||
+ (action1[i] == INVESTIGATE && !xdl_clean_mmatch(action1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
+ /* changed[i] remains 0, i.e. keep */
} else
xdf1->changed[i] = 1;
+ /* i.e. discard */
}
xdf1->nreff = nreff;
for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
i <= xdf2->dend; i++, recs++) {
- if (dis2[i] == 1 ||
- (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
+ if (action2[i] == KEEP ||
+ (action2[i] == INVESTIGATE && !xdl_clean_mmatch(action2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
+ /* changed[i] remains 0, i.e. keep */
} else
xdf2->changed[i] = 1;
+ /* i.e. discard */
}
xdf2->nreff = nreff;
- xdl_free(dis);
+cleanup:
+ xdl_free(action1);
+ xdl_free(action2);
- return 0;
+ return ret;
}
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* [PATCH v6 12/12] xdiff: change type of xdfile_t.changed from char to bool
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
` (10 preceding siblings ...)
2025-09-26 22:41 ` [PATCH v6 11/12] xdiff: add macros DISCARD(0), KEEP(1), INVESTIGATE(2) in xprepare.c Ezekiel Newren via GitGitGadget
@ 2025-09-26 22:41 ` Ezekiel Newren via GitGitGadget
2025-10-03 13:47 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Phillip Wood
12 siblings, 0 replies; 158+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-09-26 22:41 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Phillip Wood, Ben Knoble, Jeff King,
Ezekiel Newren, Ezekiel Newren
From: Ezekiel Newren <ezekielnewren@gmail.com>
The only values possible for 'changed' is 1 and 0, which exactly maps
to a bool type. It might not look like this because action1 and action2
(which use to be dis1, and dis2) were also of type char and were
assigned numerical values within a few lines of 'changed' (what used to
be rchg).
Using DISCARD/KEEP/INVESTIGATE for action1[i]/action2[j], and true/false
for changed[k] makes it clear to future readers that these are
logically separate concepts.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
xdiff/xdiffi.c | 14 +++++++-------
xdiff/xhistogram.c | 8 ++++----
xdiff/xpatience.c | 8 ++++----
xdiff/xprepare.c | 12 ++++++------
xdiff/xtypes.h | 2 +-
5 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index bd5b31c664..6f3998ee54 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
*/
if (off1 == lim1) {
for (; off2 < lim2; off2++)
- xdf2->changed[xdf2->rindex[off2]] = 1;
+ xdf2->changed[xdf2->rindex[off2]] = true;
} else if (off2 == lim2) {
for (; off1 < lim1; off1++)
- xdf1->changed[xdf1->rindex[off1]] = 1;
+ xdf1->changed[xdf1->rindex[off1]] = true;
} else {
xdpsplit_t spl;
spl.i1 = spl.i2 = 0;
@@ -753,8 +753,8 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->end < xdf->nrec &&
recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
- xdf->changed[g->start++] = 0;
- xdf->changed[g->end++] = 1;
+ xdf->changed[g->start++] = false;
+ xdf->changed[g->end++] = true;
while (xdf->changed[g->end])
g->end++;
@@ -774,8 +774,8 @@ static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
{
if (g->start > 0 &&
recs_match(&xdf->recs[g->start - 1], &xdf->recs[g->end - 1])) {
- xdf->changed[--g->start] = 1;
- xdf->changed[--g->end] = 0;
+ xdf->changed[--g->start] = true;
+ xdf->changed[--g->end] = false;
while (xdf->changed[g->start - 1])
g->start--;
@@ -932,7 +932,7 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
xdchange_t *cscr = NULL, *xch;
- char *changed1 = xe->xdf1.changed, *changed2 = xe->xdf2.changed;
+ bool *changed1 = xe->xdf1.changed, *changed2 = xe->xdf2.changed;
long i1, i2, l1, l2;
/*
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 15ca15f6b0..6dc450b1fe 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -318,11 +318,11 @@ redo:
if (!count1) {
while(count2--)
- env->xdf2.changed[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.changed[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = true;
return 0;
}
@@ -335,9 +335,9 @@ redo:
else {
if (lcs.begin1 == 0 && lcs.begin2 == 0) {
while (count1--)
- env->xdf1.changed[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = true;
while (count2--)
- env->xdf2.changed[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = true;
result = 0;
} else {
result = histogram_diff(xpp, env,
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 14092ffb86..669b653580 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -331,11 +331,11 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* trivial case: one side is empty */
if (!count1) {
while(count2--)
- env->xdf2.changed[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = true;
return 0;
} else if (!count2) {
while(count1--)
- env->xdf1.changed[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = true;
return 0;
}
@@ -347,9 +347,9 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
/* are there any matching lines at all? */
if (!map.has_matches) {
while(count1--)
- env->xdf1.changed[line1++ - 1] = 1;
+ env->xdf1.changed[line1++ - 1] = true;
while(count2--)
- env->xdf2.changed[line2++ - 1] = 1;
+ env->xdf2.changed[line2++ - 1] = true;
xdl_free(map.entries);
return 0;
}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 55e3b50ce6..192334f1b7 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -273,7 +273,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
/*
* Create temporary arrays that will help us decide if
- * changed[i] should remain 0 or become 1.
+ * changed[i] should remain false, or become true.
*/
if (!XDL_CALLOC_ARRAY(action1, xdf1->nrec + 1)) {
ret = -1;
@@ -305,16 +305,16 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
/*
* Use temporary arrays to decide if changed[i] should remain
- * 0 or become 1.
+ * false, or become true.
*/
for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
i <= xdf1->dend; i++, recs++) {
if (action1[i] == KEEP ||
(action1[i] == INVESTIGATE && !xdl_clean_mmatch(action1, i, xdf1->dstart, xdf1->dend))) {
xdf1->rindex[nreff++] = i;
- /* changed[i] remains 0, i.e. keep */
+ /* changed[i] remains false, i.e. keep */
} else
- xdf1->changed[i] = 1;
+ xdf1->changed[i] = true;
/* i.e. discard */
}
xdf1->nreff = nreff;
@@ -324,9 +324,9 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
if (action2[i] == KEEP ||
(action2[i] == INVESTIGATE && !xdl_clean_mmatch(action2, i, xdf2->dstart, xdf2->dend))) {
xdf2->rindex[nreff++] = i;
- /* changed[i] remains 0, i.e. keep */
+ /* changed[i] remains false, i.e. keep */
} else
- xdf2->changed[i] = 1;
+ xdf2->changed[i] = true;
/* i.e. discard */
}
xdf2->nreff = nreff;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index c4b5d2d8fa..f145abba3e 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -48,7 +48,7 @@ typedef struct s_xdfile {
xrecord_t *recs;
long nrec;
long dstart, dend;
- char *changed;
+ bool *changed;
long *rindex;
long nreff;
} xdfile_t;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 158+ messages in thread
* Re: [PATCH v5 04/13] xdiff: delete superfluous function xdl_get_rec() in xemit
2025-09-23 21:24 ` [PATCH v5 04/13] xdiff: delete superfluous function xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
@ 2025-09-30 13:31 ` Kristoffer Haugsbakk
2025-09-30 19:35 ` Ezekiel Newren
0 siblings, 1 reply; 158+ messages in thread
From: Kristoffer Haugsbakk @ 2025-09-30 13:31 UTC (permalink / raw)
To: Josh Soref, git
Cc: Elijah Newren, Phillip Wood, D. Ben Knoble, Jeff King,
Ezekiel Newren
On Tue, Sep 23, 2025, at 23:24, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> When xrecord_t was a linked list, and recs didn't exist, I assume this
> function walked the list until it found the right record. Accessing
> a contiguous array is so trival that this function is now superfluous.
s/trival/trivial/
> Delete it.
>
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 04/13] xdiff: delete superfluous function xdl_get_rec() in xemit
2025-09-30 13:31 ` Kristoffer Haugsbakk
@ 2025-09-30 19:35 ` Ezekiel Newren
2025-09-30 20:05 ` Junio C Hamano
0 siblings, 1 reply; 158+ messages in thread
From: Ezekiel Newren @ 2025-09-30 19:35 UTC (permalink / raw)
To: Kristoffer Haugsbakk
Cc: Josh Soref, git, Elijah Newren, Phillip Wood, D. Ben Knoble,
Jeff King
On Tue, Sep 30, 2025 at 7:31 AM Kristoffer Haugsbakk
<kristofferhaugsbakk@fastmail.com> wrote:
> > When xrecord_t was a linked list, and recs didn't exist, I assume this
> > function walked the list until it found the right record. Accessing
> > a contiguous array is so trival that this function is now superfluous.
>
> s/trival/trivial/
I think that other than this typo this patch series is ready to be
merged in. I would prefer that Junio fix this typo, so I don't spam
the mailing list with such a small change.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v5 04/13] xdiff: delete superfluous function xdl_get_rec() in xemit
2025-09-30 19:35 ` Ezekiel Newren
@ 2025-09-30 20:05 ` Junio C Hamano
0 siblings, 0 replies; 158+ messages in thread
From: Junio C Hamano @ 2025-09-30 20:05 UTC (permalink / raw)
To: Ezekiel Newren
Cc: Kristoffer Haugsbakk, Josh Soref, git, Elijah Newren,
Phillip Wood, D. Ben Knoble, Jeff King
Ezekiel Newren <ezekielnewren@gmail.com> writes:
> On Tue, Sep 30, 2025 at 7:31 AM Kristoffer Haugsbakk
> <kristofferhaugsbakk@fastmail.com> wrote:
>> > When xrecord_t was a linked list, and recs didn't exist, I assume this
>> > function walked the list until it found the right record. Accessing
>> > a contiguous array is so trival that this function is now superfluous.
>>
>> s/trival/trivial/
>
> I think that other than this typo this patch series is ready to be
> merged in. I would prefer that Junio fix this typo, so I don't spam
> the mailing list with such a small change.
If it is only this instance, I can "rebase -i" it away, sure.
Thanks.
^ permalink raw reply [flat|nested] 158+ messages in thread
* Re: [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff.
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
` (11 preceding siblings ...)
2025-09-26 22:41 ` [PATCH v6 12/12] xdiff: change type of xdfile_t.changed from char to bool Ezekiel Newren via GitGitGadget
@ 2025-10-03 13:47 ` Phillip Wood
12 siblings, 0 replies; 158+ messages in thread
From: Phillip Wood @ 2025-10-03 13:47 UTC (permalink / raw)
To: Ezekiel Newren via GitGitGadget, git
Cc: Elijah Newren, Ben Knoble, Jeff King, Ezekiel Newren
Hi Ezekiel
On 26/09/2025 23:41, Ezekiel Newren via GitGitGadget wrote:
> Changes since v5.
>
> * Address review feedback on commit messages.
> * Drop commit "xdiff: delete rchg aliasing"
> * Use DISCARD/KEEP/INVESTIGATE instead of NONE/SOME/TOO_MANY
> * Fix the word wrapping in the comments of xprepare.c
Thanks for expanding the commit messages, I think the range diff looks
good. There's a typo in patch 12 (see below) but its not worth
re-rolling just for that.
> Range-diff vs v5:
> [...]
> 12: 08a0fceb72 ! 11: f08782a977 xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
> @@ Metadata
> Author: Ezekiel Newren <ezekielnewren@gmail.com>
>
> ## Commit message ##
> - xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c
> + xdiff: add macros DISCARD(0), KEEP(1), INVESTIGATE(2) in xprepare.c
>
> - Rename dis1, dis2 to matches1, matches2.
> + This commit is refactor-only; no behavior is changed. A future commit
> + will use bool literals for changed[i].
>
> - Define macros NONE(0), SOME(1), TOO_MANY(2) as the enum values for
> - matches1 and matches2. These states will influence whether changed[i]
> - is set to 1 or kept as 0.
> + The functions xdl_clean_mmatch() and xdl_cleanup_records() will be
> + cleaned up more in a future patch series. The changes to
> + xdl_cleanup_records(), in this patch, is just to make it clear why
Not worth a re-roll on its own s/is/are/
Thanks for working on this
Phillip
> + `char rchg` is refactored to `bool changed`.
> +
> + Rename dis* to action* and replace literal numericals with macros.
> + The old names came from when dis* (which I think was short for discard)
> + was treated like a boolean, but over time it grew into a ternary state
> + machine. The result was confusing because dis* and rchg* both used 0/1
> + values with different meanings.
> +
> + The new names and macros make the states explicit. nm is short for
> + number of matches, and mlim is a heuristic limit:
> +
> + nm == 0 -> action[i] = DISCARD -> changed[i] = true
> + 0 < nm < mlim -> action[i] = KEEP -> changed[i] = false
> + nm >= mlim -> action[i] = INVESTIGATE -> changed[i] = xdl_clean_mmatch()
> +
> + When need_min is true, only DISCARD and KEEP occur because the limit
> + is effectively infinite.
>
> Best-viewed-with: --color-words
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> @@ xdiff/xprepare.c
> #define XDL_GUESS_NLINES1 256
> #define XDL_GUESS_NLINES2 20
>
> -+#define NONE 0
> -+#define SOME 1
> -+#define TOO_MANY 2
> ++#define DISCARD 0
> ++#define KEEP 1
> ++#define INVESTIGATE 2
>
> typedef struct s_xdlclass {
> struct s_xdlclass *next;
> @@ xdiff/xprepare.c: void xdl_free_env(xdfenv_t *xe) {
>
>
> -static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
> -+static bool xdl_clean_mmatch(uint8_t const *matches, long i, long s, long e) {
> ++static bool xdl_clean_mmatch(uint8_t const *action, long i, long s, long e) {
> long r, rdis0, rpdis0, rdis1, rpdis1;
>
> /*
> - * Limits the window the is examined during the similar-lines
> - * scan. The loops below stops when dis[i - r] == 1 (line that
> +- * has no match), but there are corner cases where the loop
> +- * proceed all the way to the extremities by causing huge
> +- * performance penalties in case of big files.
> + * Limits the window that is examined during the similar-lines
> -+ * scan. The loops below stops when matches[i - r] == SOME (line that
> - * has no match), but there are corner cases where the loop
> - * proceed all the way to the extremities by causing huge
> - * performance penalties in case of big files.
> ++ * scan. The loops below stops when action[i - r] == KEEP
> ++ * (line that has no match), but there are corner cases where
> ++ * the loop proceed all the way to the extremities by causing
> ++ * huge performance penalties in case of big files.
> + */
> + if (i - s > XDL_SIMSCAN_WINDOW)
> + s = i - XDL_SIMSCAN_WINDOW;
> @@ xdiff/xprepare.c: static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
>
> /*
> * Scans the lines before 'i' to find a run of lines that either
> - * have no match (dis[j] == 0) or have multiple matches (dis[j] > 1).
> - * Note that we always call this function with dis[i] > 1, so the
> -+ * have no match (matches[j] == NONE) or have multiple matches (matches[j] == TOO_MANY).
> -+ * Note that we always call this function with matches[i] == TOO_MANY, so the
> - * current line (i) is already a multimatch line.
> +- * current line (i) is already a multimatch line.
> ++ * have no match (action[j] == DISCARD) or have multiple matches
> ++ * (action[j] == INVESTIGATE). Note that we always call this
> ++ * function with action[i] == INVESTIGATE, so the current line
> ++ * (i) is already a multimatch line.
> */
> for (r = 1, rdis0 = 0, rpdis0 = 1; (i - r) >= s; r++) {
> - if (!dis[i - r])
> -+ if (matches[i - r] == NONE)
> ++ if (action[i - r] == DISCARD)
> rdis0++;
> - else if (dis[i - r] == 2)
> -+ else if (matches[i - r] == TOO_MANY)
> ++ else if (action[i - r] == INVESTIGATE)
> rpdis0++;
> - else
> -+ else if (matches[i - r] == SOME)
> ++ else if (action[i - r] == KEEP)
> break;
> + else
> -+ BUG("Illegal value for matches[i - r]");
> ++ BUG("Illegal value for action[i - r]");
> }
> /*
> - * If the run before the line 'i' found only multimatch lines, we
> +- * If the run before the line 'i' found only multimatch lines, we
> - * return 0 and hence we don't make the current line (i) discarded.
> -+ * return false and hence we don't make the current line (i) discarded.
> - * We want to discard multimatch lines only when they appear in the
> +- * We want to discard multimatch lines only when they appear in the
> - * middle of runs with nomatch lines (dis[j] == 0).
> -+ * middle of runs with nomatch lines (matches[j] == NONE).
> ++ * If the run before the line 'i' found only multimatch lines,
> ++ * we return false and hence we don't make the current line (i)
> ++ * discarded. We want to discard multimatch lines only when
> ++ * they appear in the middle of runs with nomatch lines
> ++ * (action[j] == DISCARD).
> */
> if (rdis0 == 0)
> return 0;
> for (r = 1, rdis1 = 0, rpdis1 = 1; (i + r) <= e; r++) {
> - if (!dis[i + r])
> -+ if (matches[i + r] == NONE)
> ++ if (action[i + r] == DISCARD)
> rdis1++;
> - else if (dis[i + r] == 2)
> -+ else if (matches[i + r] == TOO_MANY)
> ++ else if (action[i + r] == INVESTIGATE)
> rpdis1++;
> - else
> -+ else if (matches[i + r] == SOME)
> ++ else if (action[i + r] == KEEP)
> break;
> + else
> -+ BUG("Illegal value for matches[i + r]");
> ++ BUG("Illegal value for action[i + r]");
> }
> /*
> - * If the run after the line 'i' found only multimatch lines, we
> +- * If the run after the line 'i' found only multimatch lines, we
> - * return 0 and hence we don't make the current line (i) discarded.
> -+ * return false and hence we don't make the current line (i) discarded.
> ++ * If the run after the line 'i' found only multimatch lines,
> ++ * we return false and hence we don't make the current line (i)
> ++ * discarded.
> */
> if (rdis1 == 0)
> - return 0;
> @@ xdiff/xprepare.c: static int xdl_clean_mmatch(char const *dis, long i, long s, l
> xdlclass_t *rcrec;
> - char *dis, *dis1, *dis2;
> - int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
> -+ uint8_t *matches1, *matches2;
> -+ int status = 0;
> ++ uint8_t *action1 = NULL, *action2 = NULL;
> + bool need_min = !!(cf->flags & XDF_NEED_MINIMAL);
> ++ int ret = 0;
>
> - if (!XDL_CALLOC_ARRAY(dis, xdf1->nrec + xdf2->nrec + 2))
> - return -1;
> - dis1 = dis;
> - dis2 = dis1 + xdf1->nrec + 1;
> -+ matches1 = NULL;
> -+ matches2 = NULL;
> -+
> + /*
> + * Create temporary arrays that will help us decide if
> + * changed[i] should remain 0 or become 1.
> + */
> -+ if (!XDL_CALLOC_ARRAY(matches1, xdf1->nrec + 1)) {
> -+ status = -1;
> ++ if (!XDL_CALLOC_ARRAY(action1, xdf1->nrec + 1)) {
> ++ ret = -1;
> + goto cleanup;
> + }
> -+ if (!XDL_CALLOC_ARRAY(matches2, xdf2->nrec + 1)) {
> -+ status = -1;
> ++ if (!XDL_CALLOC_ARRAY(action2, xdf2->nrec + 1)) {
> ++ ret = -1;
> + goto cleanup;
> + }
>
> + /*
> -+ * Initialize temporary arrays with NONE, SOME, or TOO_MANY.
> ++ * Initialize temporary arrays with DISCARD, KEEP, or INVESTIGATE.
> + */
> if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
> mlim = XDL_MAX_EQLIMIT;
> @@ xdiff/xprepare.c: static int xdl_clean_mmatch(char const *dis, long i, long s, l
> rcrec = cf->rcrecs[recs->ha];
> nm = rcrec ? rcrec->len2 : 0;
> - dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
> -+ matches1[i] = (nm == 0) ? NONE: (nm >= mlim && !need_min) ? TOO_MANY: SOME;
> ++ action1[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
> }
>
> if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
> @@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *
> rcrec = cf->rcrecs[recs->ha];
> nm = rcrec ? rcrec->len1 : 0;
> - dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
> -+ matches2[i] = (nm == 0) ? NONE: (nm >= mlim && !need_min) ? TOO_MANY: SOME;
> ++ action2[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
> }
>
> + /*
> @@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *
> i <= xdf1->dend; i++, recs++) {
> - if (dis1[i] == 1 ||
> - (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
> -+ if (matches1[i] == SOME ||
> -+ (matches1[i] == TOO_MANY && !xdl_clean_mmatch(matches1, i, xdf1->dstart, xdf1->dend))) {
> ++ if (action1[i] == KEEP ||
> ++ (action1[i] == INVESTIGATE && !xdl_clean_mmatch(action1, i, xdf1->dstart, xdf1->dend))) {
> xdf1->rindex[nreff++] = i;
> -+ /* changed[i] remains 0 */
> ++ /* changed[i] remains 0, i.e. keep */
> } else
> xdf1->changed[i] = 1;
> ++ /* i.e. discard */
> }
> -@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> + xdf1->nreff = nreff;
>
> for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
> i <= xdf2->dend; i++, recs++) {
> - if (dis2[i] == 1 ||
> - (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
> -+ if (matches2[i] == SOME ||
> -+ (matches2[i] == TOO_MANY && !xdl_clean_mmatch(matches2, i, xdf2->dstart, xdf2->dend))) {
> ++ if (action2[i] == KEEP ||
> ++ (action2[i] == INVESTIGATE && !xdl_clean_mmatch(action2, i, xdf2->dstart, xdf2->dend))) {
> xdf2->rindex[nreff++] = i;
> -+ /* changed[i] remains 0 */
> ++ /* changed[i] remains 0, i.e. keep */
> } else
> xdf2->changed[i] = 1;
> ++ /* i.e. discard */
> }
> xdf2->nreff = nreff;
>
> - xdl_free(dis);
> +cleanup:
> -+ xdl_free(matches1);
> -+ xdl_free(matches2);
> ++ xdl_free(action1);
> ++ xdl_free(action2);
>
> - return 0;
> -+ return status;
> ++ return ret;
> }
>
>
> 13: 975e845bfa ! 12: 83e1ace5bd xdiff: change type of xdfile_t.changed from char to bool
> @@ Commit message
> xdiff: change type of xdfile_t.changed from char to bool
>
> The only values possible for 'changed' is 1 and 0, which exactly maps
> - to a bool type. It might not look like this is the case because
> - matches1 and matches2 (which use to be dis1, and dis2) were also char
> - and were assigned numerical values within a few lines of 'changed'
> - (what used to be rchg).
> + to a bool type. It might not look like this because action1 and action2
> + (which use to be dis1, and dis2) were also of type char and were
> + assigned numerical values within a few lines of 'changed' (what used to
> + be rchg).
>
> - Using NONE, SOME, TOO_MANY for matches1[i]/matches2[j], and true/false
> + Using DISCARD/KEEP/INVESTIGATE for action1[i]/action2[j], and true/false
> for changed[k] makes it clear to future readers that these are
> logically separate concepts.
>
> @@ xdiff/xdiffi.c: static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
>
> while (xdf->changed[g->start - 1])
> g->start--;
> +@@ xdiff/xdiffi.c: int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
> +
> + int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
> + xdchange_t *cscr = NULL, *xch;
> +- char *changed1 = xe->xdf1.changed, *changed2 = xe->xdf2.changed;
> ++ bool *changed1 = xe->xdf1.changed, *changed2 = xe->xdf2.changed;
> + long i1, i2, l1, l2;
> +
> + /*
>
> ## xdiff/xhistogram.c ##
> @@ xdiff/xhistogram.c: redo:
> @@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *
> - * changed[i] should remain 0 or become 1.
> + * changed[i] should remain false, or become true.
> */
> - if (!XDL_CALLOC_ARRAY(matches1, xdf1->nrec + 1)) {
> - status = -1;
> + if (!XDL_CALLOC_ARRAY(action1, xdf1->nrec + 1)) {
> + ret = -1;
> @@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
>
> /*
> @@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *
> */
> for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
> i <= xdf1->dend; i++, recs++) {
> - if (matches1[i] == SOME ||
> - (matches1[i] == TOO_MANY && !xdl_clean_mmatch(matches1, i, xdf1->dstart, xdf1->dend))) {
> + if (action1[i] == KEEP ||
> + (action1[i] == INVESTIGATE && !xdl_clean_mmatch(action1, i, xdf1->dstart, xdf1->dend))) {
> xdf1->rindex[nreff++] = i;
> -- /* changed[i] remains 0 */
> -+ /* changed[i] remains false */
> +- /* changed[i] remains 0, i.e. keep */
> ++ /* changed[i] remains false, i.e. keep */
> } else
> - xdf1->changed[i] = 1;
> + xdf1->changed[i] = true;
> + /* i.e. discard */
> }
> xdf1->nreff = nreff;
> -
> @@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
> - if (matches2[i] == SOME ||
> - (matches2[i] == TOO_MANY && !xdl_clean_mmatch(matches2, i, xdf2->dstart, xdf2->dend))) {
> + if (action2[i] == KEEP ||
> + (action2[i] == INVESTIGATE && !xdl_clean_mmatch(action2, i, xdf2->dstart, xdf2->dend))) {
> xdf2->rindex[nreff++] = i;
> -- /* changed[i] remains 0 */
> -+ /* changed[i] remains false */
> +- /* changed[i] remains 0, i.e. keep */
> ++ /* changed[i] remains false, i.e. keep */
> } else
> - xdf2->changed[i] = 1;
> + xdf2->changed[i] = true;
> + /* i.e. discard */
> }
> xdf2->nreff = nreff;
> -
>
> ## xdiff/xtypes.h ##
> @@ xdiff/xtypes.h: typedef struct s_xdfile {
>
^ permalink raw reply [flat|nested] 158+ messages in thread
end of thread, other threads:[~2025-10-03 13:47 UTC | newest]
Thread overview: 158+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-07 19:45 [PATCH 00/17] Use rust types in xdiff Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 01/17] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-09 8:55 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 02/17] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
2025-09-09 8:56 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 03/17] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
2025-09-09 8:56 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 04/17] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
2025-09-09 8:56 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 05/17] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
2025-09-09 8:56 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 06/17] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
2025-09-09 8:57 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 07/17] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
2025-09-09 8:57 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 08/17] xdiff: delete chastore from xdfile_t, view with --color-words Ezekiel Newren via GitGitGadget
2025-09-09 8:58 ` Elijah Newren
2025-09-09 13:50 ` Phillip Wood
2025-09-09 20:33 ` Junio C Hamano
2025-09-10 22:02 ` Ben Knoble
2025-09-07 19:45 ` [PATCH 09/17] xdiff: treat xdfile_t.rchg like an enum Ezekiel Newren via GitGitGadget
2025-09-09 8:58 ` Elijah Newren
2025-09-07 19:45 ` [PATCH 10/17] compat/rust_types.h: define rust primitive types Ezekiel Newren via GitGitGadget
2025-09-08 15:08 ` Junio C Hamano
2025-09-08 16:15 ` Ezekiel Newren
2025-09-07 19:45 ` [PATCH 11/17] xdiff: include compat/rust_types.h Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 12/17] xdiff: make xrecord_t.ptr a u8 instead of char Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 13/17] xdiff: make xrecord_t.size a usize instead of long Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 14/17] xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 15/17] xdiff: make xdfile_t.nrec a usize instead of long Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 16/17] xdiff: make xdfile_t.nreff " Ezekiel Newren via GitGitGadget
2025-09-07 19:45 ` [PATCH 17/17] xdiff: change the types of dstart, dend, rchg, and rindex in xdfile_t Ezekiel Newren via GitGitGadget
2025-09-16 21:56 ` [PATCH 00/17] Use rust types in xdiff Junio C Hamano
2025-09-16 22:01 ` Ezekiel Newren
2025-09-17 2:16 ` Elijah Newren
2025-09-17 13:53 ` Junio C Hamano
2025-09-17 6:22 ` Junio C Hamano
2025-09-18 23:56 ` [PATCH v2 00/10] " Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 01/10] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 02/10] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 03/10] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 04/10] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 05/10] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 06/10] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 07/10] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 08/10] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
2025-09-18 23:56 ` [PATCH v2 10/10] xdiff: treat xdfile_t.rchg like an enum Ezekiel Newren via GitGitGadget
2025-09-19 0:33 ` [PATCH v2 00/10] Use rust types in xdiff Junio C Hamano
2025-09-19 0:41 ` Ezekiel Newren
2025-09-19 15:15 ` Ezekiel Newren
2025-09-19 15:16 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t " Ezekiel Newren via GitGitGadget
2025-09-19 15:16 ` [PATCH v3 01/10] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-20 17:16 ` Junio C Hamano
2025-09-20 17:41 ` Ezekiel Newren
2025-09-20 18:31 ` Elijah Newren
2025-09-20 22:25 ` Ben Knoble
2025-09-20 22:43 ` Junio C Hamano
2025-09-20 17:46 ` Ben Knoble
2025-09-20 18:46 ` Jeff King
2025-09-20 22:25 ` Ben Knoble
2025-09-20 22:52 ` Junio C Hamano
2025-09-20 23:15 ` Jeff King
2025-09-19 15:16 ` [PATCH v3 02/10] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
2025-09-20 17:36 ` Junio C Hamano
2025-09-19 15:16 ` [PATCH v3 03/10] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
2025-09-19 15:16 ` [PATCH v3 04/10] xdiff: delete xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
2025-09-20 17:48 ` Junio C Hamano
2025-09-21 13:06 ` Phillip Wood
2025-09-21 15:07 ` Ezekiel Newren
2025-09-19 15:16 ` [PATCH v3 05/10] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
2025-09-21 13:06 ` Phillip Wood
2025-09-21 16:03 ` Ezekiel Newren
2025-09-19 15:16 ` [PATCH v3 06/10] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
2025-09-19 15:16 ` [PATCH v3 07/10] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
2025-09-21 13:06 ` Phillip Wood
2025-09-21 16:07 ` Ezekiel Newren
2025-09-19 15:16 ` [PATCH v3 08/10] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
2025-09-19 15:16 ` [PATCH v3 09/10] xdiff: delete rchg aliasing Ezekiel Newren via GitGitGadget
2025-09-21 13:07 ` Phillip Wood
2025-09-21 16:37 ` Ezekiel Newren
2025-09-19 15:16 ` [PATCH v3 10/10] xdiff: treat xdfile_t.rchg like an enum Ezekiel Newren via GitGitGadget
2025-09-21 0:00 ` Junio C Hamano
2025-09-21 0:38 ` Ezekiel Newren
2025-09-21 9:19 ` Phillip Wood
2025-09-21 16:11 ` Ezekiel Newren
2025-09-19 23:30 ` [PATCH v3 00/10] Cleanup xdfile_t and xrecord_t in xdiff Elijah Newren
2025-09-19 23:37 ` Ezekiel Newren
2025-09-22 19:51 ` [PATCH v4 00/12] " Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 01/12] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 02/12] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 03/12] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 04/12] xdiff: delete superfluous function xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 05/12] xdiff: delete superfluous local variables that alias fields in xrecord_t Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 06/12] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 07/12] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 08/12] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 09/12] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 10/12] xdiff: delete rchg aliasing Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 11/12] xdiff: use bool literals for xdfile_t.rchg Ezekiel Newren via GitGitGadget
2025-09-22 19:51 ` [PATCH v4 12/12] xdiff: refactor 'char *rchg' to 'bool *changed' in xdfile_t Ezekiel Newren via GitGitGadget
2025-09-22 22:39 ` [PATCH v4 00/12] Cleanup xdfile_t and xrecord_t in xdiff Junio C Hamano
2025-09-23 0:13 ` Ezekiel Newren
2025-09-23 1:06 ` Junio C Hamano
2025-09-23 1:30 ` Ezekiel Newren
2025-09-23 14:12 ` Junio C Hamano
2025-09-23 16:50 ` Ezekiel Newren
2025-09-23 21:24 ` [PATCH v5 00/13] " Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 01/13] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 02/13] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 03/13] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 04/13] xdiff: delete superfluous function xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
2025-09-30 13:31 ` Kristoffer Haugsbakk
2025-09-30 19:35 ` Ezekiel Newren
2025-09-30 20:05 ` Junio C Hamano
2025-09-23 21:24 ` [PATCH v5 05/13] xdiff: delete superfluous local variables that alias fields in xrecord_t Ezekiel Newren via GitGitGadget
2025-09-24 10:22 ` Phillip Wood
2025-09-24 14:52 ` Ezekiel Newren
2025-09-23 21:24 ` [PATCH v5 06/13] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 07/13] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 08/13] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 09/13] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
2025-09-23 21:24 ` [PATCH v5 10/13] xdiff: delete rchg aliasing Ezekiel Newren via GitGitGadget
2025-09-24 10:22 ` Phillip Wood
2025-09-24 15:01 ` Ezekiel Newren
2025-09-24 15:34 ` Junio C Hamano
2025-09-24 15:58 ` Ezekiel Newren
2025-09-24 21:31 ` Junio C Hamano
2025-09-24 22:46 ` Ezekiel Newren
2025-09-25 7:09 ` Junio C Hamano
2025-09-25 22:02 ` Ezekiel Newren
2025-09-23 21:24 ` [PATCH v5 11/13] xdiff: rename rchg -> changed in xdfile_t Ezekiel Newren via GitGitGadget
2025-09-24 10:22 ` Phillip Wood
2025-09-24 15:10 ` Ezekiel Newren
2025-09-24 15:18 ` Phillip Wood
2025-09-23 21:24 ` [PATCH v5 12/13] xdiff: use enum macros NONE(0), SOME(1), TOO_MANY(2) in xprepare.c Ezekiel Newren via GitGitGadget
2025-09-24 10:21 ` Phillip Wood
2025-09-24 14:46 ` Ezekiel Newren
2025-09-24 15:18 ` Phillip Wood
2025-09-24 17:29 ` Junio C Hamano
2025-09-25 18:40 ` Ezekiel Newren
2025-09-26 2:29 ` Ezekiel Newren
2025-09-23 21:24 ` [PATCH v5 13/13] xdiff: change type of xdfile_t.changed from char to bool Ezekiel Newren via GitGitGadget
2025-09-24 10:21 ` Phillip Wood
2025-09-24 15:14 ` Ezekiel Newren
2025-09-26 22:41 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 01/12] xdiff: delete static forward declarations in xprepare Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 02/12] xdiff: delete local variables and initialize/free xdfile_t directly Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 03/12] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 04/12] xdiff: delete superfluous function xdl_get_rec() in xemit Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 05/12] xdiff: delete local variables that alias fields in xrecord_t Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 06/12] xdiff: delete struct diffdata_t Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 07/12] xdiff: delete redundant array xdfile_t.ha Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 08/12] xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 09/12] xdiff: delete chastore from xdfile_t Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 10/12] xdiff: rename rchg -> changed in xdfile_t Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 11/12] xdiff: add macros DISCARD(0), KEEP(1), INVESTIGATE(2) in xprepare.c Ezekiel Newren via GitGitGadget
2025-09-26 22:41 ` [PATCH v6 12/12] xdiff: change type of xdfile_t.changed from char to bool Ezekiel Newren via GitGitGadget
2025-10-03 13:47 ` [PATCH v6 00/12] Cleanup xdfile_t and xrecord_t in xdiff Phillip Wood
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).