git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
@ 2025-07-17 20:32 Ezekiel Newren via GitGitGadget
  2025-07-17 20:32 ` [PATCH 1/7] xdiff: introduce rust Ezekiel Newren via GitGitGadget
                   ` (12 more replies)
  0 siblings, 13 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-07-17 20:32 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Ezekiel Newren

This series accelerates xdiff by 5-19%.

It also introduces Rust as a hard dependency.

…and it doesn’t yet pass a couple of the github workflows; hints from
Windows experts, and opinions on ambiguous primitives would be appreciated
(see below).

This is just the beginning of many patches that I have to convert portions
of, maybe eventually all of, xdiff to Rust. While working on that
conversion, I found several ways to clarify the code, along with some
optimizations.

So...

This obviously raises the question of whether we are ready to accept a hard
dependency on Rust. Previous discussions on the mailing list and at Git
Merge 2024 have not answered that question. If not now, will we be willing
to accept such a hard dependency later? And what route do we want to take to
get there?

About the optimizations in this series:

1. xdiff currently uses DJB2a for hashing (even though it is not explicitly named as such). This is an older hashing algorithm, and modern alternatives are superior. I chose xxhash because it’s faster, more collision resistant, and designed to be a standard. Other hash algorithms like aHash, MurMurHash, SipHash, and Fnv1a were considered, but my local testing made me feel like xxhash was the best choice for usage in xdiff.

2. In support of switching to xxhash, parsing and hashing were split into separate steps. And it turns out that memchr() is faster for parsing than character-by-character iteration.


About the workflow builds/tests that aren’t working with this series:

1. Windows fails to build. I don’t know which rust toolchain is even correct for this or if multiple are needed.  Example failed build: https://github.com/git/git/actions/runs/16353209191

2. I386/ubuntu:focal will build, but fails the tests. The kernel reports the bitness as 64 despite the container being 32. I believe the issue is that C uses ambiguous primitives (which differ in size between platforms). The new code should use unambiguous primitives from Rust (u32, u64, etc.) rather than perpetuating ambiguous primitive types.  Since the current xdiff API hardcodes the ambiguous types, though, those places will need to be migrated to unambiguous primitives. Much of the C code needs a slight refactor to be compatible with the Rust FFI and usually requires converting ambiguous to unambiguous types. What does this community think of this approach?


My brother (Elijah, cc’ed) has been guiding and reviewing my work here.

Ezekiel Newren (7):
  xdiff: introduce rust
  xdiff/xprepare: remove superfluous forward declarations
  xdiff: delete unnecessary fields from xrecord_t and xdfile_t
  xdiff: make fields of xrecord_t Rust friendly
  xdiff: separate parsing lines from hashing them
  xdiff: conditionally use Rust's implementation of xxhash
  github_workflows: install rust

 .github/workflows/main.yml |   1 +
 .gitignore                 |   1 +
 Makefile                   |  60 +++++++---
 build_rust.sh              |  59 ++++++++++
 ci/install-dependencies.sh |  14 +--
 ci/install-rust.sh         |  33 ++++++
 ci/lib.sh                  |   8 ++
 ci/make-test-artifacts.sh  |   7 ++
 ci/run-build-and-tests.sh  |  10 ++
 git-compat-util.h          |  17 +++
 meson.build                |  40 +++++--
 rust/Cargo.lock            |  21 ++++
 rust/Cargo.toml            |   6 +
 rust/interop/Cargo.toml    |  14 +++
 rust/interop/src/lib.rs    |   0
 rust/xdiff/Cargo.toml      |  16 +++
 rust/xdiff/src/lib.rs      |   7 ++
 xdiff/xdiffi.c             |   8 +-
 xdiff/xemit.c              |   2 +-
 xdiff/xmerge.c             |  14 +--
 xdiff/xpatience.c          |   2 +-
 xdiff/xprepare.c           | 226 ++++++++++++++++++-------------------
 xdiff/xtypes.h             |   9 +-
 xdiff/xutils.c             |   4 +-
 24 files changed, 414 insertions(+), 165 deletions(-)
 create mode 100755 build_rust.sh
 create mode 100644 ci/install-rust.sh
 create mode 100644 rust/Cargo.lock
 create mode 100644 rust/Cargo.toml
 create mode 100644 rust/interop/Cargo.toml
 create mode 100644 rust/interop/src/lib.rs
 create mode 100644 rust/xdiff/Cargo.toml
 create mode 100644 rust/xdiff/src/lib.rs


base-commit: 16bd9f20a403117f2e0d9bcda6c6e621d3763e77
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1980%2Fezekielnewren%2Fxdiff_rust_speedup-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1980/ezekielnewren/xdiff_rust_speedup-v1
Pull-Request: https://github.com/git/git/pull/1980
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 198+ messages in thread

* [PATCH 1/7] xdiff: introduce rust
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
@ 2025-07-17 20:32 ` Ezekiel Newren via GitGitGadget
  2025-07-17 21:30   ` brian m. carlson
  2025-07-17 22:38   ` Taylor Blau
  2025-07-17 20:32 ` [PATCH 2/7] xdiff/xprepare: remove superfluous forward declarations Ezekiel Newren via GitGitGadget
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-07-17 20:32 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Upcoming patches will accelerate and simplify xdiff, while also
porting parts of it to Rust. In preparation, add some stubs and setup
the Rust build. For now, it is easier to let cargo build rust and
have make or meson merely link against the static library that cargo
builds. In line with ongoing libification efforts, use multiple
crates to allow more modularity on the Rust side. xdiff is the crate
that this series will focus on, but we also introduce the interop
crate for future patch series.

In order to facilitate interoperability between C and Rust, introduce
C definitions for Rust primitive types in git-compat-util.h.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 Makefile                | 20 +++++++++++++++++++-
 git-compat-util.h       | 17 +++++++++++++++++
 meson.build             | 32 ++++++++++++++++++++++++++++++++
 rust/Cargo.lock         | 14 ++++++++++++++
 rust/Cargo.toml         |  6 ++++++
 rust/interop/Cargo.toml | 14 ++++++++++++++
 rust/interop/src/lib.rs |  0
 rust/xdiff/Cargo.toml   | 15 +++++++++++++++
 rust/xdiff/src/lib.rs   |  0
 9 files changed, 117 insertions(+), 1 deletion(-)
 create mode 100644 rust/Cargo.lock
 create mode 100644 rust/Cargo.toml
 create mode 100644 rust/interop/Cargo.toml
 create mode 100644 rust/interop/src/lib.rs
 create mode 100644 rust/xdiff/Cargo.toml
 create mode 100644 rust/xdiff/src/lib.rs

diff --git a/Makefile b/Makefile
index 70d1543b6b86..db39e6e1c28e 100644
--- a/Makefile
+++ b/Makefile
@@ -919,6 +919,11 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+ifeq ($(DEBUG), 1)
+RUST_LIB = rust/target/debug/libxdiff.a
+else
+RUST_LIB = rust/target/release/libxdiff.a
+endif
 REFTABLE_LIB = reftable/libreftable.a
 
 GENERATED_H += command-list.h
@@ -1392,6 +1397,8 @@ UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/lib-reftable.o
 GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
 EXTLIBS =
 
+GITLIBS += $(RUST_LIB)
+
 GIT_USER_AGENT = git/$(GIT_VERSION)
 
 ifeq ($(wildcard sha1collisiondetection/lib/sha1.h),sha1collisiondetection/lib/sha1.h)
@@ -2925,6 +2932,14 @@ $(LIB_FILE): $(LIB_OBJS)
 $(XDIFF_LIB): $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
+.PHONY: $(RUST_LIB)
+$(RUST_LIB):
+ifeq ($(DEBUG), 1)
+	cd rust && RUSTFLAGS="-Aunused_imports -Adead_code" cargo build --verbose
+else
+	cd rust && RUSTFLAGS="-Aunused_imports -Adead_code" cargo build --verbose --release
+endif
+
 $(REFTABLE_LIB): $(REFTABLE_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
@@ -3756,7 +3771,10 @@ cocciclean:
 	$(RM) -r .build/contrib/coccinelle
 	$(RM) contrib/coccinelle/*.cocci.patch
 
-clean: profile-clean coverage-clean cocciclean
+rustclean:
+	cd rust && cargo clean
+
+clean: profile-clean coverage-clean cocciclean rustclean
 	$(RM) -r .build $(UNIT_TEST_BIN)
 	$(RM) GIT-TEST-SUITES
 	$(RM) po/git.pot po/git-core.pot
diff --git a/git-compat-util.h b/git-compat-util.h
index 4678e21c4cb8..82dc99764ac0 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -196,6 +196,23 @@ static inline int is_xplatform_dir_sep(int c)
 #include "compat/msvc.h"
 #endif
 
+/* rust types */
+typedef uint8_t   u8;
+typedef uint16_t  u16;
+typedef uint32_t  u32;
+typedef uint64_t  u64;
+
+typedef int8_t    i8;
+typedef int16_t   i16;
+typedef int32_t   i32;
+typedef int64_t   i64;
+
+typedef float     f32;
+typedef double    f64;
+
+typedef size_t    usize;
+typedef ptrdiff_t isize;
+
 /* used on Mac OS X */
 #ifdef PRECOMPOSE_UNICODE
 #include "compat/precompose_utf8.h"
diff --git a/meson.build b/meson.build
index 596f5ac7110e..2d8da17f6515 100644
--- a/meson.build
+++ b/meson.build
@@ -267,6 +267,36 @@ version_gen_environment.set('GIT_DATE', get_option('build_date'))
 version_gen_environment.set('GIT_USER_AGENT', get_option('user_agent'))
 version_gen_environment.set('GIT_VERSION', get_option('version'))
 
+if get_option('optimization') in ['2', '3', 's', 'z']
+  rust_target = 'release'
+  rust_args = ['--release']
+  rustflags = '-Aunused_imports -Adead_code'
+else
+  rust_target = 'debug'
+  rust_args = []
+  rustflags = '-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
+endif
+
+
+rust_leaf = custom_target('rust_leaf',
+  output: 'libxdiff.a',
+  build_by_default: true,
+  build_always_stale: true,
+  command: ['cargo', 'build',
+            '--manifest-path', meson.project_source_root() / 'rust/Cargo.toml'
+  ] + rust_args,
+  env: {
+    'RUSTFLAGS': rustflags,
+  },
+  install: false,
+)
+
+rust_xdiff_dep = declare_dependency(
+  link_args: ['-L' + meson.project_source_root() / 'rust/target' / rust_target, '-lxdiff'],
+#  include_directories: include_directories('xdiff/include'),  # Adjust if you expose headers
+)
+
+
 compiler = meson.get_compiler('c')
 
 libgit_sources = [
@@ -1677,6 +1707,8 @@ version_def_h = custom_target(
 )
 libgit_sources += version_def_h
 
+libgit_dependencies += rust_xdiff_dep
+
 libgit = declare_dependency(
   link_with: static_library('git',
     sources: libgit_sources,
diff --git a/rust/Cargo.lock b/rust/Cargo.lock
new file mode 100644
index 000000000000..fb1eac690b39
--- /dev/null
+++ b/rust/Cargo.lock
@@ -0,0 +1,14 @@
+# This file is automatically @generated by Cargo.
+# It is not intended for manual editing.
+version = 4
+
+[[package]]
+name = "interop"
+version = "0.1.0"
+
+[[package]]
+name = "xdiff"
+version = "0.1.0"
+dependencies = [
+ "interop",
+]
diff --git a/rust/Cargo.toml b/rust/Cargo.toml
new file mode 100644
index 000000000000..ed3d79d7f827
--- /dev/null
+++ b/rust/Cargo.toml
@@ -0,0 +1,6 @@
+[workspace]
+members = [
+    "xdiff",
+    "interop",
+]
+resolver = "2"
diff --git a/rust/interop/Cargo.toml b/rust/interop/Cargo.toml
new file mode 100644
index 000000000000..045e3b01cfad
--- /dev/null
+++ b/rust/interop/Cargo.toml
@@ -0,0 +1,14 @@
+[package]
+name = "interop"
+version = "0.1.0"
+edition = "2021"
+
+[lib]
+name = "interop"
+path = "src/lib.rs"
+## staticlib to generate xdiff.a for use by gcc
+## cdylib (optional) to generate xdiff.so for use by gcc
+## rlib is required by the rust unit tests
+crate-type = ["staticlib", "rlib"]
+
+[dependencies]
diff --git a/rust/interop/src/lib.rs b/rust/interop/src/lib.rs
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/rust/xdiff/Cargo.toml b/rust/xdiff/Cargo.toml
new file mode 100644
index 000000000000..eb7966aada64
--- /dev/null
+++ b/rust/xdiff/Cargo.toml
@@ -0,0 +1,15 @@
+[package]
+name = "xdiff"
+version = "0.1.0"
+edition = "2021"
+
+[lib]
+name = "xdiff"
+path = "src/lib.rs"
+## staticlib to generate xdiff.a for use by gcc
+## cdylib (optional) to generate xdiff.so for use by gcc
+## rlib is required by the rust unit tests
+crate-type = ["staticlib", "rlib"]
+
+[dependencies]
+interop = { path = "../interop" }
diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
new file mode 100644
index 000000000000..e69de29bb2d1
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH 2/7] xdiff/xprepare: remove superfluous forward declarations
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
  2025-07-17 20:32 ` [PATCH 1/7] xdiff: introduce rust Ezekiel Newren via GitGitGadget
@ 2025-07-17 20:32 ` Ezekiel Newren via GitGitGadget
  2025-07-17 22:41   ` Taylor Blau
  2025-07-17 20:32 ` [PATCH 3/7] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-07-17 20:32 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Move xdl_prepare_env() later in the file to avoid the need
for forward declarations.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xprepare.c | 116 ++++++++++++++++++++---------------------------
 1 file changed, 50 insertions(+), 66 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e1d4017b2dde..a45c5ee208c8 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -53,21 +53,6 @@ typedef struct s_xdlclassifier {
 
 
 
-static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags);
-static void xdl_free_classifier(xdlclassifier_t *cf);
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
-			       unsigned int hbits, xrecord_t *rec);
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
-			   xdlclassifier_t *cf, xdfile_t *xdf);
-static void xdl_free_ctx(xdfile_t *xdf);
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e);
-static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-
-
-
-
 static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags) {
 	cf->flags = flags;
 
@@ -242,57 +227,6 @@ static void xdl_free_ctx(xdfile_t *xdf) {
 }
 
 
-int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
-		    xdfenv_t *xe) {
-	long enl1, enl2, sample;
-	xdlclassifier_t cf;
-
-	memset(&cf, 0, sizeof(cf));
-
-	/*
-	 * For histogram diff, we can afford a smaller sample size and
-	 * thus a poorer estimate of the number of lines, as the hash
-	 * table (rhash) won't be filled up/grown. The number of lines
-	 * (nrecs) will be updated correctly anyway by
-	 * xdl_prepare_ctx().
-	 */
-	sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
-		  ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
-
-	enl1 = xdl_guess_lines(mf1, sample) + 1;
-	enl2 = xdl_guess_lines(mf2, sample) + 1;
-
-	if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
-		return -1;
-
-	if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
-
-		xdl_free_classifier(&cf);
-		return -1;
-	}
-	if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
-
-		xdl_free_ctx(&xe->xdf1);
-		xdl_free_classifier(&cf);
-		return -1;
-	}
-
-	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
-	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
-	    xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
-
-		xdl_free_ctx(&xe->xdf2);
-		xdl_free_ctx(&xe->xdf1);
-		xdl_free_classifier(&cf);
-		return -1;
-	}
-
-	xdl_free_classifier(&cf);
-
-	return 0;
-}
-
-
 void xdl_free_env(xdfenv_t *xe) {
 
 	xdl_free_ctx(&xe->xdf2);
@@ -460,3 +394,53 @@ static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2
 
 	return 0;
 }
+
+int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+		    xdfenv_t *xe) {
+	long enl1, enl2, sample;
+	xdlclassifier_t cf;
+
+	memset(&cf, 0, sizeof(cf));
+
+	/*
+	 * For histogram diff, we can afford a smaller sample size and
+	 * thus a poorer estimate of the number of lines, as the hash
+	 * table (rhash) won't be filled up/grown. The number of lines
+	 * (nrecs) will be updated correctly anyway by
+	 * xdl_prepare_ctx().
+	 */
+	sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
+		  ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
+
+	enl1 = xdl_guess_lines(mf1, sample) + 1;
+	enl2 = xdl_guess_lines(mf2, sample) + 1;
+
+	if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
+		return -1;
+
+	if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
+
+		xdl_free_classifier(&cf);
+		return -1;
+	}
+	if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
+
+		xdl_free_ctx(&xe->xdf1);
+		xdl_free_classifier(&cf);
+		return -1;
+	}
+
+	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
+	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
+	    xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
+
+		xdl_free_ctx(&xe->xdf2);
+		xdl_free_ctx(&xe->xdf1);
+		xdl_free_classifier(&cf);
+		return -1;
+	    }
+
+	xdl_free_classifier(&cf);
+
+	return 0;
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH 3/7] xdiff: delete unnecessary fields from xrecord_t and xdfile_t
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
  2025-07-17 20:32 ` [PATCH 1/7] xdiff: introduce rust Ezekiel Newren via GitGitGadget
  2025-07-17 20:32 ` [PATCH 2/7] xdiff/xprepare: remove superfluous forward declarations Ezekiel Newren via GitGitGadget
@ 2025-07-17 20:32 ` Ezekiel Newren via GitGitGadget
  2025-07-17 20:32 ` [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-07-17 20:32 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xprepare.c | 24 +++---------------------
 xdiff/xtypes.h   |  3 ---
 2 files changed, 3 insertions(+), 24 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index a45c5ee208c8..ad356281f939 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -91,8 +91,7 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
 }
 
 
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
-			       unsigned int hbits, xrecord_t *rec) {
+static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
 	long hi;
 	char const *line;
 	xdlclass_t *rcrec;
@@ -126,23 +125,17 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 
 	rec->ha = (unsigned long) rcrec->idx;
 
-	hi = (long) XDL_HASHLONG(rec->ha, hbits);
-	rec->next = rhash[hi];
-	rhash[hi] = rec;
-
 	return 0;
 }
 
 
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
 			   xdlclassifier_t *cf, xdfile_t *xdf) {
-	unsigned int hbits;
-	long nrec, hsize, bsize;
+	long nrec, bsize;
 	unsigned long hav;
 	char const *blk, *cur, *top, *prev;
 	xrecord_t *crec;
 	xrecord_t **recs;
-	xrecord_t **rhash;
 	unsigned long *ha;
 	char *rchg;
 	long *rindex;
@@ -150,7 +143,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	ha = NULL;
 	rindex = NULL;
 	rchg = NULL;
-	rhash = NULL;
 	recs = NULL;
 
 	if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
@@ -158,11 +150,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	if (!XDL_ALLOC_ARRAY(recs, narec))
 		goto abort;
 
-	hbits = xdl_hashbits((unsigned int) narec);
-	hsize = 1 << hbits;
-	if (!XDL_CALLOC_ARRAY(rhash, hsize))
-		goto abort;
-
 	nrec = 0;
 	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
 		for (top = blk + bsize; cur < top; ) {
@@ -176,7 +163,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 			crec->size = (long) (cur - prev);
 			crec->ha = hav;
 			recs[nrec++] = crec;
-			if (xdl_classify_record(pass, cf, rhash, hbits, crec) < 0)
+			if (xdl_classify_record(pass, cf, crec) < 0)
 				goto abort;
 		}
 	}
@@ -194,8 +181,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 
 	xdf->nrec = nrec;
 	xdf->recs = recs;
-	xdf->hbits = hbits;
-	xdf->rhash = rhash;
 	xdf->rchg = rchg + 1;
 	xdf->rindex = rindex;
 	xdf->nreff = 0;
@@ -209,7 +194,6 @@ abort:
 	xdl_free(ha);
 	xdl_free(rindex);
 	xdl_free(rchg);
-	xdl_free(rhash);
 	xdl_free(recs);
 	xdl_cha_free(&xdf->rcha);
 	return -1;
@@ -217,8 +201,6 @@ abort:
 
 
 static void xdl_free_ctx(xdfile_t *xdf) {
-
-	xdl_free(xdf->rhash);
 	xdl_free(xdf->rindex);
 	xdl_free(xdf->rchg - 1);
 	xdl_free(xdf->ha);
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8442bd436efe..8b8467360ecf 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,7 +39,6 @@ typedef struct s_chastore {
 } chastore_t;
 
 typedef struct s_xrecord {
-	struct s_xrecord *next;
 	char const *ptr;
 	long size;
 	unsigned long ha;
@@ -48,8 +47,6 @@ typedef struct s_xrecord {
 typedef struct s_xdfile {
 	chastore_t rcha;
 	long nrec;
-	unsigned int hbits;
-	xrecord_t **rhash;
 	long dstart, dend;
 	xrecord_t **recs;
 	char *rchg;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
                   ` (2 preceding siblings ...)
  2025-07-17 20:32 ` [PATCH 3/7] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-07-17 20:32 ` Ezekiel Newren via GitGitGadget
  2025-07-17 22:46   ` Taylor Blau
                     ` (2 more replies)
  2025-07-17 20:32 ` [PATCH 5/7] xdiff: separate parsing lines from hashing them Ezekiel Newren via GitGitGadget
                   ` (8 subsequent siblings)
  12 siblings, 3 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-07-17 20:32 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

A few commits ago, we added definitions for Rust primitive types,
to facilitate interoperability between C and Rust. Switch a
few variables to use these types. Which, for now, will
require adding some casts.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xdiffi.c    |  8 ++++----
 xdiff/xemit.c     |  2 +-
 xdiff/xmerge.c    | 14 +++++++-------
 xdiff/xpatience.c |  2 +-
 xdiff/xprepare.c  |  6 +++---
 xdiff/xtypes.h    |  6 +++---
 xdiff/xutils.c    |  4 ++--
 7 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 5a96e36dfbea..3b364c61f671 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -418,7 +418,7 @@ static int get_indent(xrecord_t *rec)
 	long i;
 	int ret = 0;
 
-	for (i = 0; i < rec->size; i++) {
+	for (i = 0; i < (long) rec->size; i++) {
 		char c = rec->ptr[i];
 
 		if (!XDL_ISSPACE(c))
@@ -1005,11 +1005,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
 
 		rec = &xe->xdf1.recs[xch->i1];
 		for (i = 0; i < xch->chg1 && ignore; i++)
-			ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+			ignore = xdl_blankline((const char*) rec[i]->ptr, rec[i]->size, flags);
 
 		rec = &xe->xdf2.recs[xch->i2];
 		for (i = 0; i < xch->chg2 && ignore; i++)
-			ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+			ignore = xdl_blankline((const char*)rec[i]->ptr, rec[i]->size, flags);
 
 		xch->ignore = ignore;
 	}
@@ -1020,7 +1020,7 @@ static int record_matches_regex(xrecord_t *rec, xpparam_t const *xpp) {
 	size_t i;
 
 	for (i = 0; i < xpp->ignore_regex_nr; i++)
-		if (!regexec_buf(xpp->ignore_regex[i], rec->ptr, rec->size, 1,
+		if (!regexec_buf(xpp->ignore_regex[i], (const char*) rec->ptr, rec->size, 1,
 				 &regmatch, 0))
 			return 1;
 
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 1d40c9cb4076..bbf7b7f8c862 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -24,7 +24,7 @@
 
 static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
 
-	*rec = xdf->recs[ri]->ptr;
+	*rec = (char const*) xdf->recs[ri]->ptr;
 
 	return xdf->recs[ri]->size;
 }
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index af40c88a5b36..6fa6ea61a208 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -101,8 +101,8 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
 	xrecord_t **rec2 = xe2->xdf2.recs + i2;
 
 	for (i = 0; i < line_count; i++) {
-		int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
-			rec2[i]->ptr, rec2[i]->size, flags);
+		int result = xdl_recmatch((const char*) rec1[i]->ptr, rec1[i]->size,
+			(const char*) rec2[i]->ptr, rec2[i]->size, flags);
 		if (!result)
 			return -1;
 	}
@@ -324,8 +324,8 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
 
 static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
 {
-	return xdl_recmatch(rec1->ptr, rec1->size,
-			    rec2->ptr, rec2->size, flags);
+	return xdl_recmatch((char const*) rec1->ptr, rec1->size,
+			    (char const*) rec2->ptr, rec2->size, flags);
 }
 
 /*
@@ -383,10 +383,10 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
 		 */
 		t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
 		t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
-			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
+			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - (u8 const*) t1.ptr;
 		t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
 		t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
-			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
+			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - (u8 const*) t2.ptr;
 		if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
 			return -1;
 		if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
@@ -440,7 +440,7 @@ static int line_contains_alnum(const char *ptr, long size)
 static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
 {
 	for (; chg; chg--, i++)
-		if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
+		if (line_contains_alnum((char const*) xe->xdf2.recs[i]->ptr,
 				xe->xdf2.recs[i]->size))
 			return 1;
 	return 0;
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 77dc411d1937..986a3a3f749a 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
 		return;
 	map->entries[index].line1 = line;
 	map->entries[index].hash = record->ha;
-	map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
+	map->entries[index].anchor = is_anchor(xpp, (const char*) map->env->xdf1.recs[line - 1]->ptr);
 	if (!map->first)
 		map->first = map->entries + index;
 	if (map->last) {
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index ad356281f939..747268e4fdf7 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -96,12 +96,12 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 	char const *line;
 	xdlclass_t *rcrec;
 
-	line = rec->ptr;
+	line = (char const*) rec->ptr;
 	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
 	for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
 		if (rcrec->ha == rec->ha &&
 				xdl_recmatch(rcrec->line, rcrec->size,
-					rec->ptr, rec->size, cf->flags))
+					(const char*) rec->ptr, rec->size, cf->flags))
 			break;
 
 	if (!rcrec) {
@@ -159,7 +159,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 				goto abort;
 			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
 				goto abort;
-			crec->ptr = prev;
+			crec->ptr = (u8 const*) prev;
 			crec->size = (long) (cur - prev);
 			crec->ha = hav;
 			recs[nrec++] = crec;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8b8467360ecf..6e5f67ebf380 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,9 +39,9 @@ typedef struct s_chastore {
 } chastore_t;
 
 typedef struct s_xrecord {
-	char const *ptr;
-	long size;
-	unsigned long ha;
+	u8 const* ptr;
+	usize size;
+	u64 ha;
 } xrecord_t;
 
 typedef struct s_xdfile {
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 444a108f87c0..10e4f20b7c31 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -418,10 +418,10 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
 
 	subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
 	subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
-		diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
+		diff_env->xdf1.recs[line1 + count1 - 2]->size - (u8 const*) subfile1.ptr;
 	subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
 	subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
-		diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
+		diff_env->xdf2.recs[line2 + count2 - 2]->size - (u8 const*) subfile2.ptr;
 	if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
 		return -1;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH 5/7] xdiff: separate parsing lines from hashing them
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
                   ` (3 preceding siblings ...)
  2025-07-17 20:32 ` [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
@ 2025-07-17 20:32 ` Ezekiel Newren via GitGitGadget
  2025-07-17 22:59   ` Taylor Blau
  2025-07-18 13:34   ` Phillip Wood
  2025-07-17 20:32 ` [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash Ezekiel Newren via GitGitGadget
                   ` (7 subsequent siblings)
  12 siblings, 2 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-07-17 20:32 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

We want to use xxhash for faster hashing. To facilitate that
and to simplify the code. Separate the concerns of parsing
and hashing into discrete steps. This makes swapping the hash
function much easier. Since xdl_hash_record() both parses and
hashses lines, this requires some slight code restructuring.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xprepare.c | 75 ++++++++++++++++++++++++++++--------------------
 1 file changed, 44 insertions(+), 31 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 747268e4fdf7..c44005e9bbb8 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -129,13 +129,39 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 }
 
 
+static void xdl_parse_lines(mmfile_t *mf, long narec, xdfile_t *xdf) {
+	u8 const* ptr = (u8 const*) mf->ptr;
+	usize len = (usize) mf->size;
+
+	xdf->recs = NULL;
+	xdf->nrec = 0;
+	XDL_ALLOC_ARRAY(xdf->recs, narec);
+
+	while (len > 0) {
+		xrecord_t *rec = NULL;
+		usize length;
+		u8 const* result = memchr(ptr, '\n', len);
+		if (result) {
+			length = result - ptr + 1;
+		} else {
+			length = len;
+		}
+		if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
+			die("XDL_ALLOC_GROW failed");
+		rec = xdl_cha_alloc(&xdf->rcha);
+		rec->ptr = ptr;
+		rec->size = length;
+		rec->ha = 0;
+		xdf->recs[xdf->nrec++] = rec;
+		ptr += length;
+		len -= length;
+	}
+
+}
+
+
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
 			   xdlclassifier_t *cf, xdfile_t *xdf) {
-	long nrec, bsize;
-	unsigned long hav;
-	char const *blk, *cur, *top, *prev;
-	xrecord_t *crec;
-	xrecord_t **recs;
 	unsigned long *ha;
 	char *rchg;
 	long *rindex;
@@ -143,50 +169,37 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	ha = NULL;
 	rindex = NULL;
 	rchg = NULL;
-	recs = NULL;
 
 	if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
 		goto abort;
-	if (!XDL_ALLOC_ARRAY(recs, narec))
-		goto abort;
 
-	nrec = 0;
-	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
-		for (top = blk + bsize; cur < top; ) {
-			prev = cur;
-			hav = xdl_hash_record(&cur, top, xpp->flags);
-			if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
-				goto abort;
-			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
-				goto abort;
-			crec->ptr = (u8 const*) prev;
-			crec->size = (long) (cur - prev);
-			crec->ha = hav;
-			recs[nrec++] = crec;
-			if (xdl_classify_record(pass, cf, crec) < 0)
-				goto abort;
-		}
+	xdl_parse_lines(mf, narec, xdf);
+
+	for (usize i = 0; i < (usize) xdf->nrec; i++) {
+		xrecord_t *rec = xdf->recs[i];
+		char const* dump = (char const*) rec->ptr;
+		rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
+		xdl_classify_record(pass, cf, rec);
 	}
 
-	if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
+
+	if (!XDL_CALLOC_ARRAY(rchg, xdf->nrec + 2))
 		goto abort;
 
 	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
 	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
-		if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
+		if (!XDL_ALLOC_ARRAY(rindex, xdf->nrec + 1))
 			goto abort;
-		if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
+		if (!XDL_ALLOC_ARRAY(ha, xdf->nrec + 1))
 			goto abort;
 	}
 
-	xdf->nrec = nrec;
-	xdf->recs = recs;
 	xdf->rchg = rchg + 1;
 	xdf->rindex = rindex;
 	xdf->nreff = 0;
 	xdf->ha = ha;
 	xdf->dstart = 0;
-	xdf->dend = nrec - 1;
+	xdf->dend = xdf->nrec - 1;
 
 	return 0;
 
@@ -194,7 +207,7 @@ abort:
 	xdl_free(ha);
 	xdl_free(rindex);
 	xdl_free(rchg);
-	xdl_free(recs);
+	xdl_free(xdf->recs);
 	xdl_cha_free(&xdf->rcha);
 	return -1;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
                   ` (4 preceding siblings ...)
  2025-07-17 20:32 ` [PATCH 5/7] xdiff: separate parsing lines from hashing them Ezekiel Newren via GitGitGadget
@ 2025-07-17 20:32 ` Ezekiel Newren via GitGitGadget
  2025-07-17 23:29   ` Taylor Blau
                     ` (2 more replies)
  2025-07-17 20:32 ` [PATCH 7/7] github_workflows: install rust Ezekiel Newren via GitGitGadget
                   ` (6 subsequent siblings)
  12 siblings, 3 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-07-17 20:32 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

When no whitespace flags are present use xxhash, for faster
hashing, otherwise use DJB2a (which is what xdiff has been
using all along).

The benchmark below compares my series with version v2.49.0
(built in build_release/ and build_v2.49.0/ respectively),
running log commands on linux kernel with 3 different machines.

$ BASE=/path/to/git/root

    // laptop
    // CPU: 6-core Intel Core i7-8750H (-MT MCP-) speed/min/max: 726/800/4100 MHz
    $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null'
    Benchmark 1: /home/ezekiel/development/work/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):     10.419 s ±  0.166 s    [User: 10.097 s, System: 0.284 s]
      Range (min … max):   10.215 s … 10.680 s    10 runs

    Benchmark 2: /home/ezekiel/development/work/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):     10.980 s ±  0.137 s    [User: 10.633 s, System: 0.308 s]
      Range (min … max):   10.791 s … 11.178 s    10 runs

    Summary
      /home/ezekiel/development/work/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran
        1.05 ± 0.02 times faster than /home/ezekiel/development/work/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null

    // desktop
    // CPU: 8-core Intel Core i7-9700 (-MCP-) speed/min/max: 800/800/4700 MHz
    $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null'
    Benchmark 1: /home/steamuser/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):      6.823 s ±  0.020 s    [User: 6.624 s, System: 0.180 s]
      Range (min … max):    6.801 s …  6.858 s    10 runs

    Benchmark 2: /home/steamuser/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):      8.151 s ±  0.024 s    [User: 7.928 s, System: 0.198 s]
      Range (min … max):    8.105 s …  8.184 s    10 runs

    Summary
      /home/steamuser/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran
        1.19 ± 0.01 times faster than /home/steamuser/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null

    // router
    // CPU: dual core Intel Celeron 3965U (-MCP-) speed/min/max: 1300/400/2200 MHz
    $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null'
    Benchmark 1: /home/metal/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):     21.209 s ±  0.054 s    [User: 20.341 s, System: 0.605 s]
      Range (min … max):   21.135 s … 21.309 s    10 runs

    Benchmark 2: /home/metal/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):     23.683 s ±  0.060 s    [User: 22.735 s, System: 0.672 s]
      Range (min … max):   23.566 s … 23.751 s    10 runs

    Summary
      /home/metal/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran
        1.12 ± 0.00 times faster than /home/metal/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 rust/Cargo.lock       |  7 +++++++
 rust/xdiff/Cargo.toml |  1 +
 rust/xdiff/src/lib.rs |  7 +++++++
 xdiff/xprepare.c      | 19 +++++++++++++++++--
 4 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/rust/Cargo.lock b/rust/Cargo.lock
index fb1eac690b39..5f84617b1049 100644
--- a/rust/Cargo.lock
+++ b/rust/Cargo.lock
@@ -11,4 +11,11 @@ name = "xdiff"
 version = "0.1.0"
 dependencies = [
  "interop",
+ "xxhash-rust",
 ]
+
+[[package]]
+name = "xxhash-rust"
+version = "0.8.15"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "fdd20c5420375476fbd4394763288da7eb0cc0b8c11deed431a91562af7335d3"
diff --git a/rust/xdiff/Cargo.toml b/rust/xdiff/Cargo.toml
index eb7966aada64..1516e829db18 100644
--- a/rust/xdiff/Cargo.toml
+++ b/rust/xdiff/Cargo.toml
@@ -13,3 +13,4 @@ crate-type = ["staticlib", "rlib"]
 
 [dependencies]
 interop = { path = "../interop" }
+xxhash-rust = { version = "0.8.15", features = ["xxh3"] }
diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
index e69de29bb2d1..96975975a1ba 100644
--- a/rust/xdiff/src/lib.rs
+++ b/rust/xdiff/src/lib.rs
@@ -0,0 +1,7 @@
+
+
+#[no_mangle]
+unsafe extern "C" fn xxh3_64(ptr: *const u8, size: usize) -> u64 {
+    let slice = std::slice::from_raw_parts(ptr, size);
+    xxhash_rust::xxh3::xxh3_64(slice)
+}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index c44005e9bbb8..5a2e52f102cf 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -160,6 +160,9 @@ static void xdl_parse_lines(mmfile_t *mf, long narec, xdfile_t *xdf) {
 }
 
 
+extern u64 xxh3_64(u8 const* ptr, usize size);
+
+
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
 			   xdlclassifier_t *cf, xdfile_t *xdf) {
 	unsigned long *ha;
@@ -175,14 +178,26 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 
 	xdl_parse_lines(mf, narec, xdf);
 
+	if ((xpp->flags & XDF_WHITESPACE_FLAGS) == 0) {
+		for (usize i = 0; i < (usize) xdf->nrec; i++) {
+			xrecord_t *rec = xdf->recs[i];
+			rec->ha = xxh3_64(rec->ptr, rec->size);
+		}
+	} else {
+		for (usize i = 0; i < (usize) xdf->nrec; i++) {
+			xrecord_t *rec = xdf->recs[i];
+			char const* dump = (char const*) rec->ptr;
+			rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
+		}
+	}
+
 	for (usize i = 0; i < (usize) xdf->nrec; i++) {
 		xrecord_t *rec = xdf->recs[i];
-		char const* dump = (char const*) rec->ptr;
-		rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
 		xdl_classify_record(pass, cf, rec);
 	}
 
 
+
 	if (!XDL_CALLOC_ARRAY(rchg, xdf->nrec + 2))
 		goto abort;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH 7/7] github_workflows: install rust
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
                   ` (5 preceding siblings ...)
  2025-07-17 20:32 ` [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash Ezekiel Newren via GitGitGadget
@ 2025-07-17 20:32 ` Ezekiel Newren via GitGitGadget
  2025-07-17 21:23   ` brian m. carlson
  2025-07-19 21:54   ` Johannes Schindelin
  2025-07-17 21:51 ` [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification brian m. carlson
                   ` (5 subsequent siblings)
  12 siblings, 2 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-07-17 20:32 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Since we have introduced rust, it needs to be installed for the
continuous integration build targets. Create an install script
(build_rust.sh) that needs to be run as the same user that builds git.
Because of the limitations of meson, create build_rust.sh which makes
it easy to centralize how rust is built between meson and make.

There are 2 interesting decisions worth calling out in this commit:

* The 'output' field of custom_target() does not allow specifying a
  file nested inside the build directory. Thus create build_rust.sh to
  build rust with all of its parameters and then moves libxdiff.a to
  the root of the build directory.

* Install curl, to facilitate the rustup install script.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 .github/workflows/main.yml |  1 +
 .gitignore                 |  1 +
 Makefile                   | 46 +++++++++++++++++++----------
 build_rust.sh              | 59 ++++++++++++++++++++++++++++++++++++++
 ci/install-dependencies.sh | 14 ++++-----
 ci/install-rust.sh         | 33 +++++++++++++++++++++
 ci/lib.sh                  |  8 ++++++
 ci/make-test-artifacts.sh  |  7 +++++
 ci/run-build-and-tests.sh  | 10 +++++++
 meson.build                | 40 +++++++++++---------------
 10 files changed, 173 insertions(+), 46 deletions(-)
 create mode 100755 build_rust.sh
 create mode 100644 ci/install-rust.sh

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 7dbf9f7f123c..8aac18a6ba45 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -4,6 +4,7 @@ on: [push, pull_request]
 
 env:
   DEVELOPER: 1
+  RUST_VERSION: 1.87.0
 
 # If more than one workflow run is triggered for the very same commit hash
 # (which happens when multiple branches pointing to the same commit), only
diff --git a/.gitignore b/.gitignore
index 04c444404e4b..a1c0d212541e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -254,3 +254,4 @@ Release/
 /contrib/buildsystems/out
 /contrib/libgit-rs/target
 /contrib/libgit-sys/target
+/rust/target
diff --git a/Makefile b/Makefile
index db39e6e1c28e..e659b6eefe82 100644
--- a/Makefile
+++ b/Makefile
@@ -919,11 +919,29 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+
+EXTLIBS =
+
 ifeq ($(DEBUG), 1)
-RUST_LIB = rust/target/debug/libxdiff.a
+  RUST_BUILD_MODE = debug
 else
-RUST_LIB = rust/target/release/libxdiff.a
+  RUST_BUILD_MODE = release
+endif
+
+RUST_TARGET_DIR = rust/target/$(RUST_BUILD_MODE)
+RUST_FLAGS_FOR_C = -L$(RUST_TARGET_DIR)
+
+.PHONY: compile_rust
+compile_rust:
+	./build_rust.sh . $(RUST_BUILD_MODE) xdiff
+
+EXTLIBS += ./$(RUST_TARGET_DIR)/libxdiff.a
+
+UNAME_S := $(shell uname -s)
+ifeq ($(UNAME_S),Linux)
+  EXTLIBS += -ldl
 endif
+
 REFTABLE_LIB = reftable/libreftable.a
 
 GENERATED_H += command-list.h
@@ -1395,9 +1413,7 @@ UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/lib-reftable.o
 
 # xdiff and reftable libs may in turn depend on what is in libgit.a
 GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
-EXTLIBS =
 
-GITLIBS += $(RUST_LIB)
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
 
@@ -2548,7 +2564,7 @@ git.sp git.s git.o: EXTRA_CPPFLAGS = \
 	'-DGIT_MAN_PATH="$(mandir_relative_SQ)"' \
 	'-DGIT_INFO_PATH="$(infodir_relative_SQ)"'
 
-git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS)
+git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
 		$(filter %.o,$^) $(LIBS)
 
@@ -2898,17 +2914,17 @@ headless-git.o: compat/win32/headless.c GIT-CFLAGS
 headless-git$X: headless-git.o git.res GIT-LDFLAGS
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) $(ALL_LDFLAGS) -mwindows -o $@ $< git.res
 
-git-%$X: %.o GIT-LDFLAGS $(GITLIBS)
+git-%$X: %.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
 
-git-imap-send$X: imap-send.o $(IMAP_SEND_BUILDDEPS) GIT-LDFLAGS $(GITLIBS)
+git-imap-send$X: imap-send.o $(IMAP_SEND_BUILDDEPS) GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(IMAP_SEND_LDFLAGS) $(LIBS)
 
-git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS $(GITLIBS)
+git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(LIBS)
-git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS)
+git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
 
@@ -2918,11 +2934,11 @@ $(REMOTE_CURL_ALIASES): $(REMOTE_CURL_PRIMARY)
 	ln -s $< $@ 2>/dev/null || \
 	cp $< $@
 
-$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS $(GITLIBS)
+$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
 
-scalar$X: scalar.o GIT-LDFLAGS $(GITLIBS)
+scalar$X: scalar.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
 		$(filter %.o,$^) $(LIBS)
 
@@ -3309,7 +3325,7 @@ perf: all
 
 t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) $(UNIT_TEST_DIR)/test-lib.o
 
-t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
+t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS)
 
 check-sha1:: t/helper/test-tool$X
@@ -3929,13 +3945,13 @@ FUZZ_CXXFLAGS ?= $(ALL_CFLAGS)
 .PHONY: fuzz-all
 fuzz-all: $(FUZZ_PROGRAMS)
 
-$(FUZZ_PROGRAMS): %: %.o oss-fuzz/dummy-cmd-main.o $(GITLIBS) GIT-LDFLAGS
+$(FUZZ_PROGRAMS): %: %.o oss-fuzz/dummy-cmd-main.o $(GITLIBS) GIT-LDFLAGS compile_rust
 	$(QUIET_LINK)$(FUZZ_CXX) $(FUZZ_CXXFLAGS) -o $@ $(ALL_LDFLAGS) \
 		-Wl,--allow-multiple-definition \
 		$(filter %.o,$^) $(filter %.a,$^) $(LIBS) $(LIB_FUZZING_ENGINE)
 
 $(UNIT_TEST_PROGS): $(UNIT_TEST_BIN)/%$X: $(UNIT_TEST_DIR)/%.o $(UNIT_TEST_OBJS) \
-	$(GITLIBS) GIT-LDFLAGS
+	$(GITLIBS) GIT-LDFLAGS compile_rust
 	$(call mkdir_p_parent_template)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
 		$(filter %.o,$^) $(filter %.a,$^) $(LIBS)
@@ -3954,7 +3970,7 @@ $(UNIT_TEST_DIR)/clar.suite: $(UNIT_TEST_DIR)/clar-decls.h $(UNIT_TEST_DIR)/gene
 $(UNIT_TEST_DIR)/clar/clar.o: $(UNIT_TEST_DIR)/clar.suite
 $(CLAR_TEST_OBJS): $(UNIT_TEST_DIR)/clar-decls.h
 $(CLAR_TEST_OBJS): EXTRA_CPPFLAGS = -I$(UNIT_TEST_DIR)
-$(CLAR_TEST_PROG): $(UNIT_TEST_DIR)/clar.suite $(CLAR_TEST_OBJS) $(GITLIBS) GIT-LDFLAGS
+$(CLAR_TEST_PROG): $(UNIT_TEST_DIR)/clar.suite $(CLAR_TEST_OBJS) $(GITLIBS) GIT-LDFLAGS compile_rust
 	$(call mkdir_p_parent_template)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
 
diff --git a/build_rust.sh b/build_rust.sh
new file mode 100755
index 000000000000..4c12135cd205
--- /dev/null
+++ b/build_rust.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+
+if [ -z "$CARGO_HOME" ]; then
+  export CARGO_HOME=$HOME/.cargo
+  echo >&2 "::warning:: CARGO_HOME is not set"
+fi
+echo "CARGO_HOME=$CARGO_HOME"
+
+rustc -vV
+cargo --version
+
+dir_git_root=${0%/*}
+dir_build=$1
+rust_target=$2
+crate=$3
+
+dir_rust=$dir_git_root/rust
+
+if [ "$dir_git_root" = "" ]; then
+  echo "did not specify the directory for the root of git"
+  exit 1
+fi
+
+if [ "$dir_build" = "" ]; then
+  echo "did not specify the build directory"
+  exit 1
+fi
+
+if [ "$rust_target" = "" ]; then
+  echo "did not specify the rust_target"
+  exit 1
+fi
+
+if [ "$rust_target" = "release" ]; then
+  rust_args="--release"
+  export RUSTFLAGS='-Aunused_imports -Adead_code'
+elif [ "$rust_target" = "debug" ]; then
+  rust_args=""
+  export RUSTFLAGS='-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
+else
+  echo "illegal rust_target value $rust_target"
+  exit 1
+fi
+
+cd $dir_rust && cargo clean && pwd && cargo build -p $crate $rust_args; cd ..
+
+libfile="lib${crate}.a"
+dst=$dir_build/$libfile
+
+if [ "$dir_git_root" != "$dir_build" ]; then
+  src=$dir_rust/target/$rust_target/$libfile
+  if [ ! -f $src ]; then
+    echo >&2 "::error:: cannot find path of static library"
+    exit 5
+  fi
+
+  rm $dst 2>/dev/null
+  mv $src $dst
+fi
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index d061a4729339..7801075821ba 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -24,14 +24,14 @@ fi
 
 case "$distro" in
 alpine-*)
-	apk add --update shadow sudo meson ninja-build gcc libc-dev curl-dev openssl-dev expat-dev gettext \
+	apk add --update shadow sudo meson ninja-build gcc libc-dev curl curl-dev openssl-dev expat-dev gettext \
 		zlib-ng-dev pcre2-dev python3 musl-libintl perl-utils ncurses \
 		apache2 apache2-http2 apache2-proxy apache2-ssl apache2-webdav apr-util-dbd_sqlite3 \
 		bash cvs gnupg perl-cgi perl-dbd-sqlite perl-io-tty >/dev/null
 	;;
 fedora-*|almalinux-*)
 	dnf -yq update >/dev/null &&
-	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gawk gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
+	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gawk gettext zlib-devel expat-devel openssl-devel curl curl-devel pcre2-devel >/dev/null
 	;;
 ubuntu-*|i386/ubuntu-*|debian-*)
 	# Required so that apt doesn't wait for user input on certain packages.
@@ -55,8 +55,8 @@ ubuntu-*|i386/ubuntu-*|debian-*)
 	sudo apt-get -q update
 	sudo apt-get -q -y install \
 		$LANGUAGES apache2 cvs cvsps git gnupg $SVN \
-		make libssl-dev libcurl4-openssl-dev libexpat-dev wget sudo default-jre \
-		tcl tk gettext zlib1g-dev perl-modules liberror-perl libauthen-sasl-perl \
+		make libssl-dev curl libcurl4-openssl-dev libexpat-dev wget sudo default-jre \
+		tcl tk gettext zlib1g zlib1g-dev perl-modules liberror-perl libauthen-sasl-perl \
 		libemail-valid-perl libio-pty-perl libio-socket-ssl-perl libnet-smtp-ssl-perl libdbd-sqlite3-perl libcgi-pm-perl \
 		libsecret-1-dev libpcre2-dev meson ninja-build pkg-config \
 		${CC_PACKAGE:-${CC:-gcc}} $PYTHON_PACKAGE
@@ -121,13 +121,13 @@ ClangFormat)
 	;;
 StaticAnalysis)
 	sudo apt-get -q update
-	sudo apt-get -q -y install coccinelle libcurl4-openssl-dev libssl-dev \
+	sudo apt-get -q -y install coccinelle curl libcurl4-openssl-dev libssl-dev \
 		libexpat-dev gettext make
 	;;
 sparse)
 	sudo apt-get -q update -q
-	sudo apt-get -q -y install libssl-dev libcurl4-openssl-dev \
-		libexpat-dev gettext zlib1g-dev sparse
+	sudo apt-get -q -y install libssl-dev curl libcurl4-openssl-dev \
+		libexpat-dev gettext zlib1g zlib1g-dev sparse
 	;;
 Documentation)
 	sudo apt-get -q update
diff --git a/ci/install-rust.sh b/ci/install-rust.sh
new file mode 100644
index 000000000000..141ceddb17cf
--- /dev/null
+++ b/ci/install-rust.sh
@@ -0,0 +1,33 @@
+#!/bin/sh
+
+if [ "$(id -u)" -eq 0 ]; then
+  echo >&2 "::warning:: installing rust as root"
+fi
+
+if [ "$CARGO_HOME" = "" ]; then
+  echo >&2 "::warning:: CARGO_HOME is not set"
+  export CARGO_HOME=$HOME/.cargo
+fi
+
+export RUSTUP_HOME=$CARGO_HOME
+
+if [ "$RUST_VERSION" = "" ]; then
+  echo >&2 "::error:: RUST_VERSION is not set"
+  exit 2
+fi
+
+## install rustup
+curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain none -y
+if [ ! -f $CARGO_HOME/env ]; then
+  echo "PATH=$CARGO_HOME/bin:\$PATH" > $CARGO_HOME/env
+fi
+## install a specific version of rust
+if [ "$BITNESS" = "32" ]; then
+  $CARGO_HOME/bin/rustup set default-host i686-unknown-linux-gnu || exit $?
+  $CARGO_HOME/bin/rustup install $RUST_VERSION || exit $?
+  $CARGO_HOME/bin/rustup default --force-non-host $RUST_VERSION || exit $?
+else
+  $CARGO_HOME/bin/rustup default $RUST_VERSION || exit $?
+fi
+
+. $CARGO_HOME/env
diff --git a/ci/lib.sh b/ci/lib.sh
index f561884d4016..ad0e49a68dcb 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -1,5 +1,13 @@
 # Library of functions shared by all CI scripts
 
+
+export BITNESS="64"
+if command -v getconf >/dev/null && [ "$(getconf LONG_BIT 2>/dev/null)" = "32" ]; then
+  export BITNESS="32"
+fi
+echo "BITNESS=$BITNESS"
+
+
 if test true = "$GITHUB_ACTIONS"
 then
 	begin_group () {
diff --git a/ci/make-test-artifacts.sh b/ci/make-test-artifacts.sh
index 74141af0cc74..56aa7efb1d53 100755
--- a/ci/make-test-artifacts.sh
+++ b/ci/make-test-artifacts.sh
@@ -7,6 +7,13 @@ mkdir -p "$1" # in case ci/lib.sh decides to quit early
 
 . ${0%/*}/lib.sh
 
+## install rust per user rather than system wide
+. ${0%/*}/install-rust.sh
+
 group Build make artifacts-tar ARTIFACTS_DIRECTORY="$1"
 
+if [ -d "$CARGO_HOME" ]; then
+  rm -rf $CARGO_HOME
+fi
+
 check_unignored_build_artifacts
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 01823fd0f140..dbab1cb2f936 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -5,6 +5,12 @@
 
 . ${0%/*}/lib.sh
 
+## install rust per user rather than system wide
+. ${0%/*}/install-rust.sh
+
+rustc -vV
+cargo --version || exit $?
+
 run_tests=t
 
 case "$jobname" in
@@ -72,5 +78,9 @@ case "$jobname" in
 	;;
 esac
 
+if [ -d "$CARGO_HOME" ]; then
+  rm -rf $CARGO_HOME
+fi
+
 check_unignored_build_artifacts
 save_good_tree
diff --git a/meson.build b/meson.build
index 2d8da17f6515..047d7e5b6630 100644
--- a/meson.build
+++ b/meson.build
@@ -277,26 +277,17 @@ else
   rustflags = '-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
 endif
 
-
-rust_leaf = custom_target('rust_leaf',
+rust_build_xdiff = custom_target('rust_build_xdiff',
   output: 'libxdiff.a',
   build_by_default: true,
   build_always_stale: true,
-  command: ['cargo', 'build',
-            '--manifest-path', meson.project_source_root() / 'rust/Cargo.toml'
-  ] + rust_args,
-  env: {
-    'RUSTFLAGS': rustflags,
-  },
+  command: [
+    meson.project_source_root() / 'build_rust.sh',
+    meson.current_build_dir(), rust_target, 'xdiff',
+  ],
   install: false,
 )
 
-rust_xdiff_dep = declare_dependency(
-  link_args: ['-L' + meson.project_source_root() / 'rust/target' / rust_target, '-lxdiff'],
-#  include_directories: include_directories('xdiff/include'),  # Adjust if you expose headers
-)
-
-
 compiler = meson.get_compiler('c')
 
 libgit_sources = [
@@ -1707,17 +1698,18 @@ version_def_h = custom_target(
 )
 libgit_sources += version_def_h
 
-libgit_dependencies += rust_xdiff_dep
-
 libgit = declare_dependency(
-  link_with: static_library('git',
-    sources: libgit_sources,
-    c_args: libgit_c_args + [
-      '-DGIT_VERSION_H="' + version_def_h.full_path() + '"',
-    ],
-    dependencies: libgit_dependencies,
-    include_directories: libgit_include_directories,
-  ),
+  link_with: [
+    static_library('git',
+      sources: libgit_sources,
+      c_args: libgit_c_args + [
+        '-DGIT_VERSION_H="' + version_def_h.full_path() + '"',
+      ],
+      dependencies: libgit_dependencies,
+      include_directories: libgit_include_directories,
+    ),
+    rust_build_xdiff,
+  ],
   compile_args: libgit_c_args,
   dependencies: libgit_dependencies,
   include_directories: libgit_include_directories,
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 198+ messages in thread

* Re: [PATCH 7/7] github_workflows: install rust
  2025-07-17 20:32 ` [PATCH 7/7] github_workflows: install rust Ezekiel Newren via GitGitGadget
@ 2025-07-17 21:23   ` brian m. carlson
  2025-07-18 23:01     ` Ezekiel Newren
  2025-07-19 21:54   ` Johannes Schindelin
  1 sibling, 1 reply; 198+ messages in thread
From: brian m. carlson @ 2025-07-17 21:23 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren

[-- Attachment #1: Type: text/plain, Size: 1728 bytes --]

On 2025-07-17 at 20:32:24, Ezekiel Newren via GitGitGadget wrote:
> diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
> index 7dbf9f7f123c..8aac18a6ba45 100644
> --- a/.github/workflows/main.yml
> +++ b/.github/workflows/main.yml
> @@ -4,6 +4,7 @@ on: [push, pull_request]
>  
>  env:
>    DEVELOPER: 1
> +  RUST_VERSION: 1.87.0

Our discussed plan is to support the version in Debian stable, plus a
year.  So we'd be supporting 1.63.0 for a year after trixie's release.

The reason for that is that people build backports and security updates
for Git for stable releases of distros and they will use the distro
toolchain for doing so.  Forcing distros to constantly build with the
latest toolchain is pretty hostile, especially since the lifespan of
Rust release is six weeks.

If the Rust project provides LTS releases in the future, then we can
consider adopting those.

> +if [ "$rust_target" = "release" ]; then
> +  rust_args="--release"
> +  export RUSTFLAGS='-Aunused_imports -Adead_code'
> +elif [ "$rust_target" = "debug" ]; then
> +  rust_args=""
> +  export RUSTFLAGS='-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'

Can you say a little about why these options are needed and the defaults
are inadequate?  For instance, I build with the default options both in
my personal projects and at work and don't see a problem.

I don't know if you plan to do this in a future series, but we'd also
want cargo's tests to be run as part of CI and we'd want a lint job that
ran clippy with both 1.63.0 and the latest stable version of Rust to
make sure things were tidy.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 1/7] xdiff: introduce rust
  2025-07-17 20:32 ` [PATCH 1/7] xdiff: introduce rust Ezekiel Newren via GitGitGadget
@ 2025-07-17 21:30   ` brian m. carlson
  2025-07-17 21:54     ` Junio C Hamano
                       ` (3 more replies)
  2025-07-17 22:38   ` Taylor Blau
  1 sibling, 4 replies; 198+ messages in thread
From: brian m. carlson @ 2025-07-17 21:30 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren

[-- Attachment #1: Type: text/plain, Size: 1369 bytes --]

On 2025-07-17 at 20:32:18, Ezekiel Newren via GitGitGadget wrote:
> diff --git a/rust/Cargo.lock b/rust/Cargo.lock
> new file mode 100644
> index 000000000000..fb1eac690b39
> --- /dev/null
> +++ b/rust/Cargo.lock
> @@ -0,0 +1,14 @@
> +# This file is automatically @generated by Cargo.
> +# It is not intended for manual editing.
> +version = 4
> +
> +[[package]]
> +name = "interop"
> +version = "0.1.0"
> +
> +[[package]]
> +name = "xdiff"
> +version = "0.1.0"
> +dependencies = [
> + "interop",
> +]

I would prefer that we not check in Cargo.lock in Git.  Part of the
reason is that it changes across versions and so building with a
different version of the toolchain can update the file.

In addition, as I mentioned downthread, because our intention is to
support the Debian stable toolchain for a year after the new stable
release, unless we are exceptionally careful about dependencies, we may
end up with a case where distros need to use older dependencies patched
for security but other users may want to update the versions to newer
dependencies with security fixes but that do not work on our pinned Rust
version.  We can't possibly satisfy both sets of people if we pin
dependencies in Cargo.lock, so we probably want to avoid checking it in
and ignore it instead.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
                   ` (6 preceding siblings ...)
  2025-07-17 20:32 ` [PATCH 7/7] github_workflows: install rust Ezekiel Newren via GitGitGadget
@ 2025-07-17 21:51 ` brian m. carlson
  2025-07-17 22:25   ` Taylor Blau
  2025-07-18  9:23 ` Christian Brabandt
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 198+ messages in thread
From: brian m. carlson @ 2025-07-17 21:51 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren

[-- Attachment #1: Type: text/plain, Size: 3285 bytes --]

On 2025-07-17 at 20:32:17, Ezekiel Newren via GitGitGadget wrote:
> This series accelerates xdiff by 5-19%.

That's great.

> It also introduces Rust as a hard dependency.

I think that's fine.  We already discussed doing this at the last
Contributor Summit in Berlin and everyone was in favour.  While we did
not have every contributor represented, I think that unanimity of the
contributors present is a compelling enough reason.

> …and it doesn’t yet pass a couple of the github workflows; hints from
> Windows experts, and opinions on ambiguous primitives would be appreciated
> (see below).
> 
> This is just the beginning of many patches that I have to convert portions
> of, maybe eventually all of, xdiff to Rust. While working on that
> conversion, I found several ways to clarify the code, along with some
> optimizations.
> 
> So...
> 
> This obviously raises the question of whether we are ready to accept a hard
> dependency on Rust. Previous discussions on the mailing list and at Git
> Merge 2024 have not answered that question. If not now, will we be willing
> to accept such a hard dependency later? And what route do we want to take to
> get there?

Again, I think that's fine.

I have a proposed policy at [0] (available from the `rust` branch on
that remote).  Included in that policy is a link to [1], which I
summarize as "the U.S. government is proposing to classify development
in memory unsafe languages as a Product Security Bad Practice."  The
proposal is that a memory safety roadmap be introduced by the end of
2025.

Now, do let me be clear that I don't agree with everything that the U.S.
government says or does (far from it), but I do think this is a sensible
proposal (or I wouldn't have cited it) and it will be showing up in a
lot more security standards coming down the line, especially for those
forges and companies which will be selling to governments around the
world.  There's no time like the present to do this.

I realize that that means that we will lose support for some platforms.
I ultimately think that it's up to the porters and maintainers for a
platform to maintain appropriate toolchains on that platform and that,
while we should be cognizant of the requirements for adding new
platforms or architectures, that shouldn't prevent the inclusion of
important new tools like memory-safe languages.

I would like to see a change to our platform policy and a policy on Rust
before we merge this.  I know your series adds support for 1.87, but
because distros don't run the latest toolchain, we had discussed in the
past targeting Debian stable's version plus an additional year after the
new release.  This means we'll support a Rust version for three years,
which is a reasonable amount of time for a toolchain and allows distros
to easily backport security fixes.

If you would like, you are welcome to use my proposed policy as a basis
for this, or I can send that out as a separate document if you don't
want to write one or revise mine.

[0] https://github.com/bk2204/git/commit/fbeb1180c7473635a964daed2da642c53487782d
[1] https://www.cisa.gov/resources-tools/resources/product-security-bad-practices
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 1/7] xdiff: introduce rust
  2025-07-17 21:30   ` brian m. carlson
@ 2025-07-17 21:54     ` Junio C Hamano
  2025-07-17 22:39     ` Taylor Blau
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-07-17 21:54 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

>> +# This file is automatically @generated by Cargo.
>> +# It is not intended for manual editing.
>> +version = 4
>> +
>> +[[package]]
>> +name = "interop"
>> +version = "0.1.0"
>> +
>> +[[package]]
>> +name = "xdiff"
>> +version = "0.1.0"
>> +dependencies = [
>> + "interop",
>> +]
>
> I would prefer that we not check in Cargo.lock in Git.  Part of the
> reason is that it changes across versions and so building with a
> different version of the toolchain can update the file.
>
> In addition, as I mentioned downthread, because our intention is to
> support the Debian stable toolchain for a year after the new stable
> release, unless we are exceptionally careful about dependencies, we may
> end up with a case where distros need to use older dependencies patched
> for security but other users may want to update the versions to newer
> dependencies with security fixes but that do not work on our pinned Rust
> version.  We can't possibly satisfy both sets of people if we pin
> dependencies in Cargo.lock, so we probably want to avoid checking it in
> and ignore it instead.

Yup.  

The comment in first few lines of the file says it very well ;-)
Thanks for flagging it.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-17 21:51 ` [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification brian m. carlson
@ 2025-07-17 22:25   ` Taylor Blau
  2025-07-18  0:29     ` brian m. carlson
  2025-07-22 16:03     ` Sam James
  0 siblings, 2 replies; 198+ messages in thread
From: Taylor Blau @ 2025-07-17 22:25 UTC (permalink / raw)
  To: brian m. carlson, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 09:51:32PM +0000, brian m. carlson wrote:
> On 2025-07-17 at 20:32:17, Ezekiel Newren via GitGitGadget wrote:
> > This series accelerates xdiff by 5-19%.
>
> That's great.
>
> > It also introduces Rust as a hard dependency.
>
> I think that's fine.  We already discussed doing this at the last
> Contributor Summit in Berlin and everyone was in favour.  While we did
> not have every contributor represented, I think that unanimity of the
> contributors present is a compelling enough reason.

I agree. I don't think that there is ever going to be a "perfect" time
to introduce a hard dependency on Rust, and I don't think that should
hold the project back from adopting it.

I am far from a Rust expert, but I think that a more modern, memory-safe
language will attract newer contributors who may have a fresher
perspective on the project, and I think that's a good thing.

The alternative, of course, is to continue to use C and not take any
dependency on Rust. I think there is a middle-ground in there somewhere
to be able to build with (e.g.) "make" or "make RUST=1", but I would
really like to see the project take a firmer stance here.

I worry that having build support for both "with Rust" and "C only" will
create a headache not just at the build system level, but also in the
code itself. Having a patchwork of features, optimizations, or bug fixes
that either are or aren't supported depending on whether Rust support
was specified at build-time seems like a worst-of-all-worlds outcome.

> I realize that that means that we will lose support for some platforms.
> I ultimately think that it's up to the porters and maintainers for a
> platform to maintain appropriate toolchains on that platform and that,
> while we should be cognizant of the requirements for adding new
> platforms or architectures, that shouldn't prevent the inclusion of
> important new tools like memory-safe languages.

Agreed. Of course, I think we would all like Git to be able to build and
run on as many platforms as is reasonably possible. But we cannot
support all platforms for all time. It is also not the Git project's
responsibility to ensure that every platform is Rust-friendly.

Hopefully the platforms that we currently support but won't after this
patch series have niche enough workloads that they do not need the
absolute latest-and-greatest Git release at all times.

> I would like to see a change to our platform policy and a policy on Rust
> before we merge this.  I know your series adds support for 1.87, but
> because distros don't run the latest toolchain, we had discussed in the
> past targeting Debian stable's version plus an additional year after the
> new release.  This means we'll support a Rust version for three years,
> which is a reasonable amount of time for a toolchain and allows distros
> to easily backport security fixes.
>
> If you would like, you are welcome to use my proposed policy as a basis
> for this, or I can send that out as a separate document if you don't
> want to write one or revise mine.

Yeah, I think that this is the most interesting part of the discussion
here. I am not knowledgeable enough about Rust's release cadence and
platform compatibility to have an opinion here. But I trust brian's
judgement ;-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 1/7] xdiff: introduce rust
  2025-07-17 20:32 ` [PATCH 1/7] xdiff: introduce rust Ezekiel Newren via GitGitGadget
  2025-07-17 21:30   ` brian m. carlson
@ 2025-07-17 22:38   ` Taylor Blau
  1 sibling, 0 replies; 198+ messages in thread
From: Taylor Blau @ 2025-07-17 22:38 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 08:32:18PM +0000, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Upcoming patches will accelerate and simplify xdiff, while also
> porting parts of it to Rust. In preparation, add some stubs and setup
> the Rust build. For now, it is easier to let cargo build rust and
> have make or meson merely link against the static library that cargo
> builds. In line with ongoing libification efforts, use multiple
> crates to allow more modularity on the Rust side. xdiff is the crate
> that this series will focus on, but we also introduce the interop
> crate for future patch series.
>
> In order to facilitate interoperability between C and Rust, introduce
> C definitions for Rust primitive types in git-compat-util.h.

Exciting ;-).

> diff --git a/Makefile b/Makefile
> index 70d1543b6b86..db39e6e1c28e 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -919,6 +919,11 @@ TEST_SHELL_PATH = $(SHELL_PATH)
>
>  LIB_FILE = libgit.a
>  XDIFF_LIB = xdiff/lib.a
> +ifeq ($(DEBUG), 1)
> +RUST_LIB = rust/target/debug/libxdiff.a
> +else
> +RUST_LIB = rust/target/release/libxdiff.a
> +endif

We do have a DEBUG variable in our Makefile introduced via dce7d29551
(msvc: support building Git using MS Visual C++, 2019-06-25), but I
don't think that it is very widely used. Perhaps that is because I don't
build Git with MSVC, but I suspect that this is generally true.

Much more common is the DEVELOPER=1 setting, which adds more compiler
warnings and similar. I am not sure whether or not it would be
appropriate to use DEVELOPER here to determine which libxdiff.a to use.

In any event, our convention would be to treat the defined-ness of DEBUG
the same way that this patch treats DEBUG=1, so I might suggest
replacing your "ifeq" with "ifdef DEBUG".

>  REFTABLE_LIB = reftable/libreftable.a
>
>  GENERATED_H += command-list.h
> @@ -1392,6 +1397,8 @@ UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/lib-reftable.o
>  GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
>  EXTLIBS =
>
> +GITLIBS += $(RUST_LIB)
> +
>  GIT_USER_AGENT = git/$(GIT_VERSION)
>
>  ifeq ($(wildcard sha1collisiondetection/lib/sha1.h),sha1collisiondetection/lib/sha1.h)
> @@ -2925,6 +2932,14 @@ $(LIB_FILE): $(LIB_OBJS)
>  $(XDIFF_LIB): $(XDIFF_OBJS)
>  	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
>
> +.PHONY: $(RUST_LIB)
> +$(RUST_LIB):
> +ifeq ($(DEBUG), 1)
> +	cd rust && RUSTFLAGS="-Aunused_imports -Adead_code" cargo build --verbose

A few thoughts here:

 - Does "cargo" support a flag similar to our -C? If so, I wonder if it
   might be worth writing "cargo -C rust build ..." instead of "cd rust
   && ...".

 - This conditional on DEBUG passes the "--verbose" option in both
   cases. Should we only pass the "--verbose" option when we have "V=1"?

 - Regardless of whether or not we condition passing "--release" (or
   not) on "DEBUG", this line should also be "ifdef DEBUG" similar to
   above.

> +else
> +	cd rust && RUSTFLAGS="-Aunused_imports -Adead_code" cargo build --verbose --release
> +endif
> +
>  $(REFTABLE_LIB): $(REFTABLE_OBJS)
>  	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
>
> @@ -3756,7 +3771,10 @@ cocciclean:
>  	$(RM) -r .build/contrib/coccinelle
>  	$(RM) contrib/coccinelle/*.cocci.patch
>
> -clean: profile-clean coverage-clean cocciclean
> +rustclean:

I'm nitpicking, and we don't *really* have a convention here between
separating the clean target from "clean", as we have both
"profile-clean" and "cocciclean". I prefer the former, and think that it
would be nice to use that convention, but this is pretty much textbook
bike-shedding and not something that I really care about ;-).

> +	cd rust && cargo clean

Same question here about whether or not this could be written as "cargo
-C clean".

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 4678e21c4cb8..82dc99764ac0 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -196,6 +196,23 @@ static inline int is_xplatform_dir_sep(int c)
>  #include "compat/msvc.h"
>  #endif
>
> +/* rust types */
> +typedef uint8_t   u8;
> +typedef uint16_t  u16;
> +typedef uint32_t  u32;
> +typedef uint64_t  u64;
> +
> +typedef int8_t    i8;
> +typedef int16_t   i16;
> +typedef int32_t   i32;
> +typedef int64_t   i64;
> +
> +typedef float     f32;
> +typedef double    f64;
> +
> +typedef size_t    usize;
> +typedef ptrdiff_t isize;
> +

Makes sense. Should we also have "bool" here (assuming that the series
declaring the "use bool" experiment a success lands)? I guess maybe not
either way, <stdbool.h> defines "bool" as the type name, identically to
Rust.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 1/7] xdiff: introduce rust
  2025-07-17 21:30   ` brian m. carlson
  2025-07-17 21:54     ` Junio C Hamano
@ 2025-07-17 22:39     ` Taylor Blau
  2025-07-18 23:15     ` Ezekiel Newren
  2025-07-22 22:02     ` Mike Hommey
  3 siblings, 0 replies; 198+ messages in thread
From: Taylor Blau @ 2025-07-17 22:39 UTC (permalink / raw)
  To: brian m. carlson, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 09:30:43PM +0000, brian m. carlson wrote:
> In addition, as I mentioned downthread, because our intention is to
> support the Debian stable toolchain for a year after the new stable
> release, unless we are exceptionally careful about dependencies, we may
> end up with a case where distros need to use older dependencies patched
> for security but other users may want to update the versions to newer
> dependencies with security fixes but that do not work on our pinned Rust
> version.

...or Debian users who have an older version of the toolchain installed
and got an unfriendly "cannot parse 'version = 4'" error when trying to
build with this patch series applied locally ;-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 2/7] xdiff/xprepare: remove superfluous forward declarations
  2025-07-17 20:32 ` [PATCH 2/7] xdiff/xprepare: remove superfluous forward declarations Ezekiel Newren via GitGitGadget
@ 2025-07-17 22:41   ` Taylor Blau
  0 siblings, 0 replies; 198+ messages in thread
From: Taylor Blau @ 2025-07-17 22:41 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 08:32:19PM +0000, Ezekiel Newren via GitGitGadget wrote:
> ---
>  xdiff/xprepare.c | 116 ++++++++++++++++++++---------------------------
>  1 file changed, 50 insertions(+), 66 deletions(-)

Makes sense. Reviewing with "--color-moved" makes it straightforward to
see that the contents of xdl_prepare_env() were not modified by this
patch.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-17 20:32 ` [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
@ 2025-07-17 22:46   ` Taylor Blau
  2025-07-17 23:13     ` brian m. carlson
  2025-07-18 13:35   ` Phillip Wood
  2025-07-20  1:39   ` Johannes Schindelin
  2 siblings, 1 reply; 198+ messages in thread
From: Taylor Blau @ 2025-07-17 22:46 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 08:32:21PM +0000, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> A few commits ago, we added definitions for Rust primitive types,
> to facilitate interoperability between C and Rust. Switch a
> few variables to use these types. Which, for now, will
> require adding some casts.

Hmm, interesting. I am not super familiar with how people typically
handle interoperability between C and Rust, but having to change types
on the C side to make it work with Rust is a bit surprising to me.

I would have expected that the Rust side would have declared its types
using libc::c_int, libc::size_t, and so on. I think I have a vague
preference towards putting the burden of casting on the Rust side, but,
again, I am not super familiar with how transitions like these are
typically approached.

> ---
>  xdiff/xdiffi.c    |  8 ++++----
>  xdiff/xemit.c     |  2 +-
>  xdiff/xmerge.c    | 14 +++++++-------
>  xdiff/xpatience.c |  2 +-
>  xdiff/xprepare.c  |  6 +++---
>  xdiff/xtypes.h    |  6 +++---
>  xdiff/xutils.c    |  4 ++--
>  7 files changed, 21 insertions(+), 21 deletions(-)

The rest of the patch looks good to me, assuming that the burden of
casting is placed on the C side.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 5/7] xdiff: separate parsing lines from hashing them
  2025-07-17 20:32 ` [PATCH 5/7] xdiff: separate parsing lines from hashing them Ezekiel Newren via GitGitGadget
@ 2025-07-17 22:59   ` Taylor Blau
  2025-07-18 13:34   ` Phillip Wood
  1 sibling, 0 replies; 198+ messages in thread
From: Taylor Blau @ 2025-07-17 22:59 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 08:32:22PM +0000, Ezekiel Newren via GitGitGadget wrote:
> ---
>  xdiff/xprepare.c | 75 ++++++++++++++++++++++++++++--------------------
>  1 file changed, 44 insertions(+), 31 deletions(-)

Not being all that familiar with the xdiff code, this patch took me a
little while longer to read and understand, but the transformation looks
correct to me.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-17 22:46   ` Taylor Blau
@ 2025-07-17 23:13     ` brian m. carlson
  2025-07-17 23:37       ` Elijah Newren
  2025-07-18  0:21       ` Taylor Blau
  0 siblings, 2 replies; 198+ messages in thread
From: brian m. carlson @ 2025-07-17 23:13 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren

[-- Attachment #1: Type: text/plain, Size: 2447 bytes --]

On 2025-07-17 at 22:46:56, Taylor Blau wrote:
> On Thu, Jul 17, 2025 at 08:32:21PM +0000, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <ezekielnewren@gmail.com>
> >
> > A few commits ago, we added definitions for Rust primitive types,
> > to facilitate interoperability between C and Rust. Switch a
> > few variables to use these types. Which, for now, will
> > require adding some casts.
> 
> Hmm, interesting. I am not super familiar with how people typically
> handle interoperability between C and Rust, but having to change types
> on the C side to make it work with Rust is a bit surprising to me.
> 
> I would have expected that the Rust side would have declared its types
> using libc::c_int, libc::size_t, and so on. I think I have a vague
> preference towards putting the burden of casting on the Rust side, but,
> again, I am not super familiar with how transitions like these are
> typically approached.

Rust normally handles byte strings as slices or vectors of u8 (that is,
C's uint8_t).  C handles them as char, which may or may not be unsigned,
as we all know, which leads to some "entertaining" problems from time to
time.

Also, in general, Rust doesn't offer generic system-specific types, such
as `long`, except for C FFI.  This is actually a strong benefit, since
it means we're not inclined to write `unsigned long` and then wonder why
things are broken on Windows: instead, we write either `usize` (the
equivalent of `size_t`) or `u64` (for things like file sizes).  This is
much more ingrained than it is in Go, which has a tendency to use `int`
(Rust's `isize`) a lot and much less often specific types.

If we're going to move this code entirely into Rust, then it makes sense
to cast temporarily, and I'm fine doing that in C, since it's C that has
the weird system-dependent behaviour (arbitrary decisions on the
signedness of char).  That actually allows us to have more confidence in
the safety and maintainability of the Rust code since it is less system
dependent and leave the suspect pieces in C.  It may also, interestingly
enough, also allow us to easily get rid of the weird 2 GB limit on diffs
due to the unpleasant dependency on `int` in the xdiff code, which I
would absolutely love to see.

However, I'm not dead set against casting in Rust if that's what
everyone else wants instead.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-07-17 20:32 ` [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash Ezekiel Newren via GitGitGadget
@ 2025-07-17 23:29   ` Taylor Blau
  2025-07-18 19:00   ` Junio C Hamano
  2025-07-19 21:53   ` Johannes Schindelin
  2 siblings, 0 replies; 198+ messages in thread
From: Taylor Blau @ 2025-07-17 23:29 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 08:32:23PM +0000, Ezekiel Newren via GitGitGadget wrote:
>     // desktop
>     // CPU: 8-core Intel Core i7-9700 (-MCP-) speed/min/max: 800/800/4700 MHz
>     $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null'
>     Benchmark 1: /home/steamuser/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null
>       Time (mean ± σ):      6.823 s ±  0.020 s    [User: 6.624 s, System: 0.180 s]
>       Range (min … max):    6.801 s …  6.858 s    10 runs
>
>     Benchmark 2: /home/steamuser/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
>       Time (mean ± σ):      8.151 s ±  0.024 s    [User: 7.928 s, System: 0.198 s]
>       Range (min … max):    8.105 s …  8.184 s    10 runs
>
>     Summary
>       /home/steamuser/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran
>         1.19 ± 0.01 times faster than /home/steamuser/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null

Very cool!

> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
>  rust/Cargo.lock       |  7 +++++++
>  rust/xdiff/Cargo.toml |  1 +
>  rust/xdiff/src/lib.rs |  7 +++++++
>  xdiff/xprepare.c      | 19 +++++++++++++++++--
>  4 files changed, 32 insertions(+), 2 deletions(-)

This patch is delightfully simple. Thank you for carefully preparing the
previous five patches to make this one as tiny as it is.

> @@ -175,14 +178,26 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
>
>  	xdl_parse_lines(mf, narec, xdf);
>
> +	if ((xpp->flags & XDF_WHITESPACE_FLAGS) == 0) {

It may be worth adding a comment here to explain why we aren't using
xdl_hash_record() when xpp->flags lacks XDF_WHITESPACE_FLAGS.

(As a meta-note for reviewing this series, there are a handful of style
nits that I haven't mentioned, e.g., if (... == 0) instead of if (!...).
But since the xdiff code doesn't match the project's style conventions,
I have avoided mentioning it in my review.)

> +		for (usize i = 0; i < (usize) xdf->nrec; i++) {
> +			xrecord_t *rec = xdf->recs[i];
> +			rec->ha = xxh3_64(rec->ptr, rec->size);
> +		}
> +	} else {
> +		for (usize i = 0; i < (usize) xdf->nrec; i++) {
> +			xrecord_t *rec = xdf->recs[i];
> +			char const* dump = (char const*) rec->ptr;
> +			rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
> +		}
> +	}
> +
>  	for (usize i = 0; i < (usize) xdf->nrec; i++) {
>  		xrecord_t *rec = xdf->recs[i];
> -		char const* dump = (char const*) rec->ptr;
> -		rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
>  		xdl_classify_record(pass, cf, rec);

I am curious why you are calling xdl_classify_record() here as a
post-processing step rather than inline with the hash calculation above.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-17 23:13     ` brian m. carlson
@ 2025-07-17 23:37       ` Elijah Newren
  2025-07-18  0:23         ` Taylor Blau
  2025-07-18  0:21       ` Taylor Blau
  1 sibling, 1 reply; 198+ messages in thread
From: Elijah Newren @ 2025-07-17 23:37 UTC (permalink / raw)
  To: brian m. carlson, Taylor Blau, Ezekiel Newren via GitGitGadget,
	git, Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 4:13 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On 2025-07-17 at 22:46:56, Taylor Blau wrote:
> > On Thu, Jul 17, 2025 at 08:32:21PM +0000, Ezekiel Newren via GitGitGadget wrote:
> > > From: Ezekiel Newren <ezekielnewren@gmail.com>
> > >
> > > A few commits ago, we added definitions for Rust primitive types,
> > > to facilitate interoperability between C and Rust. Switch a
> > > few variables to use these types. Which, for now, will
> > > require adding some casts.
> >
> > Hmm, interesting. I am not super familiar with how people typically
> > handle interoperability between C and Rust, but having to change types
> > on the C side to make it work with Rust is a bit surprising to me.
> >
> > I would have expected that the Rust side would have declared its types
> > using libc::c_int, libc::size_t, and so on. I think I have a vague
> > preference towards putting the burden of casting on the Rust side, but,
> > again, I am not super familiar with how transitions like these are
> > typically approached.
>
> Rust normally handles byte strings as slices or vectors of u8 (that is,
> C's uint8_t).  C handles them as char, which may or may not be unsigned,
> as we all know, which leads to some "entertaining" problems from time to
> time.
>
> Also, in general, Rust doesn't offer generic system-specific types, such
> as `long`, except for C FFI.  This is actually a strong benefit, since
> it means we're not inclined to write `unsigned long` and then wonder why
> things are broken on Windows: instead, we write either `usize` (the
> equivalent of `size_t`) or `u64` (for things like file sizes).  This is
> much more ingrained than it is in Go, which has a tendency to use `int`
> (Rust's `isize`) a lot and much less often specific types.
>
> If we're going to move this code entirely into Rust, then it makes sense
> to cast temporarily, and I'm fine doing that in C, since it's C that has
> the weird system-dependent behaviour (arbitrary decisions on the
> signedness of char).  That actually allows us to have more confidence in
> the safety and maintainability of the Rust code since it is less system
> dependent and leave the suspect pieces in C.  It may also, interestingly
> enough, also allow us to easily get rid of the weird 2 GB limit on diffs
> due to the unpleasant dependency on `int` in the xdiff code, which I
> would absolutely love to see.
>
> However, I'm not dead set against casting in Rust if that's what
> everyone else wants instead.

In general, I too would prefer to do the casting on the C side; after
all, part of the reason for Rust is the language safety, which we
compromise if we force it into using ambiguously sized variables.

However, I think it might be somewhat case-dependent...

Here, we have C calling into APIs that will be defined and implemented
in Rust.  Further along the road of adopting Rust in more places, we
may have future cases where we have Rust calling into APIs defined and
implemented in C.  I'm wondering if in such a world the rule of thumb
that makes the most sense would be to have a
caller-must-cast-as-necessary guideline, rather than specifying the
casting side by language.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-17 23:13     ` brian m. carlson
  2025-07-17 23:37       ` Elijah Newren
@ 2025-07-18  0:21       ` Taylor Blau
  1 sibling, 0 replies; 198+ messages in thread
From: Taylor Blau @ 2025-07-18  0:21 UTC (permalink / raw)
  To: brian m. carlson, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 11:13:14PM +0000, brian m. carlson wrote:
> On 2025-07-17 at 22:46:56, Taylor Blau wrote:
> > On Thu, Jul 17, 2025 at 08:32:21PM +0000, Ezekiel Newren via GitGitGadget wrote:
> > > From: Ezekiel Newren <ezekielnewren@gmail.com>
> > >
> > > A few commits ago, we added definitions for Rust primitive types,
> > > to facilitate interoperability between C and Rust. Switch a
> > > few variables to use these types. Which, for now, will
> > > require adding some casts.
> >
> > Hmm, interesting. I am not super familiar with how people typically
> > handle interoperability between C and Rust, but having to change types
> > on the C side to make it work with Rust is a bit surprising to me.
> >
> > I would have expected that the Rust side would have declared its types
> > using libc::c_int, libc::size_t, and so on. I think I have a vague
> > preference towards putting the burden of casting on the Rust side, but,
> > again, I am not super familiar with how transitions like these are
> > typically approached.
>
> Rust normally handles byte strings as slices or vectors of u8 (that is,
> C's uint8_t).  C handles them as char, which may or may not be unsigned,
> as we all know, which leads to some "entertaining" problems from time to
> time.

;-)

> Also, in general, Rust doesn't offer generic system-specific types, such
> as `long`, except for C FFI.  This is actually a strong benefit, since
> it means we're not inclined to write `unsigned long` and then wonder why
> things are broken on Windows: instead, we write either `usize` (the
> equivalent of `size_t`) or `u64` (for things like file sizes).  This is
> much more ingrained than it is in Go, which has a tendency to use `int`
> (Rust's `isize`) a lot and much less often specific types.
>
> If we're going to move this code entirely into Rust, then it makes sense
> to cast temporarily, and I'm fine doing that in C, since it's C that has
> the weird system-dependent behaviour (arbitrary decisions on the
> signedness of char).  That actually allows us to have more confidence in
> the safety and maintainability of the Rust code since it is less system
> dependent and leave the suspect pieces in C.  It may also, interestingly
> enough, also allow us to easily get rid of the weird 2 GB limit on diffs
> due to the unpleasant dependency on `int` in the xdiff code, which I
> would absolutely love to see.

Ahhh. Thanks for the patient explanation. That makes a lot of sense, and
casting on the Rust side seems like the right approach for this spot.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-17 23:37       ` Elijah Newren
@ 2025-07-18  0:23         ` Taylor Blau
  0 siblings, 0 replies; 198+ messages in thread
From: Taylor Blau @ 2025-07-18  0:23 UTC (permalink / raw)
  To: Elijah Newren
  Cc: brian m. carlson, Ezekiel Newren via GitGitGadget, git,
	Ezekiel Newren

On Thu, Jul 17, 2025 at 04:37:24PM -0700, Elijah Newren wrote:
> > However, I'm not dead set against casting in Rust if that's what
> > everyone else wants instead.
>
> In general, I too would prefer to do the casting on the C side; after
> all, part of the reason for Rust is the language safety, which we
> compromise if we force it into using ambiguously sized variables.
>
> However, I think it might be somewhat case-dependent...
>
> Here, we have C calling into APIs that will be defined and implemented
> in Rust.  Further along the road of adopting Rust in more places, we
> may have future cases where we have Rust calling into APIs defined and
> implemented in C.  I'm wondering if in such a world the rule of thumb
> that makes the most sense would be to have a
> caller-must-cast-as-necessary guideline, rather than specifying the
> casting side by language.

Yeah, I think having some sort of general guidance here spelled out in
CodingGuidelines would be helpful. I don't think that it needs to
prescribe a specific rule, but having some of what you and brian wrote
above captured more permanently would give contributors more information
about why they may want to place the casts on one side versus the other.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-17 22:25   ` Taylor Blau
@ 2025-07-18  0:29     ` brian m. carlson
  2025-07-22 12:21       ` Patrick Steinhardt
  2025-07-22 16:03     ` Sam James
  1 sibling, 1 reply; 198+ messages in thread
From: brian m. carlson @ 2025-07-18  0:29 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren

[-- Attachment #1: Type: text/plain, Size: 4575 bytes --]

On 2025-07-17 at 22:25:23, Taylor Blau wrote:
> I agree. I don't think that there is ever going to be a "perfect" time
> to introduce a hard dependency on Rust, and I don't think that should
> hold the project back from adopting it.
> 
> I am far from a Rust expert, but I think that a more modern, memory-safe
> language will attract newer contributors who may have a fresher
> perspective on the project, and I think that's a good thing.

Yes, I think that's true.  Rust is by far the most admired programming
language to work with, according to the 2024 Stack Overflow Developer
Survey.  We will likely attract new contributors who find C intimidating
or a bit of a hassle[0] but are excited about working on Rust,
especially in a project as compelling as Git[1].

> The alternative, of course, is to continue to use C and not take any
> dependency on Rust. I think there is a middle-ground in there somewhere
> to be able to build with (e.g.) "make" or "make RUST=1", but I would
> really like to see the project take a firmer stance here.
> 
> I worry that having build support for both "with Rust" and "C only" will
> create a headache not just at the build system level, but also in the
> code itself. Having a patchwork of features, optimizations, or bug fixes
> that either are or aren't supported depending on whether Rust support
> was specified at build-time seems like a worst-of-all-worlds outcome.

I definitely agree.  I already find it terribly inconvenient when I end
up when `git grep` doesn't support `-P` and I imagine that having lots
of features that weren't available would be bothersome.

I also think that using a combination of C and Rust will end up with us
still writing a lot of unsafe Rust code to interoperate with C.  If we
want to reap the benefits in terms of memory and thread safety[2], we'll
be better off sticking with just Rust.

I will also say that while it may be more challenging to compile Git at
first on Windows, as we move more towards an all-Rust codebase, Git may
end up being easier to maintain there as we depend more on the standard
library.

> Agreed. Of course, I think we would all like Git to be able to build and
> run on as many platforms as is reasonably possible. But we cannot
> support all platforms for all time. It is also not the Git project's
> responsibility to ensure that every platform is Rust-friendly.
> 
> Hopefully the platforms that we currently support but won't after this
> patch series have niche enough workloads that they do not need the
> absolute latest-and-greatest Git release at all times.

I will also point out that many OS and CPU architectures are actually
supported in Rust upstream.  `rustc --print target-list` includes things
like the following:

* m68k-unknown-linux-gnu (Amiga and other 68000 processors on Linux)
* wasm32-unknown-unknown (Git in your browser?)
* armv7a-nuttx-eabi (ARM processors running the embedded NuttX OS)
* x86_64-pc-cygwin (Cygwin[3])
* sparc64-unknown-openbsd (OpenBSD on UltraSPARC)

All Debian release architectures are supported, for instance, as well as
several non-release architectures.  The only Debian architectures that I
don't believe are supported are alpha, hppa, ia64, sh4, and x32 (which
is an amd64 variant that can run amd64 code just fine).

> Yeah, I think that this is the most interesting part of the discussion
> here. I am not knowledgeable enough about Rust's release cadence and
> platform compatibility to have an opinion here. But I trust brian's
> judgement ;-).

I'll see what Ezekiel thinks about this and I can send out a patch for
review if that's desired.

[0] I've been writing C for three-quarters of my life and I still find
debugging segfaults and other memory problems to be annoying and
tiresome, so I'm very interested in getting out of that business while
working on Git.  While the limiting factor for my contributions to Git
is often time, I would feel more excited about working on Git in Rust
than in C and I'm confident I'd write better quality code and more unit
tests as well (which benefits us and our users as well).
[1] I definitely think there's a cool factor to working on Git.
[2] Having more thread-safe code might allow us to more easily add
threading to other parts of Git that would benefit from it and thus
improve performance in many cases.
[3] This was missing for a long time, but it's finally here now, which
is also good news for Git for Windows.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
                   ` (7 preceding siblings ...)
  2025-07-17 21:51 ` [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification brian m. carlson
@ 2025-07-18  9:23 ` Christian Brabandt
  2025-07-18 16:26   ` Junio C Hamano
  2025-07-18 13:34 ` Phillip Wood
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 198+ messages in thread
From: Christian Brabandt @ 2025-07-18  9:23 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren


On Do, 17 Jul 2025, Ezekiel Newren via GitGitGadget wrote:

> This series accelerates xdiff by 5-19%.
> 
> It also introduces Rust as a hard dependency.
> 
> …and it doesn’t yet pass a couple of the github workflows; hints from
> Windows experts, and opinions on ambiguous primitives would be appreciated
> (see below).
> 
> This is just the beginning of many patches that I have to convert portions
> of, maybe eventually all of, xdiff to Rust. While working on that
> conversion, I found several ways to clarify the code, along with some
> optimizations.

Just a quick heads-up: We (as in Vim/Neovim) have been using gits xdiff 
library for use in Vim and Neovim.

Is the plan to get rid of xdiffs C source completely and replace it by a 
Rust implementation?

Thanks,
Chris
-- 
Eine gute Stellung ist besser als jede Arbeit.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
                   ` (8 preceding siblings ...)
  2025-07-18  9:23 ` Christian Brabandt
@ 2025-07-18 13:34 ` Phillip Wood
  2025-07-18 21:25   ` Eli Schwartz
  2025-07-18 14:38 ` Junio C Hamano
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 198+ messages in thread
From: Phillip Wood @ 2025-07-18 13:34 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget, git
  Cc: Elijah Newren, Ezekiel Newren, Edward Thomson, brian m. carlson,
	Taylor Blau

Hi Ezekiel

Thanks for working on this

On 17/07/2025 21:32, Ezekiel Newren via GitGitGadget wrote:
> This series accelerates xdiff by 5-19%.
> 
> It also introduces Rust as a hard dependency.
> 
> …and it doesn’t yet pass a couple of the github workflows; hints from
> Windows experts, and opinions on ambiguous primitives would be appreciated
> (see below).
> 
> This is just the beginning of many patches that I have to convert portions
> of, maybe eventually all of, xdiff to Rust. While working on that
> conversion, I found several ways to clarify the code, along with some
> optimizations.
> 
> So...
> 
> This obviously raises the question of whether we are ready to accept a hard
> dependency on Rust. Previous discussions on the mailing list and at Git
> Merge 2024 have not answered that question. If not now, will we be willing
> to accept such a hard dependency later? And what route do we want to take to
> get there?

As far as git goes I think introducing a hard dependency on rust is 
fine. It is widely supported, the only issue I'm aware of is the lack of 
support on NonStop and I don't think it is reasonable for such a 
minority platform to hold the rest of the project to ransom. There is a 
question about the other users of the xdiff code though. libgit2 carries 
a copy as do other projects like neovim. I've cc'd the libgit2 
maintainer and posted a link to this thread in neovim github [1]

I've left a few comments on the patches

Thanks

Phillip

[1] https://github.com/neovim/neovim/discussions/34987

> About the optimizations in this series:
> 
> 1. xdiff currently uses DJB2a for hashing (even though it is not explicitly named as such). This is an older hashing algorithm, and modern alternatives are superior. I chose xxhash because it’s faster, more collision resistant, and designed to be a standard. Other hash algorithms like aHash, MurMurHash, SipHash, and Fnv1a were considered, but my local testing made me feel like xxhash was the best choice for usage in xdiff.
> 
> 2. In support of switching to xxhash, parsing and hashing were split into separate steps. And it turns out that memchr() is faster for parsing than character-by-character iteration.
> 
> 
> About the workflow builds/tests that aren’t working with this series:
> 
> 1. Windows fails to build. I don’t know which rust toolchain is even correct for this or if multiple are needed.  Example failed build: https://github.com/git/git/actions/runs/16353209191
> 
> 2. I386/ubuntu:focal will build, but fails the tests. The kernel reports the bitness as 64 despite the container being 32. I believe the issue is that C uses ambiguous primitives (which differ in size between platforms). The new code should use unambiguous primitives from Rust (u32, u64, etc.) rather than perpetuating ambiguous primitive types.  Since the current xdiff API hardcodes the ambiguous types, though, those places will need to be migrated to unambiguous primitives. Much of the C code needs a slight refactor to be compatible with the Rust FFI and usually requires converting ambiguous to unambiguous types. What does this community think of this approach?
> 
> 
> My brother (Elijah, cc’ed) has been guiding and reviewing my work here.
> 
> Ezekiel Newren (7):
>    xdiff: introduce rust
>    xdiff/xprepare: remove superfluous forward declarations
>    xdiff: delete unnecessary fields from xrecord_t and xdfile_t
>    xdiff: make fields of xrecord_t Rust friendly
>    xdiff: separate parsing lines from hashing them
>    xdiff: conditionally use Rust's implementation of xxhash
>    github_workflows: install rust
> 
>   .github/workflows/main.yml |   1 +
>   .gitignore                 |   1 +
>   Makefile                   |  60 +++++++---
>   build_rust.sh              |  59 ++++++++++
>   ci/install-dependencies.sh |  14 +--
>   ci/install-rust.sh         |  33 ++++++
>   ci/lib.sh                  |   8 ++
>   ci/make-test-artifacts.sh  |   7 ++
>   ci/run-build-and-tests.sh  |  10 ++
>   git-compat-util.h          |  17 +++
>   meson.build                |  40 +++++--
>   rust/Cargo.lock            |  21 ++++
>   rust/Cargo.toml            |   6 +
>   rust/interop/Cargo.toml    |  14 +++
>   rust/interop/src/lib.rs    |   0
>   rust/xdiff/Cargo.toml      |  16 +++
>   rust/xdiff/src/lib.rs      |   7 ++
>   xdiff/xdiffi.c             |   8 +-
>   xdiff/xemit.c              |   2 +-
>   xdiff/xmerge.c             |  14 +--
>   xdiff/xpatience.c          |   2 +-
>   xdiff/xprepare.c           | 226 ++++++++++++++++++-------------------
>   xdiff/xtypes.h             |   9 +-
>   xdiff/xutils.c             |   4 +-
>   24 files changed, 414 insertions(+), 165 deletions(-)
>   create mode 100755 build_rust.sh
>   create mode 100644 ci/install-rust.sh
>   create mode 100644 rust/Cargo.lock
>   create mode 100644 rust/Cargo.toml
>   create mode 100644 rust/interop/Cargo.toml
>   create mode 100644 rust/interop/src/lib.rs
>   create mode 100644 rust/xdiff/Cargo.toml
>   create mode 100644 rust/xdiff/src/lib.rs
> 
> 
> base-commit: 16bd9f20a403117f2e0d9bcda6c6e621d3763e77
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1980%2Fezekielnewren%2Fxdiff_rust_speedup-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1980/ezekielnewren/xdiff_rust_speedup-v1
> Pull-Request: https://github.com/git/git/pull/1980


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 5/7] xdiff: separate parsing lines from hashing them
  2025-07-17 20:32 ` [PATCH 5/7] xdiff: separate parsing lines from hashing them Ezekiel Newren via GitGitGadget
  2025-07-17 22:59   ` Taylor Blau
@ 2025-07-18 13:34   ` Phillip Wood
  1 sibling, 0 replies; 198+ messages in thread
From: Phillip Wood @ 2025-07-18 13:34 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget, git
  Cc: Elijah Newren, Ezekiel Newren, brian m. carlson, Taylor Blau

Hi Ezekiel

On 17/07/2025 21:32, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
> 
> We want to use xxhash for faster hashing. To facilitate that
> and to simplify the code. Separate the concerns of parsing
> and hashing into discrete steps. This makes swapping the hash
> function much easier. Since xdl_hash_record() both parses and
> hashses lines, this requires some slight code restructuring.

That makes sense though unfortunately we seem to have lost some error 
handling in the restructuring. How much does this extra pass over the 
input data slow down the cases that don't end up using xxhash?

> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
>   xdiff/xprepare.c | 75 ++++++++++++++++++++++++++++--------------------
>   1 file changed, 44 insertions(+), 31 deletions(-)
> 
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index 747268e4fdf7..c44005e9bbb8 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -129,13 +129,39 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
>   }
>   
>   
> +static void xdl_parse_lines(mmfile_t *mf, long narec, xdfile_t *xdf) {
> +	u8 const* ptr = (u8 const*) mf->ptr;
> +	usize len = (usize) mf->size;
> +
> +	xdf->recs = NULL;
> +	xdf->nrec = 0;
> +	XDL_ALLOC_ARRAY(xdf->recs, narec);

This should return error if the allocation fails like the original code. 
Although that does not make any difference for git a number of other 
projects such as libgit2 carry a copy of our xdiff code and want to be 
able to handle allocation failures.

> +	while (len > 0) {
> +		xrecord_t *rec = NULL;
> +		usize length;
> +		u8 const* result = memchr(ptr, '\n', len);
> +		if (result) {
> +			length = result - ptr + 1;
> +		} else {
> +			length = len;
> +		}
> +		if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
> +			die("XDL_ALLOC_GROW failed");

We should return an error rather than dying here

> +		rec = xdl_cha_alloc(&xdf->rcha);

We should return an error if the call fails like the original code

> +		rec->ptr = ptr;
> +		rec->size = length;
> +		rec->ha = 0;
> +		xdf->recs[xdf->nrec++] = rec;
> +		ptr += length;
> +		len -= length;
> +	}
> +
> +}
> +
> +
>   static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
>   			   xdlclassifier_t *cf, xdfile_t *xdf) {
> -	long nrec, bsize;
> -	unsigned long hav;
> -	char const *blk, *cur, *top, *prev;
> -	xrecord_t *crec;
> -	xrecord_t **recs;
>   	unsigned long *ha;
>   	char *rchg;
>   	long *rindex;
> @@ -143,50 +169,37 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
>   	ha = NULL;
>   	rindex = NULL;
>   	rchg = NULL;
> -	recs = NULL;
>   
>   	if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
>   		goto abort;
> -	if (!XDL_ALLOC_ARRAY(recs, narec))
> -		goto abort;
>   
> -	nrec = 0;
> -	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
> -		for (top = blk + bsize; cur < top; ) {
> -			prev = cur;
> -			hav = xdl_hash_record(&cur, top, xpp->flags);
> -			if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
> -				goto abort;
> -			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
> -				goto abort;
> -			crec->ptr = (u8 const*) prev;
> -			crec->size = (long) (cur - prev);
> -			crec->ha = hav;
> -			recs[nrec++] = crec;
> -			if (xdl_classify_record(pass, cf, crec) < 0)
> -				goto abort;
> -		}
> +	xdl_parse_lines(mf, narec, xdf);
> +
> +	for (usize i = 0; i < (usize) xdf->nrec; i++) {
> +		xrecord_t *rec = xdf->recs[i];
> +		char const* dump = (char const*) rec->ptr;
> +		rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);

I think we should update xdl_hash_record() to stop updating dump and use 
the length from xdl_parse_lines(). Now that we parse the lines before 
hashing we should use that length in the hash function so we have a 
single definition of line length.

> +		xdl_classify_record(pass, cf, rec);

We should return an error if this call fails like the original code

Thanks

Phillip

>   	}
>   
> -	if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
> +
> +	if (!XDL_CALLOC_ARRAY(rchg, xdf->nrec + 2))
>   		goto abort;
>   
>   	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
>   	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
> -		if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
> +		if (!XDL_ALLOC_ARRAY(rindex, xdf->nrec + 1))
>   			goto abort;
> -		if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
> +		if (!XDL_ALLOC_ARRAY(ha, xdf->nrec + 1))
>   			goto abort;
>   	}
>   
> -	xdf->nrec = nrec;
> -	xdf->recs = recs;
>   	xdf->rchg = rchg + 1;
>   	xdf->rindex = rindex;
>   	xdf->nreff = 0;
>   	xdf->ha = ha;
>   	xdf->dstart = 0;
> -	xdf->dend = nrec - 1;
> +	xdf->dend = xdf->nrec - 1;
>   
>   	return 0;
>   
> @@ -194,7 +207,7 @@ abort:
>   	xdl_free(ha);
>   	xdl_free(rindex);
>   	xdl_free(rchg);
> -	xdl_free(recs);
> +	xdl_free(xdf->recs);
>   	xdl_cha_free(&xdf->rcha);
>   	return -1;
>   }


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-17 20:32 ` [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
  2025-07-17 22:46   ` Taylor Blau
@ 2025-07-18 13:35   ` Phillip Wood
  2025-07-28 19:34     ` Ezekiel Newren
  2025-07-20  1:39   ` Johannes Schindelin
  2 siblings, 1 reply; 198+ messages in thread
From: Phillip Wood @ 2025-07-18 13:35 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget, git
  Cc: Elijah Newren, Ezekiel Newren, brian m. carlson, Taylor Blau

Hi Ezekiel

On 17/07/2025 21:32, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
> 
> A few commits ago, we added definitions for Rust primitive types,
> to facilitate interoperability between C and Rust. Switch a
> few variables to use these types. Which, for now, will
> require adding some casts.

How necessary is it to change char' to 'u8' so long as the rust and C 
sides both use a type that is the same size? Also what's the advantage 
of using these typedefs rather than the normal C types like unit8_t ?

> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index 5a96e36dfbea..3b364c61f671 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -418,7 +418,7 @@ static int get_indent(xrecord_t *rec)
>   	long i;
>   	int ret = 0;
>   
> -	for (i = 0; i < rec->size; i++) {
> +	for (i = 0; i < (long) rec->size; i++) {

i is a loop counter and array index so we can lose this cast by 
changeing i to size_t

Thanks

Phillip

>   		char c = rec->ptr[i];
>   
>   		if (!XDL_ISSPACE(c))
> @@ -1005,11 +1005,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
>   
>   		rec = &xe->xdf1.recs[xch->i1];
>   		for (i = 0; i < xch->chg1 && ignore; i++)
> -			ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
> +			ignore = xdl_blankline((const char*) rec[i]->ptr, rec[i]->size, flags);
>   
>   		rec = &xe->xdf2.recs[xch->i2];
>   		for (i = 0; i < xch->chg2 && ignore; i++)
> -			ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
> +			ignore = xdl_blankline((const char*)rec[i]->ptr, rec[i]->size, flags);
>   
>   		xch->ignore = ignore;
>   	}
> @@ -1020,7 +1020,7 @@ static int record_matches_regex(xrecord_t *rec, xpparam_t const *xpp) {
>   	size_t i;
>   
>   	for (i = 0; i < xpp->ignore_regex_nr; i++)
> -		if (!regexec_buf(xpp->ignore_regex[i], rec->ptr, rec->size, 1,
> +		if (!regexec_buf(xpp->ignore_regex[i], (const char*) rec->ptr, rec->size, 1,
>   				 &regmatch, 0))
>   			return 1;
>   
> diff --git a/xdiff/xemit.c b/xdiff/xemit.c
> index 1d40c9cb4076..bbf7b7f8c862 100644
> --- a/xdiff/xemit.c
> +++ b/xdiff/xemit.c
> @@ -24,7 +24,7 @@
>   
>   static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
>   
> -	*rec = xdf->recs[ri]->ptr;
> +	*rec = (char const*) xdf->recs[ri]->ptr;
>   
>   	return xdf->recs[ri]->size;
>   }
> diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
> index af40c88a5b36..6fa6ea61a208 100644
> --- a/xdiff/xmerge.c
> +++ b/xdiff/xmerge.c
> @@ -101,8 +101,8 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
>   	xrecord_t **rec2 = xe2->xdf2.recs + i2;
>   
>   	for (i = 0; i < line_count; i++) {
> -		int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
> -			rec2[i]->ptr, rec2[i]->size, flags);
> +		int result = xdl_recmatch((const char*) rec1[i]->ptr, rec1[i]->size,
> +			(const char*) rec2[i]->ptr, rec2[i]->size, flags);
>   		if (!result)
>   			return -1;
>   	}
> @@ -324,8 +324,8 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
>   
>   static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
>   {
> -	return xdl_recmatch(rec1->ptr, rec1->size,
> -			    rec2->ptr, rec2->size, flags);
> +	return xdl_recmatch((char const*) rec1->ptr, rec1->size,
> +			    (char const*) rec2->ptr, rec2->size, flags);
>   }
>   
>   /*
> @@ -383,10 +383,10 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
>   		 */
>   		t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
>   		t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
> -			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
> +			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - (u8 const*) t1.ptr;
>   		t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
>   		t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
> -			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
> +			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - (u8 const*) t2.ptr;
>   		if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
>   			return -1;
>   		if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
> @@ -440,7 +440,7 @@ static int line_contains_alnum(const char *ptr, long size)
>   static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
>   {
>   	for (; chg; chg--, i++)
> -		if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
> +		if (line_contains_alnum((char const*) xe->xdf2.recs[i]->ptr,
>   				xe->xdf2.recs[i]->size))
>   			return 1;
>   	return 0;
> diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
> index 77dc411d1937..986a3a3f749a 100644
> --- a/xdiff/xpatience.c
> +++ b/xdiff/xpatience.c
> @@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
>   		return;
>   	map->entries[index].line1 = line;
>   	map->entries[index].hash = record->ha;
> -	map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
> +	map->entries[index].anchor = is_anchor(xpp, (const char*) map->env->xdf1.recs[line - 1]->ptr);
>   	if (!map->first)
>   		map->first = map->entries + index;
>   	if (map->last) {
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index ad356281f939..747268e4fdf7 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -96,12 +96,12 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
>   	char const *line;
>   	xdlclass_t *rcrec;
>   
> -	line = rec->ptr;
> +	line = (char const*) rec->ptr;
>   	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
>   	for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
>   		if (rcrec->ha == rec->ha &&
>   				xdl_recmatch(rcrec->line, rcrec->size,
> -					rec->ptr, rec->size, cf->flags))
> +					(const char*) rec->ptr, rec->size, cf->flags))
>   			break;
>   
>   	if (!rcrec) {
> @@ -159,7 +159,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
>   				goto abort;
>   			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
>   				goto abort;
> -			crec->ptr = prev;
> +			crec->ptr = (u8 const*) prev;
>   			crec->size = (long) (cur - prev);
>   			crec->ha = hav;
>   			recs[nrec++] = crec;
> diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
> index 8b8467360ecf..6e5f67ebf380 100644
> --- a/xdiff/xtypes.h
> +++ b/xdiff/xtypes.h
> @@ -39,9 +39,9 @@ typedef struct s_chastore {
>   } chastore_t;
>   
>   typedef struct s_xrecord {
> -	char const *ptr;
> -	long size;
> -	unsigned long ha;
> +	u8 const* ptr;
> +	usize size;
> +	u64 ha;
>   } xrecord_t;
>   
>   typedef struct s_xdfile {
> diff --git a/xdiff/xutils.c b/xdiff/xutils.c
> index 444a108f87c0..10e4f20b7c31 100644
> --- a/xdiff/xutils.c
> +++ b/xdiff/xutils.c
> @@ -418,10 +418,10 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
>   
>   	subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
>   	subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
> -		diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
> +		diff_env->xdf1.recs[line1 + count1 - 2]->size - (u8 const*) subfile1.ptr;
>   	subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
>   	subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
> -		diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
> +		diff_env->xdf2.recs[line2 + count2 - 2]->size - (u8 const*) subfile2.ptr;
>   	if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
>   		return -1;
>   


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
                   ` (9 preceding siblings ...)
  2025-07-18 13:34 ` Phillip Wood
@ 2025-07-18 14:38 ` Junio C Hamano
  2025-07-18 21:56   ` Ezekiel Newren
  2025-07-21 10:14   ` Phillip Wood
  2025-07-19 21:53 ` Johannes Schindelin
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
  12 siblings, 2 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-07-18 14:38 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren

"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This series accelerates xdiff by 5-19%.

;-)

Do we know how much of that can be attributed to the hash algorithm
difference, and how much for languages?

The earlier parts of the series to trim unused code and refactor
look to me that they are good changes regardless of whether we
introduce a different hash algorithm, and/or we use an
implementation of that different hash algorithm written in Rust.
IOW, even if neither of these two happens, I would think that the
earlier parts are independently good pieces.

Thanks for starting this effort.  And thanks Elijah for helping.

And in case nobody has said this yet, welcome to the Git development
community.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-18  9:23 ` Christian Brabandt
@ 2025-07-18 16:26   ` Junio C Hamano
  2025-07-19  0:32     ` Elijah Newren
  0 siblings, 1 reply; 198+ messages in thread
From: Junio C Hamano @ 2025-07-18 16:26 UTC (permalink / raw)
  To: Christian Brabandt
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren

Christian Brabandt <cb@256bit.org> writes:

> On Do, 17 Jul 2025, Ezekiel Newren via GitGitGadget wrote:
>
>> This series accelerates xdiff by 5-19%.
>> 
>> It also introduces Rust as a hard dependency.
>> 
>> …and it doesn’t yet pass a couple of the github workflows; hints from
>> Windows experts, and opinions on ambiguous primitives would be appreciated
>> (see below).
>> 
>> This is just the beginning of many patches that I have to convert portions
>> of, maybe eventually all of, xdiff to Rust. While working on that
>> conversion, I found several ways to clarify the code, along with some
>> optimizations.
>
> Just a quick heads-up: We (as in Vim/Neovim) have been using gits xdiff 
> library for use in Vim and Neovim.
>
> Is the plan to get rid of xdiffs C source completely and replace it by a 
> Rust implementation?

As far as I know, there is no such plan that is widely agreed upon
(yet).

The discussion starter thread you are looking at only introduces a
new code path that uses a different line hash function written in
Rust when whitespace munging search is not enabled, and everything
else still is written in C, but since it is just a discussion
starter so far.

I would personally have liked the effort to start with xmerge code,
not xdiff machinery, for various reasons, but that may be just me
;-)

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-07-17 20:32 ` [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash Ezekiel Newren via GitGitGadget
  2025-07-17 23:29   ` Taylor Blau
@ 2025-07-18 19:00   ` Junio C Hamano
  2025-07-31 21:13     ` Ezekiel Newren
  2025-07-19 21:53   ` Johannes Schindelin
  2 siblings, 1 reply; 198+ messages in thread
From: Junio C Hamano @ 2025-07-18 19:00 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren

"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +extern u64 xxh3_64(u8 const* ptr, usize size);
> +
> +
>  static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
>  			   xdlclassifier_t *cf, xdfile_t *xdf) {
>  	unsigned long *ha;
> @@ -175,14 +178,26 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
>  
>  	xdl_parse_lines(mf, narec, xdf);
>  
> +	if ((xpp->flags & XDF_WHITESPACE_FLAGS) == 0) {
> +		for (usize i = 0; i < (usize) xdf->nrec; i++) {
> +			xrecord_t *rec = xdf->recs[i];
> +			rec->ha = xxh3_64(rec->ptr, rec->size);
> +		}
> +	} else {
> +		for (usize i = 0; i < (usize) xdf->nrec; i++) {
> +			xrecord_t *rec = xdf->recs[i];
> +			char const* dump = (char const*) rec->ptr;
> +			rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
> +		}
> +	}

As a technology demonstration and proof of concept patch, this is
very nice, but to be upstreamed for real, we'd want a variant of
xxhash that can work with the contents with whitespace squashed to
be usable with various whitespace ignoring modes of operation.  When
that happens, and when the result turns out to be more performant,
we can lose the xdl_hash_record() and require only the xxhash, which
would be great.

And that variant of xxhash that understands whitespace squashing can
of course be written in Rust as a part of this effort when the
series loses its RFC status.  At the same time, those who want to
use our xdiff code in third-party software (like libgit2 and vim)
may want to reimplement it in C in their copy.

Thanks.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-18 13:34 ` Phillip Wood
@ 2025-07-18 21:25   ` Eli Schwartz
  2025-07-19  0:48     ` Haelwenn (lanodan) Monnier
  2025-07-22 14:24     ` Patrick Steinhardt
  0 siblings, 2 replies; 198+ messages in thread
From: Eli Schwartz @ 2025-07-18 21:25 UTC (permalink / raw)
  To: Phillip Wood, Ezekiel Newren via GitGitGadget, git
  Cc: Elijah Newren, Ezekiel Newren, Edward Thomson, brian m. carlson,
	Taylor Blau


[-- Attachment #1.1: Type: text/plain, Size: 1768 bytes --]

On 7/18/25 9:34 AM, Phillip Wood wrote:
> Hi Ezekiel
> 
> Thanks for working on this
> 
> On 17/07/2025 21:32, Ezekiel Newren via GitGitGadget wrote:
>
>> So...
>>
>> This obviously raises the question of whether we are ready to accept a
>> hard
>> dependency on Rust. Previous discussions on the mailing list and at Git
>> Merge 2024 have not answered that question. If not now, will we be
>> willing
>> to accept such a hard dependency later? And what route do we want to
>> take to
>> get there?
> 
> As far as git goes I think introducing a hard dependency on rust is
> fine. It is widely supported, the only issue I'm aware of is the lack of
> support on NonStop and I don't think it is reasonable for such a
> minority platform to hold the rest of the project to ransom. There is a
> question about the other users of the xdiff code though. libgit2 carries
> a copy as do other projects like neovim. I've cc'd the libgit2
> maintainer and posted a link to this thread in neovim github [1]


A hard dependency on rust for Gentoo amd64 would potentially require
building https://github.com/thepowersgang/mrustc followed by building 13
and counting versions of rustc in order to get to the latest version.
What is the minimum supported version in this series, by the way?

bin packages for rust do exist but not everyone wants to use non-distro
provided binaries, sometimes for auditability reasons.


For Gentoo HPPA, Alpha, m68k it will simply mean the removal (or end of
life and staying forever on 2.50, perhaps) of Git. There is no rust
compiler there.

Even s390 support for rust is limited to a precompiled version not
everyone is willing to use.

GCC-rs will probably fix this general issue.

-- 
Eli Schwartz

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-18 14:38 ` Junio C Hamano
@ 2025-07-18 21:56   ` Ezekiel Newren
  2025-07-21 10:14   ` Phillip Wood
  1 sibling, 0 replies; 198+ messages in thread
From: Ezekiel Newren @ 2025-07-18 21:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren

On Fri, Jul 18, 2025 at 8:38 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > This series accelerates xdiff by 5-19%.
>
> ;-)
>
> Do we know how much of that can be attributed to the hash algorithm
> difference, and how much for languages?

This is difficult to answer because xdl_hash_record() hashes the
string as it determines its length. Xxhash uses simd instructions, so
all data must be contiguous and processed as blocks rather than byte
by byte. The components cannot be directly compared due to the nature
of processing differences.

> The earlier parts of the series to trim unused code and refactor
> look to me that they are good changes regardless of whether we
> introduce a different hash algorithm, and/or we use an
> implementation of that different hash algorithm written in Rust.
> IOW, even if neither of these two happens, I would think that the
> earlier parts are independently good pieces.
>
> Thanks for starting this effort.  And thanks Elijah for helping.
>
> And in case nobody has said this yet, welcome to the Git development
> community.

Thanks to you and everyone else for your review comments. I'm going to
need time to investigate and respond.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 7/7] github_workflows: install rust
  2025-07-17 21:23   ` brian m. carlson
@ 2025-07-18 23:01     ` Ezekiel Newren
  2025-07-25 23:56       ` Ben Knoble
  0 siblings, 1 reply; 198+ messages in thread
From: Ezekiel Newren @ 2025-07-18 23:01 UTC (permalink / raw)
  To: brian m. carlson, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 3:23 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On 2025-07-17 at 20:32:24, Ezekiel Newren via GitGitGadget wrote:
> > diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
> > index 7dbf9f7f123c..8aac18a6ba45 100644
> > --- a/.github/workflows/main.yml
> > +++ b/.github/workflows/main.yml
> > @@ -4,6 +4,7 @@ on: [push, pull_request]
> >
> >  env:
> >    DEVELOPER: 1
> > +  RUST_VERSION: 1.87.0
>
> Our discussed plan is to support the version in Debian stable, plus a
> year.  So we'd be supporting 1.63.0 for a year after trixie's release.
>
> The reason for that is that people build backports and security updates
> for Git for stable releases of distros and they will use the distro
> toolchain for doing so.  Forcing distros to constantly build with the
> latest toolchain is pretty hostile, especially since the lifespan of
> Rust release is six weeks.
>
> If the Rust project provides LTS releases in the future, then we can
> consider adopting those.

The RUST_VERSION variable in .github/workflows/main.yaml had to have a
specific version. 1.87.0 was selected since that's what I was using
locally. Elijah made me aware that an older version of rust might be
desired, but didn't know which one. I'll switch to 1.63.0 or whatever
the community decides.

> > +if [ "$rust_target" = "release" ]; then
> > +  rust_args="--release"
> > +  export RUSTFLAGS='-Aunused_imports -Adead_code'
> > +elif [ "$rust_target" = "debug" ]; then
> > +  rust_args=""
> > +  export RUSTFLAGS='-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
>
> Can you say a little about why these options are needed and the defaults
> are inadequate?  For instance, I build with the default options both in
> my personal projects and at work and don't see a problem.

What I found is that if I have a Rust function

#[no_mangle]
pub fn call_from_c(arg: u64) {}

which is only meant to be called from C and isn’t called from
elsewhere in Rust, then cargo will misidentify this function as dead
code.  This was the reason for adding ‘-Adead_code’.

The reason for adding ‘-Aunused_imports’ is somewhat IDE related; if I
paste code somewhere, RustRover will sometimes automatically add the
necessary imports.  However, if I delete a chunk of code, it’ll
highlight the imports that are no longer used if I scroll to the top
of the file, but it won’t automatically remove them.  Since they
aren’t automatically removed, it’s easier to build with
‘-Aunused_imports’.

The remaining arguments, ‘-C debuginfo=2 -C opt-level=1 -C
force-frame-pointers=yes’ is to make /usr/bin/perf output more
amenable to analysis.

> I don't know if you plan to do this in a future series, but we'd also
> want cargo's tests to be run as part of CI and we'd want a lint job that
> ran clippy with both 1.63.0 and the latest stable version of Rust to
> make sure things were tidy.

Yeah I'd like that too; we can add that in a future patch series.

> --
> brian m. carlson (they/them)
> Toronto, Ontario, CA

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 1/7] xdiff: introduce rust
  2025-07-17 21:30   ` brian m. carlson
  2025-07-17 21:54     ` Junio C Hamano
  2025-07-17 22:39     ` Taylor Blau
@ 2025-07-18 23:15     ` Ezekiel Newren
  2025-07-23 21:57       ` brian m. carlson
  2025-07-22 22:02     ` Mike Hommey
  3 siblings, 1 reply; 198+ messages in thread
From: Ezekiel Newren @ 2025-07-18 23:15 UTC (permalink / raw)
  To: brian m. carlson, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 3:30 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On 2025-07-17 at 20:32:18, Ezekiel Newren via GitGitGadget wrote:
> > diff --git a/rust/Cargo.lock b/rust/Cargo.lock
> > new file mode 100644
> > index 000000000000..fb1eac690b39
> > --- /dev/null
> > +++ b/rust/Cargo.lock
> > @@ -0,0 +1,14 @@
> > +# This file is automatically @generated by Cargo.
> > +# It is not intended for manual editing.
> > +version = 4
> > +
> > +[[package]]
> > +name = "interop"
> > +version = "0.1.0"
> > +
> > +[[package]]
> > +name = "xdiff"
> > +version = "0.1.0"
> > +dependencies = [
> > + "interop",
> > +]
>
> I would prefer that we not check in Cargo.lock in Git.  Part of the
> reason is that it changes across versions and so building with a
> different version of the toolchain can update the file.

This goes against what I think is best practices.  Don’t we need
Cargo.lock to audit and debug platform specific issues, and to ensure
reproducibility?  Without Cargo.lock, we might get different results
one minute to the next if one of our dependencies releases a new
version. Checking in Cargo.lock aligns with Cargo’s documented best
practices (https://doc.rust-lang.org/cargo/faq.html#why-have-cargolock-in-version-control).


> In addition, as I mentioned downthread, because our intention is to
> support the Debian stable toolchain for a year after the new stable
> release, unless we are exceptionally careful about dependencies, we may
> end up with a case where distros need to use older dependencies patched
> for security but other users may want to update the versions to newer
> dependencies with security fixes but that do not work on our pinned Rust
> version.  We can't possibly satisfy both sets of people if we pin
> dependencies in Cargo.lock, so we probably want to avoid checking it in
> and ignore it instead.

I understand your concern and I agree that this could become a
problem. I’m totally flexible on which rust version should be used,
but without Cargo.lock checked in we lose the ability to audit why a
build failed. I think that this will be a pain point, but numbing that
pain means we can’t solve intermittent problems due to dependencies in
the future.

> --
> brian m. carlson (they/them)
> Toronto, Ontario, CA

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-18 16:26   ` Junio C Hamano
@ 2025-07-19  0:32     ` Elijah Newren
  0 siblings, 0 replies; 198+ messages in thread
From: Elijah Newren @ 2025-07-19  0:32 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Brabandt, Ezekiel Newren via GitGitGadget, git,
	Ezekiel Newren

On Fri, Jul 18, 2025 at 9:26 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Brabandt <cb@256bit.org> writes:
>
[...]
> > Just a quick heads-up: We (as in Vim/Neovim) have been using gits xdiff
> > library for use in Vim and Neovim.
> >
> > Is the plan to get rid of xdiffs C source completely and replace it by a
> > Rust implementation?
>
> As far as I know, there is no such plan that is widely agreed upon
> (yet).

Yeah, Ezekiel just barely notified the community of his efforts
yesterday with this patch series.  :-)

> The discussion starter thread you are looking at only introduces a
> new code path that uses a different line hash function written in
> Rust when whitespace munging search is not enabled, and everything
> else still is written in C, but since it is just a discussion
> starter so far.
>
> I would personally have liked the effort to start with xmerge code,
> not xdiff machinery, for various reasons, but that may be just me
> ;-)

We have both xdiff/xmerge.[ch] and xdiff/{xdiff.h,xdiffi.[ch]}.  When
you say "xdiff", I suspect that you're referring to the files within
the directory rather than to the whole directory, yes?

I actually pointed Ezekiel at xhistogram to start (and thought he
might only do that file), then he backed up to xprepare, and then he
continued from there on to other bits of xdiff/, including xmerge.
Different parts are at different stages of conversion and testing.
He's not just transliterating but also trying to both clean up the
code and look for performance improvements (and it's sometimes hard to
do both; he's hit a few maintainability vs. performance tradeoffs and
those will likely result in some questions on the list at some point).
It's been a long slog, especially given how arcane xdiff sometimes
feels.  Anyway, along the way, he recognized the DJB2a hash --
something I certainly wouldn't have recognized or even thought to
investigate.  It led him to this optimization, which I thought was a
really good find, and it seemed like it'd make for a good initial
series to send to the list to get a feel for what people thought about
possibly Rustifying xdiff/.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-18 21:25   ` Eli Schwartz
@ 2025-07-19  0:48     ` Haelwenn (lanodan) Monnier
  2025-07-22 12:21       ` Patrick Steinhardt
  2025-07-22 14:24     ` Patrick Steinhardt
  1 sibling, 1 reply; 198+ messages in thread
From: Haelwenn (lanodan) Monnier @ 2025-07-19  0:48 UTC (permalink / raw)
  To: Eli Schwartz
  Cc: Phillip Wood, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren, Edward Thomson, brian m. carlson, Taylor Blau

[2025-07-18 17:25:01-0400] Eli Schwartz:
>On 7/18/25 9:34 AM, Phillip Wood wrote:
>> Hi Ezekiel
>>
>> Thanks for working on this
>>
>> On 17/07/2025 21:32, Ezekiel Newren via GitGitGadget wrote:
>>
>>> So...
>>>
>>> This obviously raises the question of whether we are ready to accept a
>>> hard
>>> dependency on Rust. Previous discussions on the mailing list and at Git
>>> Merge 2024 have not answered that question. If not now, will we be
>>> willing
>>> to accept such a hard dependency later? And what route do we want to
>>> take to
>>> get there?
>>
>> As far as git goes I think introducing a hard dependency on rust is
>> fine. It is widely supported, the only issue I'm aware of is the lack of
>> support on NonStop and I don't think it is reasonable for such a
>> minority platform to hold the rest of the project to ransom. There is a
>> question about the other users of the xdiff code though. libgit2 carries
>> a copy as do other projects like neovim. I've cc'd the libgit2
>> maintainer and posted a link to this thread in neovim github [1]
>
>
>A hard dependency on rust for Gentoo amd64 would potentially require
>building https://github.com/thepowersgang/mrustc followed by building 13
>and counting versions of rustc in order to get to the latest version.
>What is the minimum supported version in this series, by the way?
>
>bin packages for rust do exist but not everyone wants to use non-distro
>provided binaries, sometimes for auditability reasons.
>
>
>For Gentoo HPPA, Alpha, m68k it will simply mean the removal (or end of
>life and staying forever on 2.50, perhaps) of Git. There is no rust
>compiler there.
>
>Even s390 support for rust is limited to a precompiled version not
>everyone is willing to use.

Also in other distro concerns, if it trickles down to libgit2,
extra care should be taken to avoid creating circular dependencies
due to cargo depending on libgit2 (via git2 crate).

For example with making sure it can reasonably be built via meson's
Rust support rather than through cargo.

>
>GCC-rs will probably fix this general issue.
>
>-- 
>Eli Schwartz

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
                   ` (10 preceding siblings ...)
  2025-07-18 14:38 ` Junio C Hamano
@ 2025-07-19 21:53 ` Johannes Schindelin
  2025-07-20  8:45   ` Matthias Aßhauer
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
  12 siblings, 1 reply; 198+ messages in thread
From: Johannes Schindelin @ 2025-07-19 21:53 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget; +Cc: git, Elijah Newren, Ezekiel Newren

[-- Attachment #1: Type: text/plain, Size: 7271 bytes --]

Hi Ezekiel,                                                                                                                                                                                                           
pleasure to make your acquaintance!

On Thu, 17 Jul 2025, Ezekiel Newren via GitGitGadget wrote:

> 1. Windows fails to build. I don’t know which rust toolchain is even
>    correct for this or if multiple are needed.  Example failed build:
>    https://github.com/git/git/actions/runs/16353209191

There are a couple of problems, not just one. Here are the patches that I
would like to ask you to take custody of (for your convenience, I have
pushed them to https://github.com/dscho/git as the `xdiff_rust_speedup`
branch). Please find them below. They _just_ fix the build, but the tests
with win+Meson still fail (and as "win+Meson test" jobs keep the logs of
the failed tests a well-guarded secret, due to time constraints I have to
stop looking into this for now).

Thank you for working on this,
Johannes

-- snipsnap --
From 72c50ee3f9df5ccfe48bf6f44b2c6bba05a680bf Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 19 Jul 2025 21:24:07 +0200
Subject: [PATCH 1/3] Do support Windows again after requiring Rust

By default, Rust wants to build MS Visual C-compatible libraries on
Windows, because that is _the_ native C compiler.

Git is historically lacking in its MSVC support, and the official Git
for Windows versions are built using GCC instead. As a consequence, a
(subset of a) GCC toolchain is installed as part of the `windows-build`
job of every CI build.

Naturally, this requires adjustments in how Rust is called, most
importantly it requires installing support for a GCC-compatible build
target.

Let's make the necessary adjustment both in the CI-specific code that
installs Rust as well as in the Windows-specific configuration in
`config.mak.uname`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/install-rust.sh | 3 +++
 config.mak.uname   | 9 +++++++++
 2 files changed, 12 insertions(+)

diff --git a/ci/install-rust.sh b/ci/install-rust.sh
index 141ceddb17cfe..c22baa629ceb7 100644
--- a/ci/install-rust.sh
+++ b/ci/install-rust.sh
@@ -28,6 +28,9 @@ if [ "$BITNESS" = "32" ]; then
   $CARGO_HOME/bin/rustup default --force-non-host $RUST_VERSION || exit $?
 else
   $CARGO_HOME/bin/rustup default $RUST_VERSION || exit $?
+  if [ "$CI_OS_NAME" = "windows" ]; then
+    $CARGO_HOME/bin/rustup target add x86_64-pc-windows-gnu || exit $?
+  fi
 fi
 
 . $CARGO_HOME/env
diff --git a/config.mak.uname b/config.mak.uname
index 3e26bb074a4b5..fbe7cebf40edd 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -727,19 +727,28 @@ ifeq ($(uname_S),MINGW)
 		prefix = /mingw32
 		HOST_CPU = i686
 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,_mainCRTStartup
+		CARGO_BUILD_TARGET = i686-pc-windows-gnu
         endif
         ifeq (MINGW64,$(MSYSTEM))
 		prefix = /mingw64
 		HOST_CPU = x86_64
 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
+		CARGO_BUILD_TARGET = x86_64-pc-windows-gnu
         else ifeq (CLANGARM64,$(MSYSTEM))
 		prefix = /clangarm64
 		HOST_CPU = aarch64
 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
+		CARGO_BUILD_TARGET = aarch64-pc-windows-gnu
         else
 		COMPAT_CFLAGS += -D_USE_32BIT_TIME_T
 		BASIC_LDFLAGS += -Wl,--large-address-aware
         endif
+
+	export CARGO_BUILD_TARGET
+	RUST_TARGET_DIR = rust/target/$(CARGO_BUILD_TARGET)/$(RUST_BUILD_MODE)
+	# Unfortunately now needed because of Rust
+	EXTLIBS += -luserenv
+
 	CC = gcc
 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \
 		-fstack-protector-strong
-- 
2.50.1.windows.1


From ef6e4394ae26d8f28cb0d9e456810ce0818e623b Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 19 Jul 2025 23:08:11 +0200
Subject: [PATCH 2/3] win+Meson: allow for xdiff to be compiled with MSVC

The `build_rust.sh` script is quite opinionated about the naming scheme
of the C compiler: It assumes that the xdiff library file will be named
`libxdiff.a`.

However, MS Visual C generates `xdiff.lib` files instead; This naming
scheme has been in use in a very, very long time.

Let's allow for that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 build_rust.sh |  7 ++++++-
 meson.build   | 12 +++++++++---
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/build_rust.sh b/build_rust.sh
index 4c12135cd2050..694d48d857a58 100755
--- a/build_rust.sh
+++ b/build_rust.sh
@@ -44,7 +44,12 @@ fi
 
 cd $dir_rust && cargo clean && pwd && cargo build -p $crate $rust_args; cd ..
 
-libfile="lib${crate}.a"
+if grep x86_64-pc-windows-msvc rust/target/.rustc_info.json
+then
+  libfile="${crate}.lib"
+else
+  libfile="lib${crate}.a"
+fi
 dst=$dir_build/$libfile
 
 if [ "$dir_git_root" != "$dir_build" ]; then
diff --git a/meson.build b/meson.build
index 047d7e5b66306..5e89a5dd0e00f 100644
--- a/meson.build
+++ b/meson.build
@@ -277,8 +277,16 @@ else
   rustflags = '-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
 endif
 
+compiler = meson.get_compiler('c')
+
+if compiler.get_id() == 'msvc'
+  xdiff_lib_filename = 'xdiff.lib'
+else
+  xdiff_lib_filename = 'libxdiff.a'
+endif
+
 rust_build_xdiff = custom_target('rust_build_xdiff',
-  output: 'libxdiff.a',
+  output: xdiff_lib_filename,
   build_by_default: true,
   build_always_stale: true,
   command: [
@@ -288,8 +296,6 @@ rust_build_xdiff = custom_target('rust_build_xdiff',
   install: false,
 )
 
-compiler = meson.get_compiler('c')
-
 libgit_sources = [
   'abspath.c',
   'add-interactive.c',
-- 
2.50.1.windows.1


From 9c3b017cfa069211027fbb1f6d3b97c8e7edda81 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 19 Jul 2025 23:22:57 +0200
Subject: [PATCH 3/3] win+Meson: do allow linking with the Rust-built xdiff

When linking against the Rust-built `xdiff`, there is now a new required
dependency: Without _also_ linking to the system library `userenv`, the
compile would fail with this error message:

  xdiff.lib(std-c85e9beb7923f636.std.df32d1bc89881d89-cgu.0.rcgu.o) :
  error LNK2019: unresolved external symbol __imp_GetUserProfileDirectoryW
  referenced in function _ZN3std3env8home_dir17hfd1c3b6676cd78f6E

Therefore, just like we do in case of Makefile-based builds on Windows,
we now also link to that library when building with Meson.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 meson.build | 1 +
 1 file changed, 1 insertion(+)

diff --git a/meson.build b/meson.build
index 5e89a5dd0e00f..af015f04763fd 100644
--- a/meson.build
+++ b/meson.build
@@ -1260,6 +1260,7 @@ elif host_machine.system() == 'windows'
   ]
 
   libgit_dependencies += compiler.find_library('ntdll')
+  libgit_dependencies += compiler.find_library('userenv')
   libgit_include_directories += 'compat/win32'
   if compiler.get_id() == 'msvc'
     libgit_include_directories += 'compat/vcbuild/include'
-- 
2.50.1.windows.1

^ permalink raw reply related	[flat|nested] 198+ messages in thread

* Re: [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-07-17 20:32 ` [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash Ezekiel Newren via GitGitGadget
  2025-07-17 23:29   ` Taylor Blau
  2025-07-18 19:00   ` Junio C Hamano
@ 2025-07-19 21:53   ` Johannes Schindelin
  2025-07-20 10:14     ` Phillip Wood
  2 siblings, 1 reply; 198+ messages in thread
From: Johannes Schindelin @ 2025-07-19 21:53 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget
  Cc: git, Elijah Newren, Ezekiel Newren, Ezekiel Newren

Hi Ezekiel,

On Thu, 17 Jul 2025, Ezekiel Newren via GitGitGadget wrote:

> diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
> index e69de29bb2d1..96975975a1ba 100644
> --- a/rust/xdiff/src/lib.rs
> +++ b/rust/xdiff/src/lib.rs
> @@ -0,0 +1,7 @@
> +
> +
> +#[no_mangle]
> +unsafe extern "C" fn xxh3_64(ptr: *const u8, size: usize) -> u64 {
> +    let slice = std::slice::from_raw_parts(ptr, size);
> +    xxhash_rust::xxh3::xxh3_64(slice)
> +}

I know that this is a pretty small file, but I do notice that it does not
have a license header.

This reminds me of the unfortunate oversight to be careful about making
(and keeping) libgit.a's source files compatible with libgit2's license to
nurture a fruitful exchange between those two projects.

With Rust, we still have a really good chance to learn from history and
avoid that mistake: Gitoxide is a very exciting project with clear overlap
in its mission to implement Git functionality in Rust. Gitoxide is
dual-licensed under the Apache License v2 and the MIT license (see
https://github.com/GitoxideLabs/gitoxide?tab=readme-ov-file#license).

Would you mind adding a license header to that file that explicitly allows
the contents of the file to be used in Gitoxide, to get the Rust effort
started on a good foot?

Thank you,
Johannes

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 7/7] github_workflows: install rust
  2025-07-17 20:32 ` [PATCH 7/7] github_workflows: install rust Ezekiel Newren via GitGitGadget
  2025-07-17 21:23   ` brian m. carlson
@ 2025-07-19 21:54   ` Johannes Schindelin
  1 sibling, 0 replies; 198+ messages in thread
From: Johannes Schindelin @ 2025-07-19 21:54 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget
  Cc: git, Elijah Newren, Ezekiel Newren, Ezekiel Newren

Hi Ezekiel,

On Thu, 17 Jul 2025, Ezekiel Newren via GitGitGadget wrote:

> +if [ "$dir_git_root" != "$dir_build" ]; then
> +  src=$dir_rust/target/$rust_target/$libfile
> +  if [ ! -f $src ]; then
> +    echo >&2 "::error:: cannot find path of static library"
> +    exit 5
> +  fi

As I found out the hard way, this error message could be more helpful if
it specified a couple of those variables that play into the failure (or
all of them).

Would you mind changing the error message accordingly?

Thank you,
Johannes

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-17 20:32 ` [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
  2025-07-17 22:46   ` Taylor Blau
  2025-07-18 13:35   ` Phillip Wood
@ 2025-07-20  1:39   ` Johannes Schindelin
  2 siblings, 0 replies; 198+ messages in thread
From: Johannes Schindelin @ 2025-07-20  1:39 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget
  Cc: git, Elijah Newren, Ezekiel Newren, Ezekiel Newren

Hi Ezekiel,

On Thu, 17 Jul 2025, Ezekiel Newren via GitGitGadget wrote:

> diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
> index 8b8467360ecf..6e5f67ebf380 100644
> --- a/xdiff/xtypes.h
> +++ b/xdiff/xtypes.h
> @@ -39,9 +39,9 @@ typedef struct s_chastore {
>  } chastore_t;
>  
>  typedef struct s_xrecord {
> -	char const *ptr;
> -	long size;
> -	unsigned long ha;
> +	u8 const* ptr;
> +	usize size;
> +	u64 ha;
>  } xrecord_t;
>  
>  typedef struct s_xdfile {

You cannot do this on its own, you'll also have to do the following (which
incidentally fixes the linux32 failures as well as the win test and
win+Meson test failures, see
https://github.com/dscho/git/actions/runs/16394351471):

-- snipsnap --
From 8693c83858a7c9308e54fb470cd7e82bcf67c758 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sun, 20 Jul 2025 02:34:35 +0200
Subject: [PATCH] fixup! xdiff: make fields of xrecord_t Rust friendly

To make `xdl_classify_record()` work, the `ha` attributes of `xrecord_t`
and of `s_xdlclass` _must_ have the same range. Otherwise the function
won't be able to recognize previously-classified records correctly when
the `ha` recorded in the `xrecord_t` is so wide that it won't fit into
the `s_xdlclass`' attribute and therefore they won't match when they
need to match.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 xdiff/xprepare.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 5a2e52f102cf7..c0463bacd94b0 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -32,7 +32,7 @@
 
 typedef struct s_xdlclass {
 	struct s_xdlclass *next;
-	unsigned long ha;
+	u64 ha;
 	char const *line;
 	long size;
 	long idx;
-- 
2.50.1.windows.1


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-19 21:53 ` Johannes Schindelin
@ 2025-07-20  8:45   ` Matthias Aßhauer
  0 siblings, 0 replies; 198+ messages in thread
From: Matthias Aßhauer @ 2025-07-20  8:45 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren

[-- Attachment #1: Type: text/plain, Size: 8148 bytes --]



On Sat, 19 Jul 2025, Johannes Schindelin wrote:

> Hi Ezekiel,
> pleasure to make your acquaintance!
>
> On Thu, 17 Jul 2025, Ezekiel Newren via GitGitGadget wrote:
>
>> 1. Windows fails to build. I don’t know which rust toolchain is even
>>    correct for this or if multiple are needed.  Example failed build:
>>    https://github.com/git/git/actions/runs/16353209191
>
> There are a couple of problems, not just one. Here are the patches that I
> would like to ask you to take custody of (for your convenience, I have
> pushed them to https://github.com/dscho/git as the `xdiff_rust_speedup`
> branch). Please find them below. They _just_ fix the build, but the tests
> with win+Meson still fail (and as "win+Meson test" jobs keep the logs of
> the failed tests a well-guarded secret, due to time constraints I have to
> stop looking into this for now).
>
> Thank you for working on this,
> Johannes
>
> -- snipsnap --
> From 72c50ee3f9df5ccfe48bf6f44b2c6bba05a680bf Mon Sep 17 00:00:00 2001
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Sat, 19 Jul 2025 21:24:07 +0200
> Subject: [PATCH 1/3] Do support Windows again after requiring Rust
>
> By default, Rust wants to build MS Visual C-compatible libraries on
> Windows, because that is _the_ native C compiler.
>
> Git is historically lacking in its MSVC support, and the official Git
> for Windows versions are built using GCC instead. As a consequence, a
> (subset of a) GCC toolchain is installed as part of the `windows-build`
> job of every CI build.
>
> Naturally, this requires adjustments in how Rust is called, most
> importantly it requires installing support for a GCC-compatible build
> target.
>
> Let's make the necessary adjustment both in the CI-specific code that
> installs Rust as well as in the Windows-specific configuration in
> `config.mak.uname`.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> ci/install-rust.sh | 3 +++
> config.mak.uname   | 9 +++++++++
> 2 files changed, 12 insertions(+)
>
> diff --git a/ci/install-rust.sh b/ci/install-rust.sh
> index 141ceddb17cfe..c22baa629ceb7 100644
> --- a/ci/install-rust.sh
> +++ b/ci/install-rust.sh
> @@ -28,6 +28,9 @@ if [ "$BITNESS" = "32" ]; then
>   $CARGO_HOME/bin/rustup default --force-non-host $RUST_VERSION || exit $?
> else
>   $CARGO_HOME/bin/rustup default $RUST_VERSION || exit $?
> +  if [ "$CI_OS_NAME" = "windows" ]; then
> +    $CARGO_HOME/bin/rustup target add x86_64-pc-windows-gnu || exit $?
> +  fi
> fi
>
> . $CARGO_HOME/env
> diff --git a/config.mak.uname b/config.mak.uname
> index 3e26bb074a4b5..fbe7cebf40edd 100644
> --- a/config.mak.uname
> +++ b/config.mak.uname
> @@ -727,19 +727,28 @@ ifeq ($(uname_S),MINGW)
> 		prefix = /mingw32
> 		HOST_CPU = i686
> 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,_mainCRTStartup
> +		CARGO_BUILD_TARGET = i686-pc-windows-gnu

While i686-pc-windows-gnu is fine for CI, it would mean we'd have to bump 
our supported Windows version up to Windows 10. If we want to keep 
supporting Windows 8.1, we'll need i686-win7-windows-gnu, at least on rust 
1.78 and newer.[1][2] We'd probably build Windows versions on rust 1.88 
currently.[3]

[1] https://blog.rust-lang.org/2024/02/26/Windows-7/
[2] https://doc.rust-lang.org/rustc/platform-support/win7-windows-gnu.html
[3] https://packages.msys2.org/base/mingw-w64-rust

>         endif
>         ifeq (MINGW64,$(MSYSTEM))
> 		prefix = /mingw64
> 		HOST_CPU = x86_64
> 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
> +		CARGO_BUILD_TARGET = x86_64-pc-windows-gnu

For x86_64 we'llprobably  also want x86_64-win7-windows-gnu if we want to 
keep Windows 8.1 support.

>         else ifeq (CLANGARM64,$(MSYSTEM))
> 		prefix = /clangarm64
> 		HOST_CPU = aarch64
> 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
> +		CARGO_BUILD_TARGET = aarch64-pc-windows-gnu

Are we sure this target currently exists? It's at least undocumented.[4] I 
think we might want aarch64-pc-windows-gnullvm for CLANGARM64, either 
way.[5]

[4] https://doc.rust-lang.org/rustc/platform-support/windows-gnu.html
[5] https://doc.rust-lang.org/rustc/platform-support/windows-gnullvm.html

Best regards

Matthias

>         else
> 		COMPAT_CFLAGS += -D_USE_32BIT_TIME_T
> 		BASIC_LDFLAGS += -Wl,--large-address-aware
>         endif
> +
> +	export CARGO_BUILD_TARGET
> +	RUST_TARGET_DIR = rust/target/$(CARGO_BUILD_TARGET)/$(RUST_BUILD_MODE)
> +	# Unfortunately now needed because of Rust
> +	EXTLIBS += -luserenv
> +
> 	CC = gcc
> 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \
> 		-fstack-protector-strong
> -- 
> 2.50.1.windows.1
>
>
> From ef6e4394ae26d8f28cb0d9e456810ce0818e623b Mon Sep 17 00:00:00 2001
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Sat, 19 Jul 2025 23:08:11 +0200
> Subject: [PATCH 2/3] win+Meson: allow for xdiff to be compiled with MSVC
>
> The `build_rust.sh` script is quite opinionated about the naming scheme
> of the C compiler: It assumes that the xdiff library file will be named
> `libxdiff.a`.
>
> However, MS Visual C generates `xdiff.lib` files instead; This naming
> scheme has been in use in a very, very long time.
>
> Let's allow for that.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> build_rust.sh |  7 ++++++-
> meson.build   | 12 +++++++++---
> 2 files changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/build_rust.sh b/build_rust.sh
> index 4c12135cd2050..694d48d857a58 100755
> --- a/build_rust.sh
> +++ b/build_rust.sh
> @@ -44,7 +44,12 @@ fi
>
> cd $dir_rust && cargo clean && pwd && cargo build -p $crate $rust_args; cd ..
>
> -libfile="lib${crate}.a"
> +if grep x86_64-pc-windows-msvc rust/target/.rustc_info.json
> +then
> +  libfile="${crate}.lib"
> +else
> +  libfile="lib${crate}.a"
> +fi
> dst=$dir_build/$libfile
>
> if [ "$dir_git_root" != "$dir_build" ]; then
> diff --git a/meson.build b/meson.build
> index 047d7e5b66306..5e89a5dd0e00f 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -277,8 +277,16 @@ else
>   rustflags = '-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
> endif
>
> +compiler = meson.get_compiler('c')
> +
> +if compiler.get_id() == 'msvc'
> +  xdiff_lib_filename = 'xdiff.lib'
> +else
> +  xdiff_lib_filename = 'libxdiff.a'
> +endif
> +
> rust_build_xdiff = custom_target('rust_build_xdiff',
> -  output: 'libxdiff.a',
> +  output: xdiff_lib_filename,
>   build_by_default: true,
>   build_always_stale: true,
>   command: [
> @@ -288,8 +296,6 @@ rust_build_xdiff = custom_target('rust_build_xdiff',
>   install: false,
> )
>
> -compiler = meson.get_compiler('c')
> -
> libgit_sources = [
>   'abspath.c',
>   'add-interactive.c',
> -- 
> 2.50.1.windows.1
>
>
> From 9c3b017cfa069211027fbb1f6d3b97c8e7edda81 Mon Sep 17 00:00:00 2001
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Sat, 19 Jul 2025 23:22:57 +0200
> Subject: [PATCH 3/3] win+Meson: do allow linking with the Rust-built xdiff
>
> When linking against the Rust-built `xdiff`, there is now a new required
> dependency: Without _also_ linking to the system library `userenv`, the
> compile would fail with this error message:
>
>  xdiff.lib(std-c85e9beb7923f636.std.df32d1bc89881d89-cgu.0.rcgu.o) :
>  error LNK2019: unresolved external symbol __imp_GetUserProfileDirectoryW
>  referenced in function _ZN3std3env8home_dir17hfd1c3b6676cd78f6E
>
> Therefore, just like we do in case of Makefile-based builds on Windows,
> we now also link to that library when building with Meson.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> meson.build | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/meson.build b/meson.build
> index 5e89a5dd0e00f..af015f04763fd 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1260,6 +1260,7 @@ elif host_machine.system() == 'windows'
>   ]
>
>   libgit_dependencies += compiler.find_library('ntdll')
> +  libgit_dependencies += compiler.find_library('userenv')
>   libgit_include_directories += 'compat/win32'
>   if compiler.get_id() == 'msvc'
>     libgit_include_directories += 'compat/vcbuild/include'
> -- 
> 2.50.1.windows.1
>

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-07-19 21:53   ` Johannes Schindelin
@ 2025-07-20 10:14     ` Phillip Wood
  2025-09-23  9:57       ` gitoxide-compatible licensing of Git's Rust code, was " Johannes Schindelin
  0 siblings, 1 reply; 198+ messages in thread
From: Phillip Wood @ 2025-07-20 10:14 UTC (permalink / raw)
  To: Johannes Schindelin, Ezekiel Newren via GitGitGadget
  Cc: git, Elijah Newren, Ezekiel Newren, brian m. carlson

Hi Johannes

On 19/07/2025 22:53, Johannes Schindelin wrote:
> Hi Ezekiel,
> 
> On Thu, 17 Jul 2025, Ezekiel Newren via GitGitGadget wrote:
> 
>> diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
>> index e69de29bb2d1..96975975a1ba 100644
>> --- a/rust/xdiff/src/lib.rs
>> +++ b/rust/xdiff/src/lib.rs
>> @@ -0,0 +1,7 @@
>> +
>> +
>> +#[no_mangle]
>> +unsafe extern "C" fn xxh3_64(ptr: *const u8, size: usize) -> u64 {
>> +    let slice = std::slice::from_raw_parts(ptr, size);
>> +    xxhash_rust::xxh3::xxh3_64(slice)
>> +}
> 
> I know that this is a pretty small file, but I do notice that it does not
> have a license header.
> 
> This reminds me of the unfortunate oversight to be careful about making
> (and keeping) libgit.a's source files compatible with libgit2's license to
> nurture a fruitful exchange between those two projects.

I'm not sure I follow your reasoning here. libgit2 was started after git 
and chose to use an incompatible license. I wasn't around at the time 
but isn't there a list of git contributors who are happy to re-license 
their contributions with the linking exception used by libgit2?

> With Rust, we still have a really good chance to learn from history and
> avoid that mistake: Gitoxide is a very exciting project with clear overlap
> in its mission to implement Git functionality in Rust. Gitoxide is
> dual-licensed under the Apache License v2 and the MIT license (see
> https://github.com/GitoxideLabs/gitoxide?tab=readme-ov-file#license).
> 
> Would you mind adding a license header to that file that explicitly allows
> the contents of the file to be used in Gitoxide, to get the Rust effort
> started on a good foot?

I wary of that for two reasons. Firstly over time it is de-facto 
re-licensing git as the amount of rust code grows and the amount of C 
code shrinks which deserves a wider discussion. Secondly it makes it 
harder to convert our C code which is licensed under GPL2 (or in the 
case of xdiff LGPL) to rust if the rust code uses a different license.

If someone wants to start a discussion about re-licensing git (and is 
prepared to do all of the associated admin in the event that it happens) 
then by all means do so but I don't think it we want to slip such a 
change into this series.

Thanks

Phillip


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-18 14:38 ` Junio C Hamano
  2025-07-18 21:56   ` Ezekiel Newren
@ 2025-07-21 10:14   ` Phillip Wood
  2025-07-21 18:33     ` Junio C Hamano
  1 sibling, 1 reply; 198+ messages in thread
From: Phillip Wood @ 2025-07-21 10:14 UTC (permalink / raw)
  To: Junio C Hamano, Ezekiel Newren via GitGitGadget
  Cc: git, Elijah Newren, Ezekiel Newren

On 18/07/2025 15:38, Junio C Hamano wrote:
> "Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> This series accelerates xdiff by 5-19%.
> 
> ;-)
> 
> Do we know how much of that can be attributed to the hash algorithm
> difference, and how much for languages?

That's an interesting question. The two patches below [1] switch
xdiff to use xxhash from libxxhash. On my computer the rust and
C implementations both speed up "git log --oneline --shortstat"
by 15%. Just over half of that seems to come from hoisting the
check for whitespace flags in xdl_hash_record() out of the loop
in xdl_prepare_ctx() and the rest comes from the change in hash
function. As I understand it the hash is implemented using SIMD
compiler intrinsics and the rust implementation is basically a
copy of the C code in libxxhash. I wonder how well xxhash performs
compared to our existing hash on platforms without an optimized
implementation.

Thanks

Phillip

[1] These patches are available in the xdiff-hashing-experiments
     branch at https://github.com/phillipwood/git

---- 8< ----
 From 06e7abdcfb9fc3f143ef84644966d6fce128d8ae Mon Sep 17 00:00:00 2001
From: Phillip Wood <phillip.wood@dunelm.org.uk>
Date: Sat, 19 Jul 2025 10:58:48 +0100
Subject: [PATCH 1/2] xdiff: refactor xdl_hash_record()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Inline the check for whitespace flags so that the compiler can hoist
it out of the loop in xdl_prepare_ctx(). This improves the performance
by 8%.

$ hyperfine --warmup=1 -L rev HEAD,HEAD^  --setup='git checkout {rev} -- :/ && make git' ': {rev}; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0'
Benchmark 1: : HEAD; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0
   Time (mean ┬▒ ¤â):      1.670 s ┬▒  0.044 s    [User: 1.473 s, System: 0.196 s]
   Range (min  max):    1.619 s   1.754 s    10 runs

Benchmark 2: : HEAD^; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0
   Time (mean ┬▒ ¤â):      1.801 s ┬▒  0.021 s    [User: 1.605 s, System: 0.192 s]
   Range (min  max):    1.766 s   1.831 s    10 runs

Summary
   ': HEAD^; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0' ran
     1.08 ┬▒ 0.03 times faster than ': HEAD^^; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0'

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
  xdiff/xutils.c |  7 ++-----
  xdiff/xutils.h | 10 +++++++++-
  2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 444a108f87..e070ed649f 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -249,7 +249,7 @@ int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags)
  	return 1;
  }
  
-static unsigned long xdl_hash_record_with_whitespace(char const **data,
+unsigned long xdl_hash_record_with_whitespace(char const **data,
  		char const *top, long flags) {
  	unsigned long ha = 5381;
  	char const *ptr = *data;
@@ -294,13 +294,10 @@ static unsigned long xdl_hash_record_with_whitespace(char const **data,
  	return ha;
  }
  
-unsigned long xdl_hash_record(char const **data, char const *top, long flags) {
+unsigned long xdl_hash_record_verbatim(char const **data, char const *top) {
  	unsigned long ha = 5381;
  	char const *ptr = *data;
  
-	if (flags & XDF_WHITESPACE_FLAGS)
-		return xdl_hash_record_with_whitespace(data, top, flags);
-
  	for (; ptr < top && *ptr != '\n'; ptr++) {
  		ha += (ha << 5);
  		ha ^= (unsigned long) *ptr;
diff --git a/xdiff/xutils.h b/xdiff/xutils.h
index fd0bba94e8..13f6831047 100644
--- a/xdiff/xutils.h
+++ b/xdiff/xutils.h
@@ -34,7 +34,15 @@ void *xdl_cha_alloc(chastore_t *cha);
  long xdl_guess_lines(mmfile_t *mf, long sample);
  int xdl_blankline(const char *line, long size, long flags);
  int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags);
-unsigned long xdl_hash_record(char const **data, char const *top, long flags);
+unsigned long xdl_hash_record_verbatim(char const **data, char const *top);
+unsigned long xdl_hash_record_with_whitespace(char const **data, char const *top, long flags);
+static inline unsigned long xdl_hash_record(char const **data, char const *top, long flags)
+{
+	if (flags & XDF_WHITESPACE_FLAGS)
+		return xdl_hash_record_with_whitespace(data, top, flags);
+	else
+		return xdl_hash_record_verbatim(data, top);
+}
  unsigned int xdl_hashbits(unsigned int size);
  int xdl_num_out(char *out, long val);
  int xdl_emit_hunk_hdr(long s1, long c1, long s2, long c2,
-- 
2.49.0.897.gfad3eb7d21


 From 16f3b26624dc17002f3e507cd1e260deadfe1de8 Mon Sep 17 00:00:00 2001
From: Phillip Wood <phillip.wood@dunelm.org.uk>
Date: Sat, 19 Jul 2025 14:52:48 +0100
Subject: [PATCH 2/2] xdiff: use xxhash
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Using XXH3_64bits() from libxxhash to hash the input lines improves
the performance by about 6% and equals the performance of using
xxhash-rust.

$ hyperfine --warmup=1 -L rev en/xdiff-rust/v1,HEAD,HEAD^,HEAD^^  --setup='git checkout {rev} -- :/ && make git' ': {rev}; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0'
Benchmark 1: : en/xdiff-rust/v1; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0
   Time (mean ┬▒ ¤â):      1.575 s ┬▒  0.032 s    [User: 1.406 s, System: 0.168 s]
   Range (min  max):    1.541 s   1.651 s    10 runs

Benchmark 2: : HEAD; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0
   Time (mean ┬▒ ¤â):      1.569 s ┬▒  0.018 s    [User: 1.382 s, System: 0.185 s]
   Range (min  max):    1.546 s   1.596 s    10 runs

Benchmark 3: : HEAD^; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0
   Time (mean ┬▒ ¤â):      1.661 s ┬▒  0.026 s    [User: 1.475 s, System: 0.186 s]
   Range (min  max):    1.630 s   1.696 s    10 runs

Benchmark 4: : HEAD^^; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0
   Time (mean ┬▒ ¤â):      1.800 s ┬▒  0.023 s    [User: 1.611 s, System: 0.187 s]
   Range (min  max):    1.772 s   1.837 s    10 runs

Summary
   ': HEAD; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0' ran
     1.00 ┬▒ 0.02 times faster than ': en/xdiff-rust/v1; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0'
     1.06 ┬▒ 0.02 times faster than ': HEAD^; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0'
     1.15 ┬▒ 0.02 times faster than ': HEAD^^; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0'

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
  Makefile       |  1 +
  xdiff/xutils.c | 14 ++++++--------
  2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/Makefile b/Makefile
index 5f7dd79dfa..6de7ccdf3b 100644
--- a/Makefile
+++ b/Makefile
@@ -1390,6 +1390,7 @@ UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/lib-reftable.o
  # xdiff and reftable libs may in turn depend on what is in libgit.a
  GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
  EXTLIBS =
+EXTLIBS += -lxxhash
  
  GIT_USER_AGENT = git/$(GIT_VERSION)
  
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index e070ed649f..43fce4b5b1 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -21,7 +21,7 @@
   */
  
  #include "xinclude.h"
-
+#include <xxhash.h>
  
  long xdl_bogosqrt(long n) {
  	long i;
@@ -295,14 +295,12 @@ unsigned long xdl_hash_record_with_whitespace(char const **data,
  }
  
  unsigned long xdl_hash_record_verbatim(char const **data, char const *top) {
-	unsigned long ha = 5381;
-	char const *ptr = *data;
+	long ha;
+	char const *eol = memchr(*data, '\n', top - *data);
+	size_t len = (eol ? eol : top) - *data;
  
-	for (; ptr < top && *ptr != '\n'; ptr++) {
-		ha += (ha << 5);
-		ha ^= (unsigned long) *ptr;
-	}
-	*data = ptr < top ? ptr + 1: ptr;
+	ha = XXH3_64bits(*data, len);
+	*data += len + !!eol;
  
  	return ha;
  }
-- 
2.49.0.897.gfad3eb7d21



^ permalink raw reply related	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-21 10:14   ` Phillip Wood
@ 2025-07-21 18:33     ` Junio C Hamano
  0 siblings, 0 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-07-21 18:33 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren

Phillip Wood <phillip.wood123@gmail.com> writes:

> ... by 15%. Just over half of that seems to come from hoisting the
> check for whitespace flags in xdl_hash_record() out of the loop
> in xdl_prepare_ctx() and the rest comes from the change in hash
> function.

The first half of that alone is interesting enough ;-).

> As I understand it the hash is implemented using SIMD
> compiler intrinsics and the rust implementation is basically a
> copy of the C code in libxxhash. I wonder how well xxhash performs
> compared to our existing hash on platforms without an optimized
> implementation.

Yeah, that indeed is worth investigating.

Thanks.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-18  0:29     ` brian m. carlson
@ 2025-07-22 12:21       ` Patrick Steinhardt
  2025-07-22 15:56         ` Junio C Hamano
  0 siblings, 1 reply; 198+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 12:21 UTC (permalink / raw)
  To: brian m. carlson, Taylor Blau, Ezekiel Newren via GitGitGadget,
	git, Elijah Newren, Ezekiel Newren

On Fri, Jul 18, 2025 at 12:29:16AM +0000, brian m. carlson wrote:
> On 2025-07-17 at 22:25:23, Taylor Blau wrote:
> > I agree. I don't think that there is ever going to be a "perfect" time
> > to introduce a hard dependency on Rust, and I don't think that should
> > hold the project back from adopting it.
> > 
> > I am far from a Rust expert, but I think that a more modern, memory-safe
> > language will attract newer contributors who may have a fresher
> > perspective on the project, and I think that's a good thing.
> 
> Yes, I think that's true.  Rust is by far the most admired programming
> language to work with, according to the 2024 Stack Overflow Developer
> Survey.  We will likely attract new contributors who find C intimidating
> or a bit of a hassle[0] but are excited about working on Rust,
> especially in a project as compelling as Git[1].

I am also aligned with allowing Rust into Git. I think the ecosystem has
kind of settled on Rust as the next system-level programming language,
and it does have good interop with C.

I think with the ongoing efforts to reduce our reliance on global state
we should eventually be able to encapsulate more and more of our
subsystems. And once they are neatly encapsulated we would be able to
swap out their respective implementation and plug in a Rust replacement.

Good candidates are for example the reftable library, as I've already
proposed in the past.

> > The alternative, of course, is to continue to use C and not take any
> > dependency on Rust. I think there is a middle-ground in there somewhere
> > to be able to build with (e.g.) "make" or "make RUST=1", but I would
> > really like to see the project take a firmer stance here.
> > 
> > I worry that having build support for both "with Rust" and "C only" will
> > create a headache not just at the build system level, but also in the
> > code itself. Having a patchwork of features, optimizations, or bug fixes
> > that either are or aren't supported depending on whether Rust support
> > was specified at build-time seems like a worst-of-all-worlds outcome.
> 
> I definitely agree.  I already find it terribly inconvenient when I end
> up when `git grep` doesn't support `-P` and I imagine that having lots
> of features that weren't available would be bothersome.
> 
> I also think that using a combination of C and Rust will end up with us
> still writing a lot of unsafe Rust code to interoperate with C.  If we
> want to reap the benefits in terms of memory and thread safety[2], we'll
> be better off sticking with just Rust.
> 
> I will also say that while it may be more challenging to compile Git at
> first on Windows, as we move more towards an all-Rust codebase, Git may
> end up being easier to maintain there as we depend more on the standard
> library.

Fully agreed. I've said so at the last contributors summit, but I think
it would become awfully unmaintainable if we retain two implementations
of every subsystem that we convert to Rust. If we decide to use Rust I
would strongly advocate for going all-in.

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-19  0:48     ` Haelwenn (lanodan) Monnier
@ 2025-07-22 12:21       ` Patrick Steinhardt
  0 siblings, 0 replies; 198+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 12:21 UTC (permalink / raw)
  To: Haelwenn (lanodan) Monnier
  Cc: Eli Schwartz, Phillip Wood, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren, Ezekiel Newren, Edward Thomson, brian m. carlson,
	Taylor Blau

On Sat, Jul 19, 2025 at 02:48:39AM +0200, Haelwenn (lanodan) Monnier wrote:
> [2025-07-18 17:25:01-0400] Eli Schwartz:
> > On 7/18/25 9:34 AM, Phillip Wood wrote:
> > > Hi Ezekiel
> > > 
> > > Thanks for working on this
> > > 
> > > On 17/07/2025 21:32, Ezekiel Newren via GitGitGadget wrote:
> > > 
> > > > So...
> > > > 
> > > > This obviously raises the question of whether we are ready to accept a
> > > > hard
> > > > dependency on Rust. Previous discussions on the mailing list and at Git
> > > > Merge 2024 have not answered that question. If not now, will we be
> > > > willing
> > > > to accept such a hard dependency later? And what route do we want to
> > > > take to
> > > > get there?
> > > 
> > > As far as git goes I think introducing a hard dependency on rust is
> > > fine. It is widely supported, the only issue I'm aware of is the lack of
> > > support on NonStop and I don't think it is reasonable for such a
> > > minority platform to hold the rest of the project to ransom. There is a
> > > question about the other users of the xdiff code though. libgit2 carries
> > > a copy as do other projects like neovim. I've cc'd the libgit2
> > > maintainer and posted a link to this thread in neovim github [1]
> > 
> > 
> > A hard dependency on rust for Gentoo amd64 would potentially require
> > building https://github.com/thepowersgang/mrustc followed by building 13
> > and counting versions of rustc in order to get to the latest version.
> > What is the minimum supported version in this series, by the way?
> > 
> > bin packages for rust do exist but not everyone wants to use non-distro
> > provided binaries, sometimes for auditability reasons.
> > 
> > 
> > For Gentoo HPPA, Alpha, m68k it will simply mean the removal (or end of
> > life and staying forever on 2.50, perhaps) of Git. There is no rust
> > compiler there.
> > 
> > Even s390 support for rust is limited to a precompiled version not
> > everyone is willing to use.
> 
> Also in other distro concerns, if it trickles down to libgit2,
> extra care should be taken to avoid creating circular dependencies
> due to cargo depending on libgit2 (via git2 crate).
> 
> For example with making sure it can reasonably be built via meson's
> Rust support rather than through cargo.

I think it's unlikely that this eventually trickles down into libgit2.
The bundled versions of xdiff have already diverged for a long time, and
unfortunately libgit2 is mostly in maintenance mode nowadays. So I guess
that this change here just means that things will diverge even further
in the future, which is probably okay-ish. After all, the whole xdiff
library didn't really evolve in a fast pace over the last years.

That being said, there is an xdiff fork located at [1] that libgit2
maintains nowadays. So if the Rust dependency ever became a problem for
any of the downstream users I think we could simply redirect them to
that fork and make it the canonical upstream for C-only xdiff.

Patrick

[1]: https://github.com/libgit2/xdiff

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-18 21:25   ` Eli Schwartz
  2025-07-19  0:48     ` Haelwenn (lanodan) Monnier
@ 2025-07-22 14:24     ` Patrick Steinhardt
  2025-07-22 15:14       ` Eli Schwartz
  2025-07-22 15:56       ` Sam James
  1 sibling, 2 replies; 198+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 14:24 UTC (permalink / raw)
  To: Eli Schwartz
  Cc: Phillip Wood, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren, Edward Thomson, brian m. carlson, Taylor Blau

On Fri, Jul 18, 2025 at 05:25:01PM -0400, Eli Schwartz wrote:
> On 7/18/25 9:34 AM, Phillip Wood wrote:
> > Hi Ezekiel
> > 
> > Thanks for working on this
> > 
> > On 17/07/2025 21:32, Ezekiel Newren via GitGitGadget wrote:
> >
> >> So...
> >>
> >> This obviously raises the question of whether we are ready to accept a
> >> hard
> >> dependency on Rust. Previous discussions on the mailing list and at Git
> >> Merge 2024 have not answered that question. If not now, will we be
> >> willing
> >> to accept such a hard dependency later? And what route do we want to
> >> take to
> >> get there?
> > 
> > As far as git goes I think introducing a hard dependency on rust is
> > fine. It is widely supported, the only issue I'm aware of is the lack of
> > support on NonStop and I don't think it is reasonable for such a
> > minority platform to hold the rest of the project to ransom. There is a
> > question about the other users of the xdiff code though. libgit2 carries
> > a copy as do other projects like neovim. I've cc'd the libgit2
> > maintainer and posted a link to this thread in neovim github [1]
> 
> 
> A hard dependency on rust for Gentoo amd64 would potentially require
> building https://github.com/thepowersgang/mrustc followed by building 13
> and counting versions of rustc in order to get to the latest version.
> What is the minimum supported version in this series, by the way?
> 
> bin packages for rust do exist but not everyone wants to use non-distro
> provided binaries, sometimes for auditability reasons.
> 
> 
> For Gentoo HPPA, Alpha, m68k it will simply mean the removal (or end of
> life and staying forever on 2.50, perhaps) of Git. There is no rust
> compiler there.
> 
> Even s390 support for rust is limited to a precompiled version not
> everyone is willing to use.
> 
> GCC-rs will probably fix this general issue.

Hm. It would be nice to assemble a list of common or semi-common
distributions that do not have proper support for Rust for all or at
least some platforms. Should we maybe consider reaching out to other
distros (e.g. Debian, Fedora, BSDs) before we commit to any change that
has an outsized impact on the larger ecosystem?

I would really love to start adopting Rust, and if it's only going to be
architectures that are extremely niche I'm probably fine with that. But
if there are many small systems that are impacted by such a change we
might have to reconsider.

Meh :/

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-22 14:24     ` Patrick Steinhardt
@ 2025-07-22 15:14       ` Eli Schwartz
  2025-07-22 15:56       ` Sam James
  1 sibling, 0 replies; 198+ messages in thread
From: Eli Schwartz @ 2025-07-22 15:14 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Phillip Wood, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren, Edward Thomson, brian m. carlson, Taylor Blau


[-- Attachment #1.1: Type: text/plain, Size: 2963 bytes --]

On 7/22/25 10:24 AM, Patrick Steinhardt wrote:
> On Fri, Jul 18, 2025 at 05:25:01PM -0400, Eli Schwartz wrote:

>> For Gentoo HPPA, Alpha, m68k it will simply mean the removal (or end of
>> life and staying forever on 2.50, perhaps) of Git. There is no rust
>> compiler there.
>>
>> Even s390 support for rust is limited to a precompiled version not
>> everyone is willing to use.
>>
>> GCC-rs will probably fix this general issue.
> 
> Hm. It would be nice to assemble a list of common or semi-common
> distributions that do not have proper support for Rust for all or at
> least some platforms. Should we maybe consider reaching out to other
> distros (e.g. Debian, Fedora, BSDs) before we commit to any change that
> has an outsized impact on the larger ecosystem?
> 
> I would really love to start adopting Rust, and if it's only going to be
> architectures that are extremely niche I'm probably fine with that. But
> if there are many small systems that are impacted by such a change we
> might have to reconsider.
> 
> Meh :/


To elaborate a bit w.r.t. Gentooo.

Gentoo Prefix-on-macOS and Prefix-on-Solaris don't support rust either.
I think at least macOS is reasonably popular. Obviously Rust supports
macOS, and the Prefix maintainer would like it to work but hasn't been
able to -- no idea why. Arguably you can tell these users "install a
better OS so you can use git".

musl has lots of issues with rust, and is disabled for Gentoo musl
editions on arm (not arm64), ppc, i686, m68k, mips. Arguably you can
tell these users "musl sucks, why are you using it, use glibc like a
sensible person".

i486 is entirely disabled due to mandatory sse2. Hopefully those users
are rare even compared to i686 users. ;)

s390 only works on s390x

sparc 64ul works, but 32ul does not.

riscv rv64gc works, rv32imac does not.

A general trend here is 32-bit issues.


For alpha/hppa, no references at all -- not even tier 3 support -- on
https://doc.rust-lang.org/beta/rustc/platform-support.html, and Gentoo
doesn't support LLVM there either. ;) In general, porting rustc to a new
arch means *first* porting LLVM, and then after that, *also* porting
rustc, so who's going to try the latter before the former? ;)

Hence the interest in GCC-rs, which already has a backend supporting all
this for C/C++/Fortran plus interest by users of these arches in a
portable rust compiler.

If rust is added and doesn't have a fallback C impl, all this becomes a
relevant topic for consideration. (I don't have strong opinions on
optional rust.)


...


See

$ git clone https://github.com/gentoo/gentoo && cd gentoo
$ git grep --name-only features/wd40 profiles/| grep -v 17.0


(wd40 is the inheritance tree for disabling all features in any package
that rely on a rust compiler. See README at
https://github.com/gentoo/gentoo/tree/master/profiles/features/wd40 for
details)

-- 
Eli Schwartz

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-22 12:21       ` Patrick Steinhardt
@ 2025-07-22 15:56         ` Junio C Hamano
  0 siblings, 0 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-07-22 15:56 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: brian m. carlson, Taylor Blau, Ezekiel Newren via GitGitGadget,
	git, Elijah Newren, Ezekiel Newren

Patrick Steinhardt <ps@pks.im> writes:

> Fully agreed. I've said so at the last contributors summit, but I think
> it would become awfully unmaintainable if we retain two implementations
> of every subsystem that we convert to Rust. If we decide to use Rust I
> would strongly advocate for going all-in.

True.  

We do not have subsystems with clear boundaries yet, and introducing
Rust in such a state would not allow us to pick some parts (e.g.
merge backends, etc.) and do them optionally in Rust, while keeping
and/or adding others in C.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-22 14:24     ` Patrick Steinhardt
  2025-07-22 15:14       ` Eli Schwartz
@ 2025-07-22 15:56       ` Sam James
  2025-07-23  4:32         ` Patrick Steinhardt
  1 sibling, 1 reply; 198+ messages in thread
From: Sam James @ 2025-07-22 15:56 UTC (permalink / raw)
  To: ps
  Cc: eschwartz, ethomson, ezekielnewren, git, gitgitgadget, me, newren,
	phillip.wood123, sandals

There's a few issues from our perspective:

* Old platforms which don't have LLVM can't yet have Rust either, as
  rustc is based on LLVM.

  These need gccrs to be unblocked. I can understand not caring too much
  about these, though it is unfortunate, because I think if git hadn't
  supported many platforms to begin with, I doubt it'd have the adoption
  it does today.

  (There is another effort which seeks to take rustc and bolt on
  libgccjit as a replacement backend, but that isn't feasible for use
  yet either.)

* New platforms where rustc or various Rust crates don't support it
  and we have to go around patching them.

  The crate model makes this much harder. Not having git available when
  doing such porting if doing it natively is going to suck. It also
  means even more software needs Rust ported first.
  
* Platforms which aren't ancient, just not "the default", which tend not
  to work well with Rust.

  For example, rustc assumes that all musl configurations will be
  statically linked, which isn't the case. Working around this is a
  hassle.

* rustc doesn't have LTS releases or the like.

  The only supported release is the latest one. Upgrading to the latest
  release often means we have to deal with new portability problems
  but we can't not upgrade because:
  a) some software will start to require bleeding-edge Rust immediately,
  and
  b) it means we're missing out on bug fixes (miscompilations are
  serious)

* Crate creep

  Rust projects tend to end up having a huge list of crates that they
  pull-in which makes us worried about something nasty creeping in, but
  there's also popular crates with serious portability problems like the
  'ring' crate for TLS.

git is a fundamental piece of system software and making it harder to
build it or use it is a real worry for us.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-17 22:25   ` Taylor Blau
  2025-07-18  0:29     ` brian m. carlson
@ 2025-07-22 16:03     ` Sam James
  2025-07-22 21:37       ` Elijah Newren
  1 sibling, 1 reply; 198+ messages in thread
From: Sam James @ 2025-07-22 16:03 UTC (permalink / raw)
  To: me; +Cc: ezekielnewren, git, gitgitgadget, newren, sandals, Eli Schwartz

> I am far from a Rust expert, but I think that a more modern, memory-safe
> language will attract newer contributors who may have a fresher
> perspective on the project, and I think that's a good thing.

Aren't they likely to contribute to gitoxide? There, they get a clean
slate without having to deal with the least-fun part (bidings).

> It is also not the Git project's responsibility to ensure that every
> platform is Rust-friendly.

That's true, of course. And nobody is entitled to indefinie updates, but
on the other hand, there's still some implicit contract with users. I
really don't think git would have the adoption it does today if it had
adopted a Rust-like language in the same state Rust is now from the
start.

(In exactly the same way, git doesn't gratuitously break compatibility
every release either. Can it? Yes, and git can change the platforms it
runs on, but it's something to be taken seriously.)

> Hopefully the platforms that we currently support but won't after this
> patch series have niche enough workloads that they do not need the
> absolute latest-and-greatest Git release at all times.

I mention this in my other email, but it's not just about ancient
platforms. It's also about new ones, or ones where Rust supports them
poorly despite them being relevant.

> Yeah, I think that this is the most interesting part of the discussion
> here. I am not knowledgeable enough about Rust's release cadence and
> platform compatibility to have an opinion here. But I trust brian's
> judgement ;-).

It gets a new release every 6 weeks and no other releases are supported.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-22 16:03     ` Sam James
@ 2025-07-22 21:37       ` Elijah Newren
  2025-07-22 21:55         ` Sam James
  0 siblings, 1 reply; 198+ messages in thread
From: Elijah Newren @ 2025-07-22 21:37 UTC (permalink / raw)
  To: Sam James; +Cc: me, ezekielnewren, git, gitgitgadget, sandals, Eli Schwartz

Hi,

On Tue, Jul 22, 2025 at 9:03 AM Sam James <sam@gentoo.org> wrote:

First of all, thanks to all the Gentoo folks for chiming in and
providing specifics about platforms and their state.

> > I am far from a Rust expert, but I think that a more modern, memory-safe
> > language will attract newer contributors who may have a fresher
> > perspective on the project, and I think that's a good thing.
>
> Aren't they likely to contribute to gitoxide? There, they get a clean
> slate without having to deal with the least-fun part (bidings).

I'm sure some are.  But clearly there are others where the draw is
improving git itself because of its installed base; in fact, we need
look no further than this exact series we are commenting on to find
proof of that -- one such new contributor submitted patches to use
Rust in git, and found a significant speedup while doing so.

Further, there's considerable interest from existing git developers to
use Rust in git as well; last year at the Git contributor summit,
usage of Rust in git was not only one of the topics of discussion, it
was the top voted topic (meaning, the topic that the greatest number
of git contributors wanted to discuss).

> > It is also not the Git project's responsibility to ensure that every
> > platform is Rust-friendly.
>
> That's true, of course. And nobody is entitled to indefinie updates, but
> on the other hand, there's still some implicit contract with users. I
> really don't think git would have the adoption it does today if it had
> adopted a Rust-like language in the same state Rust is now from the
> start.
>
> (In exactly the same way, git doesn't gratuitously break compatibility
> every release either. Can it? Yes, and git can change the platforms it
> runs on, but it's something to be taken seriously.)

This feels kind of close to a false dichotomy between breaking
compatibility every release and indefinite update entitlements.  There
is certainly some middle ground: discussing reducing the breadth of
platform support in order to gain other benefits, then gathering
feedback, making a plan, and announcing the upcoming change, etc.

And we're already pretty deep into it.  Concerns about losing out on
some platforms have repeatedly slowed us down from adopting Rust years
ago.  Yet, the desire for Rust adoption keeps coming up anyway; see
the threads starting at

  * https://lore.kernel.org/git/ZZ77NQkSuiRxRDwt@nand.local/
  * https://lore.kernel.org/git/Zu2D%2Fb1ZJbTlC1ml@nand.local/
  * https://lore.kernel.org/git/20241128-pks-meson-v10-22-79a3fb0cb3a6@pks.im/
(search for "Rust")
  * https://lore.kernel.org/git/cover.1723242556.git.steadmon@google.com/

The discussion has also been picked up and reported outside the Git
mailing list, e.g. https://lwn.net/Articles/998115/.

And so, in addition to the optional contrib/libgit-rs and
contrib/libgit-sys Rust components that have already been merged into
git, and a new build system added in part to make it easier to adopt
Rust, we now have the first patch series that proposes a hard
dependency on Rust.

Further, I'd like to comment a bit on the support of our users from
another angle.  We're also responsible for security for our users, and
feel Rust would help (see e.g.
https://litchipi.github.io/infosec/2023/01/24/git-code-audit-viewed-as-rust-programmer.html
and https://github.com/bk2204/git/commit/fbeb1180c7473635a964daed2da642c53487782d).
We're responsible for performance of Git for our users, and feel Rust
would help (see the email that started this thread,
https://lore.kernel.org/git/CABPp-BFOmwV-xBtjvtenb6RFz9wx2VWVpTeho0k=D8wsCCVwqQ@mail.gmail.com/,
and brian's notes about [CPU multi-]threading elsewhere in this email
thread we are in).  And there are other benefits from using Rust that
we believe would benefit our users.  Thus, it's not just a question of
responsibility to our users, because such a responsibility pulls us in
different directions regarding usage of Rust.  So we need to figure
out how to weigh the needs of our different users.  For many of us,
and forgive the geeky comparison, we'll probably weigh those needs
with something more akin to an L2 norm (most good for the most users)
rather than an L-infinity norm (maximal difference in usability for a
single user), which probably isn't to your liking.

Anyway, there's been lots of discussion already.  We can certainly
still discuss more about exactly how to announce, when to adopt Rust,
whether we'll support an existing C-only version of git for a longer
period of time than normal, and even whether to continue to delay
adopting Rust for a little longer.  But my personal guess is that
attempting to stop adoption of Rust is unlikely to win at this point.

> > Hopefully the platforms that we currently support but won't after this
> > patch series have niche enough workloads that they do not need the
> > absolute latest-and-greatest Git release at all times.
>
> I mention this in my other email, but it's not just about ancient
> platforms. It's also about new ones, or ones where Rust supports them
> poorly despite them being relevant.

This feels like you're trying to push the decision for a given
platform to be a dichotomy between latest-and-greatest-Git or
no-version-of-Git-at-all, despite the fact that Taylor suggested an
alternative and you even quoted him.  Can you comment on that
alternative?  Why would using the last C-only version of Git[1] until
gccrs bridges the gap be a problem for these platforms?

Thanks,
Elijah

[1] Well, C-only other than optional Rust components like
contrib/libgit-rs and contrib/libgit-sys that have already been
released.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-22 21:37       ` Elijah Newren
@ 2025-07-22 21:55         ` Sam James
  2025-07-22 22:08           ` Collin Funk
  0 siblings, 1 reply; 198+ messages in thread
From: Sam James @ 2025-07-22 21:55 UTC (permalink / raw)
  To: Elijah Newren; +Cc: me, ezekielnewren, git, gitgitgadget, sandals, Eli Schwartz

Elijah Newren <newren@gmail.com> writes:

> Hi,
>
> On Tue, Jul 22, 2025 at 9:03 AM Sam James <sam@gentoo.org> wrote:
>
> First of all, thanks to all the Gentoo folks for chiming in and
> providing specifics about platforms and their state.

Thanks. I've been trying to be very specific about what the issues
are. I don't deny Rust is the future of many projects, but it still has
rough parts, and I'd like to ensure they're discussed. It is not my
intention to just scream whenever someone considers adopting Rust.

>
>> > I am far from a Rust expert, but I think that a more modern, memory-safe
>> > language will attract newer contributors who may have a fresher
>> > perspective on the project, and I think that's a good thing.
>>
>> Aren't they likely to contribute to gitoxide? There, they get a clean
>> slate without having to deal with the least-fun part (bidings).
>
> I'm sure some are.  But clearly there are others where the draw is
> improving git itself because of its installed base; in fact, we need
> look no further than this exact series we are commenting on to find
> proof of that -- one such new contributor submitted patches to use
> Rust in git, and found a significant speedup while doing so.

Part of my opinion there is coloured by how generally working on a
polylang codebase often has pain when dealing with bindings and the
edges, so I figure that anyone most-keen on Rust would surely want to
avoid that ;)

>
> Further, there's considerable interest from existing git developers to
> use Rust in git as well; last year at the Git contributor summit,
> usage of Rust in git was not only one of the topics of discussion, it
> was the top voted topic (meaning, the topic that the greatest number
> of git contributors wanted to discuss).
>
>> > It is also not the Git project's responsibility to ensure that every
>> > platform is Rust-friendly.
>>
>> That's true, of course. And nobody is entitled to indefinie updates, but
>> on the other hand, there's still some implicit contract with users. I
>> really don't think git would have the adoption it does today if it had
>> adopted a Rust-like language in the same state Rust is now from the
>> start.
>>
>> (In exactly the same way, git doesn't gratuitously break compatibility
>> every release either. Can it? Yes, and git can change the platforms it
>> runs on, but it's something to be taken seriously.)
>
> This feels kind of close to a false dichotomy between breaking
> compatibility every release and indefinite update entitlements.  There
> is certainly some middle ground: discussing reducing the breadth of
> platform support in order to gain other benefits, then gathering
> feedback, making a plan, and announcing the upcoming change, etc.
>

Of course. I'm just making the point that it is indeed a compatibility
change, and perhaps that perspective is useful.

> And we're already pretty deep into it.  Concerns about losing out on
> some platforms have repeatedly slowed us down from adopting Rust years
> ago.  Yet, the desire for Rust adoption keeps coming up anyway; see
> the threads starting at
>
>   * https://lore.kernel.org/git/ZZ77NQkSuiRxRDwt@nand.local/
>   * https://lore.kernel.org/git/Zu2D%2Fb1ZJbTlC1ml@nand.local/
>   * https://lore.kernel.org/git/20241128-pks-meson-v10-22-79a3fb0cb3a6@pks.im/
> (search for "Rust")
>   * https://lore.kernel.org/git/cover.1723242556.git.steadmon@google.com/
>
> The discussion has also been picked up and reported outside the Git
> mailing list, e.g. https://lwn.net/Articles/998115/.
>
> And so, in addition to the optional contrib/libgit-rs and
> contrib/libgit-sys Rust components that have already been merged into
> git, and a new build system added in part to make it easier to adopt
> Rust, we now have the first patch series that proposes a hard
> dependency on Rust.

Yes, that's why it's of concern. I have no issue with the optional parts.

>
> Further, I'd like to comment a bit on the support of our users from
> another angle.  We're also responsible for security for our users

Supply-chain issues become more of a problem with Rust if we end up
making heavy use of crates. A policy moderating their use is something
we should talk about.

> and
> feel Rust would help (see e.g.
> https://litchipi.github.io/infosec/2023/01/24/git-code-audit-viewed-as-rust-programmer.html
> and https://github.com/bk2204/git/commit/fbeb1180c7473635a964daed2da642c53487782d).
> We're responsible for performance of Git for our users, and feel Rust
> would help (see the email that started this thread,
> https://lore.kernel.org/git/CABPp-BFOmwV-xBtjvtenb6RFz9wx2VWVpTeho0k=D8wsCCVwqQ@mail.gmail.com/,
> and brian's notes about [CPU multi-]threading elsewhere in this email
> thread we are in).  And there are other benefits from using Rust that
> we believe would benefit our users.  Thus, it's not just a question of
> responsibility to our users, because such a responsibility pulls us in
> different directions regarding usage of Rust.  So we need to figure
> out how to weigh the needs of our different users.  For many of us,
> and forgive the geeky comparison, we'll probably weigh those needs
> with something more akin to an L2 norm (most good for the most users)
> rather than an L-infinity norm (maximal difference in usability for a
> single user), which probably isn't to your liking.
>

:)

> Anyway, there's been lots of discussion already.  We can certainly
> still discuss more about exactly how to announce, when to adopt Rust,
> whether we'll support an existing C-only version of git for a longer
> period of time than normal, and even whether to continue to delay
> adopting Rust for a little longer.  But my personal guess is that
> attempting to stop adoption of Rust is unlikely to win at this point.

I wouldn't characterise my position as attempting to flat-out stop
adoption of Rust (see beginning of this email).

>
>> > Hopefully the platforms that we currently support but won't after this
>> > patch series have niche enough workloads that they do not need the
>> > absolute latest-and-greatest Git release at all times.
>>
>> I mention this in my other email, but it's not just about ancient
>> platforms. It's also about new ones, or ones where Rust supports them
>> poorly despite them being relevant.
>
> This feels like you're trying to push the decision for a given
> platform to be a dichotomy between latest-and-greatest-Git or
> no-version-of-Git-at-all, despite the fact that Taylor suggested an
> alternative and you even quoted him.  Can you comment on that
> alternative?  Why would using the last C-only version of Git[1] until
> gccrs bridges the gap be a problem for these platforms?

What I was saying there was: it matters for platforms where they may not
have a git at all (because they're new, and we have a bit of a
bootstrapping problem), not just old ones where they're stuck on an old git.

Part of what I had in mind here is that sticking on old versions even
temporarily isn't necessarily a great option, see the recent issues w/
backports done (https://lore.kernel.org/git/xmqqldov4rpt.fsf@gitster.g/,
and https://lore.kernel.org/git/20250708210529.1214574-1-tmz@pobox.com/).

---

As a final note: I am genuinely not trying to be a member of the peanut
gallery wishing to prevent git's progress if the project wants to adopt
Rust, just there's some real practical obstacles for us right now.

I hope it didn't come across that way, but "dichotomy" appearing a few
times in your reply made me fear it did.

>
> Thanks,
> Elijah

thanks,
sam

>
> [1] Well, C-only other than optional Rust components like
> contrib/libgit-rs and contrib/libgit-sys that have already been
> released.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 1/7] xdiff: introduce rust
  2025-07-17 21:30   ` brian m. carlson
                       ` (2 preceding siblings ...)
  2025-07-18 23:15     ` Ezekiel Newren
@ 2025-07-22 22:02     ` Mike Hommey
  2025-07-22 23:52       ` brian m. carlson
  3 siblings, 1 reply; 198+ messages in thread
From: Mike Hommey @ 2025-07-22 22:02 UTC (permalink / raw)
  To: brian m. carlson, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren, Ezekiel Newren

On Thu, Jul 17, 2025 at 09:30:43PM +0000, brian m. carlson wrote:
> On 2025-07-17 at 20:32:18, Ezekiel Newren via GitGitGadget wrote:
> > diff --git a/rust/Cargo.lock b/rust/Cargo.lock
> > new file mode 100644
> > index 000000000000..fb1eac690b39
> > --- /dev/null
> > +++ b/rust/Cargo.lock
> > @@ -0,0 +1,14 @@
> > +# This file is automatically @generated by Cargo.
> > +# It is not intended for manual editing.
> > +version = 4
> > +
> > +[[package]]
> > +name = "interop"
> > +version = "0.1.0"
> > +
> > +[[package]]
> > +name = "xdiff"
> > +version = "0.1.0"
> > +dependencies = [
> > + "interop",
> > +]
> 
> I would prefer that we not check in Cargo.lock in Git.  Part of the
> reason is that it changes across versions and so building with a
> different version of the toolchain can update the file.

That actually doesn't happen unless the file needs to be updated for
some reason, like Cargo.toml having new dependencies or `cargo update`
being run.

Mike

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-22 21:55         ` Sam James
@ 2025-07-22 22:08           ` Collin Funk
  0 siblings, 0 replies; 198+ messages in thread
From: Collin Funk @ 2025-07-22 22:08 UTC (permalink / raw)
  To: Sam James
  Cc: Elijah Newren, me, ezekielnewren, git, gitgitgadget, sandals,
	Eli Schwartz

Sam James <sam@gentoo.org> writes:

>> Further, I'd like to comment a bit on the support of our users from
>> another angle.  We're also responsible for security for our users
>
> Supply-chain issues become more of a problem with Rust if we end up
> making heavy use of crates. A policy moderating their use is something
> we should talk about.

+1. I find it a bit worrying when I see 500+ dependencies (mostly
transitive) being downloaded when running 'cargo build'.

Not saying we should go to the extreme of Not Invented Here syndrome
[1], since easy use of packages via 'cargo' is a major reason why people
enjoy Rust. But we should consider whether they provide enough value to
be included.

Collin

[1] https://en.wikipedia.org/wiki/Not_invented_here

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 1/7] xdiff: introduce rust
  2025-07-22 22:02     ` Mike Hommey
@ 2025-07-22 23:52       ` brian m. carlson
  0 siblings, 0 replies; 198+ messages in thread
From: brian m. carlson @ 2025-07-22 23:52 UTC (permalink / raw)
  To: Mike Hommey
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren

[-- Attachment #1: Type: text/plain, Size: 813 bytes --]

On 2025-07-22 at 22:02:33, Mike Hommey wrote:
> On Thu, Jul 17, 2025 at 09:30:43PM +0000, brian m. carlson wrote:
> > I would prefer that we not check in Cargo.lock in Git.  Part of the
> > reason is that it changes across versions and so building with a
> > different version of the toolchain can update the file.
> 
> That actually doesn't happen unless the file needs to be updated for
> some reason, like Cargo.toml having new dependencies or `cargo update`
> being run.

I've actually seen several cases in my local Rust development where
Cargo wants to update the file despite it not being necessary and
`--locked` simply refusing to work without good cause.  Perhaps those
cases have been fixed, but it has happened in older versions.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-22 15:56       ` Sam James
@ 2025-07-23  4:32         ` Patrick Steinhardt
  2025-07-24  9:01           ` Pierre-Emmanuel Patry
  0 siblings, 1 reply; 198+ messages in thread
From: Patrick Steinhardt @ 2025-07-23  4:32 UTC (permalink / raw)
  To: Sam James
  Cc: eschwartz, ethomson, ezekielnewren, git, gitgitgadget, me, newren,
	phillip.wood123, sandals

On Tue, Jul 22, 2025 at 04:56:12PM +0100, Sam James wrote:
> There's a few issues from our perspective:
> 
> * Old platforms which don't have LLVM can't yet have Rust either, as
>   rustc is based on LLVM.
> 
>   These need gccrs to be unblocked. I can understand not caring too much
>   about these, though it is unfortunate, because I think if git hadn't
>   supported many platforms to begin with, I doubt it'd have the adoption
>   it does today.
> 
>   (There is another effort which seeks to take rustc and bolt on
>   libgccjit as a replacement backend, but that isn't feasible for use
>   yet either.)

It would be great to know about the general timelines of these
alternative implementations. If e.g. gccrs were to achieve compatibility
with one of the editions of Rust next year it would be a good enough
reason to defer the rustification from my point of view so that we don't
break the ecosystem and have wider platform support. If the answer is
"They'll land in 10 years" then I don't know...

I sifted through their project sites and found various status reports,
and they do seem to be making steady progress. But as far as I see
critical language features are still missing as of now.

[snip]
> * rustc doesn't have LTS releases or the like.
> 
>   The only supported release is the latest one. Upgrading to the latest
>   release often means we have to deal with new portability problems
>   but we can't not upgrade because:
>   a) some software will start to require bleeding-edge Rust immediately,
>   and
>   b) it means we're missing out on bug fixes (miscompilations are
>   serious)

I'm not a big fan of this in the Rust ecosystem indeed. It feels like
every second project requires nightly features or at least a version of
the compiler that was released in the last couple months. This may work
for a language like Go, which is more targeted towards deploying server
applications. But for a system-level language like Rust I think it's
rather a sign of it being immature.

In any case, the burden would fall on us to ensure that we carefully
consider which version of Rust to target. And as it was said elsewhere
in the thread, we would need to make sure that things build on old
versions of Debian. Which may be easier said than done if we also rely
on lots of crates which may update to newer Rust versions at any point
in time.

> * Crate creep
> 
>   Rust projects tend to end up having a huge list of crates that they
>   pull-in which makes us worried about something nasty creeping in, but
>   there's also popular crates with serious portability problems like the
>   'ring' crate for TLS.

True. I think if we were to adopt Rust we ought to be as conservative as
we are now with picking up new dependencies. I don't want to have a big
open door for supply chain attacks. And neither do I want to be forced
into the situation where we cannot update a crate because they decided
to drop support for older Rust versions.

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 1/7] xdiff: introduce rust
  2025-07-18 23:15     ` Ezekiel Newren
@ 2025-07-23 21:57       ` brian m. carlson
  2025-07-23 22:26         ` Junio C Hamano
  2025-07-28 19:11         ` Ezekiel Newren
  0 siblings, 2 replies; 198+ messages in thread
From: brian m. carlson @ 2025-07-23 21:57 UTC (permalink / raw)
  To: Ezekiel Newren; +Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren

[-- Attachment #1: Type: text/plain, Size: 2331 bytes --]

On 2025-07-18 at 23:15:19, Ezekiel Newren wrote:
> This goes against what I think is best practices.  Don’t we need
> Cargo.lock to audit and debug platform specific issues, and to ensure
> reproducibility?  Without Cargo.lock, we might get different results
> one minute to the next if one of our dependencies releases a new
> version. Checking in Cargo.lock aligns with Cargo’s documented best
> practices (https://doc.rust-lang.org/cargo/faq.html#why-have-cargolock-in-version-control).

I appreciate that, but best practices also don't limit software to a
six-week lifespan.  Rust the language is a great tool, but we also have
a special case here in that we need to support software that upstream
does not and that we care about OS distros, which upstream does not.

Note that when someone builds locally, a Cargo.lock will be created and
they will get reproducible builds from that point on.  It is only on
first build that they will get whatever's the latest.

> I understand your concern and I agree that this could become a
> problem. I’m totally flexible on which rust version should be used,
> but without Cargo.lock checked in we lose the ability to audit why a
> build failed. I think that this will be a pain point, but numbing that
> pain means we can’t solve intermittent problems due to dependencies in
> the future.

I was one of the maintainers for Git LFS for several years.  We
routinely had people come to us and say, "This dependency you're using
has a portion that you're not using, which has a CVE.  I demand you
update it and do a new release immediately because our security scanner
is going off and our company policy is that there be no exceptions."
This happens literally all the time and I absolutely in no case want to
see those people on this list or the security list.

So the options as I see them are (a) we don't check in Cargo.lock, (b)
we convince the Rust project and the ecosystem to provide LTS releases
with security fixes, or (c) we only accept dependencies that have our
same lifetime policy (which are very few and far between).  I know this
makes builds unreproducible (although not under the Reproducible Builds
project's definitions), but we really don't have many alternatives.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 1/7] xdiff: introduce rust
  2025-07-23 21:57       ` brian m. carlson
@ 2025-07-23 22:26         ` Junio C Hamano
  2025-07-28 19:11         ` Ezekiel Newren
  1 sibling, 0 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-07-23 22:26 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Ezekiel Newren, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> I was one of the maintainers for Git LFS for several years.  We
> routinely had people come to us and say, "This dependency you're using
> has a portion that you're not using, which has a CVE.  I demand you
> update it and do a new release immediately because our security scanner
> is going off and our company policy is that there be no exceptions."
> This happens literally all the time and I absolutely in no case want to
> see those people on this list or the security list.

Ahh, the kind we love not to have.

> So the options as I see them are (a) we don't check in Cargo.lock, (b)
> we convince the Rust project and the ecosystem to provide LTS releases
> with security fixes, or (c) we only accept dependencies that have our
> same lifetime policy (which are very few and far between).  I know this
> makes builds unreproducible (although not under the Reproducible Builds
> project's definitions), but we really don't have many alternatives.

Thanks for a well reasoned argument.

Hopefully as Rust matures more, some of these issues (starting with
"6 weeks and it is too old to bother") would resolve themselves, but
until then we'd need to be careful.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-23  4:32         ` Patrick Steinhardt
@ 2025-07-24  9:01           ` Pierre-Emmanuel Patry
  2025-07-24 10:00             ` Patrick Steinhardt
  0 siblings, 1 reply; 198+ messages in thread
From: Pierre-Emmanuel Patry @ 2025-07-24  9:01 UTC (permalink / raw)
  To: ps
  Cc: eschwartz, ethomson, ezekielnewren, git, gitgitgadget, me, newren,
	phillip.wood123, sam, sandals


On Tue, Jul 23, 2025 at 06:32:06 +0200, Patrick Steinhardt wrote:
 > It would be great to know about the general timelines of these
 > alternative implementations.

We still think we'll be able to compile libcore before the end of the 
summer, we've made great progress and few items are left. But keep in 
mind we're targeting an older version of rust (1.49) and libcore is 
smaller than the standard library. We still have a lot of testing to do 
and we expect many bugs.

The next targeted version will probably be rust 1.78 as we want to keep 
up with rust for linux. This shouldn't be too long as most of the 
features are coming from either standard library modifications or 
nightly features we already had to support for 1.49.

We expect to be able to compile some 1.49 code correctly next year at 
best. I would like to bring to your attention rustc_codegen_gcc which 
adds a gcc backend to the rustc frontend, although not a full gcc 
compiler it could help supporting some architectures that are currently 
not supported by llvm.

Pierre-Emmanuel

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-24  9:01           ` Pierre-Emmanuel Patry
@ 2025-07-24 10:00             ` Patrick Steinhardt
  2025-07-28  9:06               ` Pierre-Emmanuel Patry
  0 siblings, 1 reply; 198+ messages in thread
From: Patrick Steinhardt @ 2025-07-24 10:00 UTC (permalink / raw)
  To: Pierre-Emmanuel Patry
  Cc: eschwartz, ethomson, ezekielnewren, git, gitgitgadget, me, newren,
	phillip.wood123, sam, sandals

On Thu, Jul 24, 2025 at 11:01:22AM +0200, Pierre-Emmanuel Patry wrote:
> 
> On Tue, Jul 23, 2025 at 06:32:06 +0200, Patrick Steinhardt wrote:
> > It would be great to know about the general timelines of these
> > alternative implementations.
> 
> We still think we'll be able to compile libcore before the end of the
> summer, we've made great progress and few items are left. But keep in mind
> we're targeting an older version of rust (1.49) and libcore is smaller than
> the standard library. We still have a lot of testing to do and we expect
> many bugs.

Understood. Given that we don't plan to roll with the latest version of
Rust anyway I think it could be a viable tradeoff for us to also
consider gccrs when we determine the minimum required Rust version.

> The next targeted version will probably be rust 1.78 as we want to keep up
> with rust for linux. This shouldn't be too long as most of the features are
> coming from either standard library modifications or nightly features we
> already had to support for 1.49.
> 
> We expect to be able to compile some 1.49 code correctly next year at best.

And I expect that 1.78 will be another significant effort that won't
land before the year after?

> I would like to bring to your attention rustc_codegen_gcc which adds a gcc
> backend to the rustc frontend, although not a full gcc compiler it could
> help supporting some architectures that are currently not supported by llvm.

For my own understanding: is this something that the Git project would
have to support or something that the distributor needs to set up?

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 7/7] github_workflows: install rust
  2025-07-18 23:01     ` Ezekiel Newren
@ 2025-07-25 23:56       ` Ben Knoble
  0 siblings, 0 replies; 198+ messages in thread
From: Ben Knoble @ 2025-07-25 23:56 UTC (permalink / raw)
  To: Ezekiel Newren
  Cc: brian m. carlson, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren


> Le 18 juil. 2025 à 19:04, Ezekiel Newren <ezekielnewren@gmail.com> a écrit :
> 
> On Thu, Jul 17, 2025 at 3:23 PM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
>> 
>>> On 2025-07-17 at 20:32:24, Ezekiel Newren via GitGitGadget wrote:
>>> diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
>>> index 7dbf9f7f123c..8aac18a6ba45 100644
>>> --- a/.github/workflows/main.yml
>>> +++ b/.github/workflows/main.yml
>>> @@ -4,6 +4,7 @@ on: [push, pull_request]
>>> 
>>> env:
>>>   DEVELOPER: 1
>>> +  RUST_VERSION: 1.87.0
>> 
>> Our discussed plan is to support the version in Debian stable, plus a
>> year.  So we'd be supporting 1.63.0 for a year after trixie's release.
>> 
>> The reason for that is that people build backports and security updates
>> for Git for stable releases of distros and they will use the distro
>> toolchain for doing so.  Forcing distros to constantly build with the
>> latest toolchain is pretty hostile, especially since the lifespan of
>> Rust release is six weeks.
>> 
>> If the Rust project provides LTS releases in the future, then we can
>> consider adopting those.
> 
> The RUST_VERSION variable in .github/workflows/main.yaml had to have a
> specific version. 1.87.0 was selected since that's what I was using
> locally. Elijah made me aware that an older version of rust might be
> desired, but didn't know which one. I'll switch to 1.63.0 or whatever
> the community decides.
> 
>>> +if [ "$rust_target" = "release" ]; then
>>> +  rust_args="--release"
>>> +  export RUSTFLAGS='-Aunused_imports -Adead_code'
>>> +elif [ "$rust_target" = "debug" ]; then
>>> +  rust_args=""
>>> +  export RUSTFLAGS='-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
>> 
>> Can you say a little about why these options are needed and the defaults
>> are inadequate?  For instance, I build with the default options both in
>> my personal projects and at work and don't see a problem.
> 
> What I found is that if I have a Rust function
> 
> #[no_mangle]
> pub fn call_from_c(arg: u64) {}
> 
> which is only meant to be called from C and isn’t called from
> elsewhere in Rust, then cargo will misidentify this function as dead
> code.  This was the reason for adding ‘-Adead_code’.

Are functions that exist for C FFI callers supposed to be marked unsafe, and if so: does that prevent the dead code analyzer from removing them w/o the allow flag?

Or, alternatively, can we #[] annotate them as allowed? It might be noisy, but it also lets the checkers flag actually dead code?

> 
> The reason for adding ‘-Aunused_imports’ is somewhat IDE related; if I
> paste code somewhere, RustRover will sometimes automatically add the
> necessary imports.  However, if I delete a chunk of code, it’ll
> highlight the imports that are no longer used if I scroll to the top
> of the file, but it won’t automatically remove them.  Since they
> aren’t automatically removed, it’s easier to build with
> ‘-Aunused_imports’.

Similarly, I would think this is a problem with the IDE rather than something we want to end up with in the implementation.

I wonder if one of the cargo fix type things can remove them for you, too, so that it’s easier to just drop this allow. 

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification
  2025-07-24 10:00             ` Patrick Steinhardt
@ 2025-07-28  9:06               ` Pierre-Emmanuel Patry
  0 siblings, 0 replies; 198+ messages in thread
From: Pierre-Emmanuel Patry @ 2025-07-28  9:06 UTC (permalink / raw)
  To: ps
  Cc: eschwartz, ethomson, ezekielnewren, git, gitgitgadget, me, newren,
	phillip.wood123, pierre-emmanuel.patry, sam, sandals


> And I expect that 1.78 will be another significant effort that won't
> land before the year after?

Yes, even though it will be easier once the foundations are laid off, I 
wouldn't expect 1.78 before at least a year after that.

> For my own understanding: is this something that the Git project would
> have to support or something that the distributor needs to set up?

I would say it is something the distributor needs to set up. From what I 
remember rustc requires a flag with the path to the rustc_codegen_gcc 
backend. This means that has long as the distributor has the alternative 
backend and a way to inject a flag to rustc through an environment 
variable it should be mostly fine for them.

Pierre-Emmanuel

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 1/7] xdiff: introduce rust
  2025-07-23 21:57       ` brian m. carlson
  2025-07-23 22:26         ` Junio C Hamano
@ 2025-07-28 19:11         ` Ezekiel Newren
  2025-07-31 22:37           ` brian m. carlson
  1 sibling, 1 reply; 198+ messages in thread
From: Ezekiel Newren @ 2025-07-28 19:11 UTC (permalink / raw)
  To: brian m. carlson, Ezekiel Newren, Ezekiel Newren via GitGitGadget,
	git, Elijah Newren

On Wed, Jul 23, 2025 at 3:57 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On 2025-07-18 at 23:15:19, Ezekiel Newren wrote:
> > This goes against what I think is best practices.  Don’t we need
> > Cargo.lock to audit and debug platform specific issues, and to ensure
> > reproducibility?  Without Cargo.lock, we might get different results
> > one minute to the next if one of our dependencies releases a new
> > version. Checking in Cargo.lock aligns with Cargo’s documented best
> > practices (https://doc.rust-lang.org/cargo/faq.html#why-have-cargolock-in-version-control).
>
> I appreciate that, but best practices also don't limit software to a
> six-week lifespan.  Rust the language is a great tool, but we also have
> a special case here in that we need to support software that upstream
> does not and that we care about OS distros, which upstream does not.
>
> Note that when someone builds locally, a Cargo.lock will be created and
> they will get reproducible builds from that point on.  It is only on
> first build that they will get whatever's the latest.
>
> > I understand your concern and I agree that this could become a
> > problem. I’m totally flexible on which rust version should be used,
> > but without Cargo.lock checked in we lose the ability to audit why a
> > build failed. I think that this will be a pain point, but numbing that
> > pain means we can’t solve intermittent problems due to dependencies in
> > the future.
>
> I was one of the maintainers for Git LFS for several years.  We
> routinely had people come to us and say, "This dependency you're using
> has a portion that you're not using, which has a CVE.  I demand you
> update it and do a new release immediately because our security scanner
> is going off and our company policy is that there be no exceptions."
> This happens literally all the time and I absolutely in no case want to
> see those people on this list or the security list.
>
> So the options as I see them are (a) we don't check in Cargo.lock, (b)
> we convince the Rust project and the ecosystem to provide LTS releases
> with security fixes, or (c) we only accept dependencies that have our
> same lifetime policy (which are very few and far between).  I know this
> makes builds unreproducible (although not under the Reproducible Builds
> project's definitions), but we really don't have many alternatives.
> --
> brian m. carlson (they/them)
> Toronto, Ontario, CA

I like having the Cargo.lock file to figure out why a build worked on
one system, but not another. After talking with Elijah I've decided
that a good solution would be to add Cargo.lock to .gitignore and
change the github workflows to ensure that Cargo.lock is preserved for
all builds. We should also add a comment to Cargo.toml stating that
any build or test issues should include the Cargo.lock that was
generated when asking for help. What does the community think of this
solution?

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-18 13:35   ` Phillip Wood
@ 2025-07-28 19:34     ` Ezekiel Newren
  2025-07-28 19:52       ` Phillip Wood
  2025-07-28 20:00       ` Collin Funk
  0 siblings, 2 replies; 198+ messages in thread
From: Ezekiel Newren @ 2025-07-28 19:34 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau

On Fri, Jul 18, 2025 at 7:35 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> Hi Ezekiel
>
> On 17/07/2025 21:32, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <ezekielnewren@gmail.com>
> >
> > A few commits ago, we added definitions for Rust primitive types,
> > to facilitate interoperability between C and Rust. Switch a
> > few variables to use these types. Which, for now, will
> > require adding some casts.
>
> How necessary is it to change char' to 'u8' so long as the rust and C
> sides both use a type that is the same size? Also what's the advantage
> of using these typedefs rather than the normal C types like unit8_t ?

Rust defines char as 32 bits. C treats char as signed 8 bits. What git
really means by char* is treat everything like a byte string, and u8
is how raw bytes are handled in Rust.

> > diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> > index 5a96e36dfbea..3b364c61f671 100644
> > --- a/xdiff/xdiffi.c
> > +++ b/xdiff/xdiffi.c
> > @@ -418,7 +418,7 @@ static int get_indent(xrecord_t *rec)
> >       long i;
> >       int ret = 0;
> >
> > -     for (i = 0; i < rec->size; i++) {
> > +     for (i = 0; i < (long) rec->size; i++) {
>
> i is a loop counter and array index so we can lose this cast by
> changeing i to size_t

Ok, but I'm going to change the type of i to usize and stuff it inside
the loop i.e. for (usize i = 0; ...

> Thanks
>
> Phillip

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-28 19:34     ` Ezekiel Newren
@ 2025-07-28 19:52       ` Phillip Wood
  2025-07-28 20:14         ` Ezekiel Newren
  2025-07-28 20:53         ` Junio C Hamano
  2025-07-28 20:00       ` Collin Funk
  1 sibling, 2 replies; 198+ messages in thread
From: Phillip Wood @ 2025-07-28 19:52 UTC (permalink / raw)
  To: Ezekiel Newren
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau

On 28/07/2025 20:34, Ezekiel Newren wrote:
> On Fri, Jul 18, 2025 at 7:35 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>> On 17/07/2025 21:32, Ezekiel Newren via GitGitGadget wrote:
>>> From: Ezekiel Newren <ezekielnewren@gmail.com>
>>>
>>> A few commits ago, we added definitions for Rust primitive types,
>>> to facilitate interoperability between C and Rust. Switch a
>>> few variables to use these types. Which, for now, will
>>> require adding some casts.
>>
>> How necessary is it to change char' to 'u8' so long as the rust and C
>> sides both use a type that is the same size? Also what's the advantage
>> of using these typedefs rather than the normal C types like unit8_t ?
> 
> Rust defines char as 32 bits. C treats char as signed 8 bits. What git
> really means by char* is treat everything like a byte string, and u8
> is how raw bytes are handled in Rust.

Right - we need to use u8 on the rust side but I'm trying to understand 
why we need to change the type on the C side and why do we need typedefs 
like usize and u32 on the C side when we already have size_t and uint32_t?

Thanks

Phillip

>>> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
>>> index 5a96e36dfbea..3b364c61f671 100644
>>> --- a/xdiff/xdiffi.c
>>> +++ b/xdiff/xdiffi.c
>>> @@ -418,7 +418,7 @@ static int get_indent(xrecord_t *rec)
>>>        long i;
>>>        int ret = 0;
>>>
>>> -     for (i = 0; i < rec->size; i++) {
>>> +     for (i = 0; i < (long) rec->size; i++) {
>>
>> i is a loop counter and array index so we can lose this cast by
>> changeing i to size_t
> 
> Ok, but I'm going to change the type of i to usize and stuff it inside
> the loop i.e. for (usize i = 0; ...
> 
>> Thanks
>>
>> Phillip


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-28 19:34     ` Ezekiel Newren
  2025-07-28 19:52       ` Phillip Wood
@ 2025-07-28 20:00       ` Collin Funk
  1 sibling, 0 replies; 198+ messages in thread
From: Collin Funk @ 2025-07-28 20:00 UTC (permalink / raw)
  To: Ezekiel Newren
  Cc: Phillip Wood, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau

Ezekiel Newren <ezekielnewren@gmail.com> writes:

> Rust defines char as 32 bits. C treats char as signed 8 bits. What git
> really means by char* is treat everything like a byte string, and u8
> is how raw bytes are handled in Rust.

Minor correction, but the C standard leaves the signedness of 'char' up
to the implementation. Portable code must be written to assume a plain
'char' can be signed or unsigned.

Using the test program below:

    #include <stdio.h>
    #define TYPE_SIGNED(t) (! ((t) 0 < (t) -1))
    int
    main (void)
    {
      printf ("%d\n", TYPE_SIGNED (char));
      return 0;
    }

On GNU/Linux x86_64:

    $ ./a.out 
    1

On GNU/Linux aarch64:

    $ ./a.out 
    0

Collin

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-28 19:52       ` Phillip Wood
@ 2025-07-28 20:14         ` Ezekiel Newren
  2025-07-31 14:20           ` Phillip Wood
  2025-07-28 20:53         ` Junio C Hamano
  1 sibling, 1 reply; 198+ messages in thread
From: Ezekiel Newren @ 2025-07-28 20:14 UTC (permalink / raw)
  To: phillip.wood
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau

On Mon, Jul 28, 2025 at 1:52 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> On 28/07/2025 20:34, Ezekiel Newren wrote:
> > On Fri, Jul 18, 2025 at 7:35 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> >> On 17/07/2025 21:32, Ezekiel Newren via GitGitGadget wrote:
> >>> From: Ezekiel Newren <ezekielnewren@gmail.com>
> >>>
> >>> A few commits ago, we added definitions for Rust primitive types,
> >>> to facilitate interoperability between C and Rust. Switch a
> >>> few variables to use these types. Which, for now, will
> >>> require adding some casts.
> >>
> >> How necessary is it to change char' to 'u8' so long as the rust and C
> >> sides both use a type that is the same size? Also what's the advantage
> >> of using these typedefs rather than the normal C types like unit8_t ?
> >
> > Rust defines char as 32 bits. C treats char as signed 8 bits. What git
> > really means by char* is treat everything like a byte string, and u8
> > is how raw bytes are handled in Rust.
>
> Right - we need to use u8 on the rust side but I'm trying to understand
> why we need to change the type on the C side and why do we need typedefs
> like usize and u32 on the C side when we already have size_t and uint32_t?

Ah, I misunderstood the scope of your question. I could not fit an
example of why this design pattern made sense into this patch series,
so I'll explain with an example here:

If C defines a struct like below then it's obvious how to translate
that into rust for ffi purposes. It also makes it clear that this C
struct is expressly for the purpose of C <-> Rust interoperability.
struct some_struct {
    u8* ptr;
    usize length;
    u64 counter;
};

This is how that C struct needs to be defined in Rust so that it can
interoperate with C, and making C use the Rust types reduces the
chance of copy paste, and primitive type definition mismatch errors.
#[repr(C)]
pub struct some_struct {
    ptr: *mut u8,
    length: usize,
    counter: u64,
};

The Rust function would look like:
#[no_mangle]
unsafe extern "C" fn do_something(data: *mut some_struct) {...}

And C would have a forward declaration like:
extern void do_something(struct some_struct *data);

void some_c_function() {
    struct some_struct x;
    do_something(&x);
}

> Thanks
>
> Phillip

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-28 19:52       ` Phillip Wood
  2025-07-28 20:14         ` Ezekiel Newren
@ 2025-07-28 20:53         ` Junio C Hamano
  1 sibling, 0 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-07-28 20:53 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Ezekiel Newren, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren, brian m. carlson, Taylor Blau

Phillip Wood <phillip.wood123@gmail.com> writes:

> On 28/07/2025 20:34, Ezekiel Newren wrote:
>> On Fri, Jul 18, 2025 at 7:35 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>> On 17/07/2025 21:32, Ezekiel Newren via GitGitGadget wrote:
>>>> From: Ezekiel Newren <ezekielnewren@gmail.com>
>>>>
>>>> A few commits ago, we added definitions for Rust primitive types,
>>>> to facilitate interoperability between C and Rust. Switch a
>>>> few variables to use these types. Which, for now, will
>>>> require adding some casts.
>>>
>>> How necessary is it to change char' to 'u8' so long as the rust and C
>>> sides both use a type that is the same size? Also what's the advantage
>>> of using these typedefs rather than the normal C types like unit8_t ?
>> Rust defines char as 32 bits. C treats char as signed 8 bits. What
>> git
>> really means by char* is treat everything like a byte string, and u8
>> is how raw bytes are handled in Rust.
>
> Right - we need to use u8 on the rust side but I'm trying to
> understand why we need to change the type on the C side and why do we
> need typedefs like usize and u32 on the C side when we already have
> size_t and uint32_t?

Or uint8_t?  Ah, eh, that is "unsigned char" so it would be
redundant, I guess?

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-28 20:14         ` Ezekiel Newren
@ 2025-07-31 14:20           ` Phillip Wood
  2025-07-31 20:58             ` Ezekiel Newren
  0 siblings, 1 reply; 198+ messages in thread
From: Phillip Wood @ 2025-07-31 14:20 UTC (permalink / raw)
  To: Ezekiel Newren, phillip.wood
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau

On 28/07/2025 21:14, Ezekiel Newren wrote:
> On Mon, Jul 28, 2025 at 1:52 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
> 
> Ah, I misunderstood the scope of your question. I could not fit an
> example of why this design pattern made sense into this patch series,
> so I'll explain with an example here:
> 
> If C defines a struct like below then it's obvious how to translate
> that into rust for ffi purposes. It also makes it clear that this C
> struct is expressly for the purpose of C <-> Rust interoperability.
> struct some_struct {
>      u8* ptr;
>      usize length;
>      u64 counter;
> };
> 
> This is how that C struct needs to be defined in Rust so that it can
> interoperate with C, and making C use the Rust types reduces the
> chance of copy paste, and primitive type definition mismatch errors.
> #[repr(C)]
> pub struct some_struct {
>      ptr: *mut u8,
>      length: usize,
>      counter: u64,
> };

How is the pointer, length pair used in rust? Normally one would use a 
slice so do we have to construct a slice every time we want to use the 
data in this struct, or do we copy the data in this struct into to a an 
idiomatic struct with a slice member? If we end up copying there doesn't 
seem much point in changing all the types in the C struct as we can 
define a rust struct using *c_char, c_long etc. to interface with the C 
code and covert them to an appropriate rust type when we copy the data 
to the idiomatic version that is then used by the rust of the rust code. 
I can see the value of the typedefs for documenting C<->rust interop if 
the same struct is used by both but if we end up copying data on the 
rust side I'm not so sure.

Thanks

Phillip


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-31 14:20           ` Phillip Wood
@ 2025-07-31 20:58             ` Ezekiel Newren
  2025-08-01  9:14               ` Phillip Wood
  0 siblings, 1 reply; 198+ messages in thread
From: Ezekiel Newren @ 2025-07-31 20:58 UTC (permalink / raw)
  To: phillip.wood
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau

On Thu, Jul 31, 2025 at 8:20 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> On 28/07/2025 21:14, Ezekiel Newren wrote:
> > On Mon, Jul 28, 2025 at 1:52 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
> >
> > Ah, I misunderstood the scope of your question. I could not fit an
> > example of why this design pattern made sense into this patch series,
> > so I'll explain with an example here:
> >
> > If C defines a struct like below then it's obvious how to translate
> > that into rust for ffi purposes. It also makes it clear that this C
> > struct is expressly for the purpose of C <-> Rust interoperability.
> > struct some_struct {
> >      u8* ptr;
> >      usize length;
> >      u64 counter;
> > };
> >
> > This is how that C struct needs to be defined in Rust so that it can
> > interoperate with C, and making C use the Rust types reduces the
> > chance of copy paste, and primitive type definition mismatch errors.
> > #[repr(C)]
> > pub struct some_struct {
> >      ptr: *mut u8,
> >      length: usize,
> >      counter: u64,
> > };
>
> How is the pointer, length pair used in rust? Normally one would use a
> slice so do we have to construct a slice every time we want to use the
> data in this struct, or do we copy the data in this struct into to a an
> idiomatic struct with a slice member? If we end up copying there doesn't
> seem much point in changing all the types in the C struct as we can
> define a rust struct using *c_char, c_long etc. to interface with the C
> code and covert them to an appropriate rust type when we copy the data
> to the idiomatic version that is then used by the rust of the rust code.
> I can see the value of the typedefs for documenting C<->rust interop if
> the same struct is used by both but if we end up copying data on the
> rust side I'm not so sure.
>
> Thanks
>
> Phillip

Passing pointer + length from c to Rust does not incur a memory copy
overhead. Take a look at rust/xdiff/src/lib.rs wich has the following
rust function defined:

#[no_mangle]
unsafe extern "C" fn xxh3_64(ptr: *const u8, size: usize) -> u64 {
    let slice = std::slice::from_raw_parts(ptr, size);
    xxhash_rust::xxh3::xxh3_64(slice)
}

Creating a slice tells the compiler what assumptions it can make about
that memory. On the C side in xdiff/xprepare.c:

extern u64 xxh3_64(u8 const* ptr, usize size);

and then it's called like this in that same file:

rec->ha = xxh3_64(rec->ptr, rec->size);

I really wanted to show my ivec type that made passing an
interoperable vector type between C and Rust easy and fast, but this
patch series is already getting very long.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-07-18 19:00   ` Junio C Hamano
@ 2025-07-31 21:13     ` Ezekiel Newren
  2025-08-02  7:53       ` Matthias Aßhauer
  0 siblings, 1 reply; 198+ messages in thread
From: Ezekiel Newren @ 2025-07-31 21:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren

On Fri, Jul 18, 2025 at 1:00 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > +extern u64 xxh3_64(u8 const* ptr, usize size);
> > +
> > +
> >  static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
> >                          xdlclassifier_t *cf, xdfile_t *xdf) {
> >       unsigned long *ha;
> > @@ -175,14 +178,26 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> >
> >       xdl_parse_lines(mf, narec, xdf);
> >
> > +     if ((xpp->flags & XDF_WHITESPACE_FLAGS) == 0) {
> > +             for (usize i = 0; i < (usize) xdf->nrec; i++) {
> > +                     xrecord_t *rec = xdf->recs[i];
> > +                     rec->ha = xxh3_64(rec->ptr, rec->size);
> > +             }
> > +     } else {
> > +             for (usize i = 0; i < (usize) xdf->nrec; i++) {
> > +                     xrecord_t *rec = xdf->recs[i];
> > +                     char const* dump = (char const*) rec->ptr;
> > +                     rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
> > +             }
> > +     }
>
> As a technology demonstration and proof of concept patch, this is
> very nice, but to be upstreamed for real, we'd want a variant of
> xxhash that can work with the contents with whitespace squashed to
> be usable with various whitespace ignoring modes of operation.  When
> that happens, and when the result turns out to be more performant,
> we can lose the xdl_hash_record() and require only the xxhash, which
> would be great.
>
> And that variant of xxhash that understands whitespace squashing can
> of course be written in Rust as a part of this effort when the
> series loses its RFC status.  At the same time, those who want to
> use our xdiff code in third-party software (like libgit2 and vim)
> may want to reimplement it in C in their copy.
>
> Thanks.

What is the git precedent for replacement code that is easier to read
and maintain while also being more secure, but is slower? I think
hashing with whitespace handling in Rust might fall in that category.

As far as I can tell the Rust code for dealing with whitespace is
going to be slower than the C code because xdiff used a hash algorithm
(DJB2a) that can operate 1 byte at a time and combined hashing with
determining the length. Xxhash requires that the length be known
beforehand and the memory to be contiguous or to hash it in chunks.
Hashing 1 byte at a time with Xxhash is VERY slow since it's just
copying to an internal buffer until a full block is ready.

On a broader note. How do I show the mailing list the changes that
I've made to this branch/patch series? I'm not sure what the proper
procedure is or even how to do it. What commands would I run, or web
browser steps would I take to show my newest commits?

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 1/7] xdiff: introduce rust
  2025-07-28 19:11         ` Ezekiel Newren
@ 2025-07-31 22:37           ` brian m. carlson
  0 siblings, 0 replies; 198+ messages in thread
From: brian m. carlson @ 2025-07-31 22:37 UTC (permalink / raw)
  To: Ezekiel Newren; +Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren

[-- Attachment #1: Type: text/plain, Size: 655 bytes --]

On 2025-07-28 at 19:11:34, Ezekiel Newren wrote:
> I like having the Cargo.lock file to figure out why a build worked on
> one system, but not another. After talking with Elijah I've decided
> that a good solution would be to add Cargo.lock to .gitignore and
> change the github workflows to ensure that Cargo.lock is preserved for
> all builds. We should also add a comment to Cargo.toml stating that
> any build or test issues should include the Cargo.lock that was
> generated when asking for help. What does the community think of this
> solution?

That sounds like a good solution.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly
  2025-07-31 20:58             ` Ezekiel Newren
@ 2025-08-01  9:14               ` Phillip Wood
  0 siblings, 0 replies; 198+ messages in thread
From: Phillip Wood @ 2025-08-01  9:14 UTC (permalink / raw)
  To: Ezekiel Newren, phillip.wood
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau

Hi Ezekiel

On 31/07/2025 21:58, Ezekiel Newren wrote:
> On Thu, Jul 31, 2025 at 8:20 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>
>> On 28/07/2025 21:14, Ezekiel Newren wrote:
>>> On Mon, Jul 28, 2025 at 1:52 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>>
>>> Ah, I misunderstood the scope of your question. I could not fit an
>>> example of why this design pattern made sense into this patch series,
>>> so I'll explain with an example here:
>>>
>>> If C defines a struct like below then it's obvious how to translate
>>> that into rust for ffi purposes. It also makes it clear that this C
>>> struct is expressly for the purpose of C <-> Rust interoperability.
>>> struct some_struct {
>>>       u8* ptr;
>>>       usize length;
>>>       u64 counter;
>>> };
>>>
>>> This is how that C struct needs to be defined in Rust so that it can
>>> interoperate with C, and making C use the Rust types reduces the
>>> chance of copy paste, and primitive type definition mismatch errors.
>>> #[repr(C)]
>>> pub struct some_struct {
>>>       ptr: *mut u8,
>>>       length: usize,
>>>       counter: u64,
>>> };
>>
>> How is the pointer, length pair used in rust? Normally one would use a
>> slice so do we have to construct a slice every time we want to use the
>> data in this struct, or do we copy the data in this struct into to a an
>> idiomatic struct with a slice member? If we end up copying there doesn't
>> seem much point in changing all the types in the C struct as we can
>> define a rust struct using *c_char, c_long etc. to interface with the C
>> code and covert them to an appropriate rust type when we copy the data
>> to the idiomatic version that is then used by the rust of the rust code.
>> I can see the value of the typedefs for documenting C<->rust interop if
>> the same struct is used by both but if we end up copying data on the
>> rust side I'm not so sure.
>>
>> Thanks
>>
>> Phillip
> 
> Passing pointer + length from c to Rust does not incur a memory copy
> overhead. Take a look at rust/xdiff/src/lib.rs wich has the following
> rust function defined:
> 
> #[no_mangle]
> unsafe extern "C" fn xxh3_64(ptr: *const u8, size: usize) -> u64 {
>      let slice = std::slice::from_raw_parts(ptr, size);
>      xxhash_rust::xxh3::xxh3_64(slice)
> }
I'm afraid I don't find this simple unsafe function example very 
illuminating. I'm trying to understand how we are going to use a struct 
containing a pointer, length pair in code that are more complex than 
this. For example if we implement an entire diff algorithm in rust are 
we going to call std::slice::from_raw_parts() every time we want to 
access a string passed from C? If we're doing that I assume we'd impl a 
safe method on the struct that wraps std::slice::from_raw_parts(). If 
that's the case the method can easily access a field that has type 
*c_char and we don't have to sprinkle casts throughout our C code.

For example (ignoring lifetimes)
#repr["C"]
pub struct SomeStruct {
     ptr *std::ffi::c_char,
     usize len,
     // more members
}

impl SomeStruct {
     get_line(&self) -> &[u8] {
         unsafe {
             std::slice::from_raw_parts(self.ptr as *u8, self.len);
         }
     }
}

On the other hand if at the interface between rust and C, we create a 
slice that we can pass to the rest of the rust code then we also don't 
need to change the C type as there is a single place in the rust code 
where we convert from c_char when we create the slice.

The casts on the C side are pretty invasive. At least casting from char 
to u8 is not going to break anything. The long -> usize and long -> u64 
changes and their associated casts are going to need some careful review 
but in the long run I think the C code also benefits to using those types

> Creating a slice tells the compiler what assumptions it can make about
> that memory. On the C side in xdiff/xprepare.c:
> 
> extern u64 xxh3_64(u8 const* ptr, usize size);
> 
> and then it's called like this in that same file:
> 
> rec->ha = xxh3_64(rec->ptr, rec->size);
> 
> I really wanted to show my ivec type that made passing an
> interoperable vector type between C and Rust easy and fast, but this
> patch series is already getting very long.
That sounds interesting

Thanks

Phillip


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-07-31 21:13     ` Ezekiel Newren
@ 2025-08-02  7:53       ` Matthias Aßhauer
  0 siblings, 0 replies; 198+ messages in thread
From: Matthias Aßhauer @ 2025-08-02  7:53 UTC (permalink / raw)
  To: Ezekiel Newren
  Cc: Junio C Hamano, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren

[-- Attachment #1: Type: text/plain, Size: 3590 bytes --]



On Thu, 31 Jul 2025, Ezekiel Newren wrote:

> On Fri, Jul 18, 2025 at 1:00 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> "Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>>> +extern u64 xxh3_64(u8 const* ptr, usize size);
>>> +
>>> +
>>>  static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
>>>                          xdlclassifier_t *cf, xdfile_t *xdf) {
>>>       unsigned long *ha;
>>> @@ -175,14 +178,26 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
>>>
>>>       xdl_parse_lines(mf, narec, xdf);
>>>
>>> +     if ((xpp->flags & XDF_WHITESPACE_FLAGS) == 0) {
>>> +             for (usize i = 0; i < (usize) xdf->nrec; i++) {
>>> +                     xrecord_t *rec = xdf->recs[i];
>>> +                     rec->ha = xxh3_64(rec->ptr, rec->size);
>>> +             }
>>> +     } else {
>>> +             for (usize i = 0; i < (usize) xdf->nrec; i++) {
>>> +                     xrecord_t *rec = xdf->recs[i];
>>> +                     char const* dump = (char const*) rec->ptr;
>>> +                     rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
>>> +             }
>>> +     }
>>
>> As a technology demonstration and proof of concept patch, this is
>> very nice, but to be upstreamed for real, we'd want a variant of
>> xxhash that can work with the contents with whitespace squashed to
>> be usable with various whitespace ignoring modes of operation.  When
>> that happens, and when the result turns out to be more performant,
>> we can lose the xdl_hash_record() and require only the xxhash, which
>> would be great.
>>
>> And that variant of xxhash that understands whitespace squashing can
>> of course be written in Rust as a part of this effort when the
>> series loses its RFC status.  At the same time, those who want to
>> use our xdiff code in third-party software (like libgit2 and vim)
>> may want to reimplement it in C in their copy.
>>
>> Thanks.
>
> What is the git precedent for replacement code that is easier to read
> and maintain while also being more secure, but is slower? I think
> hashing with whitespace handling in Rust might fall in that category.
>
> As far as I can tell the Rust code for dealing with whitespace is
> going to be slower than the C code because xdiff used a hash algorithm
> (DJB2a) that can operate 1 byte at a time and combined hashing with
> determining the length. Xxhash requires that the length be known
> beforehand and the memory to be contiguous or to hash it in chunks.
> Hashing 1 byte at a time with Xxhash is VERY slow since it's just
> copying to an internal buffer until a full block is ready.
>
> On a broader note. How do I show the mailing list the changes that
> I've made to this branch/patch series? I'm not sure what the proper
> procedure is or even how to do it. What commands would I run, or web
> browser steps would I take to show my newest commits?
>

Since you've used GitGitGadget for the original submission of this patch 
series, the easiest way is to force push your updated commits to your PR 
branch (xdiff_rust_speedup) and comment "/submit" on the PR again.

Alternatively you can send a version 2 of your patch series using git 
format-patch and git send-email, but that is a few more manual steps. 
MyFirstContribution.adoc has a detailed section about this process, 
including a part about a v2. [1]

[1] 
https://github.com/git/git/blob/master/Documentation/MyFirstContribution.adoc#sending-patches-with-git-send-email

Best regards

Matthias

^ permalink raw reply	[flat|nested] 198+ messages in thread

* [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification
  2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
                   ` (11 preceding siblings ...)
  2025-07-19 21:53 ` Johannes Schindelin
@ 2025-08-15  1:22 ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 01/17] doc: add a policy for using Rust brian m. carlson via GitGitGadget
                     ` (19 more replies)
  12 siblings, 20 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren

Changes in this second round of this RFC:

 * Now builds and passes tests on all platforms (example run:
   https://github.com/ezekielnewren/git/actions/runs/16974821401). Special
   thanks to Johannes Schindelin for patches to things for Windows and
   linux32.
 * Includes brian’s rust-support documentation as the new 1st patch
 * Removed the Cargo.lock file from version control, but now CI will upload
   these files as build artifacts so we can audit dependencies and notice
   if/when new dependencies cause issues.
 * Added handling of whitespace flags. These are slower; see below

Particular points I’m interested in feedback on:

 * Code style: Should we adopt a Rust code style of some sort? Perhaps have
   the code always be formatted by rustfmt in its default configuration?
 * Rust version: We are not using the same Rust version on all platforms in
   CI; 32-bit builds and Windows builds require a newer Rust version to
   successfully build.
 * Performance with whitepsace flags: I originally intended to leave out the
   whitespace handling because I knew it was slower, and I think it’d be
   difficult to fix that until more of xdiff is converted to Rust, but since
   Junio requested it, I have an implementation here. I made sure that both
   –ignore-cr-at-eol (the default on Windows) and no whitespace flags remain
   fast (or are faster), but other whitespace flag combinations are
   currently significantly slower. Are folks okay with merging this, since
   it’ll only affect those that specify some special flag, should we perhaps
   only convert the code path with no whitespace flags for now, or something
   else?
 * Types/Aliases/Data passing: The discussion between Phillip on I on
   types/translation; this longer series has examples with e.g.
   xdl_line_hash() and line_hash() which might give us more to talk about,
   though I think we can’t fully address that discussion until we have an
   example which I’m planning with a later series with an IVec type.
 * There was lots of feedback on v1, and I might have missed some; let me
   know if there’s something I need to still look at.

==Original cover letter==

This series accelerates xdiff by 5-19%.

It also introduces Rust as a hard dependency.

…and it doesn’t yet pass a couple of the github workflows; hints from
Windows experts, and opinions on ambiguous primitives would be appreciated
(see below).

This is just the beginning of many patches that I have to convert portions
of, maybe eventually all of, xdiff to Rust. While working on that
conversion, I found several ways to clarify the code, along with some
optimizations.

So...

This obviously raises the question of whether we are ready to accept a hard
dependency on Rust. Previous discussions on the mailing list and at Git
Merge 2024 have not answered that question. If not now, will we be willing
to accept such a hard dependency later? And what route do we want to take to
get there?

About the optimizations in this series:

1. xdiff currently uses DJB2a for hashing (even though it is not explicitly named as such). This is an older hashing algorithm, and modern alternatives are superior. I chose xxhash because it’s faster, more collision resistant, and designed to be a standard. Other hash algorithms like aHash, MurMurHash, SipHash, and Fnv1a were considered, but my local testing made me feel like xxhash was the best choice for usage in xdiff.

2. In support of switching to xxhash, parsing and hashing were split into separate steps. And it turns out that memchr() is faster for parsing than character-by-character iteration.


About the workflow builds/tests that aren’t working with this series:

1. Windows fails to build. I don’t know which rust toolchain is even correct for this or if multiple are needed.  Example failed build: https://github.com/git/git/actions/runs/16353209191

2. I386/ubuntu:focal will build, but fails the tests. The kernel reports the bitness as 64 despite the container being 32. I believe the issue is that C uses ambiguous primitives (which differ in size between platforms). The new code should use unambiguous primitives from Rust (u32, u64, etc.) rather than perpetuating ambiguous primitive types.  Since the current xdiff API hardcodes the ambiguous types, though, those places will need to be migrated to unambiguous primitives. Much of the C code needs a slight refactor to be compatible with the Rust FFI and usually requires converting ambiguous to unambiguous types. What does this community think of this approach?


My brother (Elijah, cc’ed) has been guiding and reviewing my work here.

Ezekiel Newren (13):
  xdiff: introduce rust
  xdiff/xprepare: remove superfluous forward declarations
  xdiff: delete unnecessary fields from xrecord_t and xdfile_t
  xdiff: make fields of xrecord_t Rust friendly
  xdiff: separate parsing lines from hashing them
  xdiff: conditionally use Rust's implementation of xxhash
  github workflows: install rust
  github workflows: define rust versions and targets in the same place
  github workflows: upload Cargo.lock
  xdiff: implement a white space iterator in Rust
  xdiff: create line_hash() and line_equal()
  xdiff: optimize case where --ignore-cr-at-eol is the only whitespace
    flag
  xdiff: use rust's version of whitespace processing

Johannes Schindelin (3):
  Do support Windows again after requiring Rust
  win+Meson: allow for xdiff to be compiled with MSVC
  win+Meson: do allow linking with the Rust-built xdiff

brian m. carlson (1):
  doc: add a policy for using Rust

 .github/workflows/main.yml                    |  61 +++
 .gitignore                                    |   3 +
 Documentation/Makefile                        |   1 +
 Documentation/technical/platform-support.adoc |   2 +
 Documentation/technical/rust-support.adoc     | 119 ++++++
 Makefile                                      |  60 ++-
 build_rust.sh                                 |  59 +++
 ci/install-dependencies.sh                    |  14 +-
 ci/install-rust.sh                            |  37 ++
 ci/lib.sh                                     |   1 +
 ci/make-test-artifacts.sh                     |   7 +
 ci/run-build-and-tests.sh                     |  12 +
 config.mak.uname                              |   9 +
 git-compat-util.h                             |  17 +
 meson.build                                   |  48 ++-
 rust/Cargo.toml                               |   6 +
 rust/interop/Cargo.toml                       |  14 +
 rust/interop/src/lib.rs                       |   0
 rust/xdiff/Cargo.toml                         |  16 +
 rust/xdiff/src/lib.rs                         |  30 ++
 rust/xdiff/src/xutils.rs                      | 354 ++++++++++++++++++
 xdiff-interface.c                             |   4 +-
 xdiff/xdiffi.c                                |   8 +-
 xdiff/xemit.c                                 |   2 +-
 xdiff/xmerge.c                                |  10 +-
 xdiff/xpatience.c                             |   2 +-
 xdiff/xprepare.c                              | 219 +++++------
 xdiff/xtypes.h                                |   9 +-
 xdiff/xutils.c                                | 162 +-------
 xdiff/xutils.h                                |   4 +-
 30 files changed, 961 insertions(+), 329 deletions(-)
 create mode 100644 Documentation/technical/rust-support.adoc
 create mode 100755 build_rust.sh
 create mode 100755 ci/install-rust.sh
 create mode 100644 rust/Cargo.toml
 create mode 100644 rust/interop/Cargo.toml
 create mode 100644 rust/interop/src/lib.rs
 create mode 100644 rust/xdiff/Cargo.toml
 create mode 100644 rust/xdiff/src/lib.rs
 create mode 100644 rust/xdiff/src/xutils.rs


base-commit: 16bd9f20a403117f2e0d9bcda6c6e621d3763e77
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1980%2Fezekielnewren%2Fxdiff_rust_speedup-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1980/ezekielnewren/xdiff_rust_speedup-v2
Pull-Request: https://github.com/git/git/pull/1980

Range-diff vs v1:

  -:  ----------- >  1:  75dfb40ead3 doc: add a policy for using Rust
  1:  2a1f4be13df !  2:  7709e5eddba xdiff: introduce rust
     @@ Commit message
      
          Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
      
     + ## .gitignore ##
     +@@ .gitignore: Release/
     + /contrib/buildsystems/out
     + /contrib/libgit-rs/target
     + /contrib/libgit-sys/target
     ++/.idea/
     ++/rust/target/
     ++/rust/Cargo.lock
     +
       ## Makefile ##
      @@ Makefile: TEST_SHELL_PATH = $(SHELL_PATH)
       
     @@ meson.build: version_def_h = custom_target(
         link_with: static_library('git',
           sources: libgit_sources,
      
     - ## rust/Cargo.lock (new) ##
     -@@
     -+# This file is automatically @generated by Cargo.
     -+# It is not intended for manual editing.
     -+version = 4
     -+
     -+[[package]]
     -+name = "interop"
     -+version = "0.1.0"
     -+
     -+[[package]]
     -+name = "xdiff"
     -+version = "0.1.0"
     -+dependencies = [
     -+ "interop",
     -+]
     -
       ## rust/Cargo.toml (new) ##
      @@
      +[workspace]
  2:  b0b744b9acf =  3:  56c96d35554 xdiff/xprepare: remove superfluous forward declarations
  3:  cc05150d6e1 =  4:  ebec3689dce xdiff: delete unnecessary fields from xrecord_t and xdfile_t
  4:  6df9f50a8f4 !  5:  769d1a5b9d2 xdiff: make fields of xrecord_t Rust friendly
     @@ Commit message
          few variables to use these types. Which, for now, will
          require adding some casts.
      
     +    Also change xdlclass_t::ha to be u64 to match xrecord_t::ha, as
     +    pointed out by Johannes.
     +
     +    Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
          Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
      
       ## xdiff/xdiffi.c ##
     @@ xdiff/xpatience.c: static void insert_record(xpparam_t const *xpp, int line, str
       	if (map->last) {
      
       ## xdiff/xprepare.c ##
     +@@
     + 
     + typedef struct s_xdlclass {
     + 	struct s_xdlclass *next;
     +-	unsigned long ha;
     ++	u64 ha;
     + 	char const *line;
     + 	long size;
     + 	long idx;
      @@ xdiff/xprepare.c: static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
       	char const *line;
       	xdlclass_t *rcrec;
  5:  2db30cc739e =  6:  87623495994 xdiff: separate parsing lines from hashing them
  6:  5a959c9bdad !  7:  d74fd4ef67a xdiff: conditionally use Rust's implementation of xxhash
     @@ Commit message
      
          Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
      
     - ## rust/Cargo.lock ##
     -@@ rust/Cargo.lock: name = "xdiff"
     - version = "0.1.0"
     - dependencies = [
     -  "interop",
     -+ "xxhash-rust",
     - ]
     -+
     -+[[package]]
     -+name = "xxhash-rust"
     -+version = "0.8.15"
     -+source = "registry+https://github.com/rust-lang/crates.io-index"
     -+checksum = "fdd20c5420375476fbd4394763288da7eb0cc0b8c11deed431a91562af7335d3"
     -
       ## rust/xdiff/Cargo.toml ##
      @@ rust/xdiff/Cargo.toml: crate-type = ["staticlib", "rlib"]
       
  7:  0de0867ab44 !  8:  7dc241e6682 github_workflows: install rust
     @@ Metadata
      Author: Ezekiel Newren <ezekielnewren@gmail.com>
      
       ## Commit message ##
     -    github_workflows: install rust
     +    github workflows: install rust
      
          Since we have introduced rust, it needs to be installed for the
          continuous integration build targets. Create an install script
     @@ .github/workflows/main.yml: on: [push, pull_request]
       # If more than one workflow run is triggered for the very same commit hash
       # (which happens when multiple branches pointing to the same commit), only
      
     - ## .gitignore ##
     -@@ .gitignore: Release/
     - /contrib/buildsystems/out
     - /contrib/libgit-rs/target
     - /contrib/libgit-sys/target
     -+/rust/target
     -
       ## Makefile ##
      @@ Makefile: TEST_SHELL_PATH = $(SHELL_PATH)
       
  -:  ----------- >  9:  96041a10d54 Do support Windows again after requiring Rust
  -:  ----------- > 10:  1194de3f39c win+Meson: allow for xdiff to be compiled with MSVC
  -:  ----------- > 11:  382067a09e3 win+Meson: do allow linking with the Rust-built xdiff
  -:  ----------- > 12:  fffdb326710 github workflows: define rust versions and targets in the same place
  -:  ----------- > 13:  44784f0d672 github workflows: upload Cargo.lock
  -:  ----------- > 14:  f20efdff7aa xdiff: implement a white space iterator in Rust
  -:  ----------- > 15:  c8d41173274 xdiff: create line_hash() and line_equal()
  -:  ----------- > 16:  f7829c55871 xdiff: optimize case where --ignore-cr-at-eol is the only whitespace flag
  -:  ----------- > 17:  395609aff4b xdiff: use rust's version of whitespace processing

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 198+ messages in thread

* [PATCH v2 01/17] doc: add a policy for using Rust
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` brian m. carlson via GitGitGadget
  2025-08-15 17:03     ` Matthias Aßhauer
  2025-08-15  1:22   ` [PATCH v2 02/17] xdiff: introduce rust Ezekiel Newren via GitGitGadget
                     ` (18 subsequent siblings)
  19 siblings, 1 reply; 198+ messages in thread
From: brian m. carlson via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, brian m. carlson

From: "brian m. carlson" <sandals@crustytoothpaste.net>

Git has historically been written primarily in C, with some shell and
Perl.  However, C is not memory safe, which makes it more likely that
security vulnerabilities or other bugs will be introduced, and it is
also more verbose and less ergonomic than other, more modern languages.

One of the most common modern compiled languages which is easily
interoperable with C is Rust.  It is popular (the most admired language
on the 2024 Stack Overflow Developer Survey), efficient, portable, and
robust.

Introduce a document laying out the incremental introduction of Rust to
Git and provide a detailed rationale for doing so, including the points
above.  Propose a design for this approach that addresses the needs of
downstreams and distributors, as well as contributors.

Since we don't want to carry both a C and Rust version of code and want
to be able to add new features only in Rust, mention that Rust is a
required part of our platform support policy.

It should be noted that a recent discussion at the Berlin Git Merge
Contributor Summit found widespread support for the addition of Rust to
Git.  While of course not all contributors were represented, the
proposal appeared to have the support of a majority of active
contributors.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 Documentation/Makefile                        |   1 +
 Documentation/technical/platform-support.adoc |   2 +
 Documentation/technical/rust-support.adoc     | 119 ++++++++++++++++++
 3 files changed, 122 insertions(+)
 create mode 100644 Documentation/technical/rust-support.adoc

diff --git a/Documentation/Makefile b/Documentation/Makefile
index b109d25e9c80..066b761c01b9 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -127,6 +127,7 @@ TECH_DOCS += technical/parallel-checkout
 TECH_DOCS += technical/partial-clone
 TECH_DOCS += technical/platform-support
 TECH_DOCS += technical/racy-git
+TECH_DOCS += technical/rust-support
 TECH_DOCS += technical/reftable
 TECH_DOCS += technical/scalar
 TECH_DOCS += technical/send-pack-pipeline
diff --git a/Documentation/technical/platform-support.adoc b/Documentation/technical/platform-support.adoc
index 0a2fb28d6277..42b04b186105 100644
--- a/Documentation/technical/platform-support.adoc
+++ b/Documentation/technical/platform-support.adoc
@@ -33,6 +33,8 @@ meet the following minimum requirements:
 
 * Has active security support (taking security releases of dependencies, etc)
 
+* Supports Rust and the toolchain version specified in link:rust-support.txt[].
+
 These requirements are a starting point, and not sufficient on their own for the
 Git community to be enthusiastic about supporting your platform. Maintainers of
 platforms which do meet these requirements can follow the steps below to make it
diff --git a/Documentation/technical/rust-support.adoc b/Documentation/technical/rust-support.adoc
new file mode 100644
index 000000000000..a63327ebc575
--- /dev/null
+++ b/Documentation/technical/rust-support.adoc
@@ -0,0 +1,119 @@
+Usage of Rust in Git
+====================
+
+Objective
+---------
+Introduce Rust into Git incrementally to improve security and maintainability.
+
+Background
+----------
+Git has historically been written primarily in C, with some portions in shell,
+Perl, or other languages.  At the time it was originally written, this was
+important for portability and was a logical choice for software development.
+
+:0: link:https://security.googleblog.com/2024/09/eliminating-memory-safety-vulnerabilities-Android.html
+:1: link:https://www.cisa.gov/resources-tools/resources/product-security-bad-practices
+
+However, as time has progressed, we've seen an increased concern with memory
+safety vulnerabilities and the development of newer languages, such as Rust,
+that substantially limit or eliminate this class of vulnerabilities.
+Development in a variety of projects has found that memory safety
+vulnerabilities constitute about 70% of vulnerabilities of software in
+languages that are not memory safe.  For instance, {0}[one survey of Android]
+found that memory safety vulnerabilities decreased from 76% to 24% over six
+years due to an increase in memory safe code.  Similarly, the U.S. government
+is {1}[proposing to classify development in memory unsafe languages as a
+Product Security Bad Practice"].
+
+These risks are even more substantial when we consider the fact that Git is a
+network-facing service.  Many organizations run Git servers internally or use a
+cloud-based forge, and the risk of accidental exposure or compromise of user
+data is substantial.  It's important to ensure that Git, whether it's used
+locally or remotely, is robustly secure.
+
+In addition, C is a difficult language to write well and concisely.  While it
+is of course possible to do anything with C, it lacks built-in support for
+niceties found in modern languages, such as hash tables, generics, typed
+errors, and automatic destruction, and most modern language offer shorter, more
+ergonomic syntax for expressing code.  This is valuable functionality that can
+allow Git to be developed more rapidly, more easily, by more developers of a
+variety of levels, and with more confidence in the correctness of the code.
+
+For these reasons, adding Rust to Git is a sensible and prudent move that will
+allow us to improve the quality of the code and potentially attract new developers.
+
+Goals
+-----
+1. Git continues to build, run, and pass tests on a wide variety of operating
+   systems and architectures.
+2. Transition from C to Rust is incremental; that is, code can be ported as it
+   is convenient and Git does not need to transition all at once.
+3. Git continues to support older operating systems in conformance with the
+   platform support policy.
+
+Non-Goals
+---------
+1. Support for every possible operating system and architecture.  Git already
+   has a platform support policy which defines what is supported and we already
+   exclude some operating systems for various reasons (e.g., lacking enough POSIX
+   tools to pass the test suite).
+2. Implementing C-only versions of Rust code or compiling a C-only Git.  This
+   would be difficult to maintain and would not offer the ergonomic benefits we
+   desire.
+
+Design
+------
+Git will adopt Rust incrementally.  This transition will start with the
+creation of a static library that can be linked into the existing Git binaries.
+At some point, we may wish to expose a dynamic library and compile the Git
+binaries themselves using Rust.  Using an incremental approach allows us to
+determine as we go along how to structure our code in the best way for the
+project and avoids the need to make hard, potentially disruptive, transitions
+caused by porting a binary wholesale from one language to another that might
+introduce bugs.
+
+We will use the `bindgen` and `cbindgen` crates for handling C-compatible
+bindings and the `rustix` crate for POSIX-compatible interfaces.  The `libc`
+crate, which is used by `rustix`, does not expose safe interfaces and does not
+handle differences between platforms, such as differing 64-bit `stat` call
+names, and so is less desirable as a target than `rustix`.  We may still choose
+to use it in some cases if `rustix` does not offer suitable interfaces.
+
+Rust upstream releases every six weeks and only supports the latest stable
+release.  While it is nice that upstream is active, we would like our software
+releases to have a lifespan exceeding six weeks.  To allow compiling our code
+on a variety of systems, we will support the version of Rust in Debian stable,
+plus, for a year after a new Debian stable is released, the version in Debian
+oldstable.
+
+This provides an approximately three-year lifespan of support for a Rust
+release and allows us to support a variety of operating systems and
+architectures, including those for which Rust upstream does not build binaries.
+Debian stable is the benchmark distribution used by many Rust projects when
+determining supported Rust versions, and it is an extremely portable and
+popular free software operating system that is available to the public at no
+charge, which makes it a sensible choice for us as well.
+
+We may change this policy if the Rust project issues long-term support releases
+or the Rust community and distributors agree on releases to target as if they
+were long-term support releases.
+
+This version support policy necessitates that we be very careful about the
+dependencies we include, since many Rust projects support only the latest
+stable version.  However, we typically have been careful about dependencies in
+the first place, so this should not be a major departure from existing policy,
+although it may be a change for some existing Rust developers.
+
+We will avoid including the `Cargo.lock` file in the repository and instead
+specify minimum dependency versions in the `Cargo.toml` file.  We want to allow
+people to use newer versions of dependencies if necessary to support newer
+platforms without needing to force upgrades of dependencies on all users, and
+it provides additional flexibility for distribution maintainers.
+
+We do not plan to support beta or nightly versions of the Rust compiler.  These
+versions may change rapidly and especially parts of the toolchain such as
+Clippy, the lint tool, can have false positives or add additional warnings with
+too great of a frequency to be supportable by the project.  However, we do plan
+to support alternate compilers, such as the rust_codegen_gcc backend and gccrs
+when they are stable and support our desired release versions.  This will
+provide greater support for more operating systems and architectures.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 02/17] xdiff: introduce rust
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 01/17] doc: add a policy for using Rust brian m. carlson via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 03/17] xdiff/xprepare: remove superfluous forward declarations Ezekiel Newren via GitGitGadget
                     ` (17 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Upcoming patches will accelerate and simplify xdiff, while also
porting parts of it to Rust. In preparation, add some stubs and setup
the Rust build. For now, it is easier to let cargo build rust and
have make or meson merely link against the static library that cargo
builds. In line with ongoing libification efforts, use multiple
crates to allow more modularity on the Rust side. xdiff is the crate
that this series will focus on, but we also introduce the interop
crate for future patch series.

In order to facilitate interoperability between C and Rust, introduce
C definitions for Rust primitive types in git-compat-util.h.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 .gitignore              |  3 +++
 Makefile                | 20 +++++++++++++++++++-
 git-compat-util.h       | 17 +++++++++++++++++
 meson.build             | 32 ++++++++++++++++++++++++++++++++
 rust/Cargo.toml         |  6 ++++++
 rust/interop/Cargo.toml | 14 ++++++++++++++
 rust/interop/src/lib.rs |  0
 rust/xdiff/Cargo.toml   | 15 +++++++++++++++
 rust/xdiff/src/lib.rs   |  0
 9 files changed, 106 insertions(+), 1 deletion(-)
 create mode 100644 rust/Cargo.toml
 create mode 100644 rust/interop/Cargo.toml
 create mode 100644 rust/interop/src/lib.rs
 create mode 100644 rust/xdiff/Cargo.toml
 create mode 100644 rust/xdiff/src/lib.rs

diff --git a/.gitignore b/.gitignore
index 04c444404e4b..ff81e3580c4e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -254,3 +254,6 @@ Release/
 /contrib/buildsystems/out
 /contrib/libgit-rs/target
 /contrib/libgit-sys/target
+/.idea/
+/rust/target/
+/rust/Cargo.lock
diff --git a/Makefile b/Makefile
index 70d1543b6b86..db39e6e1c28e 100644
--- a/Makefile
+++ b/Makefile
@@ -919,6 +919,11 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+ifeq ($(DEBUG), 1)
+RUST_LIB = rust/target/debug/libxdiff.a
+else
+RUST_LIB = rust/target/release/libxdiff.a
+endif
 REFTABLE_LIB = reftable/libreftable.a
 
 GENERATED_H += command-list.h
@@ -1392,6 +1397,8 @@ UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/lib-reftable.o
 GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
 EXTLIBS =
 
+GITLIBS += $(RUST_LIB)
+
 GIT_USER_AGENT = git/$(GIT_VERSION)
 
 ifeq ($(wildcard sha1collisiondetection/lib/sha1.h),sha1collisiondetection/lib/sha1.h)
@@ -2925,6 +2932,14 @@ $(LIB_FILE): $(LIB_OBJS)
 $(XDIFF_LIB): $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
+.PHONY: $(RUST_LIB)
+$(RUST_LIB):
+ifeq ($(DEBUG), 1)
+	cd rust && RUSTFLAGS="-Aunused_imports -Adead_code" cargo build --verbose
+else
+	cd rust && RUSTFLAGS="-Aunused_imports -Adead_code" cargo build --verbose --release
+endif
+
 $(REFTABLE_LIB): $(REFTABLE_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
@@ -3756,7 +3771,10 @@ cocciclean:
 	$(RM) -r .build/contrib/coccinelle
 	$(RM) contrib/coccinelle/*.cocci.patch
 
-clean: profile-clean coverage-clean cocciclean
+rustclean:
+	cd rust && cargo clean
+
+clean: profile-clean coverage-clean cocciclean rustclean
 	$(RM) -r .build $(UNIT_TEST_BIN)
 	$(RM) GIT-TEST-SUITES
 	$(RM) po/git.pot po/git-core.pot
diff --git a/git-compat-util.h b/git-compat-util.h
index 4678e21c4cb8..82dc99764ac0 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -196,6 +196,23 @@ static inline int is_xplatform_dir_sep(int c)
 #include "compat/msvc.h"
 #endif
 
+/* rust types */
+typedef uint8_t   u8;
+typedef uint16_t  u16;
+typedef uint32_t  u32;
+typedef uint64_t  u64;
+
+typedef int8_t    i8;
+typedef int16_t   i16;
+typedef int32_t   i32;
+typedef int64_t   i64;
+
+typedef float     f32;
+typedef double    f64;
+
+typedef size_t    usize;
+typedef ptrdiff_t isize;
+
 /* used on Mac OS X */
 #ifdef PRECOMPOSE_UNICODE
 #include "compat/precompose_utf8.h"
diff --git a/meson.build b/meson.build
index 596f5ac7110e..2d8da17f6515 100644
--- a/meson.build
+++ b/meson.build
@@ -267,6 +267,36 @@ version_gen_environment.set('GIT_DATE', get_option('build_date'))
 version_gen_environment.set('GIT_USER_AGENT', get_option('user_agent'))
 version_gen_environment.set('GIT_VERSION', get_option('version'))
 
+if get_option('optimization') in ['2', '3', 's', 'z']
+  rust_target = 'release'
+  rust_args = ['--release']
+  rustflags = '-Aunused_imports -Adead_code'
+else
+  rust_target = 'debug'
+  rust_args = []
+  rustflags = '-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
+endif
+
+
+rust_leaf = custom_target('rust_leaf',
+  output: 'libxdiff.a',
+  build_by_default: true,
+  build_always_stale: true,
+  command: ['cargo', 'build',
+            '--manifest-path', meson.project_source_root() / 'rust/Cargo.toml'
+  ] + rust_args,
+  env: {
+    'RUSTFLAGS': rustflags,
+  },
+  install: false,
+)
+
+rust_xdiff_dep = declare_dependency(
+  link_args: ['-L' + meson.project_source_root() / 'rust/target' / rust_target, '-lxdiff'],
+#  include_directories: include_directories('xdiff/include'),  # Adjust if you expose headers
+)
+
+
 compiler = meson.get_compiler('c')
 
 libgit_sources = [
@@ -1677,6 +1707,8 @@ version_def_h = custom_target(
 )
 libgit_sources += version_def_h
 
+libgit_dependencies += rust_xdiff_dep
+
 libgit = declare_dependency(
   link_with: static_library('git',
     sources: libgit_sources,
diff --git a/rust/Cargo.toml b/rust/Cargo.toml
new file mode 100644
index 000000000000..ed3d79d7f827
--- /dev/null
+++ b/rust/Cargo.toml
@@ -0,0 +1,6 @@
+[workspace]
+members = [
+    "xdiff",
+    "interop",
+]
+resolver = "2"
diff --git a/rust/interop/Cargo.toml b/rust/interop/Cargo.toml
new file mode 100644
index 000000000000..045e3b01cfad
--- /dev/null
+++ b/rust/interop/Cargo.toml
@@ -0,0 +1,14 @@
+[package]
+name = "interop"
+version = "0.1.0"
+edition = "2021"
+
+[lib]
+name = "interop"
+path = "src/lib.rs"
+## staticlib to generate xdiff.a for use by gcc
+## cdylib (optional) to generate xdiff.so for use by gcc
+## rlib is required by the rust unit tests
+crate-type = ["staticlib", "rlib"]
+
+[dependencies]
diff --git a/rust/interop/src/lib.rs b/rust/interop/src/lib.rs
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/rust/xdiff/Cargo.toml b/rust/xdiff/Cargo.toml
new file mode 100644
index 000000000000..eb7966aada64
--- /dev/null
+++ b/rust/xdiff/Cargo.toml
@@ -0,0 +1,15 @@
+[package]
+name = "xdiff"
+version = "0.1.0"
+edition = "2021"
+
+[lib]
+name = "xdiff"
+path = "src/lib.rs"
+## staticlib to generate xdiff.a for use by gcc
+## cdylib (optional) to generate xdiff.so for use by gcc
+## rlib is required by the rust unit tests
+crate-type = ["staticlib", "rlib"]
+
+[dependencies]
+interop = { path = "../interop" }
diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
new file mode 100644
index 000000000000..e69de29bb2d1
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 03/17] xdiff/xprepare: remove superfluous forward declarations
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 01/17] doc: add a policy for using Rust brian m. carlson via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 02/17] xdiff: introduce rust Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 04/17] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
                     ` (16 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Move xdl_prepare_env() later in the file to avoid the need
for forward declarations.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xprepare.c | 116 ++++++++++++++++++++---------------------------
 1 file changed, 50 insertions(+), 66 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e1d4017b2dde..a45c5ee208c8 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -53,21 +53,6 @@ typedef struct s_xdlclassifier {
 
 
 
-static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags);
-static void xdl_free_classifier(xdlclassifier_t *cf);
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
-			       unsigned int hbits, xrecord_t *rec);
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
-			   xdlclassifier_t *cf, xdfile_t *xdf);
-static void xdl_free_ctx(xdfile_t *xdf);
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e);
-static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-
-
-
-
 static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags) {
 	cf->flags = flags;
 
@@ -242,57 +227,6 @@ static void xdl_free_ctx(xdfile_t *xdf) {
 }
 
 
-int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
-		    xdfenv_t *xe) {
-	long enl1, enl2, sample;
-	xdlclassifier_t cf;
-
-	memset(&cf, 0, sizeof(cf));
-
-	/*
-	 * For histogram diff, we can afford a smaller sample size and
-	 * thus a poorer estimate of the number of lines, as the hash
-	 * table (rhash) won't be filled up/grown. The number of lines
-	 * (nrecs) will be updated correctly anyway by
-	 * xdl_prepare_ctx().
-	 */
-	sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
-		  ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
-
-	enl1 = xdl_guess_lines(mf1, sample) + 1;
-	enl2 = xdl_guess_lines(mf2, sample) + 1;
-
-	if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
-		return -1;
-
-	if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
-
-		xdl_free_classifier(&cf);
-		return -1;
-	}
-	if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
-
-		xdl_free_ctx(&xe->xdf1);
-		xdl_free_classifier(&cf);
-		return -1;
-	}
-
-	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
-	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
-	    xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
-
-		xdl_free_ctx(&xe->xdf2);
-		xdl_free_ctx(&xe->xdf1);
-		xdl_free_classifier(&cf);
-		return -1;
-	}
-
-	xdl_free_classifier(&cf);
-
-	return 0;
-}
-
-
 void xdl_free_env(xdfenv_t *xe) {
 
 	xdl_free_ctx(&xe->xdf2);
@@ -460,3 +394,53 @@ static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2
 
 	return 0;
 }
+
+int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+		    xdfenv_t *xe) {
+	long enl1, enl2, sample;
+	xdlclassifier_t cf;
+
+	memset(&cf, 0, sizeof(cf));
+
+	/*
+	 * For histogram diff, we can afford a smaller sample size and
+	 * thus a poorer estimate of the number of lines, as the hash
+	 * table (rhash) won't be filled up/grown. The number of lines
+	 * (nrecs) will be updated correctly anyway by
+	 * xdl_prepare_ctx().
+	 */
+	sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
+		  ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
+
+	enl1 = xdl_guess_lines(mf1, sample) + 1;
+	enl2 = xdl_guess_lines(mf2, sample) + 1;
+
+	if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
+		return -1;
+
+	if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
+
+		xdl_free_classifier(&cf);
+		return -1;
+	}
+	if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
+
+		xdl_free_ctx(&xe->xdf1);
+		xdl_free_classifier(&cf);
+		return -1;
+	}
+
+	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
+	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
+	    xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
+
+		xdl_free_ctx(&xe->xdf2);
+		xdl_free_ctx(&xe->xdf1);
+		xdl_free_classifier(&cf);
+		return -1;
+	    }
+
+	xdl_free_classifier(&cf);
+
+	return 0;
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 04/17] xdiff: delete unnecessary fields from xrecord_t and xdfile_t
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (2 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 03/17] xdiff/xprepare: remove superfluous forward declarations Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 05/17] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
                     ` (15 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xprepare.c | 24 +++---------------------
 xdiff/xtypes.h   |  3 ---
 2 files changed, 3 insertions(+), 24 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index a45c5ee208c8..ad356281f939 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -91,8 +91,7 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
 }
 
 
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
-			       unsigned int hbits, xrecord_t *rec) {
+static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
 	long hi;
 	char const *line;
 	xdlclass_t *rcrec;
@@ -126,23 +125,17 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 
 	rec->ha = (unsigned long) rcrec->idx;
 
-	hi = (long) XDL_HASHLONG(rec->ha, hbits);
-	rec->next = rhash[hi];
-	rhash[hi] = rec;
-
 	return 0;
 }
 
 
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
 			   xdlclassifier_t *cf, xdfile_t *xdf) {
-	unsigned int hbits;
-	long nrec, hsize, bsize;
+	long nrec, bsize;
 	unsigned long hav;
 	char const *blk, *cur, *top, *prev;
 	xrecord_t *crec;
 	xrecord_t **recs;
-	xrecord_t **rhash;
 	unsigned long *ha;
 	char *rchg;
 	long *rindex;
@@ -150,7 +143,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	ha = NULL;
 	rindex = NULL;
 	rchg = NULL;
-	rhash = NULL;
 	recs = NULL;
 
 	if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
@@ -158,11 +150,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	if (!XDL_ALLOC_ARRAY(recs, narec))
 		goto abort;
 
-	hbits = xdl_hashbits((unsigned int) narec);
-	hsize = 1 << hbits;
-	if (!XDL_CALLOC_ARRAY(rhash, hsize))
-		goto abort;
-
 	nrec = 0;
 	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
 		for (top = blk + bsize; cur < top; ) {
@@ -176,7 +163,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 			crec->size = (long) (cur - prev);
 			crec->ha = hav;
 			recs[nrec++] = crec;
-			if (xdl_classify_record(pass, cf, rhash, hbits, crec) < 0)
+			if (xdl_classify_record(pass, cf, crec) < 0)
 				goto abort;
 		}
 	}
@@ -194,8 +181,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 
 	xdf->nrec = nrec;
 	xdf->recs = recs;
-	xdf->hbits = hbits;
-	xdf->rhash = rhash;
 	xdf->rchg = rchg + 1;
 	xdf->rindex = rindex;
 	xdf->nreff = 0;
@@ -209,7 +194,6 @@ abort:
 	xdl_free(ha);
 	xdl_free(rindex);
 	xdl_free(rchg);
-	xdl_free(rhash);
 	xdl_free(recs);
 	xdl_cha_free(&xdf->rcha);
 	return -1;
@@ -217,8 +201,6 @@ abort:
 
 
 static void xdl_free_ctx(xdfile_t *xdf) {
-
-	xdl_free(xdf->rhash);
 	xdl_free(xdf->rindex);
 	xdl_free(xdf->rchg - 1);
 	xdl_free(xdf->ha);
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8442bd436efe..8b8467360ecf 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,7 +39,6 @@ typedef struct s_chastore {
 } chastore_t;
 
 typedef struct s_xrecord {
-	struct s_xrecord *next;
 	char const *ptr;
 	long size;
 	unsigned long ha;
@@ -48,8 +47,6 @@ typedef struct s_xrecord {
 typedef struct s_xdfile {
 	chastore_t rcha;
 	long nrec;
-	unsigned int hbits;
-	xrecord_t **rhash;
 	long dstart, dend;
 	xrecord_t **recs;
 	char *rchg;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 05/17] xdiff: make fields of xrecord_t Rust friendly
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (3 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 04/17] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 06/17] xdiff: separate parsing lines from hashing them Ezekiel Newren via GitGitGadget
                     ` (14 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

A few commits ago, we added definitions for Rust primitive types,
to facilitate interoperability between C and Rust. Switch a
few variables to use these types. Which, for now, will
require adding some casts.

Also change xdlclass_t::ha to be u64 to match xrecord_t::ha, as
pointed out by Johannes.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xdiffi.c    |  8 ++++----
 xdiff/xemit.c     |  2 +-
 xdiff/xmerge.c    | 14 +++++++-------
 xdiff/xpatience.c |  2 +-
 xdiff/xprepare.c  |  8 ++++----
 xdiff/xtypes.h    |  6 +++---
 xdiff/xutils.c    |  4 ++--
 7 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 5a96e36dfbea..3b364c61f671 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -418,7 +418,7 @@ static int get_indent(xrecord_t *rec)
 	long i;
 	int ret = 0;
 
-	for (i = 0; i < rec->size; i++) {
+	for (i = 0; i < (long) rec->size; i++) {
 		char c = rec->ptr[i];
 
 		if (!XDL_ISSPACE(c))
@@ -1005,11 +1005,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
 
 		rec = &xe->xdf1.recs[xch->i1];
 		for (i = 0; i < xch->chg1 && ignore; i++)
-			ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+			ignore = xdl_blankline((const char*) rec[i]->ptr, rec[i]->size, flags);
 
 		rec = &xe->xdf2.recs[xch->i2];
 		for (i = 0; i < xch->chg2 && ignore; i++)
-			ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+			ignore = xdl_blankline((const char*)rec[i]->ptr, rec[i]->size, flags);
 
 		xch->ignore = ignore;
 	}
@@ -1020,7 +1020,7 @@ static int record_matches_regex(xrecord_t *rec, xpparam_t const *xpp) {
 	size_t i;
 
 	for (i = 0; i < xpp->ignore_regex_nr; i++)
-		if (!regexec_buf(xpp->ignore_regex[i], rec->ptr, rec->size, 1,
+		if (!regexec_buf(xpp->ignore_regex[i], (const char*) rec->ptr, rec->size, 1,
 				 &regmatch, 0))
 			return 1;
 
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 1d40c9cb4076..bbf7b7f8c862 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -24,7 +24,7 @@
 
 static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
 
-	*rec = xdf->recs[ri]->ptr;
+	*rec = (char const*) xdf->recs[ri]->ptr;
 
 	return xdf->recs[ri]->size;
 }
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index af40c88a5b36..6fa6ea61a208 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -101,8 +101,8 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
 	xrecord_t **rec2 = xe2->xdf2.recs + i2;
 
 	for (i = 0; i < line_count; i++) {
-		int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
-			rec2[i]->ptr, rec2[i]->size, flags);
+		int result = xdl_recmatch((const char*) rec1[i]->ptr, rec1[i]->size,
+			(const char*) rec2[i]->ptr, rec2[i]->size, flags);
 		if (!result)
 			return -1;
 	}
@@ -324,8 +324,8 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
 
 static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
 {
-	return xdl_recmatch(rec1->ptr, rec1->size,
-			    rec2->ptr, rec2->size, flags);
+	return xdl_recmatch((char const*) rec1->ptr, rec1->size,
+			    (char const*) rec2->ptr, rec2->size, flags);
 }
 
 /*
@@ -383,10 +383,10 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
 		 */
 		t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
 		t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
-			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
+			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - (u8 const*) t1.ptr;
 		t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
 		t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
-			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
+			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - (u8 const*) t2.ptr;
 		if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
 			return -1;
 		if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
@@ -440,7 +440,7 @@ static int line_contains_alnum(const char *ptr, long size)
 static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
 {
 	for (; chg; chg--, i++)
-		if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
+		if (line_contains_alnum((char const*) xe->xdf2.recs[i]->ptr,
 				xe->xdf2.recs[i]->size))
 			return 1;
 	return 0;
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 77dc411d1937..986a3a3f749a 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
 		return;
 	map->entries[index].line1 = line;
 	map->entries[index].hash = record->ha;
-	map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
+	map->entries[index].anchor = is_anchor(xpp, (const char*) map->env->xdf1.recs[line - 1]->ptr);
 	if (!map->first)
 		map->first = map->entries + index;
 	if (map->last) {
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index ad356281f939..00cdf7d8a038 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -32,7 +32,7 @@
 
 typedef struct s_xdlclass {
 	struct s_xdlclass *next;
-	unsigned long ha;
+	u64 ha;
 	char const *line;
 	long size;
 	long idx;
@@ -96,12 +96,12 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 	char const *line;
 	xdlclass_t *rcrec;
 
-	line = rec->ptr;
+	line = (char const*) rec->ptr;
 	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
 	for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
 		if (rcrec->ha == rec->ha &&
 				xdl_recmatch(rcrec->line, rcrec->size,
-					rec->ptr, rec->size, cf->flags))
+					(const char*) rec->ptr, rec->size, cf->flags))
 			break;
 
 	if (!rcrec) {
@@ -159,7 +159,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 				goto abort;
 			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
 				goto abort;
-			crec->ptr = prev;
+			crec->ptr = (u8 const*) prev;
 			crec->size = (long) (cur - prev);
 			crec->ha = hav;
 			recs[nrec++] = crec;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8b8467360ecf..6e5f67ebf380 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,9 +39,9 @@ typedef struct s_chastore {
 } chastore_t;
 
 typedef struct s_xrecord {
-	char const *ptr;
-	long size;
-	unsigned long ha;
+	u8 const* ptr;
+	usize size;
+	u64 ha;
 } xrecord_t;
 
 typedef struct s_xdfile {
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 444a108f87c0..10e4f20b7c31 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -418,10 +418,10 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
 
 	subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
 	subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
-		diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
+		diff_env->xdf1.recs[line1 + count1 - 2]->size - (u8 const*) subfile1.ptr;
 	subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
 	subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
-		diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
+		diff_env->xdf2.recs[line2 + count2 - 2]->size - (u8 const*) subfile2.ptr;
 	if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
 		return -1;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 06/17] xdiff: separate parsing lines from hashing them
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (4 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 05/17] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 07/17] xdiff: conditionally use Rust's implementation of xxhash Ezekiel Newren via GitGitGadget
                     ` (13 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

We want to use xxhash for faster hashing. To facilitate that
and to simplify the code. Separate the concerns of parsing
and hashing into discrete steps. This makes swapping the hash
function much easier. Since xdl_hash_record() both parses and
hashses lines, this requires some slight code restructuring.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xprepare.c | 75 ++++++++++++++++++++++++++++--------------------
 1 file changed, 44 insertions(+), 31 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 00cdf7d8a038..031c1752cc1a 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -129,13 +129,39 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 }
 
 
+static void xdl_parse_lines(mmfile_t *mf, long narec, xdfile_t *xdf) {
+	u8 const* ptr = (u8 const*) mf->ptr;
+	usize len = (usize) mf->size;
+
+	xdf->recs = NULL;
+	xdf->nrec = 0;
+	XDL_ALLOC_ARRAY(xdf->recs, narec);
+
+	while (len > 0) {
+		xrecord_t *rec = NULL;
+		usize length;
+		u8 const* result = memchr(ptr, '\n', len);
+		if (result) {
+			length = result - ptr + 1;
+		} else {
+			length = len;
+		}
+		if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
+			die("XDL_ALLOC_GROW failed");
+		rec = xdl_cha_alloc(&xdf->rcha);
+		rec->ptr = ptr;
+		rec->size = length;
+		rec->ha = 0;
+		xdf->recs[xdf->nrec++] = rec;
+		ptr += length;
+		len -= length;
+	}
+
+}
+
+
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
 			   xdlclassifier_t *cf, xdfile_t *xdf) {
-	long nrec, bsize;
-	unsigned long hav;
-	char const *blk, *cur, *top, *prev;
-	xrecord_t *crec;
-	xrecord_t **recs;
 	unsigned long *ha;
 	char *rchg;
 	long *rindex;
@@ -143,50 +169,37 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	ha = NULL;
 	rindex = NULL;
 	rchg = NULL;
-	recs = NULL;
 
 	if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
 		goto abort;
-	if (!XDL_ALLOC_ARRAY(recs, narec))
-		goto abort;
 
-	nrec = 0;
-	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
-		for (top = blk + bsize; cur < top; ) {
-			prev = cur;
-			hav = xdl_hash_record(&cur, top, xpp->flags);
-			if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
-				goto abort;
-			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
-				goto abort;
-			crec->ptr = (u8 const*) prev;
-			crec->size = (long) (cur - prev);
-			crec->ha = hav;
-			recs[nrec++] = crec;
-			if (xdl_classify_record(pass, cf, crec) < 0)
-				goto abort;
-		}
+	xdl_parse_lines(mf, narec, xdf);
+
+	for (usize i = 0; i < (usize) xdf->nrec; i++) {
+		xrecord_t *rec = xdf->recs[i];
+		char const* dump = (char const*) rec->ptr;
+		rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
+		xdl_classify_record(pass, cf, rec);
 	}
 
-	if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
+
+	if (!XDL_CALLOC_ARRAY(rchg, xdf->nrec + 2))
 		goto abort;
 
 	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
 	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
-		if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
+		if (!XDL_ALLOC_ARRAY(rindex, xdf->nrec + 1))
 			goto abort;
-		if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
+		if (!XDL_ALLOC_ARRAY(ha, xdf->nrec + 1))
 			goto abort;
 	}
 
-	xdf->nrec = nrec;
-	xdf->recs = recs;
 	xdf->rchg = rchg + 1;
 	xdf->rindex = rindex;
 	xdf->nreff = 0;
 	xdf->ha = ha;
 	xdf->dstart = 0;
-	xdf->dend = nrec - 1;
+	xdf->dend = xdf->nrec - 1;
 
 	return 0;
 
@@ -194,7 +207,7 @@ abort:
 	xdl_free(ha);
 	xdl_free(rindex);
 	xdl_free(rchg);
-	xdl_free(recs);
+	xdl_free(xdf->recs);
 	xdl_cha_free(&xdf->rcha);
 	return -1;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 07/17] xdiff: conditionally use Rust's implementation of xxhash
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (5 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 06/17] xdiff: separate parsing lines from hashing them Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 08/17] github workflows: install rust Ezekiel Newren via GitGitGadget
                     ` (12 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

When no whitespace flags are present use xxhash, for faster
hashing, otherwise use DJB2a (which is what xdiff has been
using all along).

The benchmark below compares my series with version v2.49.0
(built in build_release/ and build_v2.49.0/ respectively),
running log commands on linux kernel with 3 different machines.

$ BASE=/path/to/git/root

    // laptop
    // CPU: 6-core Intel Core i7-8750H (-MT MCP-) speed/min/max: 726/800/4100 MHz
    $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null'
    Benchmark 1: /home/ezekiel/development/work/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):     10.419 s ±  0.166 s    [User: 10.097 s, System: 0.284 s]
      Range (min … max):   10.215 s … 10.680 s    10 runs

    Benchmark 2: /home/ezekiel/development/work/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):     10.980 s ±  0.137 s    [User: 10.633 s, System: 0.308 s]
      Range (min … max):   10.791 s … 11.178 s    10 runs

    Summary
      /home/ezekiel/development/work/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran
        1.05 ± 0.02 times faster than /home/ezekiel/development/work/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null

    // desktop
    // CPU: 8-core Intel Core i7-9700 (-MCP-) speed/min/max: 800/800/4700 MHz
    $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null'
    Benchmark 1: /home/steamuser/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):      6.823 s ±  0.020 s    [User: 6.624 s, System: 0.180 s]
      Range (min … max):    6.801 s …  6.858 s    10 runs

    Benchmark 2: /home/steamuser/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):      8.151 s ±  0.024 s    [User: 7.928 s, System: 0.198 s]
      Range (min … max):    8.105 s …  8.184 s    10 runs

    Summary
      /home/steamuser/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran
        1.19 ± 0.01 times faster than /home/steamuser/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null

    // router
    // CPU: dual core Intel Celeron 3965U (-MCP-) speed/min/max: 1300/400/2200 MHz
    $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null'
    Benchmark 1: /home/metal/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):     21.209 s ±  0.054 s    [User: 20.341 s, System: 0.605 s]
      Range (min … max):   21.135 s … 21.309 s    10 runs

    Benchmark 2: /home/metal/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
      Time (mean ± σ):     23.683 s ±  0.060 s    [User: 22.735 s, System: 0.672 s]
      Range (min … max):   23.566 s … 23.751 s    10 runs

    Summary
      /home/metal/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran
        1.12 ± 0.00 times faster than /home/metal/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 rust/xdiff/Cargo.toml |  1 +
 rust/xdiff/src/lib.rs |  7 +++++++
 xdiff/xprepare.c      | 19 +++++++++++++++++--
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/rust/xdiff/Cargo.toml b/rust/xdiff/Cargo.toml
index eb7966aada64..1516e829db18 100644
--- a/rust/xdiff/Cargo.toml
+++ b/rust/xdiff/Cargo.toml
@@ -13,3 +13,4 @@ crate-type = ["staticlib", "rlib"]
 
 [dependencies]
 interop = { path = "../interop" }
+xxhash-rust = { version = "0.8.15", features = ["xxh3"] }
diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
index e69de29bb2d1..96975975a1ba 100644
--- a/rust/xdiff/src/lib.rs
+++ b/rust/xdiff/src/lib.rs
@@ -0,0 +1,7 @@
+
+
+#[no_mangle]
+unsafe extern "C" fn xxh3_64(ptr: *const u8, size: usize) -> u64 {
+    let slice = std::slice::from_raw_parts(ptr, size);
+    xxhash_rust::xxh3::xxh3_64(slice)
+}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 031c1752cc1a..c0463bacd94b 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -160,6 +160,9 @@ static void xdl_parse_lines(mmfile_t *mf, long narec, xdfile_t *xdf) {
 }
 
 
+extern u64 xxh3_64(u8 const* ptr, usize size);
+
+
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
 			   xdlclassifier_t *cf, xdfile_t *xdf) {
 	unsigned long *ha;
@@ -175,14 +178,26 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 
 	xdl_parse_lines(mf, narec, xdf);
 
+	if ((xpp->flags & XDF_WHITESPACE_FLAGS) == 0) {
+		for (usize i = 0; i < (usize) xdf->nrec; i++) {
+			xrecord_t *rec = xdf->recs[i];
+			rec->ha = xxh3_64(rec->ptr, rec->size);
+		}
+	} else {
+		for (usize i = 0; i < (usize) xdf->nrec; i++) {
+			xrecord_t *rec = xdf->recs[i];
+			char const* dump = (char const*) rec->ptr;
+			rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
+		}
+	}
+
 	for (usize i = 0; i < (usize) xdf->nrec; i++) {
 		xrecord_t *rec = xdf->recs[i];
-		char const* dump = (char const*) rec->ptr;
-		rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
 		xdl_classify_record(pass, cf, rec);
 	}
 
 
+
 	if (!XDL_CALLOC_ARRAY(rchg, xdf->nrec + 2))
 		goto abort;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 08/17] github workflows: install rust
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (6 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 07/17] xdiff: conditionally use Rust's implementation of xxhash Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 09/17] Do support Windows again after requiring Rust Johannes Schindelin via GitGitGadget
                     ` (11 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Since we have introduced rust, it needs to be installed for the
continuous integration build targets. Create an install script
(build_rust.sh) that needs to be run as the same user that builds git.
Because of the limitations of meson, create build_rust.sh which makes
it easy to centralize how rust is built between meson and make.

There are 2 interesting decisions worth calling out in this commit:

* The 'output' field of custom_target() does not allow specifying a
  file nested inside the build directory. Thus create build_rust.sh to
  build rust with all of its parameters and then moves libxdiff.a to
  the root of the build directory.

* Install curl, to facilitate the rustup install script.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 .github/workflows/main.yml |  1 +
 Makefile                   | 46 +++++++++++++++++++----------
 build_rust.sh              | 59 ++++++++++++++++++++++++++++++++++++++
 ci/install-dependencies.sh | 14 ++++-----
 ci/install-rust.sh         | 33 +++++++++++++++++++++
 ci/lib.sh                  |  8 ++++++
 ci/make-test-artifacts.sh  |  7 +++++
 ci/run-build-and-tests.sh  | 10 +++++++
 meson.build                | 40 +++++++++++---------------
 9 files changed, 172 insertions(+), 46 deletions(-)
 create mode 100755 build_rust.sh
 create mode 100644 ci/install-rust.sh

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 7dbf9f7f123c..8aac18a6ba45 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -4,6 +4,7 @@ on: [push, pull_request]
 
 env:
   DEVELOPER: 1
+  RUST_VERSION: 1.87.0
 
 # If more than one workflow run is triggered for the very same commit hash
 # (which happens when multiple branches pointing to the same commit), only
diff --git a/Makefile b/Makefile
index db39e6e1c28e..e659b6eefe82 100644
--- a/Makefile
+++ b/Makefile
@@ -919,11 +919,29 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+
+EXTLIBS =
+
 ifeq ($(DEBUG), 1)
-RUST_LIB = rust/target/debug/libxdiff.a
+  RUST_BUILD_MODE = debug
 else
-RUST_LIB = rust/target/release/libxdiff.a
+  RUST_BUILD_MODE = release
+endif
+
+RUST_TARGET_DIR = rust/target/$(RUST_BUILD_MODE)
+RUST_FLAGS_FOR_C = -L$(RUST_TARGET_DIR)
+
+.PHONY: compile_rust
+compile_rust:
+	./build_rust.sh . $(RUST_BUILD_MODE) xdiff
+
+EXTLIBS += ./$(RUST_TARGET_DIR)/libxdiff.a
+
+UNAME_S := $(shell uname -s)
+ifeq ($(UNAME_S),Linux)
+  EXTLIBS += -ldl
 endif
+
 REFTABLE_LIB = reftable/libreftable.a
 
 GENERATED_H += command-list.h
@@ -1395,9 +1413,7 @@ UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/lib-reftable.o
 
 # xdiff and reftable libs may in turn depend on what is in libgit.a
 GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
-EXTLIBS =
 
-GITLIBS += $(RUST_LIB)
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
 
@@ -2548,7 +2564,7 @@ git.sp git.s git.o: EXTRA_CPPFLAGS = \
 	'-DGIT_MAN_PATH="$(mandir_relative_SQ)"' \
 	'-DGIT_INFO_PATH="$(infodir_relative_SQ)"'
 
-git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS)
+git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
 		$(filter %.o,$^) $(LIBS)
 
@@ -2898,17 +2914,17 @@ headless-git.o: compat/win32/headless.c GIT-CFLAGS
 headless-git$X: headless-git.o git.res GIT-LDFLAGS
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) $(ALL_LDFLAGS) -mwindows -o $@ $< git.res
 
-git-%$X: %.o GIT-LDFLAGS $(GITLIBS)
+git-%$X: %.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
 
-git-imap-send$X: imap-send.o $(IMAP_SEND_BUILDDEPS) GIT-LDFLAGS $(GITLIBS)
+git-imap-send$X: imap-send.o $(IMAP_SEND_BUILDDEPS) GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(IMAP_SEND_LDFLAGS) $(LIBS)
 
-git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS $(GITLIBS)
+git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(LIBS)
-git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS)
+git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
 
@@ -2918,11 +2934,11 @@ $(REMOTE_CURL_ALIASES): $(REMOTE_CURL_PRIMARY)
 	ln -s $< $@ 2>/dev/null || \
 	cp $< $@
 
-$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS $(GITLIBS)
+$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
 
-scalar$X: scalar.o GIT-LDFLAGS $(GITLIBS)
+scalar$X: scalar.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
 		$(filter %.o,$^) $(LIBS)
 
@@ -3309,7 +3325,7 @@ perf: all
 
 t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) $(UNIT_TEST_DIR)/test-lib.o
 
-t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
+t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS)
 
 check-sha1:: t/helper/test-tool$X
@@ -3929,13 +3945,13 @@ FUZZ_CXXFLAGS ?= $(ALL_CFLAGS)
 .PHONY: fuzz-all
 fuzz-all: $(FUZZ_PROGRAMS)
 
-$(FUZZ_PROGRAMS): %: %.o oss-fuzz/dummy-cmd-main.o $(GITLIBS) GIT-LDFLAGS
+$(FUZZ_PROGRAMS): %: %.o oss-fuzz/dummy-cmd-main.o $(GITLIBS) GIT-LDFLAGS compile_rust
 	$(QUIET_LINK)$(FUZZ_CXX) $(FUZZ_CXXFLAGS) -o $@ $(ALL_LDFLAGS) \
 		-Wl,--allow-multiple-definition \
 		$(filter %.o,$^) $(filter %.a,$^) $(LIBS) $(LIB_FUZZING_ENGINE)
 
 $(UNIT_TEST_PROGS): $(UNIT_TEST_BIN)/%$X: $(UNIT_TEST_DIR)/%.o $(UNIT_TEST_OBJS) \
-	$(GITLIBS) GIT-LDFLAGS
+	$(GITLIBS) GIT-LDFLAGS compile_rust
 	$(call mkdir_p_parent_template)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
 		$(filter %.o,$^) $(filter %.a,$^) $(LIBS)
@@ -3954,7 +3970,7 @@ $(UNIT_TEST_DIR)/clar.suite: $(UNIT_TEST_DIR)/clar-decls.h $(UNIT_TEST_DIR)/gene
 $(UNIT_TEST_DIR)/clar/clar.o: $(UNIT_TEST_DIR)/clar.suite
 $(CLAR_TEST_OBJS): $(UNIT_TEST_DIR)/clar-decls.h
 $(CLAR_TEST_OBJS): EXTRA_CPPFLAGS = -I$(UNIT_TEST_DIR)
-$(CLAR_TEST_PROG): $(UNIT_TEST_DIR)/clar.suite $(CLAR_TEST_OBJS) $(GITLIBS) GIT-LDFLAGS
+$(CLAR_TEST_PROG): $(UNIT_TEST_DIR)/clar.suite $(CLAR_TEST_OBJS) $(GITLIBS) GIT-LDFLAGS compile_rust
 	$(call mkdir_p_parent_template)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
 
diff --git a/build_rust.sh b/build_rust.sh
new file mode 100755
index 000000000000..4c12135cd205
--- /dev/null
+++ b/build_rust.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+
+if [ -z "$CARGO_HOME" ]; then
+  export CARGO_HOME=$HOME/.cargo
+  echo >&2 "::warning:: CARGO_HOME is not set"
+fi
+echo "CARGO_HOME=$CARGO_HOME"
+
+rustc -vV
+cargo --version
+
+dir_git_root=${0%/*}
+dir_build=$1
+rust_target=$2
+crate=$3
+
+dir_rust=$dir_git_root/rust
+
+if [ "$dir_git_root" = "" ]; then
+  echo "did not specify the directory for the root of git"
+  exit 1
+fi
+
+if [ "$dir_build" = "" ]; then
+  echo "did not specify the build directory"
+  exit 1
+fi
+
+if [ "$rust_target" = "" ]; then
+  echo "did not specify the rust_target"
+  exit 1
+fi
+
+if [ "$rust_target" = "release" ]; then
+  rust_args="--release"
+  export RUSTFLAGS='-Aunused_imports -Adead_code'
+elif [ "$rust_target" = "debug" ]; then
+  rust_args=""
+  export RUSTFLAGS='-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
+else
+  echo "illegal rust_target value $rust_target"
+  exit 1
+fi
+
+cd $dir_rust && cargo clean && pwd && cargo build -p $crate $rust_args; cd ..
+
+libfile="lib${crate}.a"
+dst=$dir_build/$libfile
+
+if [ "$dir_git_root" != "$dir_build" ]; then
+  src=$dir_rust/target/$rust_target/$libfile
+  if [ ! -f $src ]; then
+    echo >&2 "::error:: cannot find path of static library"
+    exit 5
+  fi
+
+  rm $dst 2>/dev/null
+  mv $src $dst
+fi
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index d061a4729339..7801075821ba 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -24,14 +24,14 @@ fi
 
 case "$distro" in
 alpine-*)
-	apk add --update shadow sudo meson ninja-build gcc libc-dev curl-dev openssl-dev expat-dev gettext \
+	apk add --update shadow sudo meson ninja-build gcc libc-dev curl curl-dev openssl-dev expat-dev gettext \
 		zlib-ng-dev pcre2-dev python3 musl-libintl perl-utils ncurses \
 		apache2 apache2-http2 apache2-proxy apache2-ssl apache2-webdav apr-util-dbd_sqlite3 \
 		bash cvs gnupg perl-cgi perl-dbd-sqlite perl-io-tty >/dev/null
 	;;
 fedora-*|almalinux-*)
 	dnf -yq update >/dev/null &&
-	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gawk gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
+	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gawk gettext zlib-devel expat-devel openssl-devel curl curl-devel pcre2-devel >/dev/null
 	;;
 ubuntu-*|i386/ubuntu-*|debian-*)
 	# Required so that apt doesn't wait for user input on certain packages.
@@ -55,8 +55,8 @@ ubuntu-*|i386/ubuntu-*|debian-*)
 	sudo apt-get -q update
 	sudo apt-get -q -y install \
 		$LANGUAGES apache2 cvs cvsps git gnupg $SVN \
-		make libssl-dev libcurl4-openssl-dev libexpat-dev wget sudo default-jre \
-		tcl tk gettext zlib1g-dev perl-modules liberror-perl libauthen-sasl-perl \
+		make libssl-dev curl libcurl4-openssl-dev libexpat-dev wget sudo default-jre \
+		tcl tk gettext zlib1g zlib1g-dev perl-modules liberror-perl libauthen-sasl-perl \
 		libemail-valid-perl libio-pty-perl libio-socket-ssl-perl libnet-smtp-ssl-perl libdbd-sqlite3-perl libcgi-pm-perl \
 		libsecret-1-dev libpcre2-dev meson ninja-build pkg-config \
 		${CC_PACKAGE:-${CC:-gcc}} $PYTHON_PACKAGE
@@ -121,13 +121,13 @@ ClangFormat)
 	;;
 StaticAnalysis)
 	sudo apt-get -q update
-	sudo apt-get -q -y install coccinelle libcurl4-openssl-dev libssl-dev \
+	sudo apt-get -q -y install coccinelle curl libcurl4-openssl-dev libssl-dev \
 		libexpat-dev gettext make
 	;;
 sparse)
 	sudo apt-get -q update -q
-	sudo apt-get -q -y install libssl-dev libcurl4-openssl-dev \
-		libexpat-dev gettext zlib1g-dev sparse
+	sudo apt-get -q -y install libssl-dev curl libcurl4-openssl-dev \
+		libexpat-dev gettext zlib1g zlib1g-dev sparse
 	;;
 Documentation)
 	sudo apt-get -q update
diff --git a/ci/install-rust.sh b/ci/install-rust.sh
new file mode 100644
index 000000000000..141ceddb17cf
--- /dev/null
+++ b/ci/install-rust.sh
@@ -0,0 +1,33 @@
+#!/bin/sh
+
+if [ "$(id -u)" -eq 0 ]; then
+  echo >&2 "::warning:: installing rust as root"
+fi
+
+if [ "$CARGO_HOME" = "" ]; then
+  echo >&2 "::warning:: CARGO_HOME is not set"
+  export CARGO_HOME=$HOME/.cargo
+fi
+
+export RUSTUP_HOME=$CARGO_HOME
+
+if [ "$RUST_VERSION" = "" ]; then
+  echo >&2 "::error:: RUST_VERSION is not set"
+  exit 2
+fi
+
+## install rustup
+curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain none -y
+if [ ! -f $CARGO_HOME/env ]; then
+  echo "PATH=$CARGO_HOME/bin:\$PATH" > $CARGO_HOME/env
+fi
+## install a specific version of rust
+if [ "$BITNESS" = "32" ]; then
+  $CARGO_HOME/bin/rustup set default-host i686-unknown-linux-gnu || exit $?
+  $CARGO_HOME/bin/rustup install $RUST_VERSION || exit $?
+  $CARGO_HOME/bin/rustup default --force-non-host $RUST_VERSION || exit $?
+else
+  $CARGO_HOME/bin/rustup default $RUST_VERSION || exit $?
+fi
+
+. $CARGO_HOME/env
diff --git a/ci/lib.sh b/ci/lib.sh
index f561884d4016..ad0e49a68dcb 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -1,5 +1,13 @@
 # Library of functions shared by all CI scripts
 
+
+export BITNESS="64"
+if command -v getconf >/dev/null && [ "$(getconf LONG_BIT 2>/dev/null)" = "32" ]; then
+  export BITNESS="32"
+fi
+echo "BITNESS=$BITNESS"
+
+
 if test true = "$GITHUB_ACTIONS"
 then
 	begin_group () {
diff --git a/ci/make-test-artifacts.sh b/ci/make-test-artifacts.sh
index 74141af0cc74..56aa7efb1d53 100755
--- a/ci/make-test-artifacts.sh
+++ b/ci/make-test-artifacts.sh
@@ -7,6 +7,13 @@ mkdir -p "$1" # in case ci/lib.sh decides to quit early
 
 . ${0%/*}/lib.sh
 
+## install rust per user rather than system wide
+. ${0%/*}/install-rust.sh
+
 group Build make artifacts-tar ARTIFACTS_DIRECTORY="$1"
 
+if [ -d "$CARGO_HOME" ]; then
+  rm -rf $CARGO_HOME
+fi
+
 check_unignored_build_artifacts
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 01823fd0f140..dbab1cb2f936 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -5,6 +5,12 @@
 
 . ${0%/*}/lib.sh
 
+## install rust per user rather than system wide
+. ${0%/*}/install-rust.sh
+
+rustc -vV
+cargo --version || exit $?
+
 run_tests=t
 
 case "$jobname" in
@@ -72,5 +78,9 @@ case "$jobname" in
 	;;
 esac
 
+if [ -d "$CARGO_HOME" ]; then
+  rm -rf $CARGO_HOME
+fi
+
 check_unignored_build_artifacts
 save_good_tree
diff --git a/meson.build b/meson.build
index 2d8da17f6515..047d7e5b6630 100644
--- a/meson.build
+++ b/meson.build
@@ -277,26 +277,17 @@ else
   rustflags = '-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
 endif
 
-
-rust_leaf = custom_target('rust_leaf',
+rust_build_xdiff = custom_target('rust_build_xdiff',
   output: 'libxdiff.a',
   build_by_default: true,
   build_always_stale: true,
-  command: ['cargo', 'build',
-            '--manifest-path', meson.project_source_root() / 'rust/Cargo.toml'
-  ] + rust_args,
-  env: {
-    'RUSTFLAGS': rustflags,
-  },
+  command: [
+    meson.project_source_root() / 'build_rust.sh',
+    meson.current_build_dir(), rust_target, 'xdiff',
+  ],
   install: false,
 )
 
-rust_xdiff_dep = declare_dependency(
-  link_args: ['-L' + meson.project_source_root() / 'rust/target' / rust_target, '-lxdiff'],
-#  include_directories: include_directories('xdiff/include'),  # Adjust if you expose headers
-)
-
-
 compiler = meson.get_compiler('c')
 
 libgit_sources = [
@@ -1707,17 +1698,18 @@ version_def_h = custom_target(
 )
 libgit_sources += version_def_h
 
-libgit_dependencies += rust_xdiff_dep
-
 libgit = declare_dependency(
-  link_with: static_library('git',
-    sources: libgit_sources,
-    c_args: libgit_c_args + [
-      '-DGIT_VERSION_H="' + version_def_h.full_path() + '"',
-    ],
-    dependencies: libgit_dependencies,
-    include_directories: libgit_include_directories,
-  ),
+  link_with: [
+    static_library('git',
+      sources: libgit_sources,
+      c_args: libgit_c_args + [
+        '-DGIT_VERSION_H="' + version_def_h.full_path() + '"',
+      ],
+      dependencies: libgit_dependencies,
+      include_directories: libgit_include_directories,
+    ),
+    rust_build_xdiff,
+  ],
   compile_args: libgit_c_args,
   dependencies: libgit_dependencies,
   include_directories: libgit_include_directories,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 09/17] Do support Windows again after requiring Rust
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (7 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 08/17] github workflows: install rust Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Johannes Schindelin via GitGitGadget
  2025-08-15 17:12     ` Matthias Aßhauer
  2025-08-15  1:22   ` [PATCH v2 10/17] win+Meson: allow for xdiff to be compiled with MSVC Johannes Schindelin via GitGitGadget
                     ` (10 subsequent siblings)
  19 siblings, 1 reply; 198+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

By default, Rust wants to build MS Visual C-compatible libraries on
Windows, because that is _the_ native C compiler.

Git is historically lacking in its MSVC support, and the official Git
for Windows versions are built using GCC instead. As a consequence, a
(subset of a) GCC toolchain is installed as part of the `windows-build`
job of every CI build.

Naturally, this requires adjustments in how Rust is called, most
importantly it requires installing support for a GCC-compatible build
target.

Let's make the necessary adjustment both in the CI-specific code that
installs Rust as well as in the Windows-specific configuration in
`config.mak.uname`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[en: Moved lib userenv handling to a later patch]
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 ci/install-rust.sh | 3 +++
 config.mak.uname   | 7 +++++++
 2 files changed, 10 insertions(+)

diff --git a/ci/install-rust.sh b/ci/install-rust.sh
index 141ceddb17cf..c22baa629ceb 100644
--- a/ci/install-rust.sh
+++ b/ci/install-rust.sh
@@ -28,6 +28,9 @@ if [ "$BITNESS" = "32" ]; then
   $CARGO_HOME/bin/rustup default --force-non-host $RUST_VERSION || exit $?
 else
   $CARGO_HOME/bin/rustup default $RUST_VERSION || exit $?
+  if [ "$CI_OS_NAME" = "windows" ]; then
+    $CARGO_HOME/bin/rustup target add x86_64-pc-windows-gnu || exit $?
+  fi
 fi
 
 . $CARGO_HOME/env
diff --git a/config.mak.uname b/config.mak.uname
index 3e26bb074a4b..a22703284b56 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -727,19 +727,26 @@ ifeq ($(uname_S),MINGW)
 		prefix = /mingw32
 		HOST_CPU = i686
 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,_mainCRTStartup
+		CARGO_BUILD_TARGET = i686-pc-windows-gnu
         endif
         ifeq (MINGW64,$(MSYSTEM))
 		prefix = /mingw64
 		HOST_CPU = x86_64
 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
+		CARGO_BUILD_TARGET = x86_64-pc-windows-gnu
         else ifeq (CLANGARM64,$(MSYSTEM))
 		prefix = /clangarm64
 		HOST_CPU = aarch64
 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
+		CARGO_BUILD_TARGET = aarch64-pc-windows-gnu
         else
 		COMPAT_CFLAGS += -D_USE_32BIT_TIME_T
 		BASIC_LDFLAGS += -Wl,--large-address-aware
         endif
+
+	export CARGO_BUILD_TARGET
+	RUST_TARGET_DIR = rust/target/$(CARGO_BUILD_TARGET)/$(RUST_BUILD_MODE)
+
 	CC = gcc
 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \
 		-fstack-protector-strong
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 10/17] win+Meson: allow for xdiff to be compiled with MSVC
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (8 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 09/17] Do support Windows again after requiring Rust Johannes Schindelin via GitGitGadget
@ 2025-08-15  1:22   ` Johannes Schindelin via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 11/17] win+Meson: do allow linking with the Rust-built xdiff Johannes Schindelin via GitGitGadget
                     ` (9 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `build_rust.sh` script is quite opinionated about the naming scheme
of the C compiler: It assumes that the xdiff library file will be named
`libxdiff.a`.

However, MS Visual C generates `xdiff.lib` files instead; This naming
scheme has been in use in a very, very long time.

Let's allow for that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 build_rust.sh |  7 ++++++-
 meson.build   | 12 +++++++++---
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/build_rust.sh b/build_rust.sh
index 4c12135cd205..694d48d857a5 100755
--- a/build_rust.sh
+++ b/build_rust.sh
@@ -44,7 +44,12 @@ fi
 
 cd $dir_rust && cargo clean && pwd && cargo build -p $crate $rust_args; cd ..
 
-libfile="lib${crate}.a"
+if grep x86_64-pc-windows-msvc rust/target/.rustc_info.json
+then
+  libfile="${crate}.lib"
+else
+  libfile="lib${crate}.a"
+fi
 dst=$dir_build/$libfile
 
 if [ "$dir_git_root" != "$dir_build" ]; then
diff --git a/meson.build b/meson.build
index 047d7e5b6630..5e89a5dd0e00 100644
--- a/meson.build
+++ b/meson.build
@@ -277,8 +277,16 @@ else
   rustflags = '-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
 endif
 
+compiler = meson.get_compiler('c')
+
+if compiler.get_id() == 'msvc'
+  xdiff_lib_filename = 'xdiff.lib'
+else
+  xdiff_lib_filename = 'libxdiff.a'
+endif
+
 rust_build_xdiff = custom_target('rust_build_xdiff',
-  output: 'libxdiff.a',
+  output: xdiff_lib_filename,
   build_by_default: true,
   build_always_stale: true,
   command: [
@@ -288,8 +296,6 @@ rust_build_xdiff = custom_target('rust_build_xdiff',
   install: false,
 )
 
-compiler = meson.get_compiler('c')
-
 libgit_sources = [
   'abspath.c',
   'add-interactive.c',
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 11/17] win+Meson: do allow linking with the Rust-built xdiff
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (9 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 10/17] win+Meson: allow for xdiff to be compiled with MSVC Johannes Schindelin via GitGitGadget
@ 2025-08-15  1:22   ` Johannes Schindelin via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 12/17] github workflows: define rust versions and targets in the same place Ezekiel Newren via GitGitGadget
                     ` (8 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When linking against the Rust-built `xdiff`, there is now a new required
dependency: Without _also_ linking to the system library `userenv`, the
compile would fail with this error message:

  xdiff.lib(std-c85e9beb7923f636.std.df32d1bc89881d89-cgu.0.rcgu.o) :
  error LNK2019: unresolved external symbol __imp_GetUserProfileDirectoryW
  referenced in function _ZN3std3env8home_dir17hfd1c3b6676cd78f6E

Therefore, just like we do in case of Makefile-based builds on Windows,
we now also link to that library when building with Meson.

Note that if we only have Rust depend upon libuserenv then at link time
GCC would complain about:

  undefined reference to `GetUserProfileDirectoryW'

Apparently there is _some_ closure that gets compiled in that requires
this function, and that in turn forces Git to link to libuserenv.

This is a new requirement, and therefore has not been made part of the
"minimal Git for Windows SDK".

In the near future, I intend to include it, but for now let's just
ensure that the file is added manually if it is missing.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[en: Squashed a few of Johannes's patches, and moved lib userenv
 handling from an earlier patch]
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 .github/workflows/main.yml | 8 ++++++++
 config.mak.uname           | 2 ++
 meson.build                | 1 +
 3 files changed, 11 insertions(+)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 8aac18a6ba45..aa18742f08c4 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -115,6 +115,14 @@ jobs:
     steps:
     - uses: actions/checkout@v4
     - uses: git-for-windows/setup-git-for-windows-sdk@v1
+    - name: ensure that libuserenv.a is present
+      shell: bash
+      run: |
+        cd /mingw64/lib && {
+          test -f libuserenv.a ||
+          /c/Program\ Files/Git/mingw64/bin/curl -Lo libuserenv.a \
+            https://github.com/git-for-windows/git-sdk-64/raw/HEAD/mingw64/lib/libuserenv.a
+        }
     - name: build
       shell: bash
       env:
diff --git a/config.mak.uname b/config.mak.uname
index a22703284b56..fbe7cebf40ed 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -746,6 +746,8 @@ ifeq ($(uname_S),MINGW)
 
 	export CARGO_BUILD_TARGET
 	RUST_TARGET_DIR = rust/target/$(CARGO_BUILD_TARGET)/$(RUST_BUILD_MODE)
+	# Unfortunately now needed because of Rust
+	EXTLIBS += -luserenv
 
 	CC = gcc
 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \
diff --git a/meson.build b/meson.build
index 5e89a5dd0e00..af015f04763f 100644
--- a/meson.build
+++ b/meson.build
@@ -1260,6 +1260,7 @@ elif host_machine.system() == 'windows'
   ]
 
   libgit_dependencies += compiler.find_library('ntdll')
+  libgit_dependencies += compiler.find_library('userenv')
   libgit_include_directories += 'compat/win32'
   if compiler.get_id() == 'msvc'
     libgit_include_directories += 'compat/vcbuild/include'
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 12/17] github workflows: define rust versions and targets in the same place
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (10 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 11/17] win+Meson: do allow linking with the Rust-built xdiff Johannes Schindelin via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 13/17] github workflows: upload Cargo.lock Ezekiel Newren via GitGitGadget
                     ` (7 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Consolidate the Rust toolchain definitions in main.yaml. Prefer using
actions-rs/toolchain@v1 where possible, but for docker targets use
a script to install the Rust toolchain. Four overrides are used in
main.yaml:

  * On Windows: Rust didn't resolve the bcrypt library on Windows
    correctly until version 1.78.0. Also since rustup mis-identifies
    the Rust toolchain, the Rust target triple must be set to
    x86_64-pc-windows-gnu.
  * On musl: libc differences, such as ftruncate64 vs ftruncate, were
    not accounted for until Rust version 1.72.0. No older version of
    Rust will work on musl for our needs.
  * In a 32-bit docker container running on a 64-bit host, we need to
    override the Rust target triple. This is because rustup asks the
    kernel for the bitness of the system and it says 64, even though
    the container will only run 32-bit. This also allows us to remove
    the BITNESS environment variable in ci/lib.sh.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 .github/workflows/main.yml | 34 +++++++++++++++++++++++++++++++++-
 build_rust.sh              | 11 +++--------
 ci/install-rust.sh         | 33 +++++++++++++++++----------------
 ci/lib.sh                  |  7 -------
 ci/make-test-artifacts.sh  | 12 ++++++------
 ci/run-build-and-tests.sh  |  8 +++++---
 meson.build                |  1 +
 7 files changed, 65 insertions(+), 41 deletions(-)
 mode change 100644 => 100755 ci/install-rust.sh

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index aa18742f08c4..ef4d6348edcd 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -4,7 +4,6 @@ on: [push, pull_request]
 
 env:
   DEVELOPER: 1
-  RUST_VERSION: 1.87.0
 
 # If more than one workflow run is triggered for the very same commit hash
 # (which happens when multiple branches pointing to the same commit), only
@@ -27,6 +26,12 @@ jobs:
     outputs:
       enabled: ${{ steps.check-ref.outputs.enabled }}${{ steps.skip-if-redundant.outputs.enabled }}
       skip_concurrent: ${{ steps.check-ref.outputs.skip_concurrent }}
+      rust_version_minimum: 1.61.0
+      rust_version_windows: 1.78.0
+      rust_version_musl: 1.72.0
+      ## the rust target is inferred by rustup unless specified
+      rust_target_windows: x86_64-pc-windows-gnu
+      rust_target_32bit_linux: i686-unknown-linux-gnu
     steps:
       - name: try to clone ci-config branch
         run: |
@@ -123,11 +128,19 @@ jobs:
           /c/Program\ Files/Git/mingw64/bin/curl -Lo libuserenv.a \
             https://github.com/git-for-windows/git-sdk-64/raw/HEAD/mingw64/lib/libuserenv.a
         }
+    - name: Install Rust
+      uses: actions-rs/toolchain@v1
+      with:
+        toolchain: ${{ needs.ci-config.outputs.rust_version_windows }}
+        target: ${{ needs.ci-config.outputs.rust_target_windows }}
+        profile: minimal
+        override: true
     - name: build
       shell: bash
       env:
         HOME: ${{runner.workspace}}
         NO_PERL: 1
+        CARGO_HOME: "/c/Users/runneradmin/.cargo"
       run: . /etc/profile && ci/make-test-artifacts.sh artifacts
     - name: zip up tracked files
       run: git archive -o artifacts/tracked.tar.gz HEAD
@@ -269,6 +282,13 @@ jobs:
     steps:
     - uses: actions/checkout@v4
     - uses: actions/setup-python@v5
+    - name: Install Rust
+      uses: actions-rs/toolchain@v1
+      with:
+        toolchain: ${{ needs.ci-config.outputs.rust_version_windows }}
+        target: ${{ needs.ci-config.outputs.rust_target_windows }}
+        profile: minimal
+        override: true
     - name: Set up dependencies
       shell: pwsh
       run: pip install meson ninja
@@ -342,6 +362,12 @@ jobs:
     steps:
     - uses: actions/checkout@v4
     - run: ci/install-dependencies.sh
+    - name: Install Rust
+      uses: actions-rs/toolchain@v1
+      with:
+        toolchain: ${{ needs.ci-config.outputs.rust_version_minimum }}
+        profile: minimal
+        override: true
     - run: ci/run-build-and-tests.sh
     - name: print test failures
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
@@ -402,9 +428,11 @@ jobs:
           cc: gcc
         - jobname: linux-musl-meson
           image: alpine:latest
+          rust_version_override: ${{ needs.ci-config.outputs.rust_version_musl }}
         # Supported until 2025-04-02.
         - jobname: linux32
           image: i386/ubuntu:focal
+          rust_target_override: ${{ needs.ci-config.outputs.rust_target_32bit_linux }}
         - jobname: pedantic
           image: fedora:latest
         # A RHEL 8 compatible distro.  Supported until 2029-05-31.
@@ -417,7 +445,11 @@ jobs:
       jobname: ${{matrix.vector.jobname}}
       CC: ${{matrix.vector.cc}}
       CI_JOB_IMAGE: ${{matrix.vector.image}}
+      CI_IS_DOCKER: "true"
       CUSTOM_PATH: /custom
+      RUST_VERSION: ${{ matrix.vector.rust_version_override || needs.ci-config.outputs.rust_version_minimum }}
+      RUST_TARGET: ${{ matrix.vector.rust_target_override || '' }}
+      CARGO_HOME: /home/builder/.cargo
     runs-on: ubuntu-latest
     container: ${{matrix.vector.image}}
     steps:
diff --git a/build_rust.sh b/build_rust.sh
index 694d48d857a5..80ce7eae3b00 100755
--- a/build_rust.sh
+++ b/build_rust.sh
@@ -1,13 +1,8 @@
 #!/bin/sh
 
-if [ -z "$CARGO_HOME" ]; then
-  export CARGO_HOME=$HOME/.cargo
-  echo >&2 "::warning:: CARGO_HOME is not set"
-fi
-echo "CARGO_HOME=$CARGO_HOME"
 
-rustc -vV
-cargo --version
+rustc -vV || exit $?
+cargo --version || exit $?
 
 dir_git_root=${0%/*}
 dir_build=$1
@@ -55,7 +50,7 @@ dst=$dir_build/$libfile
 if [ "$dir_git_root" != "$dir_build" ]; then
   src=$dir_rust/target/$rust_target/$libfile
   if [ ! -f $src ]; then
-    echo >&2 "::error:: cannot find path of static library"
+    echo >&2 "::error:: cannot find path of static library $src is not a file or does not exist"
     exit 5
   fi
 
diff --git a/ci/install-rust.sh b/ci/install-rust.sh
old mode 100644
new mode 100755
index c22baa629ceb..133aa8cc878a
--- a/ci/install-rust.sh
+++ b/ci/install-rust.sh
@@ -1,36 +1,37 @@
 #!/bin/sh
 
+## github workflows actions-rs/toolchain@v1 doesn't work for docker
+## targets. This script should only be used if the ci pipeline
+## doesn't support installing rust on a particular target.
+
 if [ "$(id -u)" -eq 0 ]; then
   echo >&2 "::warning:: installing rust as root"
 fi
 
-if [ "$CARGO_HOME" = "" ]; then
-  echo >&2 "::warning:: CARGO_HOME is not set"
-  export CARGO_HOME=$HOME/.cargo
-fi
-
-export RUSTUP_HOME=$CARGO_HOME
-
 if [ "$RUST_VERSION" = "" ]; then
   echo >&2 "::error:: RUST_VERSION is not set"
+  exit 1
+fi
+
+if [ "$CARGO_HOME" = "" ]; then
+  echo >&2 "::error:: CARGO_HOME is not set"
   exit 2
 fi
 
+export RUSTUP_HOME=$CARGO_HOME
+
 ## install rustup
 curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain none -y
 if [ ! -f $CARGO_HOME/env ]; then
   echo "PATH=$CARGO_HOME/bin:\$PATH" > $CARGO_HOME/env
 fi
+. $CARGO_HOME/env
+
 ## install a specific version of rust
-if [ "$BITNESS" = "32" ]; then
-  $CARGO_HOME/bin/rustup set default-host i686-unknown-linux-gnu || exit $?
-  $CARGO_HOME/bin/rustup install $RUST_VERSION || exit $?
-  $CARGO_HOME/bin/rustup default --force-non-host $RUST_VERSION || exit $?
+if [ "$RUST_TARGET" != "" ]; then
+  rustup default --force-non-host "$RUST_VERSION-$RUST_TARGET" || exit $?
 else
-  $CARGO_HOME/bin/rustup default $RUST_VERSION || exit $?
-  if [ "$CI_OS_NAME" = "windows" ]; then
-    $CARGO_HOME/bin/rustup target add x86_64-pc-windows-gnu || exit $?
-  fi
+  rustup default "$RUST_VERSION" || exit $?
 fi
 
-. $CARGO_HOME/env
+rustc -vV || exit $?
diff --git a/ci/lib.sh b/ci/lib.sh
index ad0e49a68dcb..a7992b22fdc9 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -1,13 +1,6 @@
 # Library of functions shared by all CI scripts
 
 
-export BITNESS="64"
-if command -v getconf >/dev/null && [ "$(getconf LONG_BIT 2>/dev/null)" = "32" ]; then
-  export BITNESS="32"
-fi
-echo "BITNESS=$BITNESS"
-
-
 if test true = "$GITHUB_ACTIONS"
 then
 	begin_group () {
diff --git a/ci/make-test-artifacts.sh b/ci/make-test-artifacts.sh
index 56aa7efb1d53..70d0dfbc0e8b 100755
--- a/ci/make-test-artifacts.sh
+++ b/ci/make-test-artifacts.sh
@@ -7,13 +7,13 @@ mkdir -p "$1" # in case ci/lib.sh decides to quit early
 
 . ${0%/*}/lib.sh
 
-## install rust per user rather than system wide
-. ${0%/*}/install-rust.sh
+if [ -z "$CARGO_HOME" ]; then
+  echo >&2 "::error:: CARGO_HOME is not set"
+  exit 1
+fi
 
-group Build make artifacts-tar ARTIFACTS_DIRECTORY="$1"
+export PATH="$CARGO_HOME/bin:$PATH"
 
-if [ -d "$CARGO_HOME" ]; then
-  rm -rf $CARGO_HOME
-fi
+group Build make artifacts-tar ARTIFACTS_DIRECTORY="$1"
 
 check_unignored_build_artifacts
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index dbab1cb2f936..0f1f704d583e 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -5,10 +5,12 @@
 
 . ${0%/*}/lib.sh
 
-## install rust per user rather than system wide
-. ${0%/*}/install-rust.sh
+## actions-rs/toolchain@v1 doesn't work for docker targets.
+if [ "$CI_IS_DOCKER" = "true" ]; then
+  . ${0%/*}/install-rust.sh
+fi
 
-rustc -vV
+rustc -vV || exit $?
 cargo --version || exit $?
 
 run_tests=t
diff --git a/meson.build b/meson.build
index af015f04763f..a2f9f063bef2 100644
--- a/meson.build
+++ b/meson.build
@@ -293,6 +293,7 @@ rust_build_xdiff = custom_target('rust_build_xdiff',
     meson.project_source_root() / 'build_rust.sh',
     meson.current_build_dir(), rust_target, 'xdiff',
   ],
+  env: script_environment,
   install: false,
 )
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 13/17] github workflows: upload Cargo.lock
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (11 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 12/17] github workflows: define rust versions and targets in the same place Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 14/17] xdiff: implement a white space iterator in Rust Ezekiel Newren via GitGitGadget
                     ` (6 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Make each ci workflow upload its Cargo.lock file as a build artifact so
that we can audit build dependencies.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 .github/workflows/main.yml | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index ef4d6348edcd..ba61bd516639 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -149,6 +149,11 @@ jobs:
       with:
         name: windows-artifacts
         path: artifacts
+    - name: upload Cargo.lock
+      uses: actions/upload-artifact@v4
+      with:
+        name: cargo-lock-windows
+        path: rust/Cargo.lock
   windows-test:
     name: win test
     runs-on: windows-latest
@@ -303,6 +308,11 @@ jobs:
       with:
         name: windows-meson-artifacts
         path: build
+    - name: Upload Cargo.lock
+      uses: actions/upload-artifact@v4
+      with:
+        name: cargo-lock-windows-meson
+        path: rust/Cargo.lock
   windows-meson-test:
     name: win+Meson test
     runs-on: windows-latest
@@ -378,6 +388,11 @@ jobs:
       with:
         name: failed-tests-${{matrix.vector.jobname}}
         path: ${{env.FAILED_TEST_ARTIFACTS}}
+    - name: Upload Cargo.lock
+      uses: actions/upload-artifact@v4
+      with:
+        name: cargo-lock-${{matrix.vector.jobname}}
+        path: rust/Cargo.lock
   fuzz-smoke-test:
     name: fuzz smoke test
     needs: ci-config
@@ -484,6 +499,11 @@ jobs:
       with:
         name: failed-tests-${{matrix.vector.jobname}}
         path: ${{env.FAILED_TEST_ARTIFACTS}}
+    - name: Upload Cargo.lock
+      uses: actions/upload-artifact@v4
+      with:
+        name: cargo-lock-${{matrix.vector.jobname}}
+        path: rust/Cargo.lock
   static-analysis:
     needs: ci-config
     if: needs.ci-config.outputs.enabled == 'yes'
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 14/17] xdiff: implement a white space iterator in Rust
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (12 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 13/17] github workflows: upload Cargo.lock Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 15/17] xdiff: create line_hash() and line_equal() Ezekiel Newren via GitGitGadget
                     ` (5 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Xdiff has traditionally implemented the logic for iterating over
whitespace in every location that needed to do so. Create a consolidated
iterator in Rust that we can call from each location. Write Rust unit
tests to ensure the correctness of the Rust whitespace iterator and the
chunked_iter_equal() function.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 rust/xdiff/src/lib.rs    |  10 ++
 rust/xdiff/src/xutils.rs | 292 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 302 insertions(+)
 create mode 100644 rust/xdiff/src/xutils.rs

diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
index 96975975a1ba..9cf0462bcdb9 100644
--- a/rust/xdiff/src/lib.rs
+++ b/rust/xdiff/src/lib.rs
@@ -1,3 +1,13 @@
+pub mod xutils;
+
+pub const XDF_IGNORE_WHITESPACE: u64 = 1 << 1;
+pub const XDF_IGNORE_WHITESPACE_CHANGE: u64 = 1 << 2;
+pub const XDF_IGNORE_WHITESPACE_AT_EOL: u64 = 1 << 3;
+pub const XDF_IGNORE_CR_AT_EOL: u64 = 1 << 4;
+pub const XDF_WHITESPACE_FLAGS: u64 = XDF_IGNORE_WHITESPACE |
+    XDF_IGNORE_WHITESPACE_CHANGE |
+    XDF_IGNORE_WHITESPACE_AT_EOL |
+    XDF_IGNORE_CR_AT_EOL;
 
 
 #[no_mangle]
diff --git a/rust/xdiff/src/xutils.rs b/rust/xdiff/src/xutils.rs
new file mode 100644
index 000000000000..38126b47292f
--- /dev/null
+++ b/rust/xdiff/src/xutils.rs
@@ -0,0 +1,292 @@
+use crate::*;
+
+pub(crate) fn xdl_isspace(v: u8) -> bool {
+    match v {
+        b'\t' | b'\n' | b'\r' | b' ' => true,
+        _ => false,
+    }
+}
+
+pub struct WhitespaceIter<'a> {
+    line: &'a [u8],
+    index: usize,
+    flags: u64,
+}
+
+
+impl<'a> WhitespaceIter<'a> {
+    pub fn new(line: &'a [u8], flags: u64) -> Self {
+        Self {
+            line,
+            index: 0,
+            flags,
+        }
+    }
+}
+
+impl<'a> Iterator for WhitespaceIter<'a> {
+    type Item = &'a [u8];
+
+    fn next(&mut self) -> Option<Self::Item> {
+        if self.index >= self.line.len() {
+            return None;
+        }
+
+        loop {
+            let start = self.index;
+            if self.index == self.line.len() {
+                return None;
+            }
+
+            /* return contiguous run of not space bytes */
+            while self.index < self.line.len() {
+                if xdl_isspace(self.line[self.index]) {
+                    break;
+                }
+                self.index += 1;
+            }
+            if self.index > start {
+                return Some(&self.line[start..self.index]);
+            }
+            /* the current byte had better be a space */
+            if !xdl_isspace(self.line[self.index]) {
+                panic!("xdl_line_iter_next xdl_isspace() is false")
+            }
+
+            while self.index < self.line.len() && xdl_isspace(self.line[self.index]) {
+                self.index += 1;
+            }
+
+
+            if self.index <= start {
+                panic!("xdl_isspace() cannot simultaneously be true and false");
+            }
+
+            if (self.flags & XDF_IGNORE_WHITESPACE_AT_EOL) != 0
+                && self.index == self.line.len()
+            {
+                return None;
+            }
+            if (self.flags & XDF_IGNORE_WHITESPACE) != 0 {
+                continue;
+            }
+            if (self.flags & XDF_IGNORE_WHITESPACE_CHANGE) != 0 {
+                if self.index == self.line.len() {
+                    continue;
+                }
+                return Some(" ".as_bytes());
+            }
+            if (self.flags & XDF_IGNORE_CR_AT_EOL) != 0 {
+                if start < self.line.len() && self.index == self.line.len() {
+                    let mut end = self.line.len();
+                    if end > 0 && self.line[end - 1] == b'\n' {
+                        if end - start == 1 {
+                            return Some(&self.line[start..end]);
+                        } else {
+                            end -= 1;
+                        }
+                        if end > 0 && self.line[end - 1] == b'\r' {
+                            self.index = end;
+                            end -= 1;
+                            if end - start == 0 {
+                                continue;
+                            }
+                            return Some(&self.line[start..end]);
+                        }
+                    }
+                }
+            }
+            return Some(&self.line[start..self.index]);
+        }
+    }
+}
+
+pub fn chunked_iter_equal<'a, T, IT0, IT1>(mut it0: IT0, mut it1: IT1) -> bool
+where
+    T: Eq + 'a,
+    IT0: Iterator<Item = &'a [T]>,
+    IT1: Iterator<Item = &'a [T]>,
+{
+    let mut run_option0: Option<&[T]> = it0.next();
+    let mut run_option1: Option<&[T]> = it1.next();
+    let mut i0 = 0;
+    let mut i1 = 0;
+
+    while let (Some(run0), Some(run1)) = (run_option0, run_option1) {
+        while i0 < run0.len() && i1 < run1.len() {
+            if run0[i0] != run1[i1] {
+                return false;
+            }
+
+            i0 += 1;
+            i1 += 1;
+        }
+
+        if i0 == run0.len() {
+            i0 = 0;
+            run_option0 = it0.next();
+        }
+        if i1 == run1.len() {
+            i1 = 0;
+            run_option1 = it1.next();
+        }
+    }
+
+    while let Some(run0) = run_option0 {
+        if run0.len() == 0 {
+            run_option0 = it0.next();
+        } else {
+            break;
+        }
+    }
+
+    while let Some(run1) = run_option1 {
+        if run1.len() == 0 {
+            run_option1 = it1.next();
+        } else {
+            break;
+        }
+    }
+
+    run_option0.is_none() && run_option1.is_none()
+}
+
+#[cfg(test)]
+mod tests {
+    use crate::*;
+    use crate::xutils::{chunked_iter_equal, WhitespaceIter};
+
+    fn extract_string<'a>(line: &[u8], flags: u64, buffer: &'a mut Vec<u8>) -> &'a str {
+        let it = WhitespaceIter::new(line, flags);
+        buffer.clear();
+        for run in it {
+            #[cfg(test)]
+            let _view = unsafe { std::str::from_utf8_unchecked(run) };
+            buffer.extend_from_slice(run);
+        }
+        unsafe { std::str::from_utf8_unchecked(buffer.as_slice()) }
+    }
+
+    fn get_str_it<'a>(slice: &'a [&'a str]) -> impl Iterator<Item = &'a [u8]> + 'a {
+        slice.iter().map(|v| (*v).as_bytes())
+    }
+
+    #[test]
+    fn test_ignore_space() {
+        let tv_individual = vec![
+            ("ab\r", "ab\r", XDF_IGNORE_CR_AT_EOL),
+            ("ab \r", "ab \r", XDF_IGNORE_CR_AT_EOL),
+            ("\r \t a \r", "\r \t a \r", XDF_IGNORE_CR_AT_EOL),
+            ("\r a \r", "\r a \r", XDF_IGNORE_CR_AT_EOL),
+            ("\r", "\r", XDF_IGNORE_CR_AT_EOL),
+            ("", "", XDF_IGNORE_CR_AT_EOL),
+            ("\r a \r", "\r a \r", XDF_IGNORE_CR_AT_EOL),
+
+            ("\r \t a \n", "\r \t a \r\n", XDF_IGNORE_CR_AT_EOL),
+            ("\r a \n", "\r a \r\n", XDF_IGNORE_CR_AT_EOL),
+            ("\n", "\r\n", XDF_IGNORE_CR_AT_EOL),
+            ("\n", "\n", XDF_IGNORE_CR_AT_EOL),
+            ("\r a \n", "\r a \n", XDF_IGNORE_CR_AT_EOL),
+
+            ("1\n", "1\r\n", XDF_IGNORE_CR_AT_EOL),
+            ("1", "1\r\n", XDF_IGNORE_WHITESPACE_CHANGE),
+
+            ("\r \t a \r\n", "\r \t a \r\n", 0),
+            ("\r a \r\n", "\r a \r\n", 0),
+            ("\r\n", "\r\n", 0),
+            ("\n", "\n", 0),
+            ("\r a \n", "\r a \n", 0),
+            ("     \n", "     \n", 0),
+            ("a     \n", "a     \n", 0),
+            ("  a  \t  asdf  \t \r\n", "  a  \t  asdf  \t \r\n", 0),
+            ("\t a  b  \t \n", "\t a  b  \t \n", 0),
+            ("  a b \t \r\n", "  a b \t \r\n", 0),
+            ("\t  a \n", "\t  a \n", 0),
+            ("\t\t\ta\t\n", "\t\t\ta\t\n", 0),
+            ("a\n", "a\n", 0),
+            ("\ta\n", "\ta\n", 0),
+
+            ("a", "\r \t a \r\n", XDF_IGNORE_WHITESPACE),
+            ("a", "\r a \r\n", XDF_IGNORE_WHITESPACE),
+            ("", "\r\n", XDF_IGNORE_WHITESPACE),
+            ("", "\n", XDF_IGNORE_WHITESPACE),
+            ("a", "\r a \n", XDF_IGNORE_WHITESPACE),
+            ("", "     \n", XDF_IGNORE_WHITESPACE),
+            ("a", "a     \n", XDF_IGNORE_WHITESPACE),
+            ("aasdf", "  a  \t  asdf  \t \r\n", XDF_IGNORE_WHITESPACE),
+            ("ab", "\t a  b  \t \n", XDF_IGNORE_WHITESPACE),
+            ("ab", "  a b \t \r\n", XDF_IGNORE_WHITESPACE),
+            ("a", "\t  a \n", XDF_IGNORE_WHITESPACE),
+            ("a", "\t\t\ta\t\n", XDF_IGNORE_WHITESPACE),
+            ("a", "a\n", XDF_IGNORE_WHITESPACE),
+            ("a", "\ta\n", XDF_IGNORE_WHITESPACE),
+
+            ("", "     \n", XDF_IGNORE_WHITESPACE_AT_EOL),
+            ("a", "a     \n", XDF_IGNORE_WHITESPACE_AT_EOL),
+            ("  a  \t  asdf", "  a  \t  asdf  \t \r\n", XDF_IGNORE_WHITESPACE_AT_EOL),
+            ("\t a  b", "\t a  b  \t \n", XDF_IGNORE_WHITESPACE_AT_EOL),
+
+            (" a b", "  a b \t \r\n", XDF_IGNORE_WHITESPACE_CHANGE),
+            (" a", "\t  a \n", XDF_IGNORE_WHITESPACE_CHANGE),
+            (" a", "\t\t\ta\t\n", XDF_IGNORE_WHITESPACE_CHANGE),
+            ("a", "a\n", XDF_IGNORE_WHITESPACE_CHANGE),
+            (" a", "\ta\n", XDF_IGNORE_WHITESPACE_CHANGE),
+
+            ("ab", "  a b \t \r\n", XDF_IGNORE_WHITESPACE | XDF_IGNORE_WHITESPACE_CHANGE),
+            ("a", "\t  a \n", XDF_IGNORE_WHITESPACE | XDF_IGNORE_WHITESPACE_CHANGE),
+            ("a", "\t\t\ta\t\n", XDF_IGNORE_WHITESPACE | XDF_IGNORE_WHITESPACE_CHANGE),
+            ("a", "a\n", XDF_IGNORE_WHITESPACE | XDF_IGNORE_WHITESPACE_CHANGE),
+            ("a", "\ta\n", XDF_IGNORE_WHITESPACE | XDF_IGNORE_WHITESPACE_CHANGE),
+        ];
+
+        let mut buffer = Vec::<u8>::new();
+        for (expected, input, flags) in tv_individual {
+            let actual = extract_string(input.as_bytes(), flags, &mut buffer);
+            assert_eq!(expected, actual, "input: {:?} flags: 0x{:x}", input, flags);
+        }
+    }
+
+    #[test]
+    fn test_chunked_iter_equal() {
+        let tv_str: Vec<(Vec<&str>, Vec<&str>)> = vec![
+            /* equal cases */
+            (vec!["", "", "abc"],         vec!["", "abc"]),
+            (vec!["c", "", "a"],          vec!["c", "a"]),
+            (vec!["a", "", "b", "", "c"], vec!["a", "b", "c"]),
+            (vec!["", "", "a"],           vec!["a"]),
+            (vec!["", "a"],               vec!["a"]),
+            (vec![""],                    vec![]),
+            (vec!["", ""],                vec![""]),
+            (vec!["a"],                   vec!["", "", "a"]),
+            (vec!["a"],                   vec!["", "a"]),
+            (vec![],                      vec![""]),
+            (vec![""],                    vec!["", ""]),
+            (vec!["hello ", "world"],     vec!["hel", "lo wo", "rld"]),
+            (vec!["hel", "lo wo", "rld"], vec!["hello ", "world"]),
+            (vec!["hello world"],         vec!["hello world"]),
+            (vec!["abc", "def"],          vec!["def", "abc"]),
+            (vec![],                      vec![]),
+
+            /* different cases */
+            (vec!["abc"],       vec![]),
+            (vec!["", "", ""],  vec!["", "a"]),
+            (vec!["", "a"],     vec!["b", ""]),
+            (vec!["abc"],       vec!["abc", "de"]),
+            (vec!["abc", "de"], vec!["abc"]),
+            (vec![],            vec!["a"]),
+            (vec!["a"],         vec![]),
+            (vec!["abc", "kj"], vec!["abc", "de"]),
+        ];
+
+        for (lhs, rhs) in tv_str.iter() {
+            let a: Vec<u8> = get_str_it(lhs).flatten().copied().collect();
+            let b: Vec<u8> = get_str_it(rhs).flatten().copied().collect();
+            let expected = a.as_slice() == b.as_slice();
+
+            let it0 = get_str_it(lhs);
+            let it1 = get_str_it(rhs);
+            let actual = chunked_iter_equal(it0, it1);
+            assert_eq!(expected, actual);
+        }
+    }
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 15/17] xdiff: create line_hash() and line_equal()
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (13 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 14/17] xdiff: implement a white space iterator in Rust Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 16/17] xdiff: optimize case where --ignore-cr-at-eol is the only whitespace flag Ezekiel Newren via GitGitGadget
                     ` (4 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

These functions use the whitespace iterator, when applicable, to hash,
and compare lines.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 rust/xdiff/src/lib.rs    | 19 +++++++++++++++++++
 rust/xdiff/src/xutils.rs | 28 ++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
index 9cf0462bcdb9..809c5573c6e7 100644
--- a/rust/xdiff/src/lib.rs
+++ b/rust/xdiff/src/lib.rs
@@ -1,3 +1,7 @@
+use std::hash::Hasher;
+use xxhash_rust::xxh3::Xxh3Default;
+use crate::xutils::*;
+
 pub mod xutils;
 
 pub const XDF_IGNORE_WHITESPACE: u64 = 1 << 1;
@@ -15,3 +19,18 @@ unsafe extern "C" fn xxh3_64(ptr: *const u8, size: usize) -> u64 {
     let slice = std::slice::from_raw_parts(ptr, size);
     xxhash_rust::xxh3::xxh3_64(slice)
 }
+
+#[no_mangle]
+unsafe extern "C" fn xdl_line_hash(ptr: *const u8, size: usize, flags: u64) -> u64 {
+    let line = std::slice::from_raw_parts(ptr, size);
+
+    line_hash(line, flags)
+}
+
+#[no_mangle]
+unsafe extern "C" fn xdl_line_equal(lhs: *const u8, lhs_len: usize, rhs: *const u8, rhs_len: usize, flags: u64) -> bool {
+    let lhs_line = std::slice::from_raw_parts(lhs, lhs_len);
+    let rhs_line = std::slice::from_raw_parts(rhs, rhs_len);
+
+    line_equal(lhs_line, rhs_line, flags)
+}
diff --git a/rust/xdiff/src/xutils.rs b/rust/xdiff/src/xutils.rs
index 38126b47292f..796a5708b6bf 100644
--- a/rust/xdiff/src/xutils.rs
+++ b/rust/xdiff/src/xutils.rs
@@ -1,4 +1,5 @@
 use crate::*;
+use xxhash_rust::xxh3::xxh3_64;
 
 pub(crate) fn xdl_isspace(v: u8) -> bool {
     match v {
@@ -151,6 +152,33 @@ where
     run_option0.is_none() && run_option1.is_none()
 }
 
+
+pub fn line_hash(line: &[u8], flags: u64) -> u64 {
+    if (flags & XDF_WHITESPACE_FLAGS) == 0 {
+        return xxh3_64(line);
+    }
+
+    let mut hasher = Xxh3Default::new();
+    for chunk in WhitespaceIter::new(line, flags) {
+        hasher.update(chunk);
+    }
+
+    hasher.finish()
+}
+
+
+pub fn line_equal(lhs: &[u8], rhs: &[u8], flags: u64) -> bool {
+    if (flags & XDF_WHITESPACE_FLAGS) == 0 {
+        return lhs == rhs;
+    }
+
+    let lhs_it = WhitespaceIter::new(lhs, flags);
+    let rhs_it = WhitespaceIter::new(rhs, flags);
+
+    chunked_iter_equal(lhs_it, rhs_it)
+}
+
+
 #[cfg(test)]
 mod tests {
     use crate::*;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 16/17] xdiff: optimize case where --ignore-cr-at-eol is the only whitespace flag
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (14 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 15/17] xdiff: create line_hash() and line_equal() Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15  1:22   ` [PATCH v2 17/17] xdiff: use rust's version of whitespace processing Ezekiel Newren via GitGitGadget
                     ` (3 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Currently the whitespace iterator is slower than git's C implementation
so we skip using the whitespace iterator if there are no whitespace
flags. Special case the --ignore-cr-at-eol similarly to make it
performant. For the rest of the whitespace flags they will be slower
for now, but as more of Xdiff is translated into Rust it'll be easier
to revisit and optimize whitespace processing. Optimizing the other
whitespace flags now would be difficult because:

  * Xxhash uses chunk based processing.
  * The same iterator is used for hashing and equality, which means the
    iterator could be optimized for returning large chunks for fast
    hashing or could return each byte making equality testing faster.
    I opted for faster hashing. The data structures in C need to be
    cleaned up before they're interoperable with Rust. Once that's done
    I believe a faster method of whitespace processing will be possible.
  * Trying to make heavliy optimized code between 2 languages that aren't
    easily interoperable in their current state makes the code either
    fast or easy to maintain. But once enough of Xdiff is written in
    Rust I believe that a fast and maintainable method can be
    implemented.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 rust/xdiff/src/xutils.rs | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/rust/xdiff/src/xutils.rs b/rust/xdiff/src/xutils.rs
index 796a5708b6bf..1ea9cfa02db5 100644
--- a/rust/xdiff/src/xutils.rs
+++ b/rust/xdiff/src/xutils.rs
@@ -33,6 +33,18 @@ impl<'a> Iterator for WhitespaceIter<'a> {
             return None;
         }
 
+        // optimize case where --ignore-cr-at-eol is the only whitespace flag
+        if (self.flags & XDF_WHITESPACE_FLAGS) == XDF_IGNORE_CR_AT_EOL {
+            if self.index == 0 && self.line.ends_with(b"\r\n") {
+                self.index = self.line.len() - 1;
+                return Some(&self.line[..self.line.len() - 2])
+            } else {
+                let off = self.index;
+                self.index = self.line.len();
+                return Some(&self.line[off..])
+            }
+        }
+
         loop {
             let start = self.index;
             if self.index == self.line.len() {
@@ -172,6 +184,28 @@ pub fn line_equal(lhs: &[u8], rhs: &[u8], flags: u64) -> bool {
         return lhs == rhs;
     }
 
+    // optimize case where --ignore-cr-at-eol is the only whitespace flag
+    if (flags & XDF_WHITESPACE_FLAGS) == XDF_IGNORE_CR_AT_EOL {
+        let a = lhs.ends_with(b"\r\n");
+        let b = rhs.ends_with(b"\r\n");
+
+        if !(a ^ b) {
+            return lhs == rhs;
+        } else {
+            let lm = if a { 1 } else { 0 };
+            let rm = if b { 1 } else { 0 };
+
+            if lhs.len() - lm != rhs.len() - rm {
+                return false;
+            } else if &lhs[..lhs.len() - 1 - lm] != &rhs[..rhs.len() - 1 - rm] {
+                return false;
+            } else if lhs[lhs.len() - 1] != rhs[rhs.len() - 1] {
+                return false;
+            }
+            return true;
+        }
+    }
+
     let lhs_it = WhitespaceIter::new(lhs, flags);
     let rhs_it = WhitespaceIter::new(rhs, flags);
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v2 17/17] xdiff: use rust's version of whitespace processing
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (15 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 16/17] xdiff: optimize case where --ignore-cr-at-eol is the only whitespace flag Ezekiel Newren via GitGitGadget
@ 2025-08-15  1:22   ` Ezekiel Newren via GitGitGadget
  2025-08-15 15:07   ` [-SPAM-] [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification Ramsay Jones
                     ` (2 subsequent siblings)
  19 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-15  1:22 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Delete xdl_hash_record() and xdl_recmatch() in favor of xdl_line_hash()
and xdl_line_equal().

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 rust/xdiff/src/lib.rs |   6 --
 xdiff-interface.c     |   4 +-
 xdiff/xmerge.c        |   8 +--
 xdiff/xprepare.c      |  29 ++------
 xdiff/xutils.c        | 158 ------------------------------------------
 xdiff/xutils.h        |   4 +-
 6 files changed, 15 insertions(+), 194 deletions(-)

diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
index 809c5573c6e7..634b453a21b6 100644
--- a/rust/xdiff/src/lib.rs
+++ b/rust/xdiff/src/lib.rs
@@ -14,12 +14,6 @@ pub const XDF_WHITESPACE_FLAGS: u64 = XDF_IGNORE_WHITESPACE |
     XDF_IGNORE_CR_AT_EOL;
 
 
-#[no_mangle]
-unsafe extern "C" fn xxh3_64(ptr: *const u8, size: usize) -> u64 {
-    let slice = std::slice::from_raw_parts(ptr, size);
-    xxhash_rust::xxh3::xxh3_64(slice)
-}
-
 #[no_mangle]
 unsafe extern "C" fn xdl_line_hash(ptr: *const u8, size: usize, flags: u64) -> u64 {
     let line = std::slice::from_raw_parts(ptr, size);
diff --git a/xdiff-interface.c b/xdiff-interface.c
index 1edcd319e6ef..71ddccf2cc15 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -299,13 +299,13 @@ void xdiff_clear_find_func(xdemitconf_t *xecfg)
 
 unsigned long xdiff_hash_string(const char *s, size_t len, long flags)
 {
-	return xdl_hash_record(&s, s + len, flags);
+	return xdl_line_hash((u8 const*) s, len, flags);
 }
 
 int xdiff_compare_lines(const char *l1, long s1,
 			const char *l2, long s2, long flags)
 {
-	return xdl_recmatch(l1, s1, l2, s2, flags);
+	return xdl_line_equal((u8 const*) l1, s1, (u8 const*) l2, s2, flags);
 }
 
 int parse_conflict_style_name(const char *value)
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index 6fa6ea61a208..2f64651a839b 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -101,8 +101,8 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
 	xrecord_t **rec2 = xe2->xdf2.recs + i2;
 
 	for (i = 0; i < line_count; i++) {
-		int result = xdl_recmatch((const char*) rec1[i]->ptr, rec1[i]->size,
-			(const char*) rec2[i]->ptr, rec2[i]->size, flags);
+		bool result = xdl_line_equal(rec1[i]->ptr, rec1[i]->size,
+			rec2[i]->ptr, rec2[i]->size, flags);
 		if (!result)
 			return -1;
 	}
@@ -324,8 +324,8 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
 
 static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
 {
-	return xdl_recmatch((char const*) rec1->ptr, rec1->size,
-			    (char const*) rec2->ptr, rec2->size, flags);
+	return xdl_line_equal(rec1->ptr, rec1->size,
+			    rec2->ptr, rec2->size, flags);
 }
 
 /*
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index c0463bacd94b..b9f12184b1bb 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -33,8 +33,8 @@
 typedef struct s_xdlclass {
 	struct s_xdlclass *next;
 	u64 ha;
-	char const *line;
-	long size;
+	u8 const *line;
+	usize size;
 	long idx;
 	long len1, len2;
 } xdlclass_t;
@@ -93,15 +93,15 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
 
 static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
 	long hi;
-	char const *line;
+	u8 const *line;
 	xdlclass_t *rcrec;
 
-	line = (char const*) rec->ptr;
+	line = rec->ptr;
 	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
 	for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
 		if (rcrec->ha == rec->ha &&
-				xdl_recmatch(rcrec->line, rcrec->size,
-					(const char*) rec->ptr, rec->size, cf->flags))
+				xdl_line_equal(rcrec->line, rcrec->size,
+					rec->ptr, rec->size, cf->flags))
 			break;
 
 	if (!rcrec) {
@@ -160,9 +160,6 @@ static void xdl_parse_lines(mmfile_t *mf, long narec, xdfile_t *xdf) {
 }
 
 
-extern u64 xxh3_64(u8 const* ptr, usize size);
-
-
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
 			   xdlclassifier_t *cf, xdfile_t *xdf) {
 	unsigned long *ha;
@@ -178,21 +175,9 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 
 	xdl_parse_lines(mf, narec, xdf);
 
-	if ((xpp->flags & XDF_WHITESPACE_FLAGS) == 0) {
-		for (usize i = 0; i < (usize) xdf->nrec; i++) {
-			xrecord_t *rec = xdf->recs[i];
-			rec->ha = xxh3_64(rec->ptr, rec->size);
-		}
-	} else {
-		for (usize i = 0; i < (usize) xdf->nrec; i++) {
-			xrecord_t *rec = xdf->recs[i];
-			char const* dump = (char const*) rec->ptr;
-			rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
-		}
-	}
-
 	for (usize i = 0; i < (usize) xdf->nrec; i++) {
 		xrecord_t *rec = xdf->recs[i];
+		rec->ha = xdl_line_hash(rec->ptr, rec->size, xpp->flags);
 		xdl_classify_record(pass, cf, rec);
 	}
 
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 10e4f20b7c31..29e240eb138b 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -152,164 +152,6 @@ int xdl_blankline(const char *line, long size, long flags)
 	return (i == size);
 }
 
-/*
- * Have we eaten everything on the line, except for an optional
- * CR at the very end?
- */
-static int ends_with_optional_cr(const char *l, long s, long i)
-{
-	int complete = s && l[s-1] == '\n';
-
-	if (complete)
-		s--;
-	if (s == i)
-		return 1;
-	/* do not ignore CR at the end of an incomplete line */
-	if (complete && s == i + 1 && l[i] == '\r')
-		return 1;
-	return 0;
-}
-
-int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags)
-{
-	int i1, i2;
-
-	if (s1 == s2 && !memcmp(l1, l2, s1))
-		return 1;
-	if (!(flags & XDF_WHITESPACE_FLAGS))
-		return 0;
-
-	i1 = 0;
-	i2 = 0;
-
-	/*
-	 * -w matches everything that matches with -b, and -b in turn
-	 * matches everything that matches with --ignore-space-at-eol,
-	 * which in turn matches everything that matches with --ignore-cr-at-eol.
-	 *
-	 * Each flavor of ignoring needs different logic to skip whitespaces
-	 * while we have both sides to compare.
-	 */
-	if (flags & XDF_IGNORE_WHITESPACE) {
-		goto skip_ws;
-		while (i1 < s1 && i2 < s2) {
-			if (l1[i1++] != l2[i2++])
-				return 0;
-		skip_ws:
-			while (i1 < s1 && XDL_ISSPACE(l1[i1]))
-				i1++;
-			while (i2 < s2 && XDL_ISSPACE(l2[i2]))
-				i2++;
-		}
-	} else if (flags & XDF_IGNORE_WHITESPACE_CHANGE) {
-		while (i1 < s1 && i2 < s2) {
-			if (XDL_ISSPACE(l1[i1]) && XDL_ISSPACE(l2[i2])) {
-				/* Skip matching spaces and try again */
-				while (i1 < s1 && XDL_ISSPACE(l1[i1]))
-					i1++;
-				while (i2 < s2 && XDL_ISSPACE(l2[i2]))
-					i2++;
-				continue;
-			}
-			if (l1[i1++] != l2[i2++])
-				return 0;
-		}
-	} else if (flags & XDF_IGNORE_WHITESPACE_AT_EOL) {
-		while (i1 < s1 && i2 < s2 && l1[i1] == l2[i2]) {
-			i1++;
-			i2++;
-		}
-	} else if (flags & XDF_IGNORE_CR_AT_EOL) {
-		/* Find the first difference and see how the line ends */
-		while (i1 < s1 && i2 < s2 && l1[i1] == l2[i2]) {
-			i1++;
-			i2++;
-		}
-		return (ends_with_optional_cr(l1, s1, i1) &&
-			ends_with_optional_cr(l2, s2, i2));
-	}
-
-	/*
-	 * After running out of one side, the remaining side must have
-	 * nothing but whitespace for the lines to match.  Note that
-	 * ignore-whitespace-at-eol case may break out of the loop
-	 * while there still are characters remaining on both lines.
-	 */
-	if (i1 < s1) {
-		while (i1 < s1 && XDL_ISSPACE(l1[i1]))
-			i1++;
-		if (s1 != i1)
-			return 0;
-	}
-	if (i2 < s2) {
-		while (i2 < s2 && XDL_ISSPACE(l2[i2]))
-			i2++;
-		return (s2 == i2);
-	}
-	return 1;
-}
-
-static unsigned long xdl_hash_record_with_whitespace(char const **data,
-		char const *top, long flags) {
-	unsigned long ha = 5381;
-	char const *ptr = *data;
-	int cr_at_eol_only = (flags & XDF_WHITESPACE_FLAGS) == XDF_IGNORE_CR_AT_EOL;
-
-	for (; ptr < top && *ptr != '\n'; ptr++) {
-		if (cr_at_eol_only) {
-			/* do not ignore CR at the end of an incomplete line */
-			if (*ptr == '\r' &&
-			    (ptr + 1 < top && ptr[1] == '\n'))
-				continue;
-		}
-		else if (XDL_ISSPACE(*ptr)) {
-			const char *ptr2 = ptr;
-			int at_eol;
-			while (ptr + 1 < top && XDL_ISSPACE(ptr[1])
-					&& ptr[1] != '\n')
-				ptr++;
-			at_eol = (top <= ptr + 1 || ptr[1] == '\n');
-			if (flags & XDF_IGNORE_WHITESPACE)
-				; /* already handled */
-			else if (flags & XDF_IGNORE_WHITESPACE_CHANGE
-				 && !at_eol) {
-				ha += (ha << 5);
-				ha ^= (unsigned long) ' ';
-			}
-			else if (flags & XDF_IGNORE_WHITESPACE_AT_EOL
-				 && !at_eol) {
-				while (ptr2 != ptr + 1) {
-					ha += (ha << 5);
-					ha ^= (unsigned long) *ptr2;
-					ptr2++;
-				}
-			}
-			continue;
-		}
-		ha += (ha << 5);
-		ha ^= (unsigned long) *ptr;
-	}
-	*data = ptr < top ? ptr + 1: ptr;
-
-	return ha;
-}
-
-unsigned long xdl_hash_record(char const **data, char const *top, long flags) {
-	unsigned long ha = 5381;
-	char const *ptr = *data;
-
-	if (flags & XDF_WHITESPACE_FLAGS)
-		return xdl_hash_record_with_whitespace(data, top, flags);
-
-	for (; ptr < top && *ptr != '\n'; ptr++) {
-		ha += (ha << 5);
-		ha ^= (unsigned long) *ptr;
-	}
-	*data = ptr < top ? ptr + 1: ptr;
-
-	return ha;
-}
-
 unsigned int xdl_hashbits(unsigned int size) {
 	unsigned int val = 1, bits = 0;
 
diff --git a/xdiff/xutils.h b/xdiff/xutils.h
index fd0bba94e8b4..8f524b72c491 100644
--- a/xdiff/xutils.h
+++ b/xdiff/xutils.h
@@ -33,8 +33,8 @@ void xdl_cha_free(chastore_t *cha);
 void *xdl_cha_alloc(chastore_t *cha);
 long xdl_guess_lines(mmfile_t *mf, long sample);
 int xdl_blankline(const char *line, long size, long flags);
-int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags);
-unsigned long xdl_hash_record(char const **data, char const *top, long flags);
+u64 xdl_line_hash(u8 const* ptr, usize size, u64 flags);
+bool xdl_line_equal(u8 const* lhs, usize lhs_len, u8 const* rhs, usize rhs_len, u64 flags);
 unsigned int xdl_hashbits(unsigned int size);
 int xdl_num_out(char *out, long val);
 int xdl_emit_hunk_hdr(long s1, long c1, long s2, long c2,
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 198+ messages in thread

* Re: [-SPAM-] [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (16 preceding siblings ...)
  2025-08-15  1:22   ` [PATCH v2 17/17] xdiff: use rust's version of whitespace processing Ezekiel Newren via GitGitGadget
@ 2025-08-15 15:07   ` Ramsay Jones
  2025-08-19  2:00     ` Elijah Newren
  2025-08-18 22:31   ` Junio C Hamano
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
  19 siblings, 1 reply; 198+ messages in thread
From: Ramsay Jones @ 2025-08-15 15:07 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget, git



On 15/08/2025 02:22, Ezekiel Newren via GitGitGadget wrote:
> Changes in this second round of this RFC:
> 
>  * Now builds and passes tests on all platforms (example run:
>    https://github.com/ezekielnewren/git/actions/runs/16974821401). Special
>    thanks to Johannes Schindelin for patches to things for Windows and
>    linux32.

Hmm, builds on *all* platforms may be a bit optimistic (it doesn't on
cygwin, for instance), so I'm guessing you mean all platforms which
have CI defined. Perhaps you could mention the platforms which you
have tested on. :)

ATB,
Ramsay Jones


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 01/17] doc: add a policy for using Rust
  2025-08-15  1:22   ` [PATCH v2 01/17] doc: add a policy for using Rust brian m. carlson via GitGitGadget
@ 2025-08-15 17:03     ` Matthias Aßhauer
  2025-08-15 21:31       ` Junio C Hamano
  2025-08-19  2:06       ` Ezekiel Newren
  0 siblings, 2 replies; 198+ messages in thread
From: Matthias Aßhauer @ 2025-08-15 17:03 UTC (permalink / raw)
  To: brian m. carlson via GitGitGadget
  Cc: git, Elijah Newren, brian m. carlson, Taylor Blau,
	Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren,
	brian m. carlson



On Fri, 15 Aug 2025, brian m. carlson via GitGitGadget wrote:

> From: "brian m. carlson" <sandals@crustytoothpaste.net>
>
> Git has historically been written primarily in C, with some shell and
> Perl.  However, C is not memory safe, which makes it more likely that
> security vulnerabilities or other bugs will be introduced, and it is
> also more verbose and less ergonomic than other, more modern languages.
>
> One of the most common modern compiled languages which is easily
> interoperable with C is Rust.  It is popular (the most admired language
> on the 2024 Stack Overflow Developer Survey), efficient, portable, and
> robust.
>
> Introduce a document laying out the incremental introduction of Rust to
> Git and provide a detailed rationale for doing so, including the points
> above.  Propose a design for this approach that addresses the needs of
> downstreams and distributors, as well as contributors.
>
> Since we don't want to carry both a C and Rust version of code and want
> to be able to add new features only in Rust, mention that Rust is a
> required part of our platform support policy.
>
> It should be noted that a recent discussion at the Berlin Git Merge
> Contributor Summit found widespread support for the addition of Rust to
> Git.  While of course not all contributors were represented, the
> proposal appeared to have the support of a majority of active
> contributors.
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> Documentation/Makefile                        |   1 +
> Documentation/technical/platform-support.adoc |   2 +
> Documentation/technical/rust-support.adoc     | 119 ++++++++++++++++++
> 3 files changed, 122 insertions(+)
> create mode 100644 Documentation/technical/rust-support.adoc
>
> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index b109d25e9c80..066b761c01b9 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -127,6 +127,7 @@ TECH_DOCS += technical/parallel-checkout
> TECH_DOCS += technical/partial-clone
> TECH_DOCS += technical/platform-support
> TECH_DOCS += technical/racy-git
> +TECH_DOCS += technical/rust-support
> TECH_DOCS += technical/reftable
> TECH_DOCS += technical/scalar
> TECH_DOCS += technical/send-pack-pipeline
> diff --git a/Documentation/technical/platform-support.adoc b/Documentation/technical/platform-support.adoc
> index 0a2fb28d6277..42b04b186105 100644
> --- a/Documentation/technical/platform-support.adoc
> +++ b/Documentation/technical/platform-support.adoc
> @@ -33,6 +33,8 @@ meet the following minimum requirements:
>
> * Has active security support (taking security releases of dependencies, etc)
>
> +* Supports Rust and the toolchain version specified in link:rust-support.txt[].

s/rust-support.txt/rust-support.adoc/

> +
> These requirements are a starting point, and not sufficient on their own for the
> Git community to be enthusiastic about supporting your platform. Maintainers of
> platforms which do meet these requirements can follow the steps below to make it
> diff --git a/Documentation/technical/rust-support.adoc b/Documentation/technical/rust-support.adoc
> new file mode 100644
> index 000000000000..a63327ebc575
> --- /dev/null
> +++ b/Documentation/technical/rust-support.adoc
> @@ -0,0 +1,119 @@
> +Usage of Rust in Git
> +====================
> +
> +Objective
> +---------
> +Introduce Rust into Git incrementally to improve security and maintainability.
> +
> +Background
> +----------
> +Git has historically been written primarily in C, with some portions in shell,
> +Perl, or other languages.  At the time it was originally written, this was
> +important for portability and was a logical choice for software development.
> +
> +:0: link:https://security.googleblog.com/2024/09/eliminating-memory-safety-vulnerabilities-Android.html
> +:1: link:https://www.cisa.gov/resources-tools/resources/product-security-bad-practices
> +
> +However, as time has progressed, we've seen an increased concern with memory
> +safety vulnerabilities and the development of newer languages, such as Rust,
> +that substantially limit or eliminate this class of vulnerabilities.
> +Development in a variety of projects has found that memory safety
> +vulnerabilities constitute about 70% of vulnerabilities of software in
> +languages that are not memory safe.  For instance, {0}[one survey of Android]
> +found that memory safety vulnerabilities decreased from 76% to 24% over six
> +years due to an increase in memory safe code.  Similarly, the U.S. government
> +is {1}[proposing to classify development in memory unsafe languages as a
> +Product Security Bad Practice"].
> +
> +These risks are even more substantial when we consider the fact that Git is a
> +network-facing service.  Many organizations run Git servers internally or use a
> +cloud-based forge, and the risk of accidental exposure or compromise of user
> +data is substantial.  It's important to ensure that Git, whether it's used
> +locally or remotely, is robustly secure.
> +
> +In addition, C is a difficult language to write well and concisely.  While it
> +is of course possible to do anything with C, it lacks built-in support for
> +niceties found in modern languages, such as hash tables, generics, typed
> +errors, and automatic destruction, and most modern language offer shorter, more
> +ergonomic syntax for expressing code.  This is valuable functionality that can
> +allow Git to be developed more rapidly, more easily, by more developers of a
> +variety of levels, and with more confidence in the correctness of the code.
> +
> +For these reasons, adding Rust to Git is a sensible and prudent move that will
> +allow us to improve the quality of the code and potentially attract new developers.
> +
> +Goals
> +-----
> +1. Git continues to build, run, and pass tests on a wide variety of operating
> +   systems and architectures.
> +2. Transition from C to Rust is incremental; that is, code can be ported as it
> +   is convenient and Git does not need to transition all at once.
> +3. Git continues to support older operating systems in conformance with the
> +   platform support policy.
> +
> +Non-Goals
> +---------
> +1. Support for every possible operating system and architecture.  Git already
> +   has a platform support policy which defines what is supported and we already
> +   exclude some operating systems for various reasons (e.g., lacking enough POSIX
> +   tools to pass the test suite).
> +2. Implementing C-only versions of Rust code or compiling a C-only Git.  This
> +   would be difficult to maintain and would not offer the ergonomic benefits we
> +   desire.
> +
> +Design
> +------
> +Git will adopt Rust incrementally.  This transition will start with the
> +creation of a static library that can be linked into the existing Git binaries.
> +At some point, we may wish to expose a dynamic library and compile the Git
> +binaries themselves using Rust.  Using an incremental approach allows us to
> +determine as we go along how to structure our code in the best way for the
> +project and avoids the need to make hard, potentially disruptive, transitions
> +caused by porting a binary wholesale from one language to another that might
> +introduce bugs.
> +
> +We will use the `bindgen` and `cbindgen` crates for handling C-compatible
> +bindings and the `rustix` crate for POSIX-compatible interfaces.  The `libc`
> +crate, which is used by `rustix`, does not expose safe interfaces and does not
> +handle differences between platforms, such as differing 64-bit `stat` call
> +names, and so is less desirable as a target than `rustix`.  We may still choose
> +to use it in some cases if `rustix` does not offer suitable interfaces.
> +
> +Rust upstream releases every six weeks and only supports the latest stable
> +release.  While it is nice that upstream is active, we would like our software
> +releases to have a lifespan exceeding six weeks.  To allow compiling our code
> +on a variety of systems, we will support the version of Rust in Debian stable,
> +plus, for a year after a new Debian stable is released, the version in Debian
> +oldstable.
> +
> +This provides an approximately three-year lifespan of support for a Rust
> +release and allows us to support a variety of operating systems and
> +architectures, including those for which Rust upstream does not build binaries.
> +Debian stable is the benchmark distribution used by many Rust projects when
> +determining supported Rust versions, and it is an extremely portable and
> +popular free software operating system that is available to the public at no
> +charge, which makes it a sensible choice for us as well.
> +
> +We may change this policy if the Rust project issues long-term support releases
> +or the Rust community and distributors agree on releases to target as if they
> +were long-term support releases.
> +
> +This version support policy necessitates that we be very careful about the
> +dependencies we include, since many Rust projects support only the latest
> +stable version.  However, we typically have been careful about dependencies in
> +the first place, so this should not be a major departure from existing policy,
> +although it may be a change for some existing Rust developers.
> +
> +We will avoid including the `Cargo.lock` file in the repository and instead
> +specify minimum dependency versions in the `Cargo.toml` file.  We want to allow
> +people to use newer versions of dependencies if necessary to support newer
> +platforms without needing to force upgrades of dependencies on all users, and
> +it provides additional flexibility for distribution maintainers.
> +
> +We do not plan to support beta or nightly versions of the Rust compiler.  These
> +versions may change rapidly and especially parts of the toolchain such as
> +Clippy, the lint tool, can have false positives or add additional warnings with
> +too great of a frequency to be supportable by the project.  However, we do plan
> +to support alternate compilers, such as the rust_codegen_gcc backend and gccrs
> +when they are stable and support our desired release versions.  This will
> +provide greater support for more operating systems and architectures.
> -- 
> gitgitgadget

best regards

Matthias

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 09/17] Do support Windows again after requiring Rust
  2025-08-15  1:22   ` [PATCH v2 09/17] Do support Windows again after requiring Rust Johannes Schindelin via GitGitGadget
@ 2025-08-15 17:12     ` Matthias Aßhauer
  2025-08-15 21:48       ` Junio C Hamano
  2025-08-19  2:22       ` Ezekiel Newren
  0 siblings, 2 replies; 198+ messages in thread
From: Matthias Aßhauer @ 2025-08-15 17:12 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Elijah Newren, brian m. carlson, Taylor Blau,
	Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren,
	Johannes Schindelin



On Fri, 15 Aug 2025, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> By default, Rust wants to build MS Visual C-compatible libraries on
> Windows, because that is _the_ native C compiler.
>
> Git is historically lacking in its MSVC support, and the official Git
> for Windows versions are built using GCC instead. As a consequence, a
> (subset of a) GCC toolchain is installed as part of the `windows-build`
> job of every CI build.
>
> Naturally, this requires adjustments in how Rust is called, most
> importantly it requires installing support for a GCC-compatible build
> target.
>
> Let's make the necessary adjustment both in the CI-specific code that
> installs Rust as well as in the Windows-specific configuration in
> `config.mak.uname`.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> [en: Moved lib userenv handling to a later patch]
> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
> ---
> ci/install-rust.sh | 3 +++
> config.mak.uname   | 7 +++++++
> 2 files changed, 10 insertions(+)
>
> diff --git a/ci/install-rust.sh b/ci/install-rust.sh
> index 141ceddb17cf..c22baa629ceb 100644
> --- a/ci/install-rust.sh
> +++ b/ci/install-rust.sh
> @@ -28,6 +28,9 @@ if [ "$BITNESS" = "32" ]; then
>   $CARGO_HOME/bin/rustup default --force-non-host $RUST_VERSION || exit $?
> else
>   $CARGO_HOME/bin/rustup default $RUST_VERSION || exit $?
> +  if [ "$CI_OS_NAME" = "windows" ]; then
> +    $CARGO_HOME/bin/rustup target add x86_64-pc-windows-gnu || exit $?
> +  fi
> fi
>
> . $CARGO_HOME/env
> diff --git a/config.mak.uname b/config.mak.uname
> index 3e26bb074a4b..a22703284b56 100644
> --- a/config.mak.uname
> +++ b/config.mak.uname
> @@ -727,19 +727,26 @@ ifeq ($(uname_S),MINGW)
> 		prefix = /mingw32
> 		HOST_CPU = i686
> 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,_mainCRTStartup
> +		CARGO_BUILD_TARGET = i686-pc-windows-gnu
>         endif
>         ifeq (MINGW64,$(MSYSTEM))
> 		prefix = /mingw64
> 		HOST_CPU = x86_64
> 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
> +		CARGO_BUILD_TARGET = x86_64-pc-windows-gnu

I've said it when Johannes originally sent this patch[1], but it bears 
repeating: The *-pc-windows-gnu targets will pass CI, but would mean 
raising the required Windows version from 8.1 to 10. We'd want to use
the *-win7-windows-gnu targets[2] to keep Windows 8.1 supported.

[1] 
https://lore.kernel.org/git/pull.1980.git.git.1752784344.gitgitgadget@gmail.com/T/#ma10be2ed0a0e776b0af2fdd0de63d51ba51609e4
[2] 
https://doc.rust-lang.org/nightly/rustc/platform-support/win7-windows-gnu.html

>         else ifeq (CLANGARM64,$(MSYSTEM))
> 		prefix = /clangarm64
> 		HOST_CPU = aarch64
> 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
> +		CARGO_BUILD_TARGET = aarch64-pc-windows-gnu

As I've also mentioned before [1], this target doesn't seem to exist. The 
correct target seems to be aarch64-pc-windows-gnullvm. [3]

[3] https://doc.rust-lang.org/rustc/platform-support/windows-gnullvm.html

>         else
> 		COMPAT_CFLAGS += -D_USE_32BIT_TIME_T
> 		BASIC_LDFLAGS += -Wl,--large-address-aware
>         endif
> +
> +	export CARGO_BUILD_TARGET
> +	RUST_TARGET_DIR = rust/target/$(CARGO_BUILD_TARGET)/$(RUST_BUILD_MODE)
> +
> 	CC = gcc
> 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \
> 		-fstack-protector-strong
> -- 
> gitgitgadget

Best regards

Matthias

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 01/17] doc: add a policy for using Rust
  2025-08-15 17:03     ` Matthias Aßhauer
@ 2025-08-15 21:31       ` Junio C Hamano
  2025-08-16  8:06         ` Matthias Aßhauer
  2025-08-19  2:06       ` Ezekiel Newren
  1 sibling, 1 reply; 198+ messages in thread
From: Junio C Hamano @ 2025-08-15 21:31 UTC (permalink / raw)
  To: Matthias Aßhauer
  Cc: brian m. carlson via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren

Matthias Aßhauer <mha1993@live.de> writes:

>> diff --git a/Documentation/technical/platform-support.adoc b/Documentation/technical/platform-support.adoc
>> index 0a2fb28d6277..42b04b186105 100644
>> --- a/Documentation/technical/platform-support.adoc
>> +++ b/Documentation/technical/platform-support.adoc
>> @@ -33,6 +33,8 @@ meet the following minimum requirements:
>>
>> * Has active security support (taking security releases of dependencies, etc)
>>
>> +* Supports Rust and the toolchain version specified in link:rust-support.txt[].
>
> s/rust-support.txt/rust-support.adoc/

Your review is very much appreciated, but ...

>> +
>> These requirements are a starting point, and not sufficient on their own for the
>> Git community to be enthusiastic about supporting your platform. Maintainers of
>> platforms which do meet these requirements can follow the steps below to make it

...could you trim your quotes to relevant parts that is needed to
help readers understand the point?  It is a bit brutal to force
readers wade through 200 lines of text only to find this "you got
.txt suffix for a document with .adoc suffix" comment.

Thanks.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 09/17] Do support Windows again after requiring Rust
  2025-08-15 17:12     ` Matthias Aßhauer
@ 2025-08-15 21:48       ` Junio C Hamano
  2025-08-15 22:11         ` Johannes Schindelin
                           ` (2 more replies)
  2025-08-19  2:22       ` Ezekiel Newren
  1 sibling, 3 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-08-15 21:48 UTC (permalink / raw)
  To: Matthias Aßhauer
  Cc: Johannes Schindelin via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren

Matthias Aßhauer <mha1993@live.de> writes:

>>         ifeq (MINGW64,$(MSYSTEM))
>> 		prefix = /mingw64
>> 		HOST_CPU = x86_64
>> 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
>> +		CARGO_BUILD_TARGET = x86_64-pc-windows-gnu
>
> I've said it when Johannes originally sent this patch[1], but it bears
> repeating: The *-pc-windows-gnu targets will pass CI, but would mean
> raising the required Windows version from 8.1 to 10. We'd want to use
> the *-win7-windows-gnu targets[2] to keep Windows 8.1 supported.

It seems that Dscho did not respond on the list to your initial
objection in the discussion you cited.

I do not think we spell out which releases of various platforms are
still supported by us (we do list requirements for platforms in the
Platform Support Policy document, though), but in general we should
not be attempting to give extended support to systems that the
vendor no longer supports.  As Windows 8.1 is no longer supported by
Microsoft since Jan 2023, and Windows 10 will go out of support in a
few month after Oct 2025, if I am reading the table correctly, so as
long as we document our intention of dropping a commercial system
that is no longer supported by its vender clearly, I do not mind the
above that discards 8.1 [*].

But I may be biased, as I do not live in the Microsoft ecosystem.


* https://learn.microsoft.com/en-us/lifecycle/products/windows-81
* https://learn.microsoft.com/en-us/lifecycle/products/windows-10-home-and-pro
* https://learn.microsoft.com/en-us/lifecycle/products/windows-10-enterprise-and-education

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 09/17] Do support Windows again after requiring Rust
  2025-08-15 21:48       ` Junio C Hamano
@ 2025-08-15 22:11         ` Johannes Schindelin
  2025-08-15 23:37           ` Junio C Hamano
  2025-08-15 23:37         ` Junio C Hamano
  2025-08-16  8:53         ` Matthias Aßhauer
  2 siblings, 1 reply; 198+ messages in thread
From: Johannes Schindelin @ 2025-08-15 22:11 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Matthias Aßhauer, Johannes Schindelin via GitGitGadget, git,
	Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren

[-- Attachment #1: Type: text/plain, Size: 3154 bytes --]

Hi,

On Fri, 15 Aug 2025, Junio C Hamano wrote:

> Matthias Aßhauer <mha1993@live.de> writes:
> 
> >>         ifeq (MINGW64,$(MSYSTEM))
> >> 		prefix = /mingw64
> >> 		HOST_CPU = x86_64
> >> 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
> >> +		CARGO_BUILD_TARGET = x86_64-pc-windows-gnu
> >
> > I've said it when Johannes originally sent this patch[1], but it bears
> > repeating: The *-pc-windows-gnu targets will pass CI, but would mean
> > raising the required Windows version from 8.1 to 10. We'd want to use
> > the *-win7-windows-gnu targets[2] to keep Windows 8.1 supported.
> 
> It seems that Dscho did not respond on the list to your initial
> objection in the discussion you cited.

I would have hoped that it is clear by now that Matthias is as much to be
trusted with Git for Windows concerns as I am (just like the other active
Git for Windows contributors, if you can get them onto this here mailing
list). Just in case that it really needs my explicit ACK: What he said
about Windows 8.1 support in Git for Windows is accurate.

> I do not think we spell out which releases of various platforms are
> still supported by us (we do list requirements for platforms in the
> Platform Support Policy document, though), but in general we should
> not be attempting to give extended support to systems that the
> vendor no longer supports.  As Windows 8.1 is no longer supported by
> Microsoft since Jan 2023, and Windows 10 will go out of support in a
> few month after Oct 2025, if I am reading the table correctly, so as
> long as we document our intention of dropping a commercial system
> that is no longer supported by its vender clearly, I do not mind the
> above that discards 8.1 [*].

While there is obviously some connection between the official EOL of
Windows versions (see https://endoflife.date/windows) and which versions
Git for Windows supports, the balance we try to strike (and by "we" I
don't apply the pluralis majestatis, it is very much a consensus between
all active Git for Windows contributors, including Matthias and myself) is
to support older Windows versions as much as can be done with a reasonable
amount of effort (where "reasonable" is obviously as subjective as the
definition of "taste").

The consensus of what Windows versions can be reasonably supported is
documented at https://gitforwindows.org/requirements.html#windows-version.
Currently that is — you may have guessed it — as Matthias has stated: Git
for Windows will support Windows 8.1 for the time being. The hope is that
we will be able to notify users when support for that version will be
phased out well in advance, much as we did for Windows 7 and 8, where
deprecation notices were included in the release notes of several Git for
Windows versions prior to v2.46.2, which was the last Git for Windows
version to support Windows 7 and 8.

> But I may be biased, as I do not live in the Microsoft ecosystem.

You do point that out frequently, so I believe that you made the point.

Personally, I would like to see a more open-minded approach here.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 09/17] Do support Windows again after requiring Rust
  2025-08-15 21:48       ` Junio C Hamano
  2025-08-15 22:11         ` Johannes Schindelin
@ 2025-08-15 23:37         ` Junio C Hamano
  2025-08-16  8:53         ` Matthias Aßhauer
  2 siblings, 0 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-08-15 23:37 UTC (permalink / raw)
  To: Matthias Aßhauer
  Cc: Johannes Schindelin via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren

Junio C Hamano <gitster@pobox.com> writes:

> I do not think we spell out which releases of various platforms are
> still supported by us (we do list requirements for platforms in the
> Platform Support Policy document, though), but in general we should
> not be attempting to give extended support to systems that the
> vendor no longer supports.
> ... so as
> long as we document our intention of dropping a commercial system
> that is no longer supported by its vender clearly, I do not mind the
> above that discards 8.1 [*].

Apologies to authors of the PSP document.  We do have this as part
of "minimum requirement":

 * Has active security support (taking security releases of dependencies, etc)

So, being implicit about dropping 8.1, while it may be less than
nice as we could, is perhaps fine.

If we wanted to support a tad older releases that are still used
widely, that is fine as well.  I didn't check what additional
documents and policies GfW (Git for Windows) project says about this
issue, so perhaps it is all documented there, in which case our PSP
document is fine as-is, too.  In other words, if GfW project takes
responsibility of supporting ports for an older release that is out
of support, what our PSP document says does not matter.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 09/17] Do support Windows again after requiring Rust
  2025-08-15 22:11         ` Johannes Schindelin
@ 2025-08-15 23:37           ` Junio C Hamano
  0 siblings, 0 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-08-15 23:37 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Matthias Aßhauer, Johannes Schindelin via GitGitGadget, git,
	Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> But I may be biased, as I do not live in the Microsoft ecosystem.
>
> You do point that out frequently, so I believe that you made the point.
>
> Personally, I would like to see a more open-minded approach here.

I gave it as an explanation for the reason why my conclusion may be
different from what those in the Microsoft ecosystem decided to keep
supporting, and I have no reason to object what they want to do.

It has nothing to do with open-mindedness and such a comment was
uncalled for.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 01/17] doc: add a policy for using Rust
  2025-08-15 21:31       ` Junio C Hamano
@ 2025-08-16  8:06         ` Matthias Aßhauer
  0 siblings, 0 replies; 198+ messages in thread
From: Matthias Aßhauer @ 2025-08-16  8:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: brian m. carlson via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren



On Fri, 15 Aug 2025, Junio C Hamano wrote:

> ...could you trim your quotes to relevant parts that is needed to
> help readers understand the point?  It is a bit brutal to force
> readers wade through 200 lines of text only to find this "you got
> .txt suffix for a document with .adoc suffix" comment.
>
> Thanks.
>

Sorry, I'll try to be more mindful of trimming my mails.

Best regards

Matthias

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 09/17] Do support Windows again after requiring Rust
  2025-08-15 21:48       ` Junio C Hamano
  2025-08-15 22:11         ` Johannes Schindelin
  2025-08-15 23:37         ` Junio C Hamano
@ 2025-08-16  8:53         ` Matthias Aßhauer
  2025-08-17 15:57           ` Junio C Hamano
  2 siblings, 1 reply; 198+ messages in thread
From: Matthias Aßhauer @ 2025-08-16  8:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren

[-- Attachment #1: Type: text/plain, Size: 3109 bytes --]



On Fri, 15 Aug 2025, Junio C Hamano wrote:

> Matthias Aßhauer <mha1993@live.de> writes:
>
>>>         ifeq (MINGW64,$(MSYSTEM))
>>> 		prefix = /mingw64
>>> 		HOST_CPU = x86_64
>>> 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
>>> +		CARGO_BUILD_TARGET = x86_64-pc-windows-gnu
>>
>> I've said it when Johannes originally sent this patch[1], but it bears
>> repeating: The *-pc-windows-gnu targets will pass CI, but would mean
>> raising the required Windows version from 8.1 to 10. We'd want to use
>> the *-win7-windows-gnu targets[2] to keep Windows 8.1 supported.
>
> It seems that Dscho did not respond on the list to your initial
> objection in the discussion you cited.

He didn't, but from various interactions surrounding Git for Windows, I do 
think he's currently in favour of keeping Windows 8.1 supported in Git for 
Windows.

> I do not think we spell out which releases of various platforms are
> still supported by us (we do list requirements for platforms in the
> Platform Support Policy document, though),

We don't do that in git.git, no. Git for Windows very explicitly spells 
out which versions of Windows are supported (though usually we just 
mention the Desktop versions and imply the corresponding Windows Server 
versions). Since 2.47.0 that is Windows 8.1 and newer Desktop releases [1] 
(Windows 11 on ARM64). We even tend to announce in advance when we intend 
to drop support for a Windows version.

[1] 
https://gitforwindows.org/faq.html#which-versions-of-windows-are-supported

> but in general we should not be attempting to give extended support to
> systems that the vendor no longer supports.  As Windows 8.1 is no longer
> supported by Microsoft since Jan 2023, and Windows 10 will go out of
> support in a few month after Oct 2025, if I am reading the table correctly,

Git for Windows has historically supported Windows Versions beyond this 
date.

* XP was supported for 2 years beyond the official extended EOL. [1][2]
* Vista was supported for 5 years beyond the official extended EOL [1][3]
* 7 was supported for 4 years beyond the official extended EOL [1][4]
* 8 was supported for 8 years beyond the official extended EOL [1][5]

git.git has historically roughly followed Git for Windows in this.

You're reading the tables correctly, but there are so called LTSC releases 
of Windows 10 with support until 2026/2029. [6]

[2] https://learn.microsoft.com/en-us/lifecycle/products/windows-xp
[3] https://learn.microsoft.com/en-us/lifecycle/products/windows-vista
[4] https://learn.microsoft.com/en-us/lifecycle/products/windows-7
[5] https://learn.microsoft.com/en-us/lifecycle/products/windows-8
[6] 
https://learn.microsoft.com/en-us/windows/whats-new/ltsc/whats-new-windows-10-2021#lifecycle

> so as long as we document our intention of dropping a commercial system
> that is no longer supported by its vender clearly, I do not mind the
> above that discards 8.1 [*].

I'm not completely opposed, but I do think it should be a concious 
decision and not an unintended side effect of some change that our CI
didn't catch.

Best regards

Matthias

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 09/17] Do support Windows again after requiring Rust
  2025-08-16  8:53         ` Matthias Aßhauer
@ 2025-08-17 15:57           ` Junio C Hamano
  0 siblings, 0 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-08-17 15:57 UTC (permalink / raw)
  To: Matthias Aßhauer
  Cc: Johannes Schindelin via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren

Matthias Aßhauer <mha1993@live.de> writes:

>> It seems that Dscho did not respond on the list to your initial
>> objection in the discussion you cited.
>
> He didn't, but from various interactions surrounding Git for Windows,
> I do think he's currently in favour of keeping Windows 8.1 supported
> in Git for Windows.

OK, as long as folks with stakes in Git for Windows are in
agreement, I have no problem (except that in principle we should
avoid doing disservice to the end user population by doing things
that encourage their prolonged use of out-of-security-support
platforms).

> We don't do that in git.git, no. Git for Windows very explicitly
> spells out which versions of Windows are supported (though usually we
> just mention the Desktop versions and imply the corresponding Windows
> Server versions).

Yup, thanks for clarifying it to me.  Could you do the same for
future readers of the updated version of the commit 09/17 by telling
the author about that in your review comment, so that the log
message can talk about the reasons why a specific CARGO_BUILD_TARGET
was chosen (e.g. "as described in Git for Windows documentation at
$URL, we support Windows versions X or newer, so we use this cargo
build target to ensure we still work with that version").

> I'm not completely opposed, but I do think it should be a concious
> decision and not an unintended side effect of some change that our CI
> didn't catch.

Oh, absolutely.

Thanks for clarification.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (17 preceding siblings ...)
  2025-08-15 15:07   ` [-SPAM-] [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification Ramsay Jones
@ 2025-08-18 22:31   ` Junio C Hamano
  2025-08-18 23:52     ` Ben Knoble
  2025-08-19  1:52     ` Elijah Newren
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
  19 siblings, 2 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-08-18 22:31 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget
  Cc: git, Elijah Newren, brian m. carlson, Taylor Blau,
	Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren

"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  * Code style: Should we adopt a Rust code style of some sort? Perhaps have
>    the code always be formatted by rustfmt in its default configuration?

Sounds sensible.  I'll let folks with more Rust inclination to
figure out what _the_ style should be, but having _a_ style we all
stick to is good.

>  * Rust version: We are not using the same Rust version on all platforms in
>    CI; 32-bit builds and Windows builds require a newer Rust version to
>    successfully build.

As long as we do not have to bend backwards on the code with "if
using version X or older, use this alternative codepath" all over
the place, "pick a version that works on each platform" that results
in "due to the quality of ports, some platform's older port is
unusable and newer version is required" is not too bad, especially
for a system that is still rapidly getting improved and a bit on the
unstable side, I think.

>  * Performance with whitepsace flags: I originally intended to leave out the
>    whitespace handling because I knew it was slower,...

If the Rust guinea pig were different from how each line is hashed
in xdiff, which is targetted by am/xdiff-hash-tweak topic, then we
can leave out the whitespace-ignoring hashing from this topic
altogether.

Quite honestly, I do not like throwing away the other optimization
efforts that can be reviewed and integrated trivially, but it is
practically impossible to do so while still have a "let's start
playing with Rust" topic that targets exactly the same area.  Yes,
this topic licked the same corner of the cake first, but still,
I was hoping that the second iteration of this series would use a
different code paths as a Rust guinea pig.

After all, the primary objective of our first Rust topic is to set
the framework right (like the platform and version support policies,
how foreign interfaces like type systems get impedance-matched, what
the impact to our build infrature looks like, etc.).  It would be a
huge plus if it can at the same time demonstrate how much safer code
we can write with less effort if we switched writing some (and
gradually larger, posibly) parts in the language.

The result this cover letter has in its title, 'accelerate xdiff',
is not primarily due to use of Rust, is it?  As the other topic
demonstrates, it is to use an implementation of a faster hash
function (we can consider it to be an impressive technology
demonstration that a rust reimplementation of original C code can be
done in a very performant way).  And nobody is expecting that we
would be using Rust for speed anyway, no?

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification
  2025-08-18 22:31   ` Junio C Hamano
@ 2025-08-18 23:52     ` Ben Knoble
  2025-08-19  1:52     ` Elijah Newren
  1 sibling, 0 replies; 198+ messages in thread
From: Ben Knoble @ 2025-08-18 23:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ezekiel Newren


> Le 18 août 2025 à 18:31, Junio C Hamano <gitster@pobox.com> a écrit :
> 
> "Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> * Code style: Should we adopt a Rust code style of some sort? Perhaps have
>>   the code always be formatted by rustfmt in its default configuration?
> 
> Sounds sensible.  I'll let folks with more Rust inclination to
> figure out what _the_ style should be, but having _a_ style we all
> stick to is good.

While there can be room for configuring the formatter if we have particularly idiosyncratic needs, I’d second going with cargo fmt / rustfmt in default configurations to start (the former is a shortcut for the latter over a whole crate AIUI).

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification
  2025-08-18 22:31   ` Junio C Hamano
  2025-08-18 23:52     ` Ben Knoble
@ 2025-08-19  1:52     ` Elijah Newren
  2025-08-19  9:47       ` Junio C Hamano
  1 sibling, 1 reply; 198+ messages in thread
From: Elijah Newren @ 2025-08-19  1:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ezekiel Newren via GitGitGadget, git, brian m. carlson,
	Taylor Blau, Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren

On Mon, Aug 18, 2025 at 3:31 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> >  * Code style: Should we adopt a Rust code style of some sort? Perhaps have
> >    the code always be formatted by rustfmt in its default configuration?
>
> Sounds sensible.  I'll let folks with more Rust inclination to
> figure out what _the_ style should be, but having _a_ style we all
> stick to is good.
>
> >  * Rust version: We are not using the same Rust version on all platforms in
> >    CI; 32-bit builds and Windows builds require a newer Rust version to
> >    successfully build.
>
> As long as we do not have to bend backwards on the code with "if
> using version X or older, use this alternative codepath" all over
> the place, "pick a version that works on each platform" that results
> in "due to the quality of ports, some platform's older port is
> unusable and newer version is required" is not too bad, especially
> for a system that is still rapidly getting improved and a bit on the
> unstable side, I think.
>
> >  * Performance with whitepsace flags: I originally intended to leave out the
> >    whitespace handling because I knew it was slower,...
>
> If the Rust guinea pig were different from how each line is hashed
> in xdiff, which is targetted by am/xdiff-hash-tweak topic, then we
> can leave out the whitespace-ignoring hashing from this topic
> altogether.
>
> Quite honestly, I do not like throwing away the other optimization
> efforts that can be reviewed and integrated trivially, but it is
> practically impossible to do so while still have a "let's start
> playing with Rust" topic that targets exactly the same area.  Yes,
> this topic licked the same corner of the cake first, but still,
> I was hoping that the second iteration of this series would use a
> different code paths as a Rust guinea pig.

I agree it would be nice to merge those down first.  One possibility
here would be having Ezekiel rebase his work on top of
am/xdiff-hash-tweak...

> After all, the primary objective of our first Rust topic is to set
> the framework right (like the platform and version support policies,
> how foreign interfaces like type systems get impedance-matched, what
> the impact to our build infrature looks like, etc.).  It would be a
> huge plus if it can at the same time demonstrate how much safer code
> we can write with less effort if we switched writing some (and
> gradually larger, posibly) parts in the language.
>
> The result this cover letter has in its title, 'accelerate xdiff',
> is not primarily due to use of Rust, is it?  As the other topic
> demonstrates, it is to use an implementation of a faster hash
> function (we can consider it to be an impressive technology
> demonstration that a rust reimplementation of original C code can be
> done in a very performant way).  And nobody is expecting that we
> would be using Rust for speed anyway, no?

You are correct that it is not due to Rust.  My original objective
that I tried to trick/coax/nerd-snipe/whatever Ezekiel into looking
into was cleaning up xdiff, to allow various features in `git replay`
and rebasing merges.  Ezekiel was interested in Rust and in a
challenge.  xdiff is quite a knot to untangle, and Ezekiel's been at
work on it for quite some time.  But, he happened to notice this
speedup, and found a way to turn it into a short series without all
his other patches.

We could perhaps shift focus, but I'm curious if you're wanting the
xdiff work to be thrown away or shelved in favor of some completely
different area of the code, or if perhaps some other aspects of the
xdiff code would still be amenable.

One big challenge in finding another area, whether in xdiff or
elsewhere, is that Ezekiel really wants to showcase how nice Rust's
unittesting is, but that only works if we start at a low-level and
build up.  If we make Rust code call C code, that'd either not be
readily unit-testable, or would require us stubbing out the entire
implementation behind that C interface or doing something more
complex.

What if Ezekiel rebased his series on am/xdiff-hash-tweak, and then
instead of further modifying the hashing in the first series, he:
  - introduced brian's patch with the platform support
  - setup the CI builds to test building with Rust (including Johannes' patches)
  - started working on transitioning xdfile_t data structure to be FFI friendly

One issue here is that it probably wouldn't be too long before we'd
want to rip out the xdlclassifier struct (mostly a glorified
hashtable), which is kind of tied up in a knot with the hashing and
line equality, so it would probably only be a few more series down the
road before we'd want to start tweaking the code in
am/xdiff-hash-tweak to make use of the new data structures.  Would
that be agreeable?

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [-SPAM-] [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification
  2025-08-15 15:07   ` [-SPAM-] [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification Ramsay Jones
@ 2025-08-19  2:00     ` Elijah Newren
  2025-08-24 16:52       ` Patrick Steinhardt
  0 siblings, 1 reply; 198+ messages in thread
From: Elijah Newren @ 2025-08-19  2:00 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: Ezekiel Newren via GitGitGadget, git

On Fri, Aug 15, 2025 at 8:10 AM Ramsay Jones
<ramsay@ramsayjones.plus.com> wrote:
>
> On 15/08/2025 02:22, Ezekiel Newren via GitGitGadget wrote:
> > Changes in this second round of this RFC:
> >
> >  * Now builds and passes tests on all platforms (example run:
> >    https://github.com/ezekielnewren/git/actions/runs/16974821401). Special
> >    thanks to Johannes Schindelin for patches to things for Windows and
> >    linux32.
>
> Hmm, builds on *all* platforms may be a bit optimistic (it doesn't on
> cygwin, for instance), so I'm guessing you mean all platforms which
> have CI defined. Perhaps you could mention the platforms which you
> have tested on. :)

Ezekiel says this email didn't show up in his inbox (no idea why), but
yes what was meant was all platforms where gitgitgadget CI runs.  If
you follow the github.com link in the text that you quoted, you can
see all those platforms (various windows flavors, various osx builds,
musl, sparse, static analysis, etc.).

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 01/17] doc: add a policy for using Rust
  2025-08-15 17:03     ` Matthias Aßhauer
  2025-08-15 21:31       ` Junio C Hamano
@ 2025-08-19  2:06       ` Ezekiel Newren
  1 sibling, 0 replies; 198+ messages in thread
From: Ezekiel Newren @ 2025-08-19  2:06 UTC (permalink / raw)
  To: Matthias Aßhauer
  Cc: brian m. carlson via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble

On Fri, Aug 15, 2025 at 11:03 AM Matthias Aßhauer <mha1993@live.de> wrote:
>
>> +* Supports Rust and the toolchain version specified in link:rust-support.txt[].
>
> s/rust-support.txt/rust-support.adoc/

Thanks for spotting that, I'll fix it up.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 09/17] Do support Windows again after requiring Rust
  2025-08-15 17:12     ` Matthias Aßhauer
  2025-08-15 21:48       ` Junio C Hamano
@ 2025-08-19  2:22       ` Ezekiel Newren
  1 sibling, 0 replies; 198+ messages in thread
From: Ezekiel Newren @ 2025-08-19  2:22 UTC (permalink / raw)
  To: Matthias Aßhauer
  Cc: Johannes Schindelin via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, Ben Knoble

On Fri, Aug 15, 2025 at 11:13 AM Matthias Aßhauer <mha1993@live.de> wrote:
> > diff --git a/ci/install-rust.sh b/ci/install-rust.sh
> > index 141ceddb17cf..c22baa629ceb 100644
> > --- a/ci/install-rust.sh
> > +++ b/ci/install-rust.sh
> > @@ -28,6 +28,9 @@ if [ "$BITNESS" = "32" ]; then
> >   $CARGO_HOME/bin/rustup default --force-non-host $RUST_VERSION || exit $?
> > else
> >   $CARGO_HOME/bin/rustup default $RUST_VERSION || exit $?
> > +  if [ "$CI_OS_NAME" = "windows" ]; then
> > +    $CARGO_HOME/bin/rustup target add x86_64-pc-windows-gnu || exit $?
> > +  fi
> > fi
> >
> > . $CARGO_HOME/env
> > diff --git a/config.mak.uname b/config.mak.uname
> > index 3e26bb074a4b..a22703284b56 100644
> > --- a/config.mak.uname
> > +++ b/config.mak.uname
> > @@ -727,19 +727,26 @@ ifeq ($(uname_S),MINGW)
> >               prefix = /mingw32
> >               HOST_CPU = i686
> >               BASIC_LDFLAGS += -Wl,--pic-executable,-e,_mainCRTStartup
> > +             CARGO_BUILD_TARGET = i686-pc-windows-gnu
> >         endif
> >         ifeq (MINGW64,$(MSYSTEM))
> >               prefix = /mingw64
> >               HOST_CPU = x86_64
> >               BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
> > +             CARGO_BUILD_TARGET = x86_64-pc-windows-gnu
>
> I've said it when Johannes originally sent this patch[1], but it bears
> repeating: The *-pc-windows-gnu targets will pass CI, but would mean
> raising the required Windows version from 8.1 to 10. We'd want to use
> the *-win7-windows-gnu targets[2] to keep Windows 8.1 supported.
>
> [1]
> https://lore.kernel.org/git/pull.1980.git.git.1752784344.gitgitgadget@gmail.com/T/#ma10be2ed0a0e776b0af2fdd0de63d51ba51609e4
> [2]
> https://doc.rust-lang.org/nightly/rustc/platform-support/win7-windows-gnu.html
>
> >         else ifeq (CLANGARM64,$(MSYSTEM))
> >               prefix = /clangarm64
> >               HOST_CPU = aarch64
> >               BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
> > +             CARGO_BUILD_TARGET = aarch64-pc-windows-gnu
>
> As I've also mentioned before [1], this target doesn't seem to exist. The
> correct target seems to be aarch64-pc-windows-gnullvm. [3]
>
> [3] https://doc.rust-lang.org/rustc/platform-support/windows-gnullvm.html

I'll be happy to make that change for the next round.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification
  2025-08-19  1:52     ` Elijah Newren
@ 2025-08-19  9:47       ` Junio C Hamano
  0 siblings, 0 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-08-19  9:47 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Ezekiel Newren via GitGitGadget, git, brian m. carlson,
	Taylor Blau, Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ben Knoble, Ezekiel Newren

Elijah Newren <newren@gmail.com> writes:

> What if Ezekiel rebased his series on am/xdiff-hash-tweak, and then
> instead of further modifying the hashing in the first series, he:
>   - introduced brian's patch with the platform support
>   - setup the CI builds to test building with Rust (including Johannes' patches)
>   - started working on transitioning xdfile_t data structure to be FFI friendly

Yup, that matches my understanding of what our first Rust topic
would want to achieve, i.e. get the framework right.

> One issue here is that it probably wouldn't be too long before we'd
> want to rip out the xdlclassifier struct (mostly a glorified
> hashtable), which is kind of tied up in a knot with the hashing and
> line equality, so it would probably only be a few more series down the
> road before we'd want to start tweaking the code in
> am/xdiff-hash-tweak to make use of the new data structures.

I understand that at this point we do not expect to import any
(security or otherwise) fixes to xdiff code from "upstream", as we
are practically the upstream for other folks?  For our consumption,
that would allow us to take a quite different stance from our
historical attitude, which was to keep the modification to the
minimum, and apply whatever clean-ups and optimizations only to suit
our needs.  So what you outline does make certain sense to me.

I am however not sure if we owe anything to our downstream projects,
though (e.g., I understand that libgit2 extracted xdiff part from
our source, so if we have serious security fixes in ours, they would
want to be able to import them?).

Thanks.  

^ permalink raw reply	[flat|nested] 198+ messages in thread

* [PATCH v3 00/15] RFC: Cleanup xdiff and begin its rustification
  2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
                     ` (18 preceding siblings ...)
  2025-08-18 22:31   ` Junio C Hamano
@ 2025-08-23  3:55   ` Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 01/15] doc: add a policy for using Rust brian m. carlson via GitGitGadget
                       ` (14 more replies)
  19 siblings, 15 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren

In order to facilitate the quick adoption of am/xdiff-hash-tweak, this round
drops the changes to hashing in xdiff and instead modifies another part of
xdiff.

A high level overview of v3:

 * patch 1: add a policy for using Rust (brian's patch, with a small tweak)
 * patch 2: introduce Rust to the codebase
 * patches 3-5: adapt CI (github workflows) to build Git with Rust
 * patch 6: introduce the ivec type
 * patches 7-14: xdiff code cleanup in preparation for translating to Rust
 * patch 15: translate a C function into Rust and call it from C

I'm particularly interested in what folks think of the new ivec type for
sharing data across the language barrier. Thoughts?

Build results for these changes:
https://github.com/git/git/actions/runs/17170212383

Links to older versions, which focused on hashing in xdiff:

 * v1:
   https://lore.kernel.org/git/pull.1980.git.git.1752784344.gitgitgadget@gmail.com/
 * v2:
   https://lore.kernel.org/git/pull.1980.v2.git.git.1755220973.gitgitgadget@gmail.com/

Ezekiel Newren (13):
  xdiff: introduce rust
  github workflows: install rust
  github workflows: upload Cargo.lock
  ivec: create a vector type that is interoperable between C and Rust
  xdiff/xprepare: remove superfluous forward declarations
  xdiff: delete unnecessary fields from xrecord_t and xdfile_t
  xdiff: make fields of xrecord_t Rust friendly
  xdiff: use one definition for freeing xdfile_t
  xdiff: replace chastore with an ivec in xdfile_t
  xdiff: delete nrec field from xdfile_t
  xdiff: delete recs field from xdfile_t
  xdiff: make xdfile_t more rust friendly
  xdiff: implement xdl_trim_ends() in Rust

Johannes Schindelin (1):
  win+Meson: do allow linking with the Rust-built xdiff

brian m. carlson (1):
  doc: add a policy for using Rust

 .github/workflows/main.yml                    |  89 +++-
 .gitignore                                    |   3 +
 Documentation/Makefile                        |   1 +
 Documentation/technical/platform-support.adoc |   2 +
 Documentation/technical/rust-support.adoc     | 142 ++++++
 Makefile                                      |  69 ++-
 build_rust.sh                                 |  57 +++
 ci/install-dependencies.sh                    |  14 +-
 ci/install-rust-toolchain.sh                  |  30 ++
 ci/install-rustup.sh                          |  25 +
 ci/lib.sh                                     |   1 +
 ci/make-test-artifacts.sh                     |   9 +
 ci/run-build-and-tests.sh                     |  13 +
 config.mak.uname                              |   4 +
 git-compat-util.h                             |  17 +
 interop/ivec.c                                | 151 ++++++
 interop/ivec.h                                |  52 ++
 meson.build                                   |  54 +-
 rust/Cargo.toml                               |   6 +
 rust/interop/Cargo.toml                       |  14 +
 rust/interop/src/ivec.rs                      | 462 ++++++++++++++++++
 rust/interop/src/lib.rs                       |  10 +
 rust/xdiff/Cargo.toml                         |  15 +
 rust/xdiff/src/lib.rs                         |  15 +
 rust/xdiff/src/xprepare.rs                    |  27 +
 rust/xdiff/src/xtypes.rs                      |  19 +
 xdiff/xdiffi.c                                |  60 +--
 xdiff/xdiffi.h                                |   8 +-
 xdiff/xemit.c                                 |  24 +-
 xdiff/xhistogram.c                            |   2 +-
 xdiff/xmerge.c                                |  72 +--
 xdiff/xpatience.c                             |  16 +-
 xdiff/xprepare.c                              | 271 ++++------
 xdiff/xtypes.h                                |  27 +-
 xdiff/xutils.c                                |  12 +-
 35 files changed, 1474 insertions(+), 319 deletions(-)
 create mode 100644 Documentation/technical/rust-support.adoc
 create mode 100755 build_rust.sh
 create mode 100755 ci/install-rust-toolchain.sh
 create mode 100755 ci/install-rustup.sh
 create mode 100644 interop/ivec.c
 create mode 100644 interop/ivec.h
 create mode 100644 rust/Cargo.toml
 create mode 100644 rust/interop/Cargo.toml
 create mode 100644 rust/interop/src/ivec.rs
 create mode 100644 rust/interop/src/lib.rs
 create mode 100644 rust/xdiff/Cargo.toml
 create mode 100644 rust/xdiff/src/lib.rs
 create mode 100644 rust/xdiff/src/xprepare.rs
 create mode 100644 rust/xdiff/src/xtypes.rs


base-commit: 16bd9f20a403117f2e0d9bcda6c6e621d3763e77
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1980%2Fezekielnewren%2Fxdiff_rust_speedup-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1980/ezekielnewren/xdiff_rust_speedup-v3
Pull-Request: https://github.com/git/git/pull/1980

Range-diff vs v2:

  1:  75dfb40ead3 !  1:  6d065f550fe doc: add a policy for using Rust
     @@ Commit message
          contributors.
      
          Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
     +    [en: Added some comments about types, and changed the recommondations
     +         about cbindgen, bindgen, rustix, libc.]
          Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
      
       ## Documentation/Makefile ##
     @@ Documentation/technical/platform-support.adoc: meet the following minimum requir
       
       * Has active security support (taking security releases of dependencies, etc)
       
     -+* Supports Rust and the toolchain version specified in link:rust-support.txt[].
     ++* Supports Rust and the toolchain version specified in link:rust-support.adoc[].
      +
       These requirements are a starting point, and not sufficient on their own for the
       Git community to be enthusiastic about supporting your platform. Maintainers of
     @@ Documentation/technical/rust-support.adoc (new)
      +caused by porting a binary wholesale from one language to another that might
      +introduce bugs.
      +
     -+We will use the `bindgen` and `cbindgen` crates for handling C-compatible
     -+bindings and the `rustix` crate for POSIX-compatible interfaces.  The `libc`
     -+crate, which is used by `rustix`, does not expose safe interfaces and does not
     -+handle differences between platforms, such as differing 64-bit `stat` call
     -+names, and so is less desirable as a target than `rustix`.  We may still choose
     -+to use it in some cases if `rustix` does not offer suitable interfaces.
     ++Crates like libc or rustix define types like c_long, but in ways that are not
     ++safe across platforms.
     ++From https://docs.rs/rustix/latest/rustix/ffi/type.c_long.html:
     ++
     ++    This type will always be i32 or i64.  Most notably, many Linux-based
     ++    systems assume an i64, but Windows assumes i32.  The C standard technically
     ++    only requires that this type be a signed integer that is at least 32 bits
     ++    and at least the size of an int, although in practice, no system would
     ++    have a long that is neither an i32 nor i64.
     ++
     ++Also, note that other locations, such as
     ++https://docs.rs/libc/latest/libc/type.c_long.html, just hardcode c_long as i64
     ++even though C may mean i32 on some platforms.
     ++
     ++As such, using the c_long type would give us portability issues, and
     ++perpetuate some of the bugs git has faced across platforms.  Avoid using C's
     ++types (long, unsigned, char, etc.), and switch to unambiguous types (e.g. i32
     ++or i64) before trying to make C and Rust interoperate.
     ++
     ++Crates like libc and rustix may have also traditionally aided interoperability
     ++with older versions of Rust (e.g.  when worrying about stat[64] system calls),
     ++but the Rust standard library in newer versions of Rust handle these concerns
     ++in a platform agnostic way.  There may arise cases where we need to consider
     ++these crates, but for now we omit them.
     ++
     ++Tools like bindgen and cbindgen create C-styled unsafe Rust code rather than
     ++idiomatic Rust; where possible, we prefer to switch to idiomatic Rust.  Any
     ++standard C library functions that are needed can be manually wrapped on the
     ++Rust side.
      +
      +Rust upstream releases every six weeks and only supports the latest stable
      +release.  While it is nice that upstream is active, we would like our software
  2:  7709e5eddba <  -:  ----------- xdiff: introduce rust
  8:  7dc241e6682 !  2:  03939951256 github workflows: install rust
     @@ Metadata
      Author: Ezekiel Newren <ezekielnewren@gmail.com>
      
       ## Commit message ##
     -    github workflows: install rust
     +    xdiff: introduce rust
      
     -    Since we have introduced rust, it needs to be installed for the
     -    continuous integration build targets. Create an install script
     -    (build_rust.sh) that needs to be run as the same user that builds git.
     -    Because of the limitations of meson, create build_rust.sh which makes
     -    it easy to centralize how rust is built between meson and make.
     +    Upcoming patches will simplify xdiff, while also porting parts of it to
     +    Rust. In preparation, add some stubs and setup the Rust build. For now,
     +    it is easier to let cargo build rust and have make or meson merely link
     +    against the static library that cargo builds. In line with ongoing
     +    libification efforts, use multiple crates to allow more modularity on
     +    the Rust side. xdiff is the crate that this series will focus on, but
     +    we also introduce the interop crate for future patch series.
      
     -    There are 2 interesting decisions worth calling out in this commit:
     -
     -    * The 'output' field of custom_target() does not allow specifying a
     -      file nested inside the build directory. Thus create build_rust.sh to
     -      build rust with all of its parameters and then moves libxdiff.a to
     -      the root of the build directory.
     -
     -    * Install curl, to facilitate the rustup install script.
     +    In order to facilitate interoperability between C and Rust, introduce C
     +    definitions for Rust primitive types in git-compat-util.h.
      
          Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
      
     - ## .github/workflows/main.yml ##
     -@@ .github/workflows/main.yml: on: [push, pull_request]
     - 
     - env:
     -   DEVELOPER: 1
     -+  RUST_VERSION: 1.87.0
     - 
     - # If more than one workflow run is triggered for the very same commit hash
     - # (which happens when multiple branches pointing to the same commit), only
     + ## .gitignore ##
     +@@ .gitignore: Release/
     + /contrib/buildsystems/out
     + /contrib/libgit-rs/target
     + /contrib/libgit-sys/target
     ++/.idea/
     ++/rust/target/
     ++/rust/Cargo.lock
      
       ## Makefile ##
      @@ Makefile: TEST_SHELL_PATH = $(SHELL_PATH)
     @@ Makefile: TEST_SHELL_PATH = $(SHELL_PATH)
      +
      +EXTLIBS =
      +
     - ifeq ($(DEBUG), 1)
     --RUST_LIB = rust/target/debug/libxdiff.a
     ++ifeq ($(DEBUG), 1)
      +  RUST_BUILD_MODE = debug
     - else
     --RUST_LIB = rust/target/release/libxdiff.a
     ++else
      +  RUST_BUILD_MODE = release
      +endif
      +
     @@ Makefile: TEST_SHELL_PATH = $(SHELL_PATH)
      +UNAME_S := $(shell uname -s)
      +ifeq ($(UNAME_S),Linux)
      +  EXTLIBS += -ldl
     - endif
     ++endif
      +
       REFTABLE_LIB = reftable/libreftable.a
       
     @@ Makefile: UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/lib-reftable.o
       # xdiff and reftable libs may in turn depend on what is in libgit.a
       GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
      -EXTLIBS =
     - 
     --GITLIBS += $(RUST_LIB)
     ++
       
       GIT_USER_AGENT = git/$(GIT_VERSION)
       
     @@ Makefile: $(REMOTE_CURL_ALIASES): $(REMOTE_CURL_PRIMARY)
       	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
       		$(filter %.o,$^) $(LIBS)
       
     +@@ Makefile: $(LIB_FILE): $(LIB_OBJS)
     + $(XDIFF_LIB): $(XDIFF_OBJS)
     + 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
     + 
     ++
     + $(REFTABLE_LIB): $(REFTABLE_OBJS)
     + 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
     + 
      @@ Makefile: perf: all
       
       t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) $(UNIT_TEST_DIR)/test-lib.o
     @@ Makefile: perf: all
       	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS)
       
       check-sha1:: t/helper/test-tool$X
     +@@ Makefile: cocciclean:
     + 	$(RM) -r .build/contrib/coccinelle
     + 	$(RM) contrib/coccinelle/*.cocci.patch
     + 
     +-clean: profile-clean coverage-clean cocciclean
     ++rustclean:
     ++	cd rust && cargo clean
     ++
     ++clean: profile-clean coverage-clean cocciclean rustclean
     + 	$(RM) -r .build $(UNIT_TEST_BIN)
     + 	$(RM) GIT-TEST-SUITES
     + 	$(RM) po/git.pot po/git-core.pot
      @@ Makefile: FUZZ_CXXFLAGS ?= $(ALL_CFLAGS)
       .PHONY: fuzz-all
       fuzz-all: $(FUZZ_PROGRAMS)
     @@ build_rust.sh (new)
      @@
      +#!/bin/sh
      +
     -+if [ -z "$CARGO_HOME" ]; then
     -+  export CARGO_HOME=$HOME/.cargo
     -+  echo >&2 "::warning:: CARGO_HOME is not set"
     -+fi
     -+echo "CARGO_HOME=$CARGO_HOME"
      +
     -+rustc -vV
     -+cargo --version
     ++rustc -vV || exit $?
     ++cargo --version || exit $?
      +
      +dir_git_root=${0%/*}
      +dir_build=$1
     -+rust_target=$2
     ++rust_build_profile=$2
      +crate=$3
      +
      +dir_rust=$dir_git_root/rust
     @@ build_rust.sh (new)
      +  exit 1
      +fi
      +
     -+if [ "$rust_target" = "" ]; then
     -+  echo "did not specify the rust_target"
     ++if [ "$rust_build_profile" = "" ]; then
     ++  echo "did not specify the rust_build_profile"
      +  exit 1
      +fi
      +
     -+if [ "$rust_target" = "release" ]; then
     ++if [ "$rust_build_profile" = "release" ]; then
      +  rust_args="--release"
     -+  export RUSTFLAGS='-Aunused_imports -Adead_code'
     -+elif [ "$rust_target" = "debug" ]; then
     ++  export RUSTFLAGS=''
     ++elif [ "$rust_build_profile" = "debug" ]; then
      +  rust_args=""
     -+  export RUSTFLAGS='-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
     ++  export RUSTFLAGS='-C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
      +else
     -+  echo "illegal rust_target value $rust_target"
     ++  echo "illegal rust_build_profile value $rust_build_profile"
      +  exit 1
      +fi
      +
     -+cd $dir_rust && cargo clean && pwd && cargo build -p $crate $rust_args; cd ..
     ++cd $dir_rust && cargo clean && pwd && cargo build -p $crate $rust_args; cd $dir_git_root
      +
      +libfile="lib${crate}.a"
     ++if rustup show active-toolchain | grep windows-msvc; then
     ++  libfile="${crate}.lib"
     ++fi
      +dst=$dir_build/$libfile
      +
      +if [ "$dir_git_root" != "$dir_build" ]; then
     -+  src=$dir_rust/target/$rust_target/$libfile
     ++  src=$dir_rust/target/$rust_build_profile/$libfile
      +  if [ ! -f $src ]; then
     -+    echo >&2 "::error:: cannot find path of static library"
     ++    echo >&2 "::error:: cannot find path of static library $src is not a file or does not exist"
      +    exit 5
      +  fi
      +
     @@ build_rust.sh (new)
      +  mv $src $dst
      +fi
      
     - ## ci/install-dependencies.sh ##
     -@@ ci/install-dependencies.sh: fi
     - 
     - case "$distro" in
     - alpine-*)
     --	apk add --update shadow sudo meson ninja-build gcc libc-dev curl-dev openssl-dev expat-dev gettext \
     -+	apk add --update shadow sudo meson ninja-build gcc libc-dev curl curl-dev openssl-dev expat-dev gettext \
     - 		zlib-ng-dev pcre2-dev python3 musl-libintl perl-utils ncurses \
     - 		apache2 apache2-http2 apache2-proxy apache2-ssl apache2-webdav apr-util-dbd_sqlite3 \
     - 		bash cvs gnupg perl-cgi perl-dbd-sqlite perl-io-tty >/dev/null
     - 	;;
     - fedora-*|almalinux-*)
     - 	dnf -yq update >/dev/null &&
     --	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gawk gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
     -+	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gawk gettext zlib-devel expat-devel openssl-devel curl curl-devel pcre2-devel >/dev/null
     - 	;;
     - ubuntu-*|i386/ubuntu-*|debian-*)
     - 	# Required so that apt doesn't wait for user input on certain packages.
     -@@ ci/install-dependencies.sh: ubuntu-*|i386/ubuntu-*|debian-*)
     - 	sudo apt-get -q update
     - 	sudo apt-get -q -y install \
     - 		$LANGUAGES apache2 cvs cvsps git gnupg $SVN \
     --		make libssl-dev libcurl4-openssl-dev libexpat-dev wget sudo default-jre \
     --		tcl tk gettext zlib1g-dev perl-modules liberror-perl libauthen-sasl-perl \
     -+		make libssl-dev curl libcurl4-openssl-dev libexpat-dev wget sudo default-jre \
     -+		tcl tk gettext zlib1g zlib1g-dev perl-modules liberror-perl libauthen-sasl-perl \
     - 		libemail-valid-perl libio-pty-perl libio-socket-ssl-perl libnet-smtp-ssl-perl libdbd-sqlite3-perl libcgi-pm-perl \
     - 		libsecret-1-dev libpcre2-dev meson ninja-build pkg-config \
     - 		${CC_PACKAGE:-${CC:-gcc}} $PYTHON_PACKAGE
     -@@ ci/install-dependencies.sh: ClangFormat)
     - 	;;
     - StaticAnalysis)
     - 	sudo apt-get -q update
     --	sudo apt-get -q -y install coccinelle libcurl4-openssl-dev libssl-dev \
     -+	sudo apt-get -q -y install coccinelle curl libcurl4-openssl-dev libssl-dev \
     - 		libexpat-dev gettext make
     - 	;;
     - sparse)
     - 	sudo apt-get -q update -q
     --	sudo apt-get -q -y install libssl-dev libcurl4-openssl-dev \
     --		libexpat-dev gettext zlib1g-dev sparse
     -+	sudo apt-get -q -y install libssl-dev curl libcurl4-openssl-dev \
     -+		libexpat-dev gettext zlib1g zlib1g-dev sparse
     - 	;;
     - Documentation)
     - 	sudo apt-get -q update
     + ## git-compat-util.h ##
     +@@ git-compat-util.h: static inline int is_xplatform_dir_sep(int c)
     + #include "compat/msvc.h"
     + #endif
     + 
     ++/* rust types */
     ++typedef uint8_t   u8;
     ++typedef uint16_t  u16;
     ++typedef uint32_t  u32;
     ++typedef uint64_t  u64;
     ++
     ++typedef int8_t    i8;
     ++typedef int16_t   i16;
     ++typedef int32_t   i32;
     ++typedef int64_t   i64;
     ++
     ++typedef float     f32;
     ++typedef double    f64;
     ++
     ++typedef size_t    usize;
     ++typedef ptrdiff_t isize;
     ++
     + /* used on Mac OS X */
     + #ifdef PRECOMPOSE_UNICODE
     + #include "compat/precompose_utf8.h"
      
     - ## ci/install-rust.sh (new) ##
     -@@
     -+#!/bin/sh
     -+
     -+if [ "$(id -u)" -eq 0 ]; then
     -+  echo >&2 "::warning:: installing rust as root"
     -+fi
     -+
     -+if [ "$CARGO_HOME" = "" ]; then
     -+  echo >&2 "::warning:: CARGO_HOME is not set"
     -+  export CARGO_HOME=$HOME/.cargo
     -+fi
     -+
     -+export RUSTUP_HOME=$CARGO_HOME
     -+
     -+if [ "$RUST_VERSION" = "" ]; then
     -+  echo >&2 "::error:: RUST_VERSION is not set"
     -+  exit 2
     -+fi
     -+
     -+## install rustup
     -+curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain none -y
     -+if [ ! -f $CARGO_HOME/env ]; then
     -+  echo "PATH=$CARGO_HOME/bin:\$PATH" > $CARGO_HOME/env
     -+fi
     -+## install a specific version of rust
     -+if [ "$BITNESS" = "32" ]; then
     -+  $CARGO_HOME/bin/rustup set default-host i686-unknown-linux-gnu || exit $?
     -+  $CARGO_HOME/bin/rustup install $RUST_VERSION || exit $?
     -+  $CARGO_HOME/bin/rustup default --force-non-host $RUST_VERSION || exit $?
     -+else
     -+  $CARGO_HOME/bin/rustup default $RUST_VERSION || exit $?
     -+fi
     -+
     -+. $CARGO_HOME/env
     -
     - ## ci/lib.sh ##
     -@@
     - # Library of functions shared by all CI scripts
     - 
     -+
     -+export BITNESS="64"
     -+if command -v getconf >/dev/null && [ "$(getconf LONG_BIT 2>/dev/null)" = "32" ]; then
     -+  export BITNESS="32"
     -+fi
     -+echo "BITNESS=$BITNESS"
     -+
     -+
     - if test true = "$GITHUB_ACTIONS"
     - then
     - 	begin_group () {
     -
     - ## ci/make-test-artifacts.sh ##
     -@@ ci/make-test-artifacts.sh: mkdir -p "$1" # in case ci/lib.sh decides to quit early
     - 
     - . ${0%/*}/lib.sh
     - 
     -+## install rust per user rather than system wide
     -+. ${0%/*}/install-rust.sh
     -+
     - group Build make artifacts-tar ARTIFACTS_DIRECTORY="$1"
     - 
     -+if [ -d "$CARGO_HOME" ]; then
     -+  rm -rf $CARGO_HOME
     -+fi
     -+
     - check_unignored_build_artifacts
     -
     - ## ci/run-build-and-tests.sh ##
     -@@
     - 
     - . ${0%/*}/lib.sh
     + ## meson.build ##
     +@@ meson.build: version_gen_environment.set('GIT_DATE', get_option('build_date'))
     + version_gen_environment.set('GIT_USER_AGENT', get_option('user_agent'))
     + version_gen_environment.set('GIT_VERSION', get_option('version'))
       
     -+## install rust per user rather than system wide
     -+. ${0%/*}/install-rust.sh
     ++if get_option('optimization') in ['2', '3', 's', 'z']
     ++  rust_build_profile = 'release'
     ++else
     ++  rust_build_profile = 'debug'
     ++endif
      +
     -+rustc -vV
     -+cargo --version || exit $?
     ++# Run `rustup show active-toolchain` and capture output
     ++rustup_out = run_command('rustup', 'show', 'active-toolchain',
     ++                         check: true).stdout().strip()
     ++
     ++rust_crates = ['xdiff']
     ++rust_builds = []
     ++
     ++foreach crate : rust_crates
     ++  if rustup_out.contains('windows-msvc')
     ++    libfile = crate + '.lib'
     ++  else
     ++    libfile = 'lib' + crate + '.a'
     ++  endif
     ++
     ++  rust_builds += custom_target(
     ++    'rust_build_'+crate,
     ++    output: libfile,
     ++    build_by_default: true,
     ++    build_always_stale: true,
     ++    command: [
     ++      meson.project_source_root() / 'build_rust.sh',
     ++      meson.current_build_dir(), rust_build_profile, crate,
     ++    ],
     ++    install: false,
     ++  )
     ++endforeach
      +
     - run_tests=t
     - 
     - case "$jobname" in
     -@@ ci/run-build-and-tests.sh: case "$jobname" in
     - 	;;
     - esac
     - 
     -+if [ -d "$CARGO_HOME" ]; then
     -+  rm -rf $CARGO_HOME
     -+fi
      +
     - check_unignored_build_artifacts
     - save_good_tree
     -
     - ## meson.build ##
     -@@ meson.build: else
     -   rustflags = '-Aunused_imports -Adead_code -C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
     - endif
     - 
     --
     --rust_leaf = custom_target('rust_leaf',
     -+rust_build_xdiff = custom_target('rust_build_xdiff',
     -   output: 'libxdiff.a',
     -   build_by_default: true,
     -   build_always_stale: true,
     --  command: ['cargo', 'build',
     --            '--manifest-path', meson.project_source_root() / 'rust/Cargo.toml'
     --  ] + rust_args,
     --  env: {
     --    'RUSTFLAGS': rustflags,
     --  },
     -+  command: [
     -+    meson.project_source_root() / 'build_rust.sh',
     -+    meson.current_build_dir(), rust_target, 'xdiff',
     -+  ],
     -   install: false,
     - )
     - 
     --rust_xdiff_dep = declare_dependency(
     --  link_args: ['-L' + meson.project_source_root() / 'rust/target' / rust_target, '-lxdiff'],
     --#  include_directories: include_directories('xdiff/include'),  # Adjust if you expose headers
     --)
     --
     --
       compiler = meson.get_compiler('c')
       
       libgit_sources = [
      @@ meson.build: version_def_h = custom_target(
     - )
       libgit_sources += version_def_h
       
     --libgit_dependencies += rust_xdiff_dep
     --
       libgit = declare_dependency(
      -  link_with: static_library('git',
      -    sources: libgit_sources,
     @@ meson.build: version_def_h = custom_target(
      +      dependencies: libgit_dependencies,
      +      include_directories: libgit_include_directories,
      +    ),
     -+    rust_build_xdiff,
     -+  ],
     ++  ] + rust_builds,
         compile_args: libgit_c_args,
         dependencies: libgit_dependencies,
         include_directories: libgit_include_directories,
     +
     + ## rust/Cargo.toml (new) ##
     +@@
     ++[workspace]
     ++members = [
     ++    "xdiff",
     ++    "interop",
     ++]
     ++resolver = "2"
     +
     + ## rust/interop/Cargo.toml (new) ##
     +@@
     ++[package]
     ++name = "interop"
     ++version = "0.1.0"
     ++edition = "2021"
     ++
     ++[lib]
     ++name = "interop"
     ++path = "src/lib.rs"
     ++## staticlib to generate xdiff.a for use by gcc
     ++## cdylib (optional) to generate xdiff.so for use by gcc
     ++## rlib is required by the rust unit tests
     ++crate-type = ["staticlib", "rlib"]
     ++
     ++[dependencies]
     +
     + ## rust/interop/src/lib.rs (new) ##
     +
     + ## rust/xdiff/Cargo.toml (new) ##
     +@@
     ++[package]
     ++name = "xdiff"
     ++version = "0.1.0"
     ++edition = "2021"
     ++
     ++[lib]
     ++name = "xdiff"
     ++path = "src/lib.rs"
     ++## staticlib to generate xdiff.a for use by gcc
     ++## cdylib (optional) to generate xdiff.so for use by gcc
     ++## rlib is required by the rust unit tests
     ++crate-type = ["staticlib", "rlib"]
     ++
     ++[dependencies]
     ++interop = { path = "../interop" }
     +
     + ## rust/xdiff/src/lib.rs (new) ##
 12:  fffdb326710 !  3:  a98d9e4d21b github workflows: define rust versions and targets in the same place
     @@ Metadata
      Author: Ezekiel Newren <ezekielnewren@gmail.com>
      
       ## Commit message ##
     -    github workflows: define rust versions and targets in the same place
     +    github workflows: install rust
      
     -    Consolidate the Rust toolchain definitions in main.yaml. Prefer using
     -    actions-rs/toolchain@v1 where possible, but for docker targets use
     -    a script to install the Rust toolchain. Four overrides are used in
     +    Prefer using actions-rs/toolchain@v1 where possible to install rustup,
     +    but for docker targets use a script to install rustup. Consolidate the
     +    Rust toolchain definitions in main.yaml. Use install-rust-toolchain.sh
     +    to ensure the correct toolchain is used. Five overrides are used in
          main.yaml:
      
            * On Windows: Rust didn't resolve the bcrypt library on Windows
              correctly until version 1.78.0. Also since rustup mis-identifies
              the Rust toolchain, the Rust target triple must be set to
     -        x86_64-pc-windows-gnu.
     +        x86_64-pc-windows-gnu for make (win build), and
     +        x86_64-pc-windows-msvc for meson (win+Meson build).
            * On musl: libc differences, such as ftruncate64 vs ftruncate, were
              not accounted for until Rust version 1.72.0. No older version of
              Rust will work on musl for our needs.
            * In a 32-bit docker container running on a 64-bit host, we need to
              override the Rust target triple. This is because rustup asks the
              kernel for the bitness of the system and it says 64, even though
     -        the container will only run 32-bit. This also allows us to remove
     -        the BITNESS environment variable in ci/lib.sh.
     +        the container is 32-bit. This also allows us to remove the
     +        BITNESS environment variable in ci/lib.sh.
      
     +    The logic for selecting library names was initially provided in a patch
     +    from Johannes, but was reworked and squashed into this commit.
     +
     +    Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
          Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
      
       ## .github/workflows/main.yml ##
     -@@ .github/workflows/main.yml: on: [push, pull_request]
     - 
     - env:
     -   DEVELOPER: 1
     --  RUST_VERSION: 1.87.0
     - 
     - # If more than one workflow run is triggered for the very same commit hash
     - # (which happens when multiple branches pointing to the same commit), only
      @@ .github/workflows/main.yml: jobs:
           outputs:
             enabled: ${{ steps.check-ref.outputs.enabled }}${{ steps.skip-if-redundant.outputs.enabled }}
     @@ .github/workflows/main.yml: jobs:
      +      rust_version_windows: 1.78.0
      +      rust_version_musl: 1.72.0
      +      ## the rust target is inferred by rustup unless specified
     -+      rust_target_windows: x86_64-pc-windows-gnu
     ++      rust_target_windows_make: x86_64-pc-windows-gnu
     ++      rust_target_windows_meson: x86_64-pc-windows-msvc
      +      rust_target_32bit_linux: i686-unknown-linux-gnu
           steps:
             - name: try to clone ci-config branch
               run: |
      @@ .github/workflows/main.yml: jobs:
     -           /c/Program\ Files/Git/mingw64/bin/curl -Lo libuserenv.a \
     -             https://github.com/git-for-windows/git-sdk-64/raw/HEAD/mingw64/lib/libuserenv.a
     -         }
     -+    - name: Install Rust
     +     needs: ci-config
     +     if: needs.ci-config.outputs.enabled == 'yes'
     +     runs-on: windows-latest
     ++    env:
     ++      CARGO_HOME: "/c/Users/runneradmin/.cargo"
     +     concurrency:
     +       group: windows-build-${{ github.ref }}
     +       cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
     +     steps:
     +     - uses: actions/checkout@v4
     +     - uses: git-for-windows/setup-git-for-windows-sdk@v1
     ++    - name: Install rustup via github actions
      +      uses: actions-rs/toolchain@v1
      +      with:
     -+        toolchain: ${{ needs.ci-config.outputs.rust_version_windows }}
     -+        target: ${{ needs.ci-config.outputs.rust_target_windows }}
     ++        toolchain: stable
      +        profile: minimal
     -+        override: true
     ++        override: false
     ++    - name: Install Rust toolchain
     ++      shell: bash
     ++      env:
     ++        RUST_VERSION: ${{ needs.ci-config.outputs.rust_version_windows }}
     ++        RUST_TARGET: ${{ needs.ci-config.outputs.rust_target_windows_make }}
     ++      run: ci/install-rust-toolchain.sh
           - name: build
             shell: bash
             env:
     -         HOME: ${{runner.workspace}}
     -         NO_PERL: 1
     -+        CARGO_HOME: "/c/Users/runneradmin/.cargo"
     -       run: . /etc/profile && ci/make-test-artifacts.sh artifacts
     -     - name: zip up tracked files
     -       run: git archive -o artifacts/tracked.tar.gz HEAD
      @@ .github/workflows/main.yml: jobs:
     +     needs: ci-config
     +     if: needs.ci-config.outputs.enabled == 'yes'
     +     runs-on: windows-latest
     ++    env:
     ++      CARGO_HOME: "/c/Users/runneradmin/.cargo"
     +     concurrency:
     +       group: windows-meson-build-${{ github.ref }}
     +       cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
           steps:
           - uses: actions/checkout@v4
           - uses: actions/setup-python@v5
     -+    - name: Install Rust
     ++    - name: Install rustup via github actions
      +      uses: actions-rs/toolchain@v1
      +      with:
     -+        toolchain: ${{ needs.ci-config.outputs.rust_version_windows }}
     -+        target: ${{ needs.ci-config.outputs.rust_target_windows }}
     ++        toolchain: stable
      +        profile: minimal
     -+        override: true
     ++        override: false
     ++    - name: Install Rust toolchain
     ++      shell: bash
     ++      env:
     ++        RUST_VERSION: ${{ needs.ci-config.outputs.rust_version_windows }}
     ++        RUST_TARGET: ${{ needs.ci-config.outputs.rust_target_windows_meson }}
     ++      run: ci/install-rust-toolchain.sh
           - name: Set up dependencies
             shell: pwsh
             run: pip install meson ninja
      @@ .github/workflows/main.yml: jobs:
     +       jobname: ${{matrix.vector.jobname}}
     +       CI_JOB_IMAGE: ${{matrix.vector.pool}}
     +       TEST_OUTPUT_DIRECTORY: ${{github.workspace}}/t
     ++      CARGO_HOME: "/Users/runner/.cargo"
     +     runs-on: ${{matrix.vector.pool}}
           steps:
           - uses: actions/checkout@v4
           - run: ci/install-dependencies.sh
     -+    - name: Install Rust
     +-    - run: ci/run-build-and-tests.sh
     ++    - name: Install rustup via github actions
      +      uses: actions-rs/toolchain@v1
      +      with:
     -+        toolchain: ${{ needs.ci-config.outputs.rust_version_minimum }}
     ++        toolchain: stable
      +        profile: minimal
     -+        override: true
     -     - run: ci/run-build-and-tests.sh
     ++        override: false
     ++    - name: Install Rust toolchain
     ++      shell: bash
     ++      env:
     ++        RUST_VERSION: ${{ needs.ci-config.outputs.rust_version_minimum }}
     ++      run: ci/install-rust-toolchain.sh
     ++    - name: Run build and tests
     ++      run: ci/run-build-and-tests.sh
           - name: print test failures
             if: failure() && env.FAILED_TEST_ARTIFACTS != ''
     +       run: ci/print-test-failures.sh
      @@ .github/workflows/main.yml: jobs:
                 cc: gcc
               - jobname: linux-musl-meson
     @@ .github/workflows/main.yml: jobs:
             CI_JOB_IMAGE: ${{matrix.vector.image}}
      +      CI_IS_DOCKER: "true"
             CUSTOM_PATH: /custom
     -+      RUST_VERSION: ${{ matrix.vector.rust_version_override || needs.ci-config.outputs.rust_version_minimum }}
     -+      RUST_TARGET: ${{ matrix.vector.rust_target_override || '' }}
      +      CARGO_HOME: /home/builder/.cargo
           runs-on: ubuntu-latest
           container: ${{matrix.vector.image}}
           steps:
     +@@ .github/workflows/main.yml: jobs:
     +     - run: ci/install-dependencies.sh
     +     - run: useradd builder --create-home
     +     - run: chown -R builder .
     ++    - name: Install rustup via script
     ++      run: sudo --preserve-env --set-home --user=builder ci/install-rustup.sh
     ++    - name: Install Rust toolchain
     ++      env:
     ++        RUST_VERSION: ${{ matrix.vector.rust_version_override || needs.ci-config.outputs.rust_version_minimum }}
     ++        RUST_TARGET: ${{ matrix.vector.rust_target_override || '' }}
     ++      run: sudo --preserve-env --set-home --user=builder ci/install-rust-toolchain.sh
     +     - run: sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh
     +     - name: print test failures
     +       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
      
     - ## build_rust.sh ##
     -@@
     - #!/bin/sh
     - 
     --if [ -z "$CARGO_HOME" ]; then
     --  export CARGO_HOME=$HOME/.cargo
     --  echo >&2 "::warning:: CARGO_HOME is not set"
     --fi
     --echo "CARGO_HOME=$CARGO_HOME"
     + ## ci/install-dependencies.sh ##
     +@@ ci/install-dependencies.sh: fi
       
     --rustc -vV
     --cargo --version
     + case "$distro" in
     + alpine-*)
     +-	apk add --update shadow sudo meson ninja-build gcc libc-dev curl-dev openssl-dev expat-dev gettext \
     ++	apk add --update shadow sudo meson ninja-build gcc libc-dev curl curl-dev openssl-dev expat-dev gettext \
     + 		zlib-ng-dev pcre2-dev python3 musl-libintl perl-utils ncurses \
     + 		apache2 apache2-http2 apache2-proxy apache2-ssl apache2-webdav apr-util-dbd_sqlite3 \
     + 		bash cvs gnupg perl-cgi perl-dbd-sqlite perl-io-tty >/dev/null
     + 	;;
     + fedora-*|almalinux-*)
     + 	dnf -yq update >/dev/null &&
     +-	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gawk gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
     ++	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gawk gettext zlib-devel expat-devel openssl-devel curl curl-devel pcre2-devel >/dev/null
     + 	;;
     + ubuntu-*|i386/ubuntu-*|debian-*)
     + 	# Required so that apt doesn't wait for user input on certain packages.
     +@@ ci/install-dependencies.sh: ubuntu-*|i386/ubuntu-*|debian-*)
     + 	sudo apt-get -q update
     + 	sudo apt-get -q -y install \
     + 		$LANGUAGES apache2 cvs cvsps git gnupg $SVN \
     +-		make libssl-dev libcurl4-openssl-dev libexpat-dev wget sudo default-jre \
     +-		tcl tk gettext zlib1g-dev perl-modules liberror-perl libauthen-sasl-perl \
     ++		make libssl-dev curl libcurl4-openssl-dev libexpat-dev wget sudo default-jre \
     ++		tcl tk gettext zlib1g zlib1g-dev perl-modules liberror-perl libauthen-sasl-perl \
     + 		libemail-valid-perl libio-pty-perl libio-socket-ssl-perl libnet-smtp-ssl-perl libdbd-sqlite3-perl libcgi-pm-perl \
     + 		libsecret-1-dev libpcre2-dev meson ninja-build pkg-config \
     + 		${CC_PACKAGE:-${CC:-gcc}} $PYTHON_PACKAGE
     +@@ ci/install-dependencies.sh: ClangFormat)
     + 	;;
     + StaticAnalysis)
     + 	sudo apt-get -q update
     +-	sudo apt-get -q -y install coccinelle libcurl4-openssl-dev libssl-dev \
     ++	sudo apt-get -q -y install coccinelle curl libcurl4-openssl-dev libssl-dev \
     + 		libexpat-dev gettext make
     + 	;;
     + sparse)
     + 	sudo apt-get -q update -q
     +-	sudo apt-get -q -y install libssl-dev libcurl4-openssl-dev \
     +-		libexpat-dev gettext zlib1g-dev sparse
     ++	sudo apt-get -q -y install libssl-dev curl libcurl4-openssl-dev \
     ++		libexpat-dev gettext zlib1g zlib1g-dev sparse
     + 	;;
     + Documentation)
     + 	sudo apt-get -q update
     +
     + ## ci/install-rust-toolchain.sh (new) ##
     +@@
     ++#!/bin/sh
     ++
     ++if [ "$CARGO_HOME" = "" ]; then
     ++  echo >&2 "::error:: CARGO_HOME is not set"
     ++  exit 2
     ++fi
     ++export PATH="$CARGO_HOME/bin:$PATH"
     ++rustup -vV || exit $?
     ++
     ++## Enforce the correct Rust toolchain
     ++rustup override unset || true
     ++
     ++## install a specific version of rust
     ++if [ "$RUST_TARGET" != "" ]; then
     ++  rustup default --force-non-host "$RUST_VERSION-$RUST_TARGET" || exit $?
     ++else
     ++  rustup default "$RUST_VERSION" || exit $?
     ++fi
     ++
      +rustc -vV || exit $?
     -+cargo --version || exit $?
     - 
     - dir_git_root=${0%/*}
     - dir_build=$1
     -@@ build_rust.sh: dst=$dir_build/$libfile
     - if [ "$dir_git_root" != "$dir_build" ]; then
     -   src=$dir_rust/target/$rust_target/$libfile
     -   if [ ! -f $src ]; then
     --    echo >&2 "::error:: cannot find path of static library"
     -+    echo >&2 "::error:: cannot find path of static library $src is not a file or does not exist"
     -     exit 5
     -   fi
     - 
     ++
     ++RE_RUST_TARGET="$RUST_TARGET"
     ++if [ "$RUST_TARGET" = "" ]; then
     ++  RE_RUST_TARGET="[^ ]+"
     ++fi
     ++
     ++if ! rustup show active-toolchain | grep -E "^$RUST_VERSION-$RE_RUST_TARGET \(default\)$"; then
     ++  echo >&2 "::error:: wrong Rust toolchain, active-toolchain: $(rustup show active-toolchain)"
     ++  exit 3
     ++fi
      
     - ## ci/install-rust.sh (mode change 100644 => 100755) ##
     + ## ci/install-rustup.sh (new) ##
      @@
     - #!/bin/sh
     - 
     ++#!/bin/sh
     ++
      +## github workflows actions-rs/toolchain@v1 doesn't work for docker
      +## targets. This script should only be used if the ci pipeline
      +## doesn't support installing rust on a particular target.
      +
     - if [ "$(id -u)" -eq 0 ]; then
     -   echo >&2 "::warning:: installing rust as root"
     - fi
     - 
     --if [ "$CARGO_HOME" = "" ]; then
     --  echo >&2 "::warning:: CARGO_HOME is not set"
     --  export CARGO_HOME=$HOME/.cargo
     --fi
     --
     --export RUSTUP_HOME=$CARGO_HOME
     --
     - if [ "$RUST_VERSION" = "" ]; then
     -   echo >&2 "::error:: RUST_VERSION is not set"
     -+  exit 1
     ++if [ "$(id -u)" -eq 0 ]; then
     ++  echo >&2 "::warning:: installing rust as root"
      +fi
      +
      +if [ "$CARGO_HOME" = "" ]; then
      +  echo >&2 "::error:: CARGO_HOME is not set"
     -   exit 2
     - fi
     - 
     ++  exit 2
     ++fi
     ++
      +export RUSTUP_HOME=$CARGO_HOME
      +
     - ## install rustup
     - curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain none -y
     - if [ ! -f $CARGO_HOME/env ]; then
     -   echo "PATH=$CARGO_HOME/bin:\$PATH" > $CARGO_HOME/env
     - fi
     ++## install rustup
     ++curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain none -y
     ++if [ ! -f $CARGO_HOME/env ]; then
     ++  echo "PATH=$CARGO_HOME/bin:\$PATH" > $CARGO_HOME/env
     ++fi
      +. $CARGO_HOME/env
      +
     - ## install a specific version of rust
     --if [ "$BITNESS" = "32" ]; then
     --  $CARGO_HOME/bin/rustup set default-host i686-unknown-linux-gnu || exit $?
     --  $CARGO_HOME/bin/rustup install $RUST_VERSION || exit $?
     --  $CARGO_HOME/bin/rustup default --force-non-host $RUST_VERSION || exit $?
     -+if [ "$RUST_TARGET" != "" ]; then
     -+  rustup default --force-non-host "$RUST_VERSION-$RUST_TARGET" || exit $?
     - else
     --  $CARGO_HOME/bin/rustup default $RUST_VERSION || exit $?
     --  if [ "$CI_OS_NAME" = "windows" ]; then
     --    $CARGO_HOME/bin/rustup target add x86_64-pc-windows-gnu || exit $?
     --  fi
     -+  rustup default "$RUST_VERSION" || exit $?
     - fi
     - 
     --. $CARGO_HOME/env
     -+rustc -vV || exit $?
     ++rustup -vV
      
       ## ci/lib.sh ##
      @@
       # Library of functions shared by all CI scripts
       
     - 
     --export BITNESS="64"
     --if command -v getconf >/dev/null && [ "$(getconf LONG_BIT 2>/dev/null)" = "32" ]; then
     --  export BITNESS="32"
     --fi
     --echo "BITNESS=$BITNESS"
     --
     --
     ++
       if test true = "$GITHUB_ACTIONS"
       then
       	begin_group () {
     @@ ci/make-test-artifacts.sh: mkdir -p "$1" # in case ci/lib.sh decides to quit ear
       
       . ${0%/*}/lib.sh
       
     --## install rust per user rather than system wide
     --. ${0%/*}/install-rust.sh
     -+if [ -z "$CARGO_HOME" ]; then
     ++## ensure rustup is in the PATH variable
     ++if [ "$CARGO_HOME" = "" ]; then
      +  echo >&2 "::error:: CARGO_HOME is not set"
     -+  exit 1
     ++  exit 2
      +fi
     - 
     --group Build make artifacts-tar ARTIFACTS_DIRECTORY="$1"
      +export PATH="$CARGO_HOME/bin:$PATH"
     - 
     --if [ -d "$CARGO_HOME" ]; then
     --  rm -rf $CARGO_HOME
     --fi
     -+group Build make artifacts-tar ARTIFACTS_DIRECTORY="$1"
     ++
     ++rustc -vV
     ++
     + group Build make artifacts-tar ARTIFACTS_DIRECTORY="$1"
       
       check_unignored_build_artifacts
      
     @@ ci/run-build-and-tests.sh
       
       . ${0%/*}/lib.sh
       
     --## install rust per user rather than system wide
     --. ${0%/*}/install-rust.sh
     -+## actions-rs/toolchain@v1 doesn't work for docker targets.
     -+if [ "$CI_IS_DOCKER" = "true" ]; then
     -+  . ${0%/*}/install-rust.sh
     ++## ensure rustup is in the PATH variable
     ++if [ "$CARGO_HOME" = "" ]; then
     ++  echo >&2 "::error:: CARGO_HOME is not set"
     ++  exit 2
      +fi
     - 
     --rustc -vV
     ++. $CARGO_HOME/env
     ++
      +rustc -vV || exit $?
     - cargo --version || exit $?
     - 
     ++
       run_tests=t
     -
     - ## meson.build ##
     -@@ meson.build: rust_build_xdiff = custom_target('rust_build_xdiff',
     -     meson.project_source_root() / 'build_rust.sh',
     -     meson.current_build_dir(), rust_target, 'xdiff',
     -   ],
     -+  env: script_environment,
     -   install: false,
     - )
       
     + case "$jobname" in
     +@@ ci/run-build-and-tests.sh: case "$jobname" in
     + 	;;
     + esac
     + 
     ++if [ -d "$CARGO_HOME" ]; then
     ++  rm -rf $CARGO_HOME
     ++fi
     ++
     + check_unignored_build_artifacts
     + save_good_tree
 11:  382067a09e3 !  4:  0d2b39c3e03 win+Meson: do allow linking with the Rust-built xdiff
     @@ .github/workflows/main.yml: jobs:
      +          /c/Program\ Files/Git/mingw64/bin/curl -Lo libuserenv.a \
      +            https://github.com/git-for-windows/git-sdk-64/raw/HEAD/mingw64/lib/libuserenv.a
      +        }
     -     - name: build
     -       shell: bash
     -       env:
     +     - name: Install rustup via github actions
     +       uses: actions-rs/toolchain@v1
     +       with:
      
       ## config.mak.uname ##
      @@ config.mak.uname: ifeq ($(uname_S),MINGW)
     - 
     - 	export CARGO_BUILD_TARGET
     - 	RUST_TARGET_DIR = rust/target/$(CARGO_BUILD_TARGET)/$(RUST_BUILD_MODE)
     + 		COMPAT_CFLAGS += -D_USE_32BIT_TIME_T
     + 		BASIC_LDFLAGS += -Wl,--large-address-aware
     +         endif
     ++
      +	# Unfortunately now needed because of Rust
      +	EXTLIBS += -luserenv
     - 
     ++
       	CC = gcc
       	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \
     + 		-fstack-protector-strong
      
       ## meson.build ##
      @@ meson.build: elif host_machine.system() == 'windows'
 13:  44784f0d672 =  5:  e65488ab993 github workflows: upload Cargo.lock
  -:  ----------- >  6:  db5d22b1887 ivec: create a vector type that is interoperable between C and Rust
  3:  56c96d35554 =  7:  d4bed954632 xdiff/xprepare: remove superfluous forward declarations
  4:  ebec3689dce =  8:  7c68ce5349c xdiff: delete unnecessary fields from xrecord_t and xdfile_t
  5:  769d1a5b9d2 =  9:  e516ccc8c0a xdiff: make fields of xrecord_t Rust friendly
  6:  87623495994 ! 10:  21bfb9f0883 xdiff: separate parsing lines from hashing them
     @@ Metadata
      Author: Ezekiel Newren <ezekielnewren@gmail.com>
      
       ## Commit message ##
     -    xdiff: separate parsing lines from hashing them
     +    xdiff: use one definition for freeing xdfile_t
      
     -    We want to use xxhash for faster hashing. To facilitate that
     -    and to simplify the code. Separate the concerns of parsing
     -    and hashing into discrete steps. This makes swapping the hash
     -    function much easier. Since xdl_hash_record() both parses and
     -    hashses lines, this requires some slight code restructuring.
     +    Simplify xdl_prepare_ctx() by using xdl_free_ctx() instead of using
     +    local variables with hand rolled memory management.
      
          Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
      
     @@ xdiff/xprepare.c: static int xdl_classify_record(unsigned int pass, xdlclassifie
       }
       
       
     -+static void xdl_parse_lines(mmfile_t *mf, long narec, xdfile_t *xdf) {
     -+	u8 const* ptr = (u8 const*) mf->ptr;
     -+	usize len = (usize) mf->size;
     -+
     -+	xdf->recs = NULL;
     -+	xdf->nrec = 0;
     -+	XDL_ALLOC_ARRAY(xdf->recs, narec);
     -+
     -+	while (len > 0) {
     -+		xrecord_t *rec = NULL;
     -+		usize length;
     -+		u8 const* result = memchr(ptr, '\n', len);
     -+		if (result) {
     -+			length = result - ptr + 1;
     -+		} else {
     -+			length = len;
     -+		}
     -+		if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
     -+			die("XDL_ALLOC_GROW failed");
     -+		rec = xdl_cha_alloc(&xdf->rcha);
     -+		rec->ptr = ptr;
     -+		rec->size = length;
     -+		rec->ha = 0;
     -+		xdf->recs[xdf->nrec++] = rec;
     -+		ptr += length;
     -+		len -= length;
     -+	}
     -+
     ++static void xdl_free_ctx(xdfile_t *xdf) {
     ++	xdl_free(xdf->rindex);
     ++	xdl_free(xdf->rchg - 1);
     ++	xdl_free(xdf->ha);
     ++	xdl_free(xdf->recs);
     ++	xdl_cha_free(&xdf->rcha);
      +}
      +
      +
       static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
       			   xdlclassifier_t *cf, xdfile_t *xdf) {
      -	long nrec, bsize;
     --	unsigned long hav;
     --	char const *blk, *cur, *top, *prev;
     --	xrecord_t *crec;
     ++	long bsize;
     + 	unsigned long hav;
     + 	char const *blk, *cur, *top, *prev;
     + 	xrecord_t *crec;
      -	xrecord_t **recs;
     - 	unsigned long *ha;
     - 	char *rchg;
     - 	long *rindex;
     -@@ xdiff/xprepare.c: static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
     - 	ha = NULL;
     - 	rindex = NULL;
     - 	rchg = NULL;
     +-	unsigned long *ha;
     +-	char *rchg;
     +-	long *rindex;
     + 
     +-	ha = NULL;
     +-	rindex = NULL;
     +-	rchg = NULL;
      -	recs = NULL;
     ++	xdf->ha = NULL;
     ++	xdf->rindex = NULL;
     ++	xdf->rchg = NULL;
     ++	xdf->recs = NULL;
     ++	xdf->nrec = 0;
       
       	if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
       		goto abort;
      -	if (!XDL_ALLOC_ARRAY(recs, narec))
     --		goto abort;
     ++	if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
     + 		goto abort;
       
      -	nrec = 0;
     --	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
     --		for (top = blk + bsize; cur < top; ) {
     --			prev = cur;
     --			hav = xdl_hash_record(&cur, top, xpp->flags);
     + 	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
     + 		for (top = blk + bsize; cur < top; ) {
     + 			prev = cur;
     + 			hav = xdl_hash_record(&cur, top, xpp->flags);
      -			if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
     --				goto abort;
     --			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
     --				goto abort;
     --			crec->ptr = (u8 const*) prev;
     --			crec->size = (long) (cur - prev);
     --			crec->ha = hav;
     ++			if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
     + 				goto abort;
     + 			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
     + 				goto abort;
     + 			crec->ptr = (u8 const*) prev;
     + 			crec->size = (long) (cur - prev);
     + 			crec->ha = hav;
      -			recs[nrec++] = crec;
     --			if (xdl_classify_record(pass, cf, crec) < 0)
     --				goto abort;
     --		}
     -+	xdl_parse_lines(mf, narec, xdf);
     -+
     -+	for (usize i = 0; i < (usize) xdf->nrec; i++) {
     -+		xrecord_t *rec = xdf->recs[i];
     -+		char const* dump = (char const*) rec->ptr;
     -+		rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags);
     -+		xdl_classify_record(pass, cf, rec);
     ++			xdf->recs[xdf->nrec++] = crec;
     + 			if (xdl_classify_record(pass, cf, crec) < 0)
     + 				goto abort;
     + 		}
       	}
       
      -	if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
     -+
     -+	if (!XDL_CALLOC_ARRAY(rchg, xdf->nrec + 2))
     ++	if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
       		goto abort;
       
       	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
       	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
      -		if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
     -+		if (!XDL_ALLOC_ARRAY(rindex, xdf->nrec + 1))
     ++		if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
       			goto abort;
      -		if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
     -+		if (!XDL_ALLOC_ARRAY(ha, xdf->nrec + 1))
     ++		if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
       			goto abort;
       	}
       
      -	xdf->nrec = nrec;
      -	xdf->recs = recs;
     - 	xdf->rchg = rchg + 1;
     - 	xdf->rindex = rindex;
     +-	xdf->rchg = rchg + 1;
     +-	xdf->rindex = rindex;
     ++	xdf->rchg += 1;
       	xdf->nreff = 0;
     - 	xdf->ha = ha;
     +-	xdf->ha = ha;
       	xdf->dstart = 0;
      -	xdf->dend = nrec - 1;
      +	xdf->dend = xdf->nrec - 1;
       
       	return 0;
       
     -@@ xdiff/xprepare.c: abort:
     - 	xdl_free(ha);
     - 	xdl_free(rindex);
     - 	xdl_free(rchg);
     + abort:
     +-	xdl_free(ha);
     +-	xdl_free(rindex);
     +-	xdl_free(rchg);
      -	xdl_free(recs);
     -+	xdl_free(xdf->recs);
     - 	xdl_cha_free(&xdf->rcha);
     +-	xdl_cha_free(&xdf->rcha);
     ++	xdl_free_ctx(xdf);
       	return -1;
       }
     + 
     + 
     +-static void xdl_free_ctx(xdfile_t *xdf) {
     +-	xdl_free(xdf->rindex);
     +-	xdl_free(xdf->rchg - 1);
     +-	xdl_free(xdf->ha);
     +-	xdl_free(xdf->recs);
     +-	xdl_cha_free(&xdf->rcha);
     +-}
     +-
     +-
     + void xdl_free_env(xdfenv_t *xe) {
     + 
     + 	xdl_free_ctx(&xe->xdf2);
  7:  d74fd4ef67a <  -:  ----------- xdiff: conditionally use Rust's implementation of xxhash
  9:  96041a10d54 <  -:  ----------- Do support Windows again after requiring Rust
 10:  1194de3f39c <  -:  ----------- win+Meson: allow for xdiff to be compiled with MSVC
 14:  f20efdff7aa <  -:  ----------- xdiff: implement a white space iterator in Rust
  -:  ----------- > 11:  6ce0e252b38 xdiff: replace chastore with an ivec in xdfile_t
  -:  ----------- > 12:  0cfc6cf26b7 xdiff: delete nrec field from xdfile_t
  -:  ----------- > 13:  cf0387d851c xdiff: delete recs field from xdfile_t
  -:  ----------- > 14:  ea699135f95 xdiff: make xdfile_t more rust friendly
 15:  c8d41173274 ! 15:  b18544b74f3 xdiff: create line_hash() and line_equal()
     @@ Metadata
      Author: Ezekiel Newren <ezekielnewren@gmail.com>
      
       ## Commit message ##
     -    xdiff: create line_hash() and line_equal()
     +    xdiff: implement xdl_trim_ends() in Rust
      
     -    These functions use the whitespace iterator, when applicable, to hash,
     -    and compare lines.
     +    Replace the C implementation of xdl_trim_ends() with a Rust
     +    implementation.
      
          Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
      
       ## rust/xdiff/src/lib.rs ##
      @@
     -+use std::hash::Hasher;
     -+use xxhash_rust::xxh3::Xxh3Default;
     -+use crate::xutils::*;
     -+
     - pub mod xutils;
     ++pub mod xprepare;
     ++pub mod xtypes;
       
     - pub const XDF_IGNORE_WHITESPACE: u64 = 1 << 1;
     -@@ rust/xdiff/src/lib.rs: unsafe extern "C" fn xxh3_64(ptr: *const u8, size: usize) -> u64 {
     -     let slice = std::slice::from_raw_parts(ptr, size);
     -     xxhash_rust::xxh3::xxh3_64(slice)
     - }
     ++use crate::xprepare::trim_ends;
     ++use crate::xtypes::xdfile;
      +
      +#[no_mangle]
     -+unsafe extern "C" fn xdl_line_hash(ptr: *const u8, size: usize, flags: u64) -> u64 {
     -+    let line = std::slice::from_raw_parts(ptr, size);
     -+
     -+    line_hash(line, flags)
     -+}
     ++unsafe extern "C" fn xdl_trim_ends(xdf1: *mut xdfile, xdf2: *mut xdfile) -> i32 {
     ++    let xdf1 = xdf1.as_mut().expect("null pointer");
     ++    let xdf2 = xdf2.as_mut().expect("null pointer");
      +
     -+#[no_mangle]
     -+unsafe extern "C" fn xdl_line_equal(lhs: *const u8, lhs_len: usize, rhs: *const u8, rhs_len: usize, flags: u64) -> bool {
     -+    let lhs_line = std::slice::from_raw_parts(lhs, lhs_len);
     -+    let rhs_line = std::slice::from_raw_parts(rhs, rhs_len);
     ++    trim_ends(xdf1, xdf2);
      +
     -+    line_equal(lhs_line, rhs_line, flags)
     ++    0
      +}
      
     - ## rust/xdiff/src/xutils.rs ##
     + ## rust/xdiff/src/xprepare.rs (new) ##
      @@
     - use crate::*;
     -+use xxhash_rust::xxh3::xxh3_64;
     - 
     - pub(crate) fn xdl_isspace(v: u8) -> bool {
     -     match v {
     -@@ rust/xdiff/src/xutils.rs: where
     -     run_option0.is_none() && run_option1.is_none()
     - }
     - 
     ++use crate::xtypes::xdfile;
      +
     -+pub fn line_hash(line: &[u8], flags: u64) -> u64 {
     -+    if (flags & XDF_WHITESPACE_FLAGS) == 0 {
     -+        return xxh3_64(line);
     -+    }
     ++///
     ++/// Early trim initial and terminal matching records.
     ++///
     ++pub(crate) fn trim_ends(xdf1: &mut xdfile, xdf2: &mut xdfile) {
     ++    let mut lim = std::cmp::min(xdf1.record.len(), xdf2.record.len());
      +
     -+    let mut hasher = Xxh3Default::new();
     -+    for chunk in WhitespaceIter::new(line, flags) {
     -+        hasher.update(chunk);
     ++    for i in 0..lim {
     ++        if xdf1.record[i].ha != xdf2.record[i].ha {
     ++            xdf1.dstart = i as isize;
     ++            xdf2.dstart = i as isize;
     ++            lim -= i;
     ++            break;
     ++        }
      +    }
      +
     -+    hasher.finish()
     -+}
     -+
     -+
     -+pub fn line_equal(lhs: &[u8], rhs: &[u8], flags: u64) -> bool {
     -+    if (flags & XDF_WHITESPACE_FLAGS) == 0 {
     -+        return lhs == rhs;
     ++    for i in 0..lim {
     ++        let f1i = xdf1.record.len() - 1 - i;
     ++        let f2i = xdf2.record.len() - 1 - i;
     ++        if xdf1.record[f1i].ha != xdf2.record[f2i].ha {
     ++            xdf1.dend = f1i as isize;
     ++            xdf2.dend = f2i as isize;
     ++            break;
     ++        }
      +    }
     -+
     -+    let lhs_it = WhitespaceIter::new(lhs, flags);
     -+    let rhs_it = WhitespaceIter::new(rhs, flags);
     -+
     -+    chunked_iter_equal(lhs_it, rhs_it)
      +}
     +
     + ## rust/xdiff/src/xtypes.rs (new) ##
     +@@
     ++use interop::ivec::IVec;
      +
     ++#[repr(C)]
     ++pub(crate) struct xrecord {
     ++    pub(crate) ptr: *const u8,
     ++    pub(crate) size: usize,
     ++    pub(crate) ha: u64,
     ++}
      +
     - #[cfg(test)]
     - mod tests {
     -     use crate::*;
     ++#[repr(C)]
     ++pub(crate) struct xdfile {
     ++    pub(crate) record: IVec<xrecord>,
     ++    pub(crate) dstart: isize,
     ++    pub(crate) dend: isize,
     ++    pub(crate) rchg: *mut u8,
     ++    pub(crate) rindex: *mut usize,
     ++    pub(crate) nreff: usize,
     ++    pub(crate) ha: *mut u64,
     ++}
     +
     + ## xdiff/xprepare.c ##
     +@@ xdiff/xprepare.c: static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
     + }
     + 
     + 
     +-/*
     +- * Early trim initial and terminal matching records.
     +- */
     +-static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
     +-	long i, lim;
     +-	xrecord_t *recs1, *recs2;
     +-
     +-	recs1 = xdf1->record.ptr;
     +-	recs2 = xdf2->record.ptr;
     +-	for (i = 0, lim = XDL_MIN(xdf1->record.length, xdf2->record.length); i < lim;
     +-	     i++, recs1++, recs2++)
     +-		if (recs1->ha != recs2->ha)
     +-			break;
     +-
     +-	xdf1->dstart = xdf2->dstart = i;
     +-
     +-	recs1 = xdf1->record.ptr + xdf1->record.length - 1;
     +-	recs2 = xdf2->record.ptr + xdf2->record.length - 1;
     +-	for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
     +-		if (recs1->ha != recs2->ha)
     +-			break;
     +-
     +-	xdf1->dend = xdf1->record.length - i - 1;
     +-	xdf2->dend = xdf2->record.length - i - 1;
     +-
     +-	return 0;
     +-}
     ++extern i32 xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
     + 
     + 
     + static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
 16:  f7829c55871 <  -:  ----------- xdiff: optimize case where --ignore-cr-at-eol is the only whitespace flag
 17:  395609aff4b <  -:  ----------- xdiff: use rust's version of whitespace processing

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 198+ messages in thread

* [PATCH v3 01/15] doc: add a policy for using Rust
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` brian m. carlson via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 02/15] xdiff: introduce rust Ezekiel Newren via GitGitGadget
                       ` (13 subsequent siblings)
  14 siblings, 0 replies; 198+ messages in thread
From: brian m. carlson via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, brian m. carlson

From: "brian m. carlson" <sandals@crustytoothpaste.net>

Git has historically been written primarily in C, with some shell and
Perl.  However, C is not memory safe, which makes it more likely that
security vulnerabilities or other bugs will be introduced, and it is
also more verbose and less ergonomic than other, more modern languages.

One of the most common modern compiled languages which is easily
interoperable with C is Rust.  It is popular (the most admired language
on the 2024 Stack Overflow Developer Survey), efficient, portable, and
robust.

Introduce a document laying out the incremental introduction of Rust to
Git and provide a detailed rationale for doing so, including the points
above.  Propose a design for this approach that addresses the needs of
downstreams and distributors, as well as contributors.

Since we don't want to carry both a C and Rust version of code and want
to be able to add new features only in Rust, mention that Rust is a
required part of our platform support policy.

It should be noted that a recent discussion at the Berlin Git Merge
Contributor Summit found widespread support for the addition of Rust to
Git.  While of course not all contributors were represented, the
proposal appeared to have the support of a majority of active
contributors.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
[en: Added some comments about types, and changed the recommondations
     about cbindgen, bindgen, rustix, libc.]
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 Documentation/Makefile                        |   1 +
 Documentation/technical/platform-support.adoc |   2 +
 Documentation/technical/rust-support.adoc     | 142 ++++++++++++++++++
 3 files changed, 145 insertions(+)
 create mode 100644 Documentation/technical/rust-support.adoc

diff --git a/Documentation/Makefile b/Documentation/Makefile
index b109d25e9c80..066b761c01b9 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -127,6 +127,7 @@ TECH_DOCS += technical/parallel-checkout
 TECH_DOCS += technical/partial-clone
 TECH_DOCS += technical/platform-support
 TECH_DOCS += technical/racy-git
+TECH_DOCS += technical/rust-support
 TECH_DOCS += technical/reftable
 TECH_DOCS += technical/scalar
 TECH_DOCS += technical/send-pack-pipeline
diff --git a/Documentation/technical/platform-support.adoc b/Documentation/technical/platform-support.adoc
index 0a2fb28d6277..dc71672dcb57 100644
--- a/Documentation/technical/platform-support.adoc
+++ b/Documentation/technical/platform-support.adoc
@@ -33,6 +33,8 @@ meet the following minimum requirements:
 
 * Has active security support (taking security releases of dependencies, etc)
 
+* Supports Rust and the toolchain version specified in link:rust-support.adoc[].
+
 These requirements are a starting point, and not sufficient on their own for the
 Git community to be enthusiastic about supporting your platform. Maintainers of
 platforms which do meet these requirements can follow the steps below to make it
diff --git a/Documentation/technical/rust-support.adoc b/Documentation/technical/rust-support.adoc
new file mode 100644
index 000000000000..57a001fa2d7b
--- /dev/null
+++ b/Documentation/technical/rust-support.adoc
@@ -0,0 +1,142 @@
+Usage of Rust in Git
+====================
+
+Objective
+---------
+Introduce Rust into Git incrementally to improve security and maintainability.
+
+Background
+----------
+Git has historically been written primarily in C, with some portions in shell,
+Perl, or other languages.  At the time it was originally written, this was
+important for portability and was a logical choice for software development.
+
+:0: link:https://security.googleblog.com/2024/09/eliminating-memory-safety-vulnerabilities-Android.html
+:1: link:https://www.cisa.gov/resources-tools/resources/product-security-bad-practices
+
+However, as time has progressed, we've seen an increased concern with memory
+safety vulnerabilities and the development of newer languages, such as Rust,
+that substantially limit or eliminate this class of vulnerabilities.
+Development in a variety of projects has found that memory safety
+vulnerabilities constitute about 70% of vulnerabilities of software in
+languages that are not memory safe.  For instance, {0}[one survey of Android]
+found that memory safety vulnerabilities decreased from 76% to 24% over six
+years due to an increase in memory safe code.  Similarly, the U.S. government
+is {1}[proposing to classify development in memory unsafe languages as a
+Product Security Bad Practice"].
+
+These risks are even more substantial when we consider the fact that Git is a
+network-facing service.  Many organizations run Git servers internally or use a
+cloud-based forge, and the risk of accidental exposure or compromise of user
+data is substantial.  It's important to ensure that Git, whether it's used
+locally or remotely, is robustly secure.
+
+In addition, C is a difficult language to write well and concisely.  While it
+is of course possible to do anything with C, it lacks built-in support for
+niceties found in modern languages, such as hash tables, generics, typed
+errors, and automatic destruction, and most modern language offer shorter, more
+ergonomic syntax for expressing code.  This is valuable functionality that can
+allow Git to be developed more rapidly, more easily, by more developers of a
+variety of levels, and with more confidence in the correctness of the code.
+
+For these reasons, adding Rust to Git is a sensible and prudent move that will
+allow us to improve the quality of the code and potentially attract new developers.
+
+Goals
+-----
+1. Git continues to build, run, and pass tests on a wide variety of operating
+   systems and architectures.
+2. Transition from C to Rust is incremental; that is, code can be ported as it
+   is convenient and Git does not need to transition all at once.
+3. Git continues to support older operating systems in conformance with the
+   platform support policy.
+
+Non-Goals
+---------
+1. Support for every possible operating system and architecture.  Git already
+   has a platform support policy which defines what is supported and we already
+   exclude some operating systems for various reasons (e.g., lacking enough POSIX
+   tools to pass the test suite).
+2. Implementing C-only versions of Rust code or compiling a C-only Git.  This
+   would be difficult to maintain and would not offer the ergonomic benefits we
+   desire.
+
+Design
+------
+Git will adopt Rust incrementally.  This transition will start with the
+creation of a static library that can be linked into the existing Git binaries.
+At some point, we may wish to expose a dynamic library and compile the Git
+binaries themselves using Rust.  Using an incremental approach allows us to
+determine as we go along how to structure our code in the best way for the
+project and avoids the need to make hard, potentially disruptive, transitions
+caused by porting a binary wholesale from one language to another that might
+introduce bugs.
+
+Crates like libc or rustix define types like c_long, but in ways that are not
+safe across platforms.
+From https://docs.rs/rustix/latest/rustix/ffi/type.c_long.html:
+
+    This type will always be i32 or i64.  Most notably, many Linux-based
+    systems assume an i64, but Windows assumes i32.  The C standard technically
+    only requires that this type be a signed integer that is at least 32 bits
+    and at least the size of an int, although in practice, no system would
+    have a long that is neither an i32 nor i64.
+
+Also, note that other locations, such as
+https://docs.rs/libc/latest/libc/type.c_long.html, just hardcode c_long as i64
+even though C may mean i32 on some platforms.
+
+As such, using the c_long type would give us portability issues, and
+perpetuate some of the bugs git has faced across platforms.  Avoid using C's
+types (long, unsigned, char, etc.), and switch to unambiguous types (e.g. i32
+or i64) before trying to make C and Rust interoperate.
+
+Crates like libc and rustix may have also traditionally aided interoperability
+with older versions of Rust (e.g.  when worrying about stat[64] system calls),
+but the Rust standard library in newer versions of Rust handle these concerns
+in a platform agnostic way.  There may arise cases where we need to consider
+these crates, but for now we omit them.
+
+Tools like bindgen and cbindgen create C-styled unsafe Rust code rather than
+idiomatic Rust; where possible, we prefer to switch to idiomatic Rust.  Any
+standard C library functions that are needed can be manually wrapped on the
+Rust side.
+
+Rust upstream releases every six weeks and only supports the latest stable
+release.  While it is nice that upstream is active, we would like our software
+releases to have a lifespan exceeding six weeks.  To allow compiling our code
+on a variety of systems, we will support the version of Rust in Debian stable,
+plus, for a year after a new Debian stable is released, the version in Debian
+oldstable.
+
+This provides an approximately three-year lifespan of support for a Rust
+release and allows us to support a variety of operating systems and
+architectures, including those for which Rust upstream does not build binaries.
+Debian stable is the benchmark distribution used by many Rust projects when
+determining supported Rust versions, and it is an extremely portable and
+popular free software operating system that is available to the public at no
+charge, which makes it a sensible choice for us as well.
+
+We may change this policy if the Rust project issues long-term support releases
+or the Rust community and distributors agree on releases to target as if they
+were long-term support releases.
+
+This version support policy necessitates that we be very careful about the
+dependencies we include, since many Rust projects support only the latest
+stable version.  However, we typically have been careful about dependencies in
+the first place, so this should not be a major departure from existing policy,
+although it may be a change for some existing Rust developers.
+
+We will avoid including the `Cargo.lock` file in the repository and instead
+specify minimum dependency versions in the `Cargo.toml` file.  We want to allow
+people to use newer versions of dependencies if necessary to support newer
+platforms without needing to force upgrades of dependencies on all users, and
+it provides additional flexibility for distribution maintainers.
+
+We do not plan to support beta or nightly versions of the Rust compiler.  These
+versions may change rapidly and especially parts of the toolchain such as
+Clippy, the lint tool, can have false positives or add additional warnings with
+too great of a frequency to be supportable by the project.  However, we do plan
+to support alternate compilers, such as the rust_codegen_gcc backend and gccrs
+when they are stable and support our desired release versions.  This will
+provide greater support for more operating systems and architectures.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 02/15] xdiff: introduce rust
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 01/15] doc: add a policy for using Rust brian m. carlson via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23 13:43       ` rsbecker
  2025-08-23  3:55     ` [PATCH v3 03/15] github workflows: install rust Ezekiel Newren via GitGitGadget
                       ` (12 subsequent siblings)
  14 siblings, 1 reply; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Upcoming patches will simplify xdiff, while also porting parts of it to
Rust. In preparation, add some stubs and setup the Rust build. For now,
it is easier to let cargo build rust and have make or meson merely link
against the static library that cargo builds. In line with ongoing
libification efforts, use multiple crates to allow more modularity on
the Rust side. xdiff is the crate that this series will focus on, but
we also introduce the interop crate for future patch series.

In order to facilitate interoperability between C and Rust, introduce C
definitions for Rust primitive types in git-compat-util.h.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 .gitignore              |  3 +++
 Makefile                | 53 ++++++++++++++++++++++++++++----------
 build_rust.sh           | 57 +++++++++++++++++++++++++++++++++++++++++
 git-compat-util.h       | 17 ++++++++++++
 meson.build             | 52 +++++++++++++++++++++++++++++++------
 rust/Cargo.toml         |  6 +++++
 rust/interop/Cargo.toml | 14 ++++++++++
 rust/interop/src/lib.rs |  0
 rust/xdiff/Cargo.toml   | 15 +++++++++++
 rust/xdiff/src/lib.rs   |  0
 10 files changed, 196 insertions(+), 21 deletions(-)
 create mode 100755 build_rust.sh
 create mode 100644 rust/Cargo.toml
 create mode 100644 rust/interop/Cargo.toml
 create mode 100644 rust/interop/src/lib.rs
 create mode 100644 rust/xdiff/Cargo.toml
 create mode 100644 rust/xdiff/src/lib.rs

diff --git a/.gitignore b/.gitignore
index 04c444404e4b..ff81e3580c4e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -254,3 +254,6 @@ Release/
 /contrib/buildsystems/out
 /contrib/libgit-rs/target
 /contrib/libgit-sys/target
+/.idea/
+/rust/target/
+/rust/Cargo.lock
diff --git a/Makefile b/Makefile
index 70d1543b6b86..1ec0c1ee6603 100644
--- a/Makefile
+++ b/Makefile
@@ -919,6 +919,29 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+
+EXTLIBS =
+
+ifeq ($(DEBUG), 1)
+  RUST_BUILD_MODE = debug
+else
+  RUST_BUILD_MODE = release
+endif
+
+RUST_TARGET_DIR = rust/target/$(RUST_BUILD_MODE)
+RUST_FLAGS_FOR_C = -L$(RUST_TARGET_DIR)
+
+.PHONY: compile_rust
+compile_rust:
+	./build_rust.sh . $(RUST_BUILD_MODE) xdiff
+
+EXTLIBS += ./$(RUST_TARGET_DIR)/libxdiff.a
+
+UNAME_S := $(shell uname -s)
+ifeq ($(UNAME_S),Linux)
+  EXTLIBS += -ldl
+endif
+
 REFTABLE_LIB = reftable/libreftable.a
 
 GENERATED_H += command-list.h
@@ -1390,7 +1413,7 @@ UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/lib-reftable.o
 
 # xdiff and reftable libs may in turn depend on what is in libgit.a
 GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
-EXTLIBS =
+
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
 
@@ -2541,7 +2564,7 @@ git.sp git.s git.o: EXTRA_CPPFLAGS = \
 	'-DGIT_MAN_PATH="$(mandir_relative_SQ)"' \
 	'-DGIT_INFO_PATH="$(infodir_relative_SQ)"'
 
-git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS)
+git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
 		$(filter %.o,$^) $(LIBS)
 
@@ -2891,17 +2914,17 @@ headless-git.o: compat/win32/headless.c GIT-CFLAGS
 headless-git$X: headless-git.o git.res GIT-LDFLAGS
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) $(ALL_LDFLAGS) -mwindows -o $@ $< git.res
 
-git-%$X: %.o GIT-LDFLAGS $(GITLIBS)
+git-%$X: %.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
 
-git-imap-send$X: imap-send.o $(IMAP_SEND_BUILDDEPS) GIT-LDFLAGS $(GITLIBS)
+git-imap-send$X: imap-send.o $(IMAP_SEND_BUILDDEPS) GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(IMAP_SEND_LDFLAGS) $(LIBS)
 
-git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS $(GITLIBS)
+git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(LIBS)
-git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS)
+git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
 
@@ -2911,11 +2934,11 @@ $(REMOTE_CURL_ALIASES): $(REMOTE_CURL_PRIMARY)
 	ln -s $< $@ 2>/dev/null || \
 	cp $< $@
 
-$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS $(GITLIBS)
+$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
 
-scalar$X: scalar.o GIT-LDFLAGS $(GITLIBS)
+scalar$X: scalar.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
 		$(filter %.o,$^) $(LIBS)
 
@@ -2925,6 +2948,7 @@ $(LIB_FILE): $(LIB_OBJS)
 $(XDIFF_LIB): $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
+
 $(REFTABLE_LIB): $(REFTABLE_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
@@ -3294,7 +3318,7 @@ perf: all
 
 t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) $(UNIT_TEST_DIR)/test-lib.o
 
-t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
+t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) compile_rust
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS)
 
 check-sha1:: t/helper/test-tool$X
@@ -3756,7 +3780,10 @@ cocciclean:
 	$(RM) -r .build/contrib/coccinelle
 	$(RM) contrib/coccinelle/*.cocci.patch
 
-clean: profile-clean coverage-clean cocciclean
+rustclean:
+	cd rust && cargo clean
+
+clean: profile-clean coverage-clean cocciclean rustclean
 	$(RM) -r .build $(UNIT_TEST_BIN)
 	$(RM) GIT-TEST-SUITES
 	$(RM) po/git.pot po/git-core.pot
@@ -3911,13 +3938,13 @@ FUZZ_CXXFLAGS ?= $(ALL_CFLAGS)
 .PHONY: fuzz-all
 fuzz-all: $(FUZZ_PROGRAMS)
 
-$(FUZZ_PROGRAMS): %: %.o oss-fuzz/dummy-cmd-main.o $(GITLIBS) GIT-LDFLAGS
+$(FUZZ_PROGRAMS): %: %.o oss-fuzz/dummy-cmd-main.o $(GITLIBS) GIT-LDFLAGS compile_rust
 	$(QUIET_LINK)$(FUZZ_CXX) $(FUZZ_CXXFLAGS) -o $@ $(ALL_LDFLAGS) \
 		-Wl,--allow-multiple-definition \
 		$(filter %.o,$^) $(filter %.a,$^) $(LIBS) $(LIB_FUZZING_ENGINE)
 
 $(UNIT_TEST_PROGS): $(UNIT_TEST_BIN)/%$X: $(UNIT_TEST_DIR)/%.o $(UNIT_TEST_OBJS) \
-	$(GITLIBS) GIT-LDFLAGS
+	$(GITLIBS) GIT-LDFLAGS compile_rust
 	$(call mkdir_p_parent_template)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
 		$(filter %.o,$^) $(filter %.a,$^) $(LIBS)
@@ -3936,7 +3963,7 @@ $(UNIT_TEST_DIR)/clar.suite: $(UNIT_TEST_DIR)/clar-decls.h $(UNIT_TEST_DIR)/gene
 $(UNIT_TEST_DIR)/clar/clar.o: $(UNIT_TEST_DIR)/clar.suite
 $(CLAR_TEST_OBJS): $(UNIT_TEST_DIR)/clar-decls.h
 $(CLAR_TEST_OBJS): EXTRA_CPPFLAGS = -I$(UNIT_TEST_DIR)
-$(CLAR_TEST_PROG): $(UNIT_TEST_DIR)/clar.suite $(CLAR_TEST_OBJS) $(GITLIBS) GIT-LDFLAGS
+$(CLAR_TEST_PROG): $(UNIT_TEST_DIR)/clar.suite $(CLAR_TEST_OBJS) $(GITLIBS) GIT-LDFLAGS compile_rust
 	$(call mkdir_p_parent_template)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
 
diff --git a/build_rust.sh b/build_rust.sh
new file mode 100755
index 000000000000..192385a1d961
--- /dev/null
+++ b/build_rust.sh
@@ -0,0 +1,57 @@
+#!/bin/sh
+
+
+rustc -vV || exit $?
+cargo --version || exit $?
+
+dir_git_root=${0%/*}
+dir_build=$1
+rust_build_profile=$2
+crate=$3
+
+dir_rust=$dir_git_root/rust
+
+if [ "$dir_git_root" = "" ]; then
+  echo "did not specify the directory for the root of git"
+  exit 1
+fi
+
+if [ "$dir_build" = "" ]; then
+  echo "did not specify the build directory"
+  exit 1
+fi
+
+if [ "$rust_build_profile" = "" ]; then
+  echo "did not specify the rust_build_profile"
+  exit 1
+fi
+
+if [ "$rust_build_profile" = "release" ]; then
+  rust_args="--release"
+  export RUSTFLAGS=''
+elif [ "$rust_build_profile" = "debug" ]; then
+  rust_args=""
+  export RUSTFLAGS='-C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
+else
+  echo "illegal rust_build_profile value $rust_build_profile"
+  exit 1
+fi
+
+cd $dir_rust && cargo clean && pwd && cargo build -p $crate $rust_args; cd $dir_git_root
+
+libfile="lib${crate}.a"
+if rustup show active-toolchain | grep windows-msvc; then
+  libfile="${crate}.lib"
+fi
+dst=$dir_build/$libfile
+
+if [ "$dir_git_root" != "$dir_build" ]; then
+  src=$dir_rust/target/$rust_build_profile/$libfile
+  if [ ! -f $src ]; then
+    echo >&2 "::error:: cannot find path of static library $src is not a file or does not exist"
+    exit 5
+  fi
+
+  rm $dst 2>/dev/null
+  mv $src $dst
+fi
diff --git a/git-compat-util.h b/git-compat-util.h
index 4678e21c4cb8..82dc99764ac0 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -196,6 +196,23 @@ static inline int is_xplatform_dir_sep(int c)
 #include "compat/msvc.h"
 #endif
 
+/* rust types */
+typedef uint8_t   u8;
+typedef uint16_t  u16;
+typedef uint32_t  u32;
+typedef uint64_t  u64;
+
+typedef int8_t    i8;
+typedef int16_t   i16;
+typedef int32_t   i32;
+typedef int64_t   i64;
+
+typedef float     f32;
+typedef double    f64;
+
+typedef size_t    usize;
+typedef ptrdiff_t isize;
+
 /* used on Mac OS X */
 #ifdef PRECOMPOSE_UNICODE
 #include "compat/precompose_utf8.h"
diff --git a/meson.build b/meson.build
index 596f5ac7110e..324f968338b9 100644
--- a/meson.build
+++ b/meson.build
@@ -267,6 +267,40 @@ version_gen_environment.set('GIT_DATE', get_option('build_date'))
 version_gen_environment.set('GIT_USER_AGENT', get_option('user_agent'))
 version_gen_environment.set('GIT_VERSION', get_option('version'))
 
+if get_option('optimization') in ['2', '3', 's', 'z']
+  rust_build_profile = 'release'
+else
+  rust_build_profile = 'debug'
+endif
+
+# Run `rustup show active-toolchain` and capture output
+rustup_out = run_command('rustup', 'show', 'active-toolchain',
+                         check: true).stdout().strip()
+
+rust_crates = ['xdiff']
+rust_builds = []
+
+foreach crate : rust_crates
+  if rustup_out.contains('windows-msvc')
+    libfile = crate + '.lib'
+  else
+    libfile = 'lib' + crate + '.a'
+  endif
+
+  rust_builds += custom_target(
+    'rust_build_'+crate,
+    output: libfile,
+    build_by_default: true,
+    build_always_stale: true,
+    command: [
+      meson.project_source_root() / 'build_rust.sh',
+      meson.current_build_dir(), rust_build_profile, crate,
+    ],
+    install: false,
+  )
+endforeach
+
+
 compiler = meson.get_compiler('c')
 
 libgit_sources = [
@@ -1678,14 +1712,16 @@ version_def_h = custom_target(
 libgit_sources += version_def_h
 
 libgit = declare_dependency(
-  link_with: static_library('git',
-    sources: libgit_sources,
-    c_args: libgit_c_args + [
-      '-DGIT_VERSION_H="' + version_def_h.full_path() + '"',
-    ],
-    dependencies: libgit_dependencies,
-    include_directories: libgit_include_directories,
-  ),
+  link_with: [
+    static_library('git',
+      sources: libgit_sources,
+      c_args: libgit_c_args + [
+        '-DGIT_VERSION_H="' + version_def_h.full_path() + '"',
+      ],
+      dependencies: libgit_dependencies,
+      include_directories: libgit_include_directories,
+    ),
+  ] + rust_builds,
   compile_args: libgit_c_args,
   dependencies: libgit_dependencies,
   include_directories: libgit_include_directories,
diff --git a/rust/Cargo.toml b/rust/Cargo.toml
new file mode 100644
index 000000000000..ed3d79d7f827
--- /dev/null
+++ b/rust/Cargo.toml
@@ -0,0 +1,6 @@
+[workspace]
+members = [
+    "xdiff",
+    "interop",
+]
+resolver = "2"
diff --git a/rust/interop/Cargo.toml b/rust/interop/Cargo.toml
new file mode 100644
index 000000000000..045e3b01cfad
--- /dev/null
+++ b/rust/interop/Cargo.toml
@@ -0,0 +1,14 @@
+[package]
+name = "interop"
+version = "0.1.0"
+edition = "2021"
+
+[lib]
+name = "interop"
+path = "src/lib.rs"
+## staticlib to generate xdiff.a for use by gcc
+## cdylib (optional) to generate xdiff.so for use by gcc
+## rlib is required by the rust unit tests
+crate-type = ["staticlib", "rlib"]
+
+[dependencies]
diff --git a/rust/interop/src/lib.rs b/rust/interop/src/lib.rs
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/rust/xdiff/Cargo.toml b/rust/xdiff/Cargo.toml
new file mode 100644
index 000000000000..eb7966aada64
--- /dev/null
+++ b/rust/xdiff/Cargo.toml
@@ -0,0 +1,15 @@
+[package]
+name = "xdiff"
+version = "0.1.0"
+edition = "2021"
+
+[lib]
+name = "xdiff"
+path = "src/lib.rs"
+## staticlib to generate xdiff.a for use by gcc
+## cdylib (optional) to generate xdiff.so for use by gcc
+## rlib is required by the rust unit tests
+crate-type = ["staticlib", "rlib"]
+
+[dependencies]
+interop = { path = "../interop" }
diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
new file mode 100644
index 000000000000..e69de29bb2d1
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 03/15] github workflows: install rust
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 01/15] doc: add a policy for using Rust brian m. carlson via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 02/15] xdiff: introduce rust Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 04/15] win+Meson: do allow linking with the Rust-built xdiff Johannes Schindelin via GitGitGadget
                       ` (11 subsequent siblings)
  14 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Prefer using actions-rs/toolchain@v1 where possible to install rustup,
but for docker targets use a script to install rustup. Consolidate the
Rust toolchain definitions in main.yaml. Use install-rust-toolchain.sh
to ensure the correct toolchain is used. Five overrides are used in
main.yaml:

  * On Windows: Rust didn't resolve the bcrypt library on Windows
    correctly until version 1.78.0. Also since rustup mis-identifies
    the Rust toolchain, the Rust target triple must be set to
    x86_64-pc-windows-gnu for make (win build), and
    x86_64-pc-windows-msvc for meson (win+Meson build).
  * On musl: libc differences, such as ftruncate64 vs ftruncate, were
    not accounted for until Rust version 1.72.0. No older version of
    Rust will work on musl for our needs.
  * In a 32-bit docker container running on a 64-bit host, we need to
    override the Rust target triple. This is because rustup asks the
    kernel for the bitness of the system and it says 64, even though
    the container is 32-bit. This also allows us to remove the
    BITNESS environment variable in ci/lib.sh.

The logic for selecting library names was initially provided in a patch
from Johannes, but was reworked and squashed into this commit.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 .github/workflows/main.yml   | 61 +++++++++++++++++++++++++++++++++++-
 ci/install-dependencies.sh   | 14 ++++-----
 ci/install-rust-toolchain.sh | 30 ++++++++++++++++++
 ci/install-rustup.sh         | 25 +++++++++++++++
 ci/lib.sh                    |  1 +
 ci/make-test-artifacts.sh    |  9 ++++++
 ci/run-build-and-tests.sh    | 13 ++++++++
 7 files changed, 145 insertions(+), 8 deletions(-)
 create mode 100755 ci/install-rust-toolchain.sh
 create mode 100755 ci/install-rustup.sh

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 7dbf9f7f123c..2fa5fab0fa83 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -26,6 +26,13 @@ jobs:
     outputs:
       enabled: ${{ steps.check-ref.outputs.enabled }}${{ steps.skip-if-redundant.outputs.enabled }}
       skip_concurrent: ${{ steps.check-ref.outputs.skip_concurrent }}
+      rust_version_minimum: 1.61.0
+      rust_version_windows: 1.78.0
+      rust_version_musl: 1.72.0
+      ## the rust target is inferred by rustup unless specified
+      rust_target_windows_make: x86_64-pc-windows-gnu
+      rust_target_windows_meson: x86_64-pc-windows-msvc
+      rust_target_32bit_linux: i686-unknown-linux-gnu
     steps:
       - name: try to clone ci-config branch
         run: |
@@ -108,12 +115,26 @@ jobs:
     needs: ci-config
     if: needs.ci-config.outputs.enabled == 'yes'
     runs-on: windows-latest
+    env:
+      CARGO_HOME: "/c/Users/runneradmin/.cargo"
     concurrency:
       group: windows-build-${{ github.ref }}
       cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
     steps:
     - uses: actions/checkout@v4
     - uses: git-for-windows/setup-git-for-windows-sdk@v1
+    - name: Install rustup via github actions
+      uses: actions-rs/toolchain@v1
+      with:
+        toolchain: stable
+        profile: minimal
+        override: false
+    - name: Install Rust toolchain
+      shell: bash
+      env:
+        RUST_VERSION: ${{ needs.ci-config.outputs.rust_version_windows }}
+        RUST_TARGET: ${{ needs.ci-config.outputs.rust_target_windows_make }}
+      run: ci/install-rust-toolchain.sh
     - name: build
       shell: bash
       env:
@@ -254,12 +275,26 @@ jobs:
     needs: ci-config
     if: needs.ci-config.outputs.enabled == 'yes'
     runs-on: windows-latest
+    env:
+      CARGO_HOME: "/c/Users/runneradmin/.cargo"
     concurrency:
       group: windows-meson-build-${{ github.ref }}
       cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
     steps:
     - uses: actions/checkout@v4
     - uses: actions/setup-python@v5
+    - name: Install rustup via github actions
+      uses: actions-rs/toolchain@v1
+      with:
+        toolchain: stable
+        profile: minimal
+        override: false
+    - name: Install Rust toolchain
+      shell: bash
+      env:
+        RUST_VERSION: ${{ needs.ci-config.outputs.rust_version_windows }}
+        RUST_TARGET: ${{ needs.ci-config.outputs.rust_target_windows_meson }}
+      run: ci/install-rust-toolchain.sh
     - name: Set up dependencies
       shell: pwsh
       run: pip install meson ninja
@@ -329,11 +364,24 @@ jobs:
       jobname: ${{matrix.vector.jobname}}
       CI_JOB_IMAGE: ${{matrix.vector.pool}}
       TEST_OUTPUT_DIRECTORY: ${{github.workspace}}/t
+      CARGO_HOME: "/Users/runner/.cargo"
     runs-on: ${{matrix.vector.pool}}
     steps:
     - uses: actions/checkout@v4
     - run: ci/install-dependencies.sh
-    - run: ci/run-build-and-tests.sh
+    - name: Install rustup via github actions
+      uses: actions-rs/toolchain@v1
+      with:
+        toolchain: stable
+        profile: minimal
+        override: false
+    - name: Install Rust toolchain
+      shell: bash
+      env:
+        RUST_VERSION: ${{ needs.ci-config.outputs.rust_version_minimum }}
+      run: ci/install-rust-toolchain.sh
+    - name: Run build and tests
+      run: ci/run-build-and-tests.sh
     - name: print test failures
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       run: ci/print-test-failures.sh
@@ -393,9 +441,11 @@ jobs:
           cc: gcc
         - jobname: linux-musl-meson
           image: alpine:latest
+          rust_version_override: ${{ needs.ci-config.outputs.rust_version_musl }}
         # Supported until 2025-04-02.
         - jobname: linux32
           image: i386/ubuntu:focal
+          rust_target_override: ${{ needs.ci-config.outputs.rust_target_32bit_linux }}
         - jobname: pedantic
           image: fedora:latest
         # A RHEL 8 compatible distro.  Supported until 2029-05-31.
@@ -408,7 +458,9 @@ jobs:
       jobname: ${{matrix.vector.jobname}}
       CC: ${{matrix.vector.cc}}
       CI_JOB_IMAGE: ${{matrix.vector.image}}
+      CI_IS_DOCKER: "true"
       CUSTOM_PATH: /custom
+      CARGO_HOME: /home/builder/.cargo
     runs-on: ubuntu-latest
     container: ${{matrix.vector.image}}
     steps:
@@ -433,6 +485,13 @@ jobs:
     - run: ci/install-dependencies.sh
     - run: useradd builder --create-home
     - run: chown -R builder .
+    - name: Install rustup via script
+      run: sudo --preserve-env --set-home --user=builder ci/install-rustup.sh
+    - name: Install Rust toolchain
+      env:
+        RUST_VERSION: ${{ matrix.vector.rust_version_override || needs.ci-config.outputs.rust_version_minimum }}
+        RUST_TARGET: ${{ matrix.vector.rust_target_override || '' }}
+      run: sudo --preserve-env --set-home --user=builder ci/install-rust-toolchain.sh
     - run: sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh
     - name: print test failures
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index d061a4729339..7801075821ba 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -24,14 +24,14 @@ fi
 
 case "$distro" in
 alpine-*)
-	apk add --update shadow sudo meson ninja-build gcc libc-dev curl-dev openssl-dev expat-dev gettext \
+	apk add --update shadow sudo meson ninja-build gcc libc-dev curl curl-dev openssl-dev expat-dev gettext \
 		zlib-ng-dev pcre2-dev python3 musl-libintl perl-utils ncurses \
 		apache2 apache2-http2 apache2-proxy apache2-ssl apache2-webdav apr-util-dbd_sqlite3 \
 		bash cvs gnupg perl-cgi perl-dbd-sqlite perl-io-tty >/dev/null
 	;;
 fedora-*|almalinux-*)
 	dnf -yq update >/dev/null &&
-	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gawk gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
+	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gawk gettext zlib-devel expat-devel openssl-devel curl curl-devel pcre2-devel >/dev/null
 	;;
 ubuntu-*|i386/ubuntu-*|debian-*)
 	# Required so that apt doesn't wait for user input on certain packages.
@@ -55,8 +55,8 @@ ubuntu-*|i386/ubuntu-*|debian-*)
 	sudo apt-get -q update
 	sudo apt-get -q -y install \
 		$LANGUAGES apache2 cvs cvsps git gnupg $SVN \
-		make libssl-dev libcurl4-openssl-dev libexpat-dev wget sudo default-jre \
-		tcl tk gettext zlib1g-dev perl-modules liberror-perl libauthen-sasl-perl \
+		make libssl-dev curl libcurl4-openssl-dev libexpat-dev wget sudo default-jre \
+		tcl tk gettext zlib1g zlib1g-dev perl-modules liberror-perl libauthen-sasl-perl \
 		libemail-valid-perl libio-pty-perl libio-socket-ssl-perl libnet-smtp-ssl-perl libdbd-sqlite3-perl libcgi-pm-perl \
 		libsecret-1-dev libpcre2-dev meson ninja-build pkg-config \
 		${CC_PACKAGE:-${CC:-gcc}} $PYTHON_PACKAGE
@@ -121,13 +121,13 @@ ClangFormat)
 	;;
 StaticAnalysis)
 	sudo apt-get -q update
-	sudo apt-get -q -y install coccinelle libcurl4-openssl-dev libssl-dev \
+	sudo apt-get -q -y install coccinelle curl libcurl4-openssl-dev libssl-dev \
 		libexpat-dev gettext make
 	;;
 sparse)
 	sudo apt-get -q update -q
-	sudo apt-get -q -y install libssl-dev libcurl4-openssl-dev \
-		libexpat-dev gettext zlib1g-dev sparse
+	sudo apt-get -q -y install libssl-dev curl libcurl4-openssl-dev \
+		libexpat-dev gettext zlib1g zlib1g-dev sparse
 	;;
 Documentation)
 	sudo apt-get -q update
diff --git a/ci/install-rust-toolchain.sh b/ci/install-rust-toolchain.sh
new file mode 100755
index 000000000000..06a29c4cfa17
--- /dev/null
+++ b/ci/install-rust-toolchain.sh
@@ -0,0 +1,30 @@
+#!/bin/sh
+
+if [ "$CARGO_HOME" = "" ]; then
+  echo >&2 "::error:: CARGO_HOME is not set"
+  exit 2
+fi
+export PATH="$CARGO_HOME/bin:$PATH"
+rustup -vV || exit $?
+
+## Enforce the correct Rust toolchain
+rustup override unset || true
+
+## install a specific version of rust
+if [ "$RUST_TARGET" != "" ]; then
+  rustup default --force-non-host "$RUST_VERSION-$RUST_TARGET" || exit $?
+else
+  rustup default "$RUST_VERSION" || exit $?
+fi
+
+rustc -vV || exit $?
+
+RE_RUST_TARGET="$RUST_TARGET"
+if [ "$RUST_TARGET" = "" ]; then
+  RE_RUST_TARGET="[^ ]+"
+fi
+
+if ! rustup show active-toolchain | grep -E "^$RUST_VERSION-$RE_RUST_TARGET \(default\)$"; then
+  echo >&2 "::error:: wrong Rust toolchain, active-toolchain: $(rustup show active-toolchain)"
+  exit 3
+fi
diff --git a/ci/install-rustup.sh b/ci/install-rustup.sh
new file mode 100755
index 000000000000..0036231aeea7
--- /dev/null
+++ b/ci/install-rustup.sh
@@ -0,0 +1,25 @@
+#!/bin/sh
+
+## github workflows actions-rs/toolchain@v1 doesn't work for docker
+## targets. This script should only be used if the ci pipeline
+## doesn't support installing rust on a particular target.
+
+if [ "$(id -u)" -eq 0 ]; then
+  echo >&2 "::warning:: installing rust as root"
+fi
+
+if [ "$CARGO_HOME" = "" ]; then
+  echo >&2 "::error:: CARGO_HOME is not set"
+  exit 2
+fi
+
+export RUSTUP_HOME=$CARGO_HOME
+
+## install rustup
+curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain none -y
+if [ ! -f $CARGO_HOME/env ]; then
+  echo "PATH=$CARGO_HOME/bin:\$PATH" > $CARGO_HOME/env
+fi
+. $CARGO_HOME/env
+
+rustup -vV
diff --git a/ci/lib.sh b/ci/lib.sh
index f561884d4016..a7992b22fdc9 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -1,5 +1,6 @@
 # Library of functions shared by all CI scripts
 
+
 if test true = "$GITHUB_ACTIONS"
 then
 	begin_group () {
diff --git a/ci/make-test-artifacts.sh b/ci/make-test-artifacts.sh
index 74141af0cc74..e37ed7030cdf 100755
--- a/ci/make-test-artifacts.sh
+++ b/ci/make-test-artifacts.sh
@@ -7,6 +7,15 @@ mkdir -p "$1" # in case ci/lib.sh decides to quit early
 
 . ${0%/*}/lib.sh
 
+## ensure rustup is in the PATH variable
+if [ "$CARGO_HOME" = "" ]; then
+  echo >&2 "::error:: CARGO_HOME is not set"
+  exit 2
+fi
+export PATH="$CARGO_HOME/bin:$PATH"
+
+rustc -vV
+
 group Build make artifacts-tar ARTIFACTS_DIRECTORY="$1"
 
 check_unignored_build_artifacts
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 01823fd0f140..22b61e2812db 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -5,6 +5,15 @@
 
 . ${0%/*}/lib.sh
 
+## ensure rustup is in the PATH variable
+if [ "$CARGO_HOME" = "" ]; then
+  echo >&2 "::error:: CARGO_HOME is not set"
+  exit 2
+fi
+. $CARGO_HOME/env
+
+rustc -vV || exit $?
+
 run_tests=t
 
 case "$jobname" in
@@ -72,5 +81,9 @@ case "$jobname" in
 	;;
 esac
 
+if [ -d "$CARGO_HOME" ]; then
+  rm -rf $CARGO_HOME
+fi
+
 check_unignored_build_artifacts
 save_good_tree
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 04/15] win+Meson: do allow linking with the Rust-built xdiff
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (2 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 03/15] github workflows: install rust Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Johannes Schindelin via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 05/15] github workflows: upload Cargo.lock Ezekiel Newren via GitGitGadget
                       ` (10 subsequent siblings)
  14 siblings, 0 replies; 198+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When linking against the Rust-built `xdiff`, there is now a new required
dependency: Without _also_ linking to the system library `userenv`, the
compile would fail with this error message:

  xdiff.lib(std-c85e9beb7923f636.std.df32d1bc89881d89-cgu.0.rcgu.o) :
  error LNK2019: unresolved external symbol __imp_GetUserProfileDirectoryW
  referenced in function _ZN3std3env8home_dir17hfd1c3b6676cd78f6E

Therefore, just like we do in case of Makefile-based builds on Windows,
we now also link to that library when building with Meson.

Note that if we only have Rust depend upon libuserenv then at link time
GCC would complain about:

  undefined reference to `GetUserProfileDirectoryW'

Apparently there is _some_ closure that gets compiled in that requires
this function, and that in turn forces Git to link to libuserenv.

This is a new requirement, and therefore has not been made part of the
"minimal Git for Windows SDK".

In the near future, I intend to include it, but for now let's just
ensure that the file is added manually if it is missing.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[en: Squashed a few of Johannes's patches, and moved lib userenv
 handling from an earlier patch]
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 .github/workflows/main.yml | 8 ++++++++
 config.mak.uname           | 4 ++++
 meson.build                | 1 +
 3 files changed, 13 insertions(+)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 2fa5fab0fa83..0f7396621df8 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -123,6 +123,14 @@ jobs:
     steps:
     - uses: actions/checkout@v4
     - uses: git-for-windows/setup-git-for-windows-sdk@v1
+    - name: ensure that libuserenv.a is present
+      shell: bash
+      run: |
+        cd /mingw64/lib && {
+          test -f libuserenv.a ||
+          /c/Program\ Files/Git/mingw64/bin/curl -Lo libuserenv.a \
+            https://github.com/git-for-windows/git-sdk-64/raw/HEAD/mingw64/lib/libuserenv.a
+        }
     - name: Install rustup via github actions
       uses: actions-rs/toolchain@v1
       with:
diff --git a/config.mak.uname b/config.mak.uname
index 3e26bb074a4b..6805e3778a16 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -740,6 +740,10 @@ ifeq ($(uname_S),MINGW)
 		COMPAT_CFLAGS += -D_USE_32BIT_TIME_T
 		BASIC_LDFLAGS += -Wl,--large-address-aware
         endif
+
+	# Unfortunately now needed because of Rust
+	EXTLIBS += -luserenv
+
 	CC = gcc
 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \
 		-fstack-protector-strong
diff --git a/meson.build b/meson.build
index 324f968338b9..5aa9901bfc0f 100644
--- a/meson.build
+++ b/meson.build
@@ -1267,6 +1267,7 @@ elif host_machine.system() == 'windows'
   ]
 
   libgit_dependencies += compiler.find_library('ntdll')
+  libgit_dependencies += compiler.find_library('userenv')
   libgit_include_directories += 'compat/win32'
   if compiler.get_id() == 'msvc'
     libgit_include_directories += 'compat/vcbuild/include'
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 05/15] github workflows: upload Cargo.lock
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (3 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 04/15] win+Meson: do allow linking with the Rust-built xdiff Johannes Schindelin via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust Ezekiel Newren via GitGitGadget
                       ` (9 subsequent siblings)
  14 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Make each ci workflow upload its Cargo.lock file as a build artifact so
that we can audit build dependencies.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 .github/workflows/main.yml | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 0f7396621df8..0f8785a676c3 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -156,6 +156,11 @@ jobs:
       with:
         name: windows-artifacts
         path: artifacts
+    - name: upload Cargo.lock
+      uses: actions/upload-artifact@v4
+      with:
+        name: cargo-lock-windows
+        path: rust/Cargo.lock
   windows-test:
     name: win test
     runs-on: windows-latest
@@ -317,6 +322,11 @@ jobs:
       with:
         name: windows-meson-artifacts
         path: build
+    - name: Upload Cargo.lock
+      uses: actions/upload-artifact@v4
+      with:
+        name: cargo-lock-windows-meson
+        path: rust/Cargo.lock
   windows-meson-test:
     name: win+Meson test
     runs-on: windows-latest
@@ -399,6 +409,11 @@ jobs:
       with:
         name: failed-tests-${{matrix.vector.jobname}}
         path: ${{env.FAILED_TEST_ARTIFACTS}}
+    - name: Upload Cargo.lock
+      uses: actions/upload-artifact@v4
+      with:
+        name: cargo-lock-${{matrix.vector.jobname}}
+        path: rust/Cargo.lock
   fuzz-smoke-test:
     name: fuzz smoke test
     needs: ci-config
@@ -510,6 +525,11 @@ jobs:
       with:
         name: failed-tests-${{matrix.vector.jobname}}
         path: ${{env.FAILED_TEST_ARTIFACTS}}
+    - name: Upload Cargo.lock
+      uses: actions/upload-artifact@v4
+      with:
+        name: cargo-lock-${{matrix.vector.jobname}}
+        path: rust/Cargo.lock
   static-analysis:
     needs: ci-config
     if: needs.ci-config.outputs.enabled == 'yes'
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (4 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 05/15] github workflows: upload Cargo.lock Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23  8:12       ` Kristoffer Haugsbakk
                         ` (3 more replies)
  2025-08-23  3:55     ` [PATCH v3 07/15] xdiff/xprepare: remove superfluous forward declarations Ezekiel Newren via GitGitGadget
                       ` (8 subsequent siblings)
  14 siblings, 4 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Trying to use Rust's Vec in C, or git's ALLOC_GROW() macros (via
wrapper functions) in Rust is painful because:

  * C doing vector things the Rust way would require wrapper functions,
    and Rust doing vector things the C way would require wrapper
    functions, so ivec was created to ensure a consistent contract
    between the 2 languages for how to manipulate a vector.
  * Currently, Rust defines its own 'Vec' type that is generic, but its
    memory allocator and struct layout weren't designed for
    interoperability with C (or any language for that matter), meaning
    that the C side cannot push to or expand a 'Vec' without defining
    wrapper functions in Rust that C can call. Without special care,
    the two languages might use different allocators (malloc/free on
    the C side, and possibly something else in Rust), which would make
    it difficult for a function in one language to free elements
    allocated by a call from a function in the other language.
  * Similarly, git defines ALLOC_GROW() and related macros in
    git-compat-util.h. While we could add functions allowing Rust to
    invoke something similar to those macros, passing three variables
    (pointer, length, allocated_size) instead of a single variable
    (vector) across the language boundary requires more cognitive
    overhead for readers to keep track of and makes it easier to make
    mistakes. Further, for low-level components that we want to
    eventually convert to pure Rust, such triplets would feel very out
    of place.

To address these issue, introduce a new type, ivec -- short for
interoperable vector. (We refer to it as 'ivec' generally, though on
the Rust side the struct is called IVec to match Rust style.)  This new
type is specifically designed for FFI purposes, so that both languages
handle the vector in the same way, though it could be used on either
side independently. This type is designed such that it can easily be
replaced by a standard Rust 'Vec' once interoperability is no longer a
concern.

One particular item to note is that Git's macros to handle vec
operations infer the amount that a vec needs to grow from the size of
a pointer, but that makes it somewhat specific to the macros used in C.
To avoid defining every ivec function as a macro I opted to also
include an element_size field that allows concrete functions like
push() to know how much to grow the memory. This element_size also
helps in verifying that the ivec is correct when passing from C to
Rust.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 Makefile                 |  16 +-
 interop/ivec.c           | 151 +++++++++++++
 interop/ivec.h           |  52 +++++
 meson.build              |   1 +
 rust/interop/src/ivec.rs | 462 +++++++++++++++++++++++++++++++++++++++
 rust/interop/src/lib.rs  |  10 +
 rust/xdiff/src/lib.rs    |   1 +
 7 files changed, 690 insertions(+), 3 deletions(-)
 create mode 100644 interop/ivec.c
 create mode 100644 interop/ivec.h
 create mode 100644 rust/interop/src/ivec.rs

diff --git a/Makefile b/Makefile
index 1ec0c1ee6603..29a53520fd28 100644
--- a/Makefile
+++ b/Makefile
@@ -672,6 +672,7 @@ BUILTIN_OBJS =
 BUILT_INS =
 COMPAT_CFLAGS =
 COMPAT_OBJS =
+INTEROP_OBJS =
 XDIFF_OBJS =
 GENERATED_H =
 EXTRA_CPPFLAGS =
@@ -918,6 +919,7 @@ export PYTHON_PATH
 TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
+INTEROP_LIB = interop/lib.a
 XDIFF_LIB = xdiff/lib.a
 
 EXTLIBS =
@@ -1412,7 +1414,7 @@ UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
 UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/lib-reftable.o
 
 # xdiff and reftable libs may in turn depend on what is in libgit.a
-GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
+GITLIBS = common-main.o $(LIB_FILE) $(INTEROP_LIB) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
 
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
@@ -2747,6 +2749,10 @@ reconfigure config.mak.autogen: config.status
 .PHONY: reconfigure # This is a convenience target.
 endif
 
+INTEROP_OBJS += interop/ivec.o
+.PHONY: interop-objs
+interop-objs: $(INTEROP_OBJS)
+
 XDIFF_OBJS += xdiff/xdiffi.o
 XDIFF_OBJS += xdiff/xemit.o
 XDIFF_OBJS += xdiff/xhistogram.o
@@ -2791,6 +2797,7 @@ OBJECTS += $(GIT_OBJS)
 OBJECTS += $(SCALAR_OBJS)
 OBJECTS += $(PROGRAM_OBJS)
 OBJECTS += $(TEST_OBJS)
+OBJECTS += $(INTEROP_OBJS)
 OBJECTS += $(XDIFF_OBJS)
 OBJECTS += $(FUZZ_OBJS)
 OBJECTS += $(REFTABLE_OBJS) $(REFTABLE_TEST_OBJS)
@@ -2945,7 +2952,10 @@ scalar$X: scalar.o GIT-LDFLAGS $(GITLIBS) compile_rust
 $(LIB_FILE): $(LIB_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
-$(XDIFF_LIB): $(XDIFF_OBJS)
+$(INTEROP_LIB): $(INTEROP_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+$(XDIFF_LIB): $(INTEROP_OBJS) $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
 
@@ -3790,7 +3800,7 @@ clean: profile-clean coverage-clean cocciclean rustclean
 	$(RM) git.rc git.res
 	$(RM) $(OBJECTS)
 	$(RM) headless-git.o
-	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB)
+	$(RM) $(LIB_FILE) $(INTEROP_LIB) $(XDIFF_LIB) $(REFTABLE_LIB)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
diff --git a/interop/ivec.c b/interop/ivec.c
new file mode 100644
index 000000000000..9bc2258c04ad
--- /dev/null
+++ b/interop/ivec.c
@@ -0,0 +1,151 @@
+#include "ivec.h"
+
+static void ivec_set_capacity(void* self, usize new_capacity) {
+	struct rawivec *this = self;
+	if (new_capacity == 0)
+		FREE_AND_NULL(this->ptr);
+	else
+		this->ptr = xrealloc(this->ptr, new_capacity * this->element_size);
+	this->capacity = new_capacity;
+}
+
+void ivec_init(void* self, usize element_size) {
+	struct rawivec *this = self;
+	this->ptr = NULL;
+	this->length = 0;
+	this->capacity = 0;
+	this->element_size = element_size;
+}
+
+/*
+ * MUST CALL IVEC_INIT() FIRST!!!
+ * This function will free the ivec, set self.capacity and self.length
+ * to the specified capacity, and then calloc self.capacity number of
+ * elements.
+ */
+void ivec_zero(void* self, usize capacity) {
+	struct rawivec *this = self;
+	if (this->ptr)
+		FREE_AND_NULL(this->ptr);
+	this->capacity = this->length = capacity;
+	this->ptr = xcalloc(this->capacity, this->element_size);
+}
+
+void ivec_clear(void* self) {
+	struct rawivec *this = self;
+	this->length = 0;
+}
+
+void ivec_reserve_exact(void* self, usize additional) {
+	struct rawivec *this = self;
+	usize new_capacity = this->capacity + additional;
+	ivec_set_capacity(self, new_capacity);
+}
+
+void ivec_reserve(void* self, usize additional) {
+	struct rawivec *this = self;
+	usize growby = 128;
+	if (this->capacity > growby) {
+		growby = this->capacity;
+	}
+	if (additional > growby) {
+		growby = additional;
+	}
+	ivec_reserve_exact(self, growby);
+}
+
+void ivec_shrink_to_fit(void* self) {
+	struct rawivec *this = self;
+	ivec_set_capacity(self, this->length);
+}
+
+void ivec_resize(void* self, usize new_length, void* default_value) {
+	struct rawivec *this = self;
+	isize additional = (isize) (new_length - this->capacity);
+	if (additional > 0) {
+		ivec_reserve(self, additional);
+	}
+
+	for (usize i = this->length; i < new_length; i++) {
+		void* dst = (u8*) this->ptr + (this->length + i) * this->element_size;
+		memcpy(dst, default_value, this->element_size);
+	}
+	this->length = new_length;
+}
+
+void ivec_push(void* self, void* value) {
+	struct rawivec *this = self;
+	u8* dst;
+
+	if (this->length == this->capacity) {
+		ivec_reserve(self, 1);
+	}
+	dst = (u8*) this->ptr + this->length * this->element_size;
+	memcpy(dst, value, this->element_size);
+	this->length++;
+}
+
+void ivec_extend_from_slice(void* self, void const* ptr, usize size) {
+	struct rawivec *this = self;
+	u8* dst;
+
+	if (size == 0)
+		return;
+
+	if (this->length + size > this->capacity) {
+		ivec_reserve(self, this->capacity - this->length + size);
+	}
+	dst = (u8*) this->ptr + this->length * this->element_size;
+	memcpy(dst, ptr, size * this->element_size);
+	this->length += size;
+}
+
+bool ivec_equal(void* self, void* other) {
+	struct rawivec *lhs = self;
+	struct rawivec *rhs = other;
+
+	if (lhs->element_size != rhs->element_size) {
+		return false;
+	}
+
+	if (lhs->length != rhs->length) {
+		return false;
+	}
+
+
+	for (usize i = 0; i < lhs->length; i++) {
+		void* left = (u8 *) lhs->ptr + i * lhs->element_size;
+		void* right = (u8 *) rhs->ptr + i * rhs->element_size;
+		if (memcmp(left, right, lhs->element_size) != 0) {
+			return false;
+		}
+	}
+
+	return true;
+}
+
+
+void ivec_free(void* self) {
+	struct rawivec *this = self;
+	FREE_AND_NULL(this->ptr);
+	this->length = 0;
+	this->capacity = 0;
+	/* don't modify self->element_size */
+}
+
+void ivec_move(void* source, void* destination) {
+	struct rawivec *this = source;
+	struct rawivec *that = destination;
+
+	if (this->element_size != that->element_size)
+		BUG("mismatched element_size");
+
+	ivec_free(destination);
+	that->ptr = this->ptr;
+	that->length = this->length;
+	that->capacity = this->capacity;
+
+	this->ptr = NULL;
+	this->length = 0;
+	this->capacity = 0;
+}
diff --git a/interop/ivec.h b/interop/ivec.h
new file mode 100644
index 000000000000..98be4bbeb54a
--- /dev/null
+++ b/interop/ivec.h
@@ -0,0 +1,52 @@
+#ifndef IVEC_H
+#define IVEC_H
+
+#include "../git-compat-util.h"
+
+struct rawivec {
+	void* ptr;
+	usize length;
+	usize capacity;
+	usize element_size;
+};
+
+#define DEFINE_IVEC_TYPE(type, suffix) \
+struct ivec_##suffix { \
+	type* ptr; \
+	size_t length; \
+	size_t capacity; \
+	size_t element_size; \
+}
+
+#define IVEC_INIT(variable) ivec_init(&(variable), sizeof(*(variable).ptr))
+
+DEFINE_IVEC_TYPE(u8, u8);
+DEFINE_IVEC_TYPE(u16, u16);
+DEFINE_IVEC_TYPE(u32, u32);
+DEFINE_IVEC_TYPE(u64, u64);
+
+DEFINE_IVEC_TYPE(i8, i8);
+DEFINE_IVEC_TYPE(i16, i16);
+DEFINE_IVEC_TYPE(i32, i32);
+DEFINE_IVEC_TYPE(i64, i64);
+
+DEFINE_IVEC_TYPE(f32, f32);
+DEFINE_IVEC_TYPE(f64, f64);
+
+DEFINE_IVEC_TYPE(usize, usize);
+DEFINE_IVEC_TYPE(isize, isize);
+
+void ivec_init(void* self, usize element_size);
+void ivec_zero(void* self, usize capacity);
+void ivec_clear(void* self);
+void ivec_reserve_exact(void* self, usize additional);
+void ivec_reserve(void* self, usize additional);
+void ivec_shrink_to_fit(void* self);
+void ivec_resize(void* self, usize new_length, void* default_value);
+void ivec_push(void* self, void* value);
+void ivec_extend_from_slice(void* self, void const* ptr, usize size);
+bool ivec_equal(void* self, void* other);
+void ivec_free(void* self);
+void ivec_move(void* source, void* destination);
+
+#endif //IVEC_H
diff --git a/meson.build b/meson.build
index 5aa9901bfc0f..fc7c133f79d8 100644
--- a/meson.build
+++ b/meson.build
@@ -395,6 +395,7 @@ libgit_sources = [
   'hex-ll.c',
   'hook.c',
   'ident.c',
+  'interop/ivec.c',
   'json-writer.c',
   'kwset.c',
   'levenshtein.c',
diff --git a/rust/interop/src/ivec.rs b/rust/interop/src/ivec.rs
new file mode 100644
index 000000000000..4e047572b922
--- /dev/null
+++ b/rust/interop/src/ivec.rs
@@ -0,0 +1,462 @@
+use crate::*;
+use core::mem::{align_of, size_of};
+use std::fmt::{Debug, Formatter};
+use std::ops::{Index, IndexMut};
+
+#[repr(C)]
+pub struct IVec<T> {
+    ptr: *mut T,
+    length: usize,
+    capacity: usize,
+    element_size: usize,
+}
+
+impl<T> Default for IVec<T> {
+    fn default() -> Self {
+        Self::new()
+    }
+}
+
+impl<T> Drop for IVec<T> {
+    fn drop(&mut self) {
+        unsafe {
+            self._free();
+        }
+    }
+}
+
+impl<T: Clone> Clone for IVec<T> {
+    fn clone(&self) -> Self {
+        let mut copy = Self::new();
+        copy.reserve_exact(self.len());
+        for i in 0..self.len() {
+            copy.push(self[i].clone());
+        }
+
+        copy
+    }
+}
+
+impl<T: PartialEq> PartialEq for IVec<T> {
+    fn eq(&self, other: &Self) -> bool {
+        if self.len() != other.len() {
+            return false;
+        }
+
+        let lhs = self.as_slice();
+        let rhs = &other.as_slice()[..lhs.len()];
+        for i in 0..lhs.len() {
+            if lhs[i] != rhs[i] {
+                return false;
+            }
+        }
+
+        true
+    }
+}
+
+impl<T: PartialEq> Eq for IVec<T> {}
+
+/*
+ * constructors
+ */
+impl<T> IVec<T> {
+    pub fn new() -> Self {
+        Self {
+            ptr: std::ptr::null_mut(),
+            length: 0,
+            capacity: 0,
+            element_size: size_of::<T>(),
+        }
+    }
+
+    /// uses calloc to create the IVec, it's unsafe because
+    /// zeroed memory may not be a valid default value
+    pub unsafe fn zero(capacity: usize) -> Self {
+        Self {
+            ptr: calloc(capacity, size_of::<T>()) as *mut T,
+            length: capacity,
+            capacity,
+            element_size: size_of::<T>(),
+        }
+    }
+
+    pub fn with_capacity(capacity: usize) -> Self {
+        let mut vec = Self::new();
+        vec._set_capacity(capacity);
+        vec
+    }
+
+    pub fn with_capacity_and_default(capacity: usize, default_value: T) -> Self
+    where
+        T: Copy,
+    {
+        let mut vec = Self::new();
+        vec._set_capacity(capacity);
+        vec._buffer_mut().fill(default_value);
+        vec
+    }
+
+    pub unsafe fn from_raw_mut<'a>(raw: *mut Self) -> &'a mut Self {
+        let vec = raw.as_mut().expect("null pointer");
+        #[cfg(debug_assertions)]
+        vec.test_invariants();
+        vec
+    }
+
+    pub unsafe fn from_raw<'a>(raw: *const Self) -> &'a Self {
+        let vec = &*raw.as_ref().expect("null pointer");
+        #[cfg(debug_assertions)]
+        vec.test_invariants();
+        vec
+    }
+}
+
+/*
+ * private methods
+ */
+impl<T> IVec<T> {
+    pub fn test_invariants(&self) {
+        if !self.ptr.is_null() && (self.ptr as usize) % align_of::<T>() != 0 {
+            panic!(
+                "misaligned pointer: expected {:x}, got {:x}",
+                align_of::<T>(),
+                self.ptr as usize
+            );
+        }
+        if self.ptr.is_null() && (self.length > 0 || self.capacity > 0) {
+            panic!("ptr is null, but length or capacity is > 0");
+        }
+        if !self.ptr.is_null() && self.capacity == 0 {
+            panic!("ptr ISN'T null, but capacity == 0");
+        }
+        if self.element_size != size_of::<T>() {
+            panic!(
+                "incorrect element size, should be: {}, but was: {}",
+                size_of::<T>(),
+                self.element_size
+            );
+        }
+        if self.length > self.capacity {
+            panic!("length: {} > capacity: {}", self.length, self.capacity);
+        }
+        if self.capacity > usize::MAX / size_of::<T>() {
+            panic!(
+                "Capacity {} is too large, potential overflow detected",
+                self.capacity
+            );
+        }
+    }
+
+    fn _zero(&mut self) {
+        self.ptr = std::ptr::null_mut();
+        self.length = 0;
+        self.capacity = 0;
+        // DO NOT MODIFY element_size!!!
+    }
+
+    unsafe fn _free(&mut self) {
+        free(self.ptr as *mut std::ffi::c_void);
+        self._zero();
+    }
+
+    fn _set_capacity(&mut self, new_capacity: usize) {
+        unsafe {
+            if new_capacity == self.capacity {
+                return;
+            }
+            if new_capacity == 0 {
+                self._free();
+            } else {
+                let t = realloc(
+                    self.ptr as *mut std::ffi::c_void,
+                    new_capacity * size_of::<T>(),
+                );
+                if t.is_null() {
+                    panic!("out of memory");
+                }
+                self.ptr = t as *mut T;
+            }
+            self.capacity = new_capacity;
+        }
+    }
+
+    fn _resize(&mut self, new_length: usize, default_value: T, exact: bool)
+    where
+        T: Copy,
+    {
+        if exact {
+            self._set_capacity(new_length);
+        } else if new_length > self.capacity {
+            self.reserve(new_length - self.capacity);
+        } else {
+            /* capacity does not need to be changed */
+        }
+
+        if new_length > self.length {
+            let range = self.length..new_length;
+            self._buffer_mut()[range].fill(default_value);
+        }
+
+        self.length = new_length;
+    }
+
+    fn _buffer_mut(&mut self) -> &mut [T] {
+        if self.ptr.is_null() {
+            &mut []
+        } else {
+            unsafe { std::slice::from_raw_parts_mut(self.ptr, self.capacity) }
+        }
+    }
+
+    fn _buffer(&self) -> &[T] {
+        if self.ptr.is_null() {
+            &[]
+        } else {
+            unsafe { std::slice::from_raw_parts(self.ptr, self.capacity) }
+        }
+    }
+}
+
+/*
+ * methods
+ */
+impl<T> IVec<T> {
+    pub fn len(&self) -> usize {
+        self.length
+    }
+
+    pub unsafe fn set_len(&mut self, new_length: usize) {
+        self.length = new_length;
+    }
+
+    pub fn capacity(&self) -> usize {
+        self.capacity
+    }
+
+    pub fn reserve_exact(&mut self, additional: usize) {
+        self._set_capacity(self.capacity + additional);
+    }
+
+    pub fn reserve(&mut self, additional: usize) {
+        let growby = std::cmp::max(128, self.capacity);
+        self.reserve_exact(std::cmp::max(additional, growby));
+    }
+
+    pub fn shrink_to_fit(&mut self) {
+        self._set_capacity(self.length);
+    }
+
+    pub fn resize(&mut self, new_length: usize, default_value: T)
+    where
+        T: Copy,
+    {
+        self._resize(new_length, default_value, false);
+    }
+
+    pub fn resize_exact(&mut self, new_length: usize, default_value: T)
+    where
+        T: Copy,
+    {
+        self._resize(new_length, default_value, true);
+    }
+
+    pub fn insert(&mut self, index: usize, value: T) {
+        if self.length == self.capacity {
+            self.reserve(1);
+        }
+
+        unsafe {
+            let src = &self._buffer()[index] as *const T;
+            let dst = src.add(1) as *mut T;
+            let len = self.length - index;
+            std::ptr::copy(src, dst, len);
+            std::ptr::write(self.ptr.add(index), value);
+        }
+    }
+
+    pub fn push(&mut self, value: T) {
+        if self.length == self.capacity {
+            self.reserve(1);
+        }
+
+        let i = self.length;
+        unsafe {
+            std::ptr::write(self.ptr.add(i), value);
+        }
+        self.length += 1;
+    }
+
+    pub fn extend_from_slice(&mut self, slice: &[T])
+    where
+        T: Clone,
+    {
+        for v in slice {
+            self.push(v.clone());
+        }
+    }
+
+    pub fn clear(&mut self) {
+        self.length = 0;
+    }
+
+    pub fn as_ptr(&self) -> *const T {
+        self.ptr
+    }
+
+    pub fn as_mut_ptr(&self) -> *mut T {
+        self.ptr
+    }
+
+    pub fn as_slice(&self) -> &[T] {
+        &self._buffer()[0..self.length]
+    }
+
+    pub fn as_mut_slice(&mut self) -> &mut [T] {
+        let range = 0..self.length;
+        &mut self._buffer_mut()[range]
+    }
+}
+
+impl<T> Extend<T> for IVec<T> {
+    fn extend<IT: IntoIterator<Item = T>>(&mut self, iter: IT) {
+        for v in iter {
+            self.push(v);
+        }
+    }
+}
+
+impl<T> Index<usize> for IVec<T> {
+    type Output = T;
+
+    fn index(&self, index: usize) -> &Self::Output {
+        &self.as_slice()[index]
+    }
+}
+
+impl<T> IndexMut<usize> for IVec<T> {
+    fn index_mut(&mut self, index: usize) -> &mut Self::Output {
+        &mut self.as_mut_slice()[index]
+    }
+}
+
+impl<T: Debug> Debug for IVec<T> {
+    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
+        writeln!(
+            f,
+            "ptr: {}, capacity: {}, len: {}, element_size: {}, content: {:?}",
+            self.ptr as usize,
+            self.capacity,
+            self.length,
+            self.element_size,
+            self.as_slice()
+        )
+    }
+}
+
+impl std::fmt::Write for IVec<u8> {
+    fn write_str(&mut self, s: &str) -> std::fmt::Result {
+        Ok(self.extend_from_slice(s.as_bytes()))
+    }
+}
+
+impl std::io::Write for IVec<u8> {
+    fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
+        self.extend_from_slice(buf);
+        Ok(buf.len())
+    }
+
+    fn flush(&mut self) -> std::io::Result<()> {
+        Ok(())
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use crate::ivec::IVec;
+    use std::panic;
+
+    #[test]
+    fn test_panic_on_out_of_bounds() {
+        type TestType = i16;
+        let result = panic::catch_unwind(|| {
+            let mut v = IVec::<TestType>::with_capacity(1_000_000);
+            v[0] = 55;
+        });
+
+        match result {
+            Ok(_) => assert!(false, "index was out of bounds, but no panic was triggered"),
+            Err(_) => assert!(true),
+        }
+    }
+
+    #[test]
+    fn test_push_clear_resize_then_shrink_to_fit() {
+        let mut vec = IVec::<u64>::new();
+        let mut monotonic = 1;
+
+        vec.reserve_exact(1);
+        assert_eq!(1, vec.capacity);
+
+        // test push
+        for _ in 0..10 {
+            vec.push(monotonic);
+            assert_eq!(monotonic as usize, vec.length);
+            assert_eq!(monotonic, vec[(monotonic - 1) as usize]);
+            assert!(vec.capacity >= vec.length);
+            monotonic += 1;
+        }
+
+        // test clear
+        let expected = vec.capacity;
+        vec.clear();
+        assert_eq!(0, vec.length);
+        assert_eq!(expected, vec.capacity);
+
+        // test resize
+        let expected = vec.capacity + 10;
+        let default_value = 19;
+        vec.resize(expected, default_value);
+        // assert_eq!(vec.capacity, vec.slice.len());
+        assert_eq!(expected, vec.length);
+        assert!(vec.capacity >= expected);
+        for i in 0..vec.length {
+            assert_eq!(default_value, vec[i]);
+        }
+
+        vec.reserve(10);
+        // assert_eq!(vec.capacity, vec.slice.len());
+        assert!(vec.capacity > vec.length);
+        let length_before = vec.length;
+        vec.shrink_to_fit();
+        assert_eq!(length_before, vec.length);
+        assert_eq!(vec.length, vec.capacity);
+        // assert_eq!(vec.capacity, vec.slice.len());
+    }
+
+    #[test]
+    fn test_struct_size() {
+        let vec = IVec::<i16>::new();
+
+        assert_eq!(2, vec.element_size);
+        assert_eq!(size_of::<usize>() * 4, size_of::<IVec<i16>>());
+
+        drop(vec);
+
+        let vec = IVec::<u128>::new();
+        assert_eq!(16, vec.element_size);
+        assert_eq!(size_of::<usize>() * 4, size_of::<IVec<u128>>());
+    }
+
+    #[test]
+    fn test_manual_free() {
+        type TestType = i16;
+        let mut vec = IVec::<TestType>::new();
+
+        unsafe { vec._free() };
+        assert!(vec.ptr.is_null());
+        assert_eq!(0, vec.length);
+        assert_eq!(0, vec.capacity);
+        assert_eq!(size_of::<TestType>(), vec.element_size);
+    }
+}
diff --git a/rust/interop/src/lib.rs b/rust/interop/src/lib.rs
index e69de29bb2d1..4850f66e5bd9 100644
--- a/rust/interop/src/lib.rs
+++ b/rust/interop/src/lib.rs
@@ -0,0 +1,10 @@
+pub mod ivec;
+
+use std::ffi::c_void;
+
+extern "C" {
+    pub fn malloc(size: usize) -> *mut c_void;
+    pub fn calloc(nmemb: usize, size: usize) -> *mut c_void;
+    pub fn realloc(ptr: *mut c_void, size: usize) -> *mut c_void;
+    pub fn free(ptr: *mut c_void);
+}
diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
index e69de29bb2d1..8b137891791f 100644
--- a/rust/xdiff/src/lib.rs
+++ b/rust/xdiff/src/lib.rs
@@ -0,0 +1 @@
+
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 07/15] xdiff/xprepare: remove superfluous forward declarations
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (5 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 08/15] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
                       ` (7 subsequent siblings)
  14 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Move xdl_prepare_env() later in the file to avoid the need
for forward declarations.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xprepare.c | 116 ++++++++++++++++++++---------------------------
 1 file changed, 50 insertions(+), 66 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e1d4017b2dde..a45c5ee208c8 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -53,21 +53,6 @@ typedef struct s_xdlclassifier {
 
 
 
-static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags);
-static void xdl_free_classifier(xdlclassifier_t *cf);
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
-			       unsigned int hbits, xrecord_t *rec);
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
-			   xdlclassifier_t *cf, xdfile_t *xdf);
-static void xdl_free_ctx(xdfile_t *xdf);
-static int xdl_clean_mmatch(char const *dis, long i, long s, long e);
-static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
-static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2);
-
-
-
-
 static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags) {
 	cf->flags = flags;
 
@@ -242,57 +227,6 @@ static void xdl_free_ctx(xdfile_t *xdf) {
 }
 
 
-int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
-		    xdfenv_t *xe) {
-	long enl1, enl2, sample;
-	xdlclassifier_t cf;
-
-	memset(&cf, 0, sizeof(cf));
-
-	/*
-	 * For histogram diff, we can afford a smaller sample size and
-	 * thus a poorer estimate of the number of lines, as the hash
-	 * table (rhash) won't be filled up/grown. The number of lines
-	 * (nrecs) will be updated correctly anyway by
-	 * xdl_prepare_ctx().
-	 */
-	sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
-		  ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
-
-	enl1 = xdl_guess_lines(mf1, sample) + 1;
-	enl2 = xdl_guess_lines(mf2, sample) + 1;
-
-	if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
-		return -1;
-
-	if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
-
-		xdl_free_classifier(&cf);
-		return -1;
-	}
-	if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
-
-		xdl_free_ctx(&xe->xdf1);
-		xdl_free_classifier(&cf);
-		return -1;
-	}
-
-	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
-	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
-	    xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
-
-		xdl_free_ctx(&xe->xdf2);
-		xdl_free_ctx(&xe->xdf1);
-		xdl_free_classifier(&cf);
-		return -1;
-	}
-
-	xdl_free_classifier(&cf);
-
-	return 0;
-}
-
-
 void xdl_free_env(xdfenv_t *xe) {
 
 	xdl_free_ctx(&xe->xdf2);
@@ -460,3 +394,53 @@ static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2
 
 	return 0;
 }
+
+int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+		    xdfenv_t *xe) {
+	long enl1, enl2, sample;
+	xdlclassifier_t cf;
+
+	memset(&cf, 0, sizeof(cf));
+
+	/*
+	 * For histogram diff, we can afford a smaller sample size and
+	 * thus a poorer estimate of the number of lines, as the hash
+	 * table (rhash) won't be filled up/grown. The number of lines
+	 * (nrecs) will be updated correctly anyway by
+	 * xdl_prepare_ctx().
+	 */
+	sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
+		  ? XDL_GUESS_NLINES2 : XDL_GUESS_NLINES1);
+
+	enl1 = xdl_guess_lines(mf1, sample) + 1;
+	enl2 = xdl_guess_lines(mf2, sample) + 1;
+
+	if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
+		return -1;
+
+	if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
+
+		xdl_free_classifier(&cf);
+		return -1;
+	}
+	if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
+
+		xdl_free_ctx(&xe->xdf1);
+		xdl_free_classifier(&cf);
+		return -1;
+	}
+
+	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
+	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
+	    xdl_optimize_ctxs(&cf, &xe->xdf1, &xe->xdf2) < 0) {
+
+		xdl_free_ctx(&xe->xdf2);
+		xdl_free_ctx(&xe->xdf1);
+		xdl_free_classifier(&cf);
+		return -1;
+	    }
+
+	xdl_free_classifier(&cf);
+
+	return 0;
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 08/15] xdiff: delete unnecessary fields from xrecord_t and xdfile_t
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (6 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 07/15] xdiff/xprepare: remove superfluous forward declarations Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 09/15] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
                       ` (6 subsequent siblings)
  14 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xprepare.c | 24 +++---------------------
 xdiff/xtypes.h   |  3 ---
 2 files changed, 3 insertions(+), 24 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index a45c5ee208c8..ad356281f939 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -91,8 +91,7 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
 }
 
 
-static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t **rhash,
-			       unsigned int hbits, xrecord_t *rec) {
+static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
 	long hi;
 	char const *line;
 	xdlclass_t *rcrec;
@@ -126,23 +125,17 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 
 	rec->ha = (unsigned long) rcrec->idx;
 
-	hi = (long) XDL_HASHLONG(rec->ha, hbits);
-	rec->next = rhash[hi];
-	rhash[hi] = rec;
-
 	return 0;
 }
 
 
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
 			   xdlclassifier_t *cf, xdfile_t *xdf) {
-	unsigned int hbits;
-	long nrec, hsize, bsize;
+	long nrec, bsize;
 	unsigned long hav;
 	char const *blk, *cur, *top, *prev;
 	xrecord_t *crec;
 	xrecord_t **recs;
-	xrecord_t **rhash;
 	unsigned long *ha;
 	char *rchg;
 	long *rindex;
@@ -150,7 +143,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	ha = NULL;
 	rindex = NULL;
 	rchg = NULL;
-	rhash = NULL;
 	recs = NULL;
 
 	if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
@@ -158,11 +150,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	if (!XDL_ALLOC_ARRAY(recs, narec))
 		goto abort;
 
-	hbits = xdl_hashbits((unsigned int) narec);
-	hsize = 1 << hbits;
-	if (!XDL_CALLOC_ARRAY(rhash, hsize))
-		goto abort;
-
 	nrec = 0;
 	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
 		for (top = blk + bsize; cur < top; ) {
@@ -176,7 +163,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 			crec->size = (long) (cur - prev);
 			crec->ha = hav;
 			recs[nrec++] = crec;
-			if (xdl_classify_record(pass, cf, rhash, hbits, crec) < 0)
+			if (xdl_classify_record(pass, cf, crec) < 0)
 				goto abort;
 		}
 	}
@@ -194,8 +181,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 
 	xdf->nrec = nrec;
 	xdf->recs = recs;
-	xdf->hbits = hbits;
-	xdf->rhash = rhash;
 	xdf->rchg = rchg + 1;
 	xdf->rindex = rindex;
 	xdf->nreff = 0;
@@ -209,7 +194,6 @@ abort:
 	xdl_free(ha);
 	xdl_free(rindex);
 	xdl_free(rchg);
-	xdl_free(rhash);
 	xdl_free(recs);
 	xdl_cha_free(&xdf->rcha);
 	return -1;
@@ -217,8 +201,6 @@ abort:
 
 
 static void xdl_free_ctx(xdfile_t *xdf) {
-
-	xdl_free(xdf->rhash);
 	xdl_free(xdf->rindex);
 	xdl_free(xdf->rchg - 1);
 	xdl_free(xdf->ha);
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8442bd436efe..8b8467360ecf 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,7 +39,6 @@ typedef struct s_chastore {
 } chastore_t;
 
 typedef struct s_xrecord {
-	struct s_xrecord *next;
 	char const *ptr;
 	long size;
 	unsigned long ha;
@@ -48,8 +47,6 @@ typedef struct s_xrecord {
 typedef struct s_xdfile {
 	chastore_t rcha;
 	long nrec;
-	unsigned int hbits;
-	xrecord_t **rhash;
 	long dstart, dend;
 	xrecord_t **recs;
 	char *rchg;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 09/15] xdiff: make fields of xrecord_t Rust friendly
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (7 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 08/15] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 10/15] xdiff: use one definition for freeing xdfile_t Ezekiel Newren via GitGitGadget
                       ` (5 subsequent siblings)
  14 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

A few commits ago, we added definitions for Rust primitive types,
to facilitate interoperability between C and Rust. Switch a
few variables to use these types. Which, for now, will
require adding some casts.

Also change xdlclass_t::ha to be u64 to match xrecord_t::ha, as
pointed out by Johannes.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xdiffi.c    |  8 ++++----
 xdiff/xemit.c     |  2 +-
 xdiff/xmerge.c    | 14 +++++++-------
 xdiff/xpatience.c |  2 +-
 xdiff/xprepare.c  |  8 ++++----
 xdiff/xtypes.h    |  6 +++---
 xdiff/xutils.c    |  4 ++--
 7 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 5a96e36dfbea..3b364c61f671 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -418,7 +418,7 @@ static int get_indent(xrecord_t *rec)
 	long i;
 	int ret = 0;
 
-	for (i = 0; i < rec->size; i++) {
+	for (i = 0; i < (long) rec->size; i++) {
 		char c = rec->ptr[i];
 
 		if (!XDL_ISSPACE(c))
@@ -1005,11 +1005,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
 
 		rec = &xe->xdf1.recs[xch->i1];
 		for (i = 0; i < xch->chg1 && ignore; i++)
-			ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+			ignore = xdl_blankline((const char*) rec[i]->ptr, rec[i]->size, flags);
 
 		rec = &xe->xdf2.recs[xch->i2];
 		for (i = 0; i < xch->chg2 && ignore; i++)
-			ignore = xdl_blankline(rec[i]->ptr, rec[i]->size, flags);
+			ignore = xdl_blankline((const char*)rec[i]->ptr, rec[i]->size, flags);
 
 		xch->ignore = ignore;
 	}
@@ -1020,7 +1020,7 @@ static int record_matches_regex(xrecord_t *rec, xpparam_t const *xpp) {
 	size_t i;
 
 	for (i = 0; i < xpp->ignore_regex_nr; i++)
-		if (!regexec_buf(xpp->ignore_regex[i], rec->ptr, rec->size, 1,
+		if (!regexec_buf(xpp->ignore_regex[i], (const char*) rec->ptr, rec->size, 1,
 				 &regmatch, 0))
 			return 1;
 
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 1d40c9cb4076..bbf7b7f8c862 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -24,7 +24,7 @@
 
 static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
 
-	*rec = xdf->recs[ri]->ptr;
+	*rec = (char const*) xdf->recs[ri]->ptr;
 
 	return xdf->recs[ri]->size;
 }
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index af40c88a5b36..6fa6ea61a208 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -101,8 +101,8 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
 	xrecord_t **rec2 = xe2->xdf2.recs + i2;
 
 	for (i = 0; i < line_count; i++) {
-		int result = xdl_recmatch(rec1[i]->ptr, rec1[i]->size,
-			rec2[i]->ptr, rec2[i]->size, flags);
+		int result = xdl_recmatch((const char*) rec1[i]->ptr, rec1[i]->size,
+			(const char*) rec2[i]->ptr, rec2[i]->size, flags);
 		if (!result)
 			return -1;
 	}
@@ -324,8 +324,8 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
 
 static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
 {
-	return xdl_recmatch(rec1->ptr, rec1->size,
-			    rec2->ptr, rec2->size, flags);
+	return xdl_recmatch((char const*) rec1->ptr, rec1->size,
+			    (char const*) rec2->ptr, rec2->size, flags);
 }
 
 /*
@@ -383,10 +383,10 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
 		 */
 		t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
 		t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
-			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - t1.ptr;
+			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - (u8 const*) t1.ptr;
 		t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
 		t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
-			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - t2.ptr;
+			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - (u8 const*) t2.ptr;
 		if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
 			return -1;
 		if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
@@ -440,7 +440,7 @@ static int line_contains_alnum(const char *ptr, long size)
 static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
 {
 	for (; chg; chg--, i++)
-		if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
+		if (line_contains_alnum((char const*) xe->xdf2.recs[i]->ptr,
 				xe->xdf2.recs[i]->size))
 			return 1;
 	return 0;
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 77dc411d1937..986a3a3f749a 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
 		return;
 	map->entries[index].line1 = line;
 	map->entries[index].hash = record->ha;
-	map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1]->ptr);
+	map->entries[index].anchor = is_anchor(xpp, (const char*) map->env->xdf1.recs[line - 1]->ptr);
 	if (!map->first)
 		map->first = map->entries + index;
 	if (map->last) {
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index ad356281f939..00cdf7d8a038 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -32,7 +32,7 @@
 
 typedef struct s_xdlclass {
 	struct s_xdlclass *next;
-	unsigned long ha;
+	u64 ha;
 	char const *line;
 	long size;
 	long idx;
@@ -96,12 +96,12 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 	char const *line;
 	xdlclass_t *rcrec;
 
-	line = rec->ptr;
+	line = (char const*) rec->ptr;
 	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
 	for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
 		if (rcrec->ha == rec->ha &&
 				xdl_recmatch(rcrec->line, rcrec->size,
-					rec->ptr, rec->size, cf->flags))
+					(const char*) rec->ptr, rec->size, cf->flags))
 			break;
 
 	if (!rcrec) {
@@ -159,7 +159,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 				goto abort;
 			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
 				goto abort;
-			crec->ptr = prev;
+			crec->ptr = (u8 const*) prev;
 			crec->size = (long) (cur - prev);
 			crec->ha = hav;
 			recs[nrec++] = crec;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 8b8467360ecf..6e5f67ebf380 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,9 +39,9 @@ typedef struct s_chastore {
 } chastore_t;
 
 typedef struct s_xrecord {
-	char const *ptr;
-	long size;
-	unsigned long ha;
+	u8 const* ptr;
+	usize size;
+	u64 ha;
 } xrecord_t;
 
 typedef struct s_xdfile {
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 444a108f87c0..10e4f20b7c31 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -418,10 +418,10 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
 
 	subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
 	subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
-		diff_env->xdf1.recs[line1 + count1 - 2]->size - subfile1.ptr;
+		diff_env->xdf1.recs[line1 + count1 - 2]->size - (u8 const*) subfile1.ptr;
 	subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
 	subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
-		diff_env->xdf2.recs[line2 + count2 - 2]->size - subfile2.ptr;
+		diff_env->xdf2.recs[line2 + count2 - 2]->size - (u8 const*) subfile2.ptr;
 	if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
 		return -1;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 10/15] xdiff: use one definition for freeing xdfile_t
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (8 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 09/15] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 11/15] xdiff: replace chastore with an ivec in xdfile_t Ezekiel Newren via GitGitGadget
                       ` (4 subsequent siblings)
  14 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Simplify xdl_prepare_ctx() by using xdl_free_ctx() instead of using
local variables with hand rolled memory management.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xprepare.c | 60 +++++++++++++++++++-----------------------------
 1 file changed, 24 insertions(+), 36 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 00cdf7d8a038..55e1cc308756 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -129,86 +129,74 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 }
 
 
+static void xdl_free_ctx(xdfile_t *xdf) {
+	xdl_free(xdf->rindex);
+	xdl_free(xdf->rchg - 1);
+	xdl_free(xdf->ha);
+	xdl_free(xdf->recs);
+	xdl_cha_free(&xdf->rcha);
+}
+
+
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
 			   xdlclassifier_t *cf, xdfile_t *xdf) {
-	long nrec, bsize;
+	long bsize;
 	unsigned long hav;
 	char const *blk, *cur, *top, *prev;
 	xrecord_t *crec;
-	xrecord_t **recs;
-	unsigned long *ha;
-	char *rchg;
-	long *rindex;
 
-	ha = NULL;
-	rindex = NULL;
-	rchg = NULL;
-	recs = NULL;
+	xdf->ha = NULL;
+	xdf->rindex = NULL;
+	xdf->rchg = NULL;
+	xdf->recs = NULL;
+	xdf->nrec = 0;
 
 	if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
 		goto abort;
-	if (!XDL_ALLOC_ARRAY(recs, narec))
+	if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
 		goto abort;
 
-	nrec = 0;
 	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
 		for (top = blk + bsize; cur < top; ) {
 			prev = cur;
 			hav = xdl_hash_record(&cur, top, xpp->flags);
-			if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
+			if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
 				goto abort;
 			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
 				goto abort;
 			crec->ptr = (u8 const*) prev;
 			crec->size = (long) (cur - prev);
 			crec->ha = hav;
-			recs[nrec++] = crec;
+			xdf->recs[xdf->nrec++] = crec;
 			if (xdl_classify_record(pass, cf, crec) < 0)
 				goto abort;
 		}
 	}
 
-	if (!XDL_CALLOC_ARRAY(rchg, nrec + 2))
+	if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
 		goto abort;
 
 	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
 	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
-		if (!XDL_ALLOC_ARRAY(rindex, nrec + 1))
+		if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
 			goto abort;
-		if (!XDL_ALLOC_ARRAY(ha, nrec + 1))
+		if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
 			goto abort;
 	}
 
-	xdf->nrec = nrec;
-	xdf->recs = recs;
-	xdf->rchg = rchg + 1;
-	xdf->rindex = rindex;
+	xdf->rchg += 1;
 	xdf->nreff = 0;
-	xdf->ha = ha;
 	xdf->dstart = 0;
-	xdf->dend = nrec - 1;
+	xdf->dend = xdf->nrec - 1;
 
 	return 0;
 
 abort:
-	xdl_free(ha);
-	xdl_free(rindex);
-	xdl_free(rchg);
-	xdl_free(recs);
-	xdl_cha_free(&xdf->rcha);
+	xdl_free_ctx(xdf);
 	return -1;
 }
 
 
-static void xdl_free_ctx(xdfile_t *xdf) {
-	xdl_free(xdf->rindex);
-	xdl_free(xdf->rchg - 1);
-	xdl_free(xdf->ha);
-	xdl_free(xdf->recs);
-	xdl_cha_free(&xdf->rcha);
-}
-
-
 void xdl_free_env(xdfenv_t *xe) {
 
 	xdl_free_ctx(&xe->xdf2);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 11/15] xdiff: replace chastore with an ivec in xdfile_t
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (9 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 10/15] xdiff: use one definition for freeing xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 12/15] xdiff: delete nrec field from xdfile_t Ezekiel Newren via GitGitGadget
                       ` (3 subsequent siblings)
  14 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

xdfile_t currently uses a chastore which functions as a memory pool and
a vector which maps to the alocations created by the chastore. It seems
like xrecord_t used to be a linked list until the recs and nrec fields
were added. I think that xrecord_t.next was meant to be removed, but
was overlooked. This dual data structure setup make the code somewhat
confusing.

Additionally the C type chastore_t isn't FFI friendly. While it could
be implemented in Rust, since the data structure is confusing anyway,
replace it with an ivec whose purpose is to be interoperable. This
makes the fields nrec and recs in xdfile_t redundant, which will be
removed in the next 2 commits.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xprepare.c | 34 +++++++++++++++++-----------------
 xdiff/xtypes.h   |  6 ++++--
 2 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 55e1cc308756..3b33186c15a3 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -130,11 +130,11 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 
 
 static void xdl_free_ctx(xdfile_t *xdf) {
+	ivec_free(&xdf->record);
 	xdl_free(xdf->rindex);
 	xdl_free(xdf->rchg - 1);
 	xdl_free(xdf->ha);
 	xdl_free(xdf->recs);
-	xdl_cha_free(&xdf->rcha);
 }
 
 
@@ -143,35 +143,35 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	long bsize;
 	unsigned long hav;
 	char const *blk, *cur, *top, *prev;
-	xrecord_t *crec;
 
 	xdf->ha = NULL;
 	xdf->rindex = NULL;
 	xdf->rchg = NULL;
 	xdf->recs = NULL;
 	xdf->nrec = 0;
-
-	if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0)
-		goto abort;
-	if (!XDL_ALLOC_ARRAY(xdf->recs, narec))
-		goto abort;
+	IVEC_INIT(xdf->record);
 
 	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
 		for (top = blk + bsize; cur < top; ) {
+			xrecord_t crec;
 			prev = cur;
 			hav = xdl_hash_record(&cur, top, xpp->flags);
-			if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
-				goto abort;
-			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
-				goto abort;
-			crec->ptr = (u8 const*) prev;
-			crec->size = (long) (cur - prev);
-			crec->ha = hav;
-			xdf->recs[xdf->nrec++] = crec;
-			if (xdl_classify_record(pass, cf, crec) < 0)
-				goto abort;
+			crec.ptr = (u8 const*) prev;
+			crec.size = cur - prev;
+			crec.ha = hav;
+			ivec_push(&xdf->record, &crec);
 		}
 	}
+	ivec_shrink_to_fit(&xdf->record);
+
+	xdf->nrec = (long) xdf->record.length;
+	if (!XDL_ALLOC_ARRAY(xdf->recs, xdf->record.length))
+		goto abort;
+	for (usize i = 0; i < xdf->record.length; i++) {
+		if (xdl_classify_record(pass, cf, &xdf->record.ptr[i]) < 0)
+			goto abort;
+		xdf->recs[i] = &xdf->record.ptr[i];
+	}
 
 	if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
 		goto abort;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 6e5f67ebf380..5028a70b2675 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -23,7 +23,7 @@
 #if !defined(XTYPES_H)
 #define XTYPES_H
 
-
+#include "../interop/ivec.h"
 
 typedef struct s_chanode {
 	struct s_chanode *next;
@@ -44,8 +44,10 @@ typedef struct s_xrecord {
 	u64 ha;
 } xrecord_t;
 
+DEFINE_IVEC_TYPE(xrecord_t, xrecord);
+
 typedef struct s_xdfile {
-	chastore_t rcha;
+	struct ivec_xrecord record;
 	long nrec;
 	long dstart, dend;
 	xrecord_t **recs;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 12/15] xdiff: delete nrec field from xdfile_t
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (10 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 11/15] xdiff: replace chastore with an ivec in xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 13/15] xdiff: delete recs " Ezekiel Newren via GitGitGadget
                       ` (2 subsequent siblings)
  14 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Because of the data structure cleanup in the previous commit, the nrec
field is no longer necessary. Use record.length in place of nrec.

This commit is best viewed with --color-words.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xdiffi.c    | 10 +++++-----
 xdiff/xemit.c     | 20 ++++++++++----------
 xdiff/xmerge.c    | 10 +++++-----
 xdiff/xpatience.c |  2 +-
 xdiff/xprepare.c  | 34 ++++++++++++++++------------------
 xdiff/xtypes.h    |  1 -
 6 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 3b364c61f671..bcab2d7ae516 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -496,7 +496,7 @@ static void measure_split(const xdfile_t *xdf, long split,
 {
 	long i;
 
-	if (split >= xdf->nrec) {
+	if (split >= (long) xdf->record.length) {
 		m->end_of_file = 1;
 		m->indent = -1;
 	} else {
@@ -519,7 +519,7 @@ static void measure_split(const xdfile_t *xdf, long split,
 
 	m->post_blank = 0;
 	m->post_indent = -1;
-	for (i = split + 1; i < xdf->nrec; i++) {
+	for (i = split + 1; i < (long) xdf->record.length; i++) {
 		m->post_indent = get_indent(xdf->recs[i]);
 		if (m->post_indent != -1)
 			break;
@@ -730,7 +730,7 @@ static void group_init(xdfile_t *xdf, struct xdlgroup *g)
  */
 static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
 {
-	if (g->end == xdf->nrec)
+	if (g->end == (long) xdf->record.length)
 		return -1;
 
 	g->start = g->end + 1;
@@ -763,7 +763,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
  */
 static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
 {
-	if (g->end < xdf->nrec &&
+	if (g->end < (long) xdf->record.length &&
 	    recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
 		xdf->rchg[g->start++] = 0;
 		xdf->rchg[g->end++] = 1;
@@ -950,7 +950,7 @@ int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
 	/*
 	 * Trivial. Collects "groups" of changes and creates an edit script.
 	 */
-	for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
+	for (i1 = xe->xdf1.record.length, i2 = xe->xdf2.record.length; i1 >= 0 || i2 >= 0; i1--, i2--)
 		if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
 			for (l1 = i1; rchg1[i1 - 1]; i1--);
 			for (l2 = i2; rchg2[i2 - 1]; i2--);
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index bbf7b7f8c862..11c2823ecab5 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -147,7 +147,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
 	buf = func_line ? func_line->buf : dummy;
 	size = func_line ? sizeof(func_line->buf) : sizeof(dummy);
 
-	for (l = start; l != limit && 0 <= l && l < xe->xdf1.nrec; l += step) {
+	for (l = start; l != limit && 0 <= l && l < (long) xe->xdf1.record.length; l += step) {
 		long len = match_func_rec(&xe->xdf1, xecfg, l, buf, size);
 		if (len >= 0) {
 			if (func_line)
@@ -191,14 +191,14 @@ pre_context_calculation:
 			long fs1, i1 = xch->i1;
 
 			/* Appended chunk? */
-			if (i1 >= xe->xdf1.nrec) {
+			if (i1 >= (long) xe->xdf1.record.length) {
 				long i2 = xch->i2;
 
 				/*
 				 * We don't need additional context if
 				 * a whole function was added.
 				 */
-				while (i2 < xe->xdf2.nrec) {
+				while (i2 < (long) xe->xdf2.record.length) {
 					if (is_func_rec(&xe->xdf2, xecfg, i2))
 						goto post_context_calculation;
 					i2++;
@@ -208,7 +208,7 @@ pre_context_calculation:
 				 * Otherwise get more context from the
 				 * pre-image.
 				 */
-				i1 = xe->xdf1.nrec - 1;
+				i1 = xe->xdf1.record.length - 1;
 			}
 
 			fs1 = get_func_line(xe, xecfg, NULL, i1, -1);
@@ -240,8 +240,8 @@ pre_context_calculation:
 
  post_context_calculation:
 		lctx = xecfg->ctxlen;
-		lctx = XDL_MIN(lctx, xe->xdf1.nrec - (xche->i1 + xche->chg1));
-		lctx = XDL_MIN(lctx, xe->xdf2.nrec - (xche->i2 + xche->chg2));
+		lctx = XDL_MIN(lctx, (long) xe->xdf1.record.length - (xche->i1 + xche->chg1));
+		lctx = XDL_MIN(lctx, (long) xe->xdf2.record.length - (xche->i2 + xche->chg2));
 
 		e1 = xche->i1 + xche->chg1 + lctx;
 		e2 = xche->i2 + xche->chg2 + lctx;
@@ -249,13 +249,13 @@ pre_context_calculation:
 		if (xecfg->flags & XDL_EMIT_FUNCCONTEXT) {
 			long fe1 = get_func_line(xe, xecfg, NULL,
 						 xche->i1 + xche->chg1,
-						 xe->xdf1.nrec);
+						 xe->xdf1.record.length);
 			while (fe1 > 0 && is_empty_rec(&xe->xdf1, fe1 - 1))
 				fe1--;
 			if (fe1 < 0)
-				fe1 = xe->xdf1.nrec;
+				fe1 = xe->xdf1.record.length;
 			if (fe1 > e1) {
-				e2 = XDL_MIN(e2 + (fe1 - e1), xe->xdf2.nrec);
+				e2 = XDL_MIN(e2 + (fe1 - e1), (long) xe->xdf2.record.length);
 				e1 = fe1;
 			}
 
@@ -266,7 +266,7 @@ pre_context_calculation:
 			 */
 			if (xche->next) {
 				long l = XDL_MIN(xche->next->i1,
-						 xe->xdf1.nrec - 1);
+						 (long) xe->xdf1.record.length - 1);
 				if (l - xecfg->ctxlen <= e1 ||
 				    get_func_line(xe, xecfg, NULL, l, e1) < 0) {
 					xche = xche->next;
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index 6fa6ea61a208..f48549605d09 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -158,11 +158,11 @@ static int is_eol_crlf(xdfile_t *file, int i)
 {
 	long size;
 
-	if (i < file->nrec - 1)
+	if (i < (isize) file->record.length - 1)
 		/* All lines before the last *must* end in LF */
 		return (size = file->recs[i]->size) > 1 &&
 			file->recs[i]->ptr[size - 2] == '\r';
-	if (!file->nrec)
+	if (!file->record.length)
 		/* Cannot determine eol style from empty file */
 		return -1;
 	if ((size = file->recs[i]->size) &&
@@ -317,7 +317,7 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
 			continue;
 		i = m->i1 + m->chg1;
 	}
-	size += xdl_recs_copy(xe1, i, xe1->xdf2.nrec - i, 0, 0,
+	size += xdl_recs_copy(xe1, i, (int) xe1->xdf2.record.length - i, 0, 0,
 			      dest ? dest + size : NULL);
 	return size;
 }
@@ -622,7 +622,7 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1,
 			changes = c;
 		i0 = xscr1->i1;
 		i1 = xscr1->i2;
-		i2 = xscr1->i1 + xe2->xdf2.nrec - xe2->xdf1.nrec;
+		i2 = xscr1->i1 + xe2->xdf2.record.length - xe2->xdf1.record.length;
 		chg0 = xscr1->chg1;
 		chg1 = xscr1->chg2;
 		chg2 = xscr1->chg1;
@@ -637,7 +637,7 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1,
 		if (!changes)
 			changes = c;
 		i0 = xscr2->i1;
-		i1 = xscr2->i1 + xe1->xdf2.nrec - xe1->xdf1.nrec;
+		i1 = xscr2->i1 + xe1->xdf2.record.length - xe1->xdf1.record.length;
 		i2 = xscr2->i2;
 		chg0 = xscr2->chg1;
 		chg1 = xscr2->chg1;
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 986a3a3f749a..e1ce9a399fbf 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -370,5 +370,5 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
 
 int xdl_do_patience_diff(xpparam_t const *xpp, xdfenv_t *env)
 {
-	return patience_diff(xpp, env, 1, env->xdf1.nrec, 1, env->xdf2.nrec);
+	return patience_diff(xpp, env, 1, env->xdf1.record.length, 1, env->xdf2.record.length);
 }
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 3b33186c15a3..9b46523afe97 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -138,7 +138,7 @@ static void xdl_free_ctx(xdfile_t *xdf) {
 }
 
 
-static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
+static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, xpparam_t const *xpp,
 			   xdlclassifier_t *cf, xdfile_t *xdf) {
 	long bsize;
 	unsigned long hav;
@@ -148,7 +148,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	xdf->rindex = NULL;
 	xdf->rchg = NULL;
 	xdf->recs = NULL;
-	xdf->nrec = 0;
 	IVEC_INIT(xdf->record);
 
 	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
@@ -164,7 +163,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	}
 	ivec_shrink_to_fit(&xdf->record);
 
-	xdf->nrec = (long) xdf->record.length;
 	if (!XDL_ALLOC_ARRAY(xdf->recs, xdf->record.length))
 		goto abort;
 	for (usize i = 0; i < xdf->record.length; i++) {
@@ -173,21 +171,21 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 		xdf->recs[i] = &xdf->record.ptr[i];
 	}
 
-	if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->nrec + 2))
+	if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->record.length + 2))
 		goto abort;
 
 	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
 	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
-		if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
+		if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->record.length + 1))
 			goto abort;
-		if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->nrec + 1))
+		if (!XDL_ALLOC_ARRAY(xdf->ha, xdf->record.length + 1))
 			goto abort;
 	}
 
 	xdf->rchg += 1;
 	xdf->nreff = 0;
 	xdf->dstart = 0;
-	xdf->dend = xdf->nrec - 1;
+	xdf->dend = xdf->record.length - 1;
 
 	return 0;
 
@@ -274,12 +272,12 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 	char *dis, *dis1, *dis2;
 	int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
 
-	if (!XDL_CALLOC_ARRAY(dis, xdf1->nrec + xdf2->nrec + 2))
+	if (!XDL_CALLOC_ARRAY(dis, xdf1->record.length + xdf2->record.length + 2))
 		return -1;
 	dis1 = dis;
-	dis2 = dis1 + xdf1->nrec + 1;
+	dis2 = dis1 + xdf1->record.length + 1;
 
-	if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
+	if ((mlim = xdl_bogosqrt(xdf1->record.length)) > XDL_MAX_EQLIMIT)
 		mlim = XDL_MAX_EQLIMIT;
 	for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
 		rcrec = cf->rcrecs[(*recs)->ha];
@@ -287,7 +285,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 		dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
 	}
 
-	if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
+	if ((mlim = xdl_bogosqrt(xdf2->record.length)) > XDL_MAX_EQLIMIT)
 		mlim = XDL_MAX_EQLIMIT;
 	for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
 		rcrec = cf->rcrecs[(*recs)->ha];
@@ -334,21 +332,21 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
 
 	recs1 = xdf1->recs;
 	recs2 = xdf2->recs;
-	for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
+	for (i = 0, lim = XDL_MIN(xdf1->record.length, xdf2->record.length); i < lim;
 	     i++, recs1++, recs2++)
 		if ((*recs1)->ha != (*recs2)->ha)
 			break;
 
 	xdf1->dstart = xdf2->dstart = i;
 
-	recs1 = xdf1->recs + xdf1->nrec - 1;
-	recs2 = xdf2->recs + xdf2->nrec - 1;
+	recs1 = xdf1->recs + xdf1->record.length - 1;
+	recs2 = xdf2->recs + xdf2->record.length - 1;
 	for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
 		if ((*recs1)->ha != (*recs2)->ha)
 			break;
 
-	xdf1->dend = xdf1->nrec - i - 1;
-	xdf2->dend = xdf2->nrec - i - 1;
+	xdf1->dend = xdf1->record.length - i - 1;
+	xdf2->dend = xdf2->record.length - i - 1;
 
 	return 0;
 }
@@ -388,12 +386,12 @@ int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
 	if (xdl_init_classifier(&cf, enl1 + enl2 + 1, xpp->flags) < 0)
 		return -1;
 
-	if (xdl_prepare_ctx(1, mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
+	if (xdl_prepare_ctx(1, mf1, xpp, &cf, &xe->xdf1) < 0) {
 
 		xdl_free_classifier(&cf);
 		return -1;
 	}
-	if (xdl_prepare_ctx(2, mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
+	if (xdl_prepare_ctx(2, mf2, xpp, &cf, &xe->xdf2) < 0) {
 
 		xdl_free_ctx(&xe->xdf1);
 		xdl_free_classifier(&cf);
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 5028a70b2675..c322e62fbf06 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -48,7 +48,6 @@ DEFINE_IVEC_TYPE(xrecord_t, xrecord);
 
 typedef struct s_xdfile {
 	struct ivec_xrecord record;
-	long nrec;
 	long dstart, dend;
 	xrecord_t **recs;
 	char *rchg;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 13/15] xdiff: delete recs field from xdfile_t
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (11 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 12/15] xdiff: delete nrec field from xdfile_t Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 14/15] xdiff: make xdfile_t more rust friendly Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 15/15] xdiff: implement xdl_trim_ends() in Rust Ezekiel Newren via GitGitGadget
  14 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Because of the change from chastore to ivec a few commits ago,
recs now points to record's elements in a 1:1 mapping.
Since both recs and record are vectors, this additional mapping is
superfluous. Remove recs.

This commit is best viewed with --color-words.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xdiffi.c     | 30 ++++++++++++------------
 xdiff/xemit.c      |  4 ++--
 xdiff/xhistogram.c |  2 +-
 xdiff/xmerge.c     | 58 +++++++++++++++++++++++-----------------------
 xdiff/xpatience.c  | 14 +++++------
 xdiff/xprepare.c   | 37 +++++++++++++----------------
 xdiff/xtypes.h     |  1 -
 xdiff/xutils.c     | 12 +++++-----
 8 files changed, 76 insertions(+), 82 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index bcab2d7ae516..ebdb72432261 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -501,13 +501,13 @@ static void measure_split(const xdfile_t *xdf, long split,
 		m->indent = -1;
 	} else {
 		m->end_of_file = 0;
-		m->indent = get_indent(xdf->recs[split]);
+		m->indent = get_indent(&xdf->record.ptr[split]);
 	}
 
 	m->pre_blank = 0;
 	m->pre_indent = -1;
 	for (i = split - 1; i >= 0; i--) {
-		m->pre_indent = get_indent(xdf->recs[i]);
+		m->pre_indent = get_indent(&xdf->record.ptr[i]);
 		if (m->pre_indent != -1)
 			break;
 		m->pre_blank += 1;
@@ -520,7 +520,7 @@ static void measure_split(const xdfile_t *xdf, long split,
 	m->post_blank = 0;
 	m->post_indent = -1;
 	for (i = split + 1; i < (long) xdf->record.length; i++) {
-		m->post_indent = get_indent(xdf->recs[i]);
+		m->post_indent = get_indent(&xdf->record.ptr[i]);
 		if (m->post_indent != -1)
 			break;
 		m->post_blank += 1;
@@ -764,7 +764,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
 static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
 {
 	if (g->end < (long) xdf->record.length &&
-	    recs_match(xdf->recs[g->start], xdf->recs[g->end])) {
+	    recs_match(&xdf->record.ptr[g->start], &xdf->record.ptr[g->end])) {
 		xdf->rchg[g->start++] = 0;
 		xdf->rchg[g->end++] = 1;
 
@@ -785,7 +785,7 @@ static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
 static int group_slide_up(xdfile_t *xdf, struct xdlgroup *g)
 {
 	if (g->start > 0 &&
-	    recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1])) {
+	    recs_match(&xdf->record.ptr[g->start - 1], &xdf->record.ptr[g->end - 1])) {
 		xdf->rchg[--g->start] = 1;
 		xdf->rchg[--g->end] = 0;
 
@@ -1000,16 +1000,16 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
 
 	for (xch = xscr; xch; xch = xch->next) {
 		int ignore = 1;
-		xrecord_t **rec;
+		xrecord_t *rec;
 		long i;
 
-		rec = &xe->xdf1.recs[xch->i1];
+		rec = &xe->xdf1.record.ptr[xch->i1];
 		for (i = 0; i < xch->chg1 && ignore; i++)
-			ignore = xdl_blankline((const char*) rec[i]->ptr, rec[i]->size, flags);
+			ignore = xdl_blankline((const char*) rec[i].ptr, rec[i].size, flags);
 
-		rec = &xe->xdf2.recs[xch->i2];
+		rec = &xe->xdf2.record.ptr[xch->i2];
 		for (i = 0; i < xch->chg2 && ignore; i++)
-			ignore = xdl_blankline((const char*)rec[i]->ptr, rec[i]->size, flags);
+			ignore = xdl_blankline((const char*)rec[i].ptr, rec[i].size, flags);
 
 		xch->ignore = ignore;
 	}
@@ -1033,7 +1033,7 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
 	xdchange_t *xch;
 
 	for (xch = xscr; xch; xch = xch->next) {
-		xrecord_t **rec;
+		xrecord_t *rec;
 		int ignore = 1;
 		long i;
 
@@ -1043,13 +1043,13 @@ static void xdl_mark_ignorable_regex(xdchange_t *xscr, const xdfenv_t *xe,
 		if (xch->ignore)
 			continue;
 
-		rec = &xe->xdf1.recs[xch->i1];
+		rec = &xe->xdf1.record.ptr[xch->i1];
 		for (i = 0; i < xch->chg1 && ignore; i++)
-			ignore = record_matches_regex(rec[i], xpp);
+			ignore = record_matches_regex(&rec[i], xpp);
 
-		rec = &xe->xdf2.recs[xch->i2];
+		rec = &xe->xdf2.record.ptr[xch->i2];
 		for (i = 0; i < xch->chg2 && ignore; i++)
-			ignore = record_matches_regex(rec[i], xpp);
+			ignore = record_matches_regex(&rec[i], xpp);
 
 		xch->ignore = ignore;
 	}
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 11c2823ecab5..0c9a12a5e828 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -24,9 +24,9 @@
 
 static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
 
-	*rec = (char const*) xdf->recs[ri]->ptr;
+	*rec = (char const*) xdf->record.ptr[ri].ptr;
 
-	return xdf->recs[ri]->size;
+	return xdf->record.ptr[ri].size;
 }
 
 
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 040d81e0bc9f..643d1c8b7071 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -86,7 +86,7 @@ struct region {
 	((LINE_MAP(index, ptr))->cnt)
 
 #define REC(env, s, l) \
-	(env->xdf##s.recs[l - 1])
+	(&env->xdf##s.record.ptr[l - 1])
 
 static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
 {
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index f48549605d09..0a3e0f28ab84 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -97,12 +97,12 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
 		int line_count, long flags)
 {
 	int i;
-	xrecord_t **rec1 = xe1->xdf2.recs + i1;
-	xrecord_t **rec2 = xe2->xdf2.recs + i2;
+	xrecord_t *rec1 = xe1->xdf2.record.ptr + i1;
+	xrecord_t *rec2 = xe2->xdf2.record.ptr + i2;
 
 	for (i = 0; i < line_count; i++) {
-		int result = xdl_recmatch((const char*) rec1[i]->ptr, rec1[i]->size,
-			(const char*) rec2[i]->ptr, rec2[i]->size, flags);
+		int result = xdl_recmatch((const char*) rec1[i].ptr, rec1[i].size,
+			(const char*) rec2[i].ptr, rec2[i].size, flags);
 		if (!result)
 			return -1;
 	}
@@ -111,20 +111,20 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
 
 static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int needs_cr, int add_nl, char *dest)
 {
-	xrecord_t **recs;
+	xrecord_t *recs;
 	int size = 0;
 
-	recs = (use_orig ? xe->xdf1.recs : xe->xdf2.recs) + i;
+	recs = (use_orig ? xe->xdf1.record.ptr : xe->xdf2.record.ptr) + i;
 
 	if (count < 1)
 		return 0;
 
-	for (i = 0; i < count; size += recs[i++]->size)
+	for (i = 0; i < count; size += recs[i++].size)
 		if (dest)
-			memcpy(dest + size, recs[i]->ptr, recs[i]->size);
+			memcpy(dest + size, recs[i].ptr, recs[i].size);
 	if (add_nl) {
-		i = recs[count - 1]->size;
-		if (i == 0 || recs[count - 1]->ptr[i - 1] != '\n') {
+		i = recs[count - 1].size;
+		if (i == 0 || recs[count - 1].ptr[i - 1] != '\n') {
 			if (needs_cr) {
 				if (dest)
 					dest[size] = '\r';
@@ -160,22 +160,22 @@ static int is_eol_crlf(xdfile_t *file, int i)
 
 	if (i < (isize) file->record.length - 1)
 		/* All lines before the last *must* end in LF */
-		return (size = file->recs[i]->size) > 1 &&
-			file->recs[i]->ptr[size - 2] == '\r';
+		return (size = file->record.ptr[i].size) > 1 &&
+			file->record.ptr[i].ptr[size - 2] == '\r';
 	if (!file->record.length)
 		/* Cannot determine eol style from empty file */
 		return -1;
-	if ((size = file->recs[i]->size) &&
-			file->recs[i]->ptr[size - 1] == '\n')
+	if ((size = file->record.ptr[i].size) &&
+			file->record.ptr[i].ptr[size - 1] == '\n')
 		/* Last line; ends in LF; Is it CR/LF? */
 		return size > 1 &&
-			file->recs[i]->ptr[size - 2] == '\r';
+			file->record.ptr[i].ptr[size - 2] == '\r';
 	if (!i)
 		/* The only line has no eol */
 		return -1;
 	/* Determine eol from second-to-last line */
-	return (size = file->recs[i - 1]->size) > 1 &&
-		file->recs[i - 1]->ptr[size - 2] == '\r';
+	return (size = file->record.ptr[i - 1].size) > 1 &&
+		file->record.ptr[i - 1].ptr[size - 2] == '\r';
 }
 
 static int is_cr_needed(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m)
@@ -334,22 +334,22 @@ static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
 static void xdl_refine_zdiff3_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
 		xpparam_t const *xpp)
 {
-	xrecord_t **rec1 = xe1->xdf2.recs, **rec2 = xe2->xdf2.recs;
+	xrecord_t *rec1 = xe1->xdf2.record.ptr, *rec2 = xe2->xdf2.record.ptr;
 	for (; m; m = m->next) {
 		/* let's handle just the conflicts */
 		if (m->mode)
 			continue;
 
 		while(m->chg1 && m->chg2 &&
-		      recmatch(rec1[m->i1], rec2[m->i2], xpp->flags)) {
+		      recmatch(&rec1[m->i1], &rec2[m->i2], xpp->flags)) {
 			m->chg1--;
 			m->chg2--;
 			m->i1++;
 			m->i2++;
 		}
 		while (m->chg1 && m->chg2 &&
-		       recmatch(rec1[m->i1 + m->chg1 - 1],
-				rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
+		       recmatch(&rec1[m->i1 + m->chg1 - 1],
+				&rec2[m->i2 + m->chg2 - 1], xpp->flags)) {
 			m->chg1--;
 			m->chg2--;
 		}
@@ -381,12 +381,12 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
 		 * This probably does not work outside git, since
 		 * we have a very simple mmfile structure.
 		 */
-		t1.ptr = (char *)xe1->xdf2.recs[m->i1]->ptr;
-		t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1]->ptr
-			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1]->size - (u8 const*) t1.ptr;
-		t2.ptr = (char *)xe2->xdf2.recs[m->i2]->ptr;
-		t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1]->ptr
-			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1]->size - (u8 const*) t2.ptr;
+		t1.ptr = (char *)xe1->xdf2.record.ptr[m->i1].ptr;
+		t1.size = xe1->xdf2.record.ptr[m->i1 + m->chg1 - 1].ptr
+			+ xe1->xdf2.record.ptr[m->i1 + m->chg1 - 1].size - (u8 const*) t1.ptr;
+		t2.ptr = (char *)xe2->xdf2.record.ptr[m->i2].ptr;
+		t2.size = xe2->xdf2.record.ptr[m->i2 + m->chg2 - 1].ptr
+			+ xe2->xdf2.record.ptr[m->i2 + m->chg2 - 1].size - (u8 const*) t2.ptr;
 		if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
 			return -1;
 		if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
@@ -440,8 +440,8 @@ static int line_contains_alnum(const char *ptr, long size)
 static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
 {
 	for (; chg; chg--, i++)
-		if (line_contains_alnum((char const*) xe->xdf2.recs[i]->ptr,
-				xe->xdf2.recs[i]->size))
+		if (line_contains_alnum((char const*) xe->xdf2.record.ptr[i].ptr,
+				xe->xdf2.record.ptr[i].size))
 			return 1;
 	return 0;
 }
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index e1ce9a399fbf..31b819ec58f0 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -88,9 +88,9 @@ static int is_anchor(xpparam_t const *xpp, const char *line)
 static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
 			  int pass)
 {
-	xrecord_t **records = pass == 1 ?
-		map->env->xdf1.recs : map->env->xdf2.recs;
-	xrecord_t *record = records[line - 1];
+	xrecord_t *records = pass == 1 ?
+		map->env->xdf1.record.ptr : map->env->xdf2.record.ptr;
+	xrecord_t *record = &records[line - 1];
 	/*
 	 * After xdl_prepare_env() (or more precisely, due to
 	 * xdl_classify_record()), the "ha" member of the records (AKA lines)
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
 		return;
 	map->entries[index].line1 = line;
 	map->entries[index].hash = record->ha;
-	map->entries[index].anchor = is_anchor(xpp, (const char*) map->env->xdf1.recs[line - 1]->ptr);
+	map->entries[index].anchor = is_anchor(xpp, (const char*) map->env->xdf1.record.ptr[line - 1].ptr);
 	if (!map->first)
 		map->first = map->entries + index;
 	if (map->last) {
@@ -246,9 +246,9 @@ static int find_longest_common_sequence(struct hashmap *map, struct entry **res)
 
 static int match(struct hashmap *map, int line1, int line2)
 {
-	xrecord_t *record1 = map->env->xdf1.recs[line1 - 1];
-	xrecord_t *record2 = map->env->xdf2.recs[line2 - 1];
-	return record1->ha == record2->ha;
+	u64 mph1 = map->env->xdf1.record.ptr[line1 - 1].ha;
+	u64 mph2 = map->env->xdf2.record.ptr[line2 - 1].ha;
+	return mph1 == mph2;
 }
 
 static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 9b46523afe97..93370f1c6db4 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -134,7 +134,6 @@ static void xdl_free_ctx(xdfile_t *xdf) {
 	xdl_free(xdf->rindex);
 	xdl_free(xdf->rchg - 1);
 	xdl_free(xdf->ha);
-	xdl_free(xdf->recs);
 }
 
 
@@ -147,7 +146,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, xpparam_t const *xpp
 	xdf->ha = NULL;
 	xdf->rindex = NULL;
 	xdf->rchg = NULL;
-	xdf->recs = NULL;
 	IVEC_INIT(xdf->record);
 
 	if ((cur = blk = xdl_mmfile_first(mf, &bsize))) {
@@ -163,12 +161,9 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, xpparam_t const *xpp
 	}
 	ivec_shrink_to_fit(&xdf->record);
 
-	if (!XDL_ALLOC_ARRAY(xdf->recs, xdf->record.length))
-		goto abort;
 	for (usize i = 0; i < xdf->record.length; i++) {
 		if (xdl_classify_record(pass, cf, &xdf->record.ptr[i]) < 0)
 			goto abort;
-		xdf->recs[i] = &xdf->record.ptr[i];
 	}
 
 	if (!XDL_CALLOC_ARRAY(xdf->rchg, xdf->record.length + 2))
@@ -267,7 +262,7 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
  */
 static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
 	long i, nm, nreff, mlim;
-	xrecord_t **recs;
+	xrecord_t *recs;
 	xdlclass_t *rcrec;
 	char *dis, *dis1, *dis2;
 	int need_min = !!(cf->flags & XDF_NEED_MINIMAL);
@@ -279,38 +274,38 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 
 	if ((mlim = xdl_bogosqrt(xdf1->record.length)) > XDL_MAX_EQLIMIT)
 		mlim = XDL_MAX_EQLIMIT;
-	for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
-		rcrec = cf->rcrecs[(*recs)->ha];
+	for (i = xdf1->dstart, recs = &xdf1->record.ptr[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
+		rcrec = cf->rcrecs[recs->ha];
 		nm = rcrec ? rcrec->len2 : 0;
 		dis1[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
 	}
 
 	if ((mlim = xdl_bogosqrt(xdf2->record.length)) > XDL_MAX_EQLIMIT)
 		mlim = XDL_MAX_EQLIMIT;
-	for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
-		rcrec = cf->rcrecs[(*recs)->ha];
+	for (i = xdf2->dstart, recs = &xdf2->record.ptr[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
+		rcrec = cf->rcrecs[recs->ha];
 		nm = rcrec ? rcrec->len1 : 0;
 		dis2[i] = (nm == 0) ? 0: (nm >= mlim && !need_min) ? 2: 1;
 	}
 
-	for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
+	for (nreff = 0, i = xdf1->dstart, recs = &xdf1->record.ptr[xdf1->dstart];
 	     i <= xdf1->dend; i++, recs++) {
 		if (dis1[i] == 1 ||
 		    (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
 			xdf1->rindex[nreff] = i;
-			xdf1->ha[nreff] = (*recs)->ha;
+			xdf1->ha[nreff] = recs->ha;
 			nreff++;
 		} else
 			xdf1->rchg[i] = 1;
 	}
 	xdf1->nreff = nreff;
 
-	for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
+	for (nreff = 0, i = xdf2->dstart, recs = &xdf2->record.ptr[xdf2->dstart];
 	     i <= xdf2->dend; i++, recs++) {
 		if (dis2[i] == 1 ||
 		    (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
 			xdf2->rindex[nreff] = i;
-			xdf2->ha[nreff] = (*recs)->ha;
+			xdf2->ha[nreff] = recs->ha;
 			nreff++;
 		} else
 			xdf2->rchg[i] = 1;
@@ -328,21 +323,21 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
  */
 static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
 	long i, lim;
-	xrecord_t **recs1, **recs2;
+	xrecord_t *recs1, *recs2;
 
-	recs1 = xdf1->recs;
-	recs2 = xdf2->recs;
+	recs1 = xdf1->record.ptr;
+	recs2 = xdf2->record.ptr;
 	for (i = 0, lim = XDL_MIN(xdf1->record.length, xdf2->record.length); i < lim;
 	     i++, recs1++, recs2++)
-		if ((*recs1)->ha != (*recs2)->ha)
+		if (recs1->ha != recs2->ha)
 			break;
 
 	xdf1->dstart = xdf2->dstart = i;
 
-	recs1 = xdf1->recs + xdf1->record.length - 1;
-	recs2 = xdf2->recs + xdf2->record.length - 1;
+	recs1 = xdf1->record.ptr + xdf1->record.length - 1;
+	recs2 = xdf2->record.ptr + xdf2->record.length - 1;
 	for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
-		if ((*recs1)->ha != (*recs2)->ha)
+		if (recs1->ha != recs2->ha)
 			break;
 
 	xdf1->dend = xdf1->record.length - i - 1;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index c322e62fbf06..849f218b3277 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -49,7 +49,6 @@ DEFINE_IVEC_TYPE(xrecord_t, xrecord);
 typedef struct s_xdfile {
 	struct ivec_xrecord record;
 	long dstart, dend;
-	xrecord_t **recs;
 	char *rchg;
 	long *rindex;
 	long nreff;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 10e4f20b7c31..eed88ee6cbe2 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -416,12 +416,12 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
 	mmfile_t subfile1, subfile2;
 	xdfenv_t env;
 
-	subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1]->ptr;
-	subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2]->ptr +
-		diff_env->xdf1.recs[line1 + count1 - 2]->size - (u8 const*) subfile1.ptr;
-	subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1]->ptr;
-	subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2]->ptr +
-		diff_env->xdf2.recs[line2 + count2 - 2]->size - (u8 const*) subfile2.ptr;
+	subfile1.ptr = (char *)diff_env->xdf1.record.ptr[line1 - 1].ptr;
+	subfile1.size = diff_env->xdf1.record.ptr[line1 + count1 - 2].ptr +
+		diff_env->xdf1.record.ptr[line1 + count1 - 2].size - (u8 const*) subfile1.ptr;
+	subfile2.ptr = (char *)diff_env->xdf2.record.ptr[line2 - 1].ptr;
+	subfile2.size = diff_env->xdf2.record.ptr[line2 + count2 - 2].ptr +
+		diff_env->xdf2.record.ptr[line2 + count2 - 2].size - (u8 const*) subfile2.ptr;
 	if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
 		return -1;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 14/15] xdiff: make xdfile_t more rust friendly
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (12 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 13/15] xdiff: delete recs " Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  2025-08-23  3:55     ` [PATCH v3 15/15] xdiff: implement xdl_trim_ends() in Rust Ezekiel Newren via GitGitGadget
  14 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Convert the remaining ambiguous fields in xdfile_t from C types to Rust
types for interoperability.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 xdiff/xdiffi.c | 16 ++++++++--------
 xdiff/xdiffi.h |  8 ++++----
 xdiff/xtypes.h | 10 +++++-----
 3 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index ebdb72432261..0509b4875996 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -42,8 +42,8 @@ typedef struct s_xdpsplit {
  * using this algorithm, so a little bit of heuristic is needed to cut the
  * search and to return a suboptimal point.
  */
-static long xdl_split(unsigned long const *ha1, long off1, long lim1,
-		      unsigned long const *ha2, long off2, long lim2,
+static long xdl_split(u64 const *ha1, long off1, long lim1,
+		      u64 const *ha2, long off2, long lim2,
 		      long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl,
 		      xdalgoenv_t *xenv) {
 	long dmin = off1 - lim2, dmax = lim1 - off2;
@@ -260,7 +260,7 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
 int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
 		 diffdata_t *dd2, long off2, long lim2,
 		 long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
-	unsigned long const *ha1 = dd1->ha, *ha2 = dd2->ha;
+	u64 const *ha1 = dd1->ha, *ha2 = dd2->ha;
 
 	/*
 	 * Shrink the box by walking through each diagonal snake (SW and NE).
@@ -273,14 +273,14 @@ int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
 	 * be obviously changed.
 	 */
 	if (off1 == lim1) {
-		char *rchg2 = dd2->rchg;
-		long *rindex2 = dd2->rindex;
+		u8 *rchg2 = dd2->rchg;
+		usize *rindex2 = dd2->rindex;
 
 		for (; off2 < lim2; off2++)
 			rchg2[rindex2[off2]] = 1;
 	} else if (off2 == lim2) {
-		char *rchg1 = dd1->rchg;
-		long *rindex1 = dd1->rindex;
+		u8 *rchg1 = dd1->rchg;
+		usize *rindex1 = dd1->rindex;
 
 		for (; off1 < lim1; off1++)
 			rchg1[rindex1[off1]] = 1;
@@ -944,7 +944,7 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
 
 int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
 	xdchange_t *cscr = NULL, *xch;
-	char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
+	u8 *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
 	long i1, i2, l1, l2;
 
 	/*
diff --git a/xdiff/xdiffi.h b/xdiff/xdiffi.h
index 126c9d8ff4e4..c766ee115c99 100644
--- a/xdiff/xdiffi.h
+++ b/xdiff/xdiffi.h
@@ -25,10 +25,10 @@
 
 
 typedef struct s_diffdata {
-	long nrec;
-	unsigned long const *ha;
-	long *rindex;
-	char *rchg;
+	usize nrec;
+	u64 const *ha;
+	usize *rindex;
+	u8 *rchg;
 } diffdata_t;
 
 typedef struct s_xdalgoenv {
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 849f218b3277..66b3dfae8bdf 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -48,11 +48,11 @@ DEFINE_IVEC_TYPE(xrecord_t, xrecord);
 
 typedef struct s_xdfile {
 	struct ivec_xrecord record;
-	long dstart, dend;
-	char *rchg;
-	long *rindex;
-	long nreff;
-	unsigned long *ha;
+	isize dstart, dend;
+	u8 *rchg;
+	usize *rindex;
+	usize nreff;
+	u64 *ha;
 } xdfile_t;
 
 typedef struct s_xdfenv {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 198+ messages in thread

* [PATCH v3 15/15] xdiff: implement xdl_trim_ends() in Rust
  2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
                       ` (13 preceding siblings ...)
  2025-08-23  3:55     ` [PATCH v3 14/15] xdiff: make xdfile_t more rust friendly Ezekiel Newren via GitGitGadget
@ 2025-08-23  3:55     ` Ezekiel Newren via GitGitGadget
  14 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren via GitGitGadget @ 2025-08-23  3:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	Ben Knoble, Ramsay Jones, Ezekiel Newren, Ezekiel Newren

From: Ezekiel Newren <ezekielnewren@gmail.com>

Replace the C implementation of xdl_trim_ends() with a Rust
implementation.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
---
 rust/xdiff/src/lib.rs      | 14 ++++++++++++++
 rust/xdiff/src/xprepare.rs | 27 +++++++++++++++++++++++++++
 rust/xdiff/src/xtypes.rs   | 19 +++++++++++++++++++
 xdiff/xprepare.c           | 28 +---------------------------
 4 files changed, 61 insertions(+), 27 deletions(-)
 create mode 100644 rust/xdiff/src/xprepare.rs
 create mode 100644 rust/xdiff/src/xtypes.rs

diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
index 8b137891791f..4cc05a7e6b4b 100644
--- a/rust/xdiff/src/lib.rs
+++ b/rust/xdiff/src/lib.rs
@@ -1 +1,15 @@
+pub mod xprepare;
+pub mod xtypes;
 
+use crate::xprepare::trim_ends;
+use crate::xtypes::xdfile;
+
+#[no_mangle]
+unsafe extern "C" fn xdl_trim_ends(xdf1: *mut xdfile, xdf2: *mut xdfile) -> i32 {
+    let xdf1 = xdf1.as_mut().expect("null pointer");
+    let xdf2 = xdf2.as_mut().expect("null pointer");
+
+    trim_ends(xdf1, xdf2);
+
+    0
+}
diff --git a/rust/xdiff/src/xprepare.rs b/rust/xdiff/src/xprepare.rs
new file mode 100644
index 000000000000..f64f60c09965
--- /dev/null
+++ b/rust/xdiff/src/xprepare.rs
@@ -0,0 +1,27 @@
+use crate::xtypes::xdfile;
+
+///
+/// Early trim initial and terminal matching records.
+///
+pub(crate) fn trim_ends(xdf1: &mut xdfile, xdf2: &mut xdfile) {
+    let mut lim = std::cmp::min(xdf1.record.len(), xdf2.record.len());
+
+    for i in 0..lim {
+        if xdf1.record[i].ha != xdf2.record[i].ha {
+            xdf1.dstart = i as isize;
+            xdf2.dstart = i as isize;
+            lim -= i;
+            break;
+        }
+    }
+
+    for i in 0..lim {
+        let f1i = xdf1.record.len() - 1 - i;
+        let f2i = xdf2.record.len() - 1 - i;
+        if xdf1.record[f1i].ha != xdf2.record[f2i].ha {
+            xdf1.dend = f1i as isize;
+            xdf2.dend = f2i as isize;
+            break;
+        }
+    }
+}
diff --git a/rust/xdiff/src/xtypes.rs b/rust/xdiff/src/xtypes.rs
new file mode 100644
index 000000000000..3d1ce9742f28
--- /dev/null
+++ b/rust/xdiff/src/xtypes.rs
@@ -0,0 +1,19 @@
+use interop::ivec::IVec;
+
+#[repr(C)]
+pub(crate) struct xrecord {
+    pub(crate) ptr: *const u8,
+    pub(crate) size: usize,
+    pub(crate) ha: u64,
+}
+
+#[repr(C)]
+pub(crate) struct xdfile {
+    pub(crate) record: IVec<xrecord>,
+    pub(crate) dstart: isize,
+    pub(crate) dend: isize,
+    pub(crate) rchg: *mut u8,
+    pub(crate) rindex: *mut usize,
+    pub(crate) nreff: usize,
+    pub(crate) ha: *mut u64,
+}
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 93370f1c6db4..2c7480875f9f 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -318,33 +318,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 }
 
 
-/*
- * Early trim initial and terminal matching records.
- */
-static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
-	long i, lim;
-	xrecord_t *recs1, *recs2;
-
-	recs1 = xdf1->record.ptr;
-	recs2 = xdf2->record.ptr;
-	for (i = 0, lim = XDL_MIN(xdf1->record.length, xdf2->record.length); i < lim;
-	     i++, recs1++, recs2++)
-		if (recs1->ha != recs2->ha)
-			break;
-
-	xdf1->dstart = xdf2->dstart = i;
-
-	recs1 = xdf1->record.ptr + xdf1->record.length - 1;
-	recs2 = xdf2->record.ptr + xdf2->record.length - 1;
-	for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
-		if (recs1->ha != recs2->ha)
-			break;
-
-	xdf1->dend = xdf1->record.length - i - 1;
-	xdf2->dend = xdf2->record.length - i - 1;
-
-	return 0;
-}
+extern i32 xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
 
 
 static int xdl_optimize_ctxs(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-23  3:55     ` [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust Ezekiel Newren via GitGitGadget
@ 2025-08-23  8:12       ` Kristoffer Haugsbakk
  2025-08-23  9:29         ` Ezekiel Newren
  2025-08-23 16:14       ` Junio C Hamano
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 198+ messages in thread
From: Kristoffer Haugsbakk @ 2025-08-23  8:12 UTC (permalink / raw)
  To: Josh Soref, git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	D. Ben Knoble, Ramsay Jones, Ezekiel Newren

On Sat, Aug 23, 2025, at 05:55, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Trying to use Rust's Vec in C, or git's ALLOC_GROW() macros (via
> wrapper functions) in Rust is painful because:

nit: s/git's/Git's/

> [snip]
> diff --git a/rust/interop/src/ivec.rs b/rust/interop/src/ivec.rs
> [snip]
> +        // assert_eq!(vec.capacity, vec.slice.len());

Why are there three commented-out assertions? (all capacity/length)

> +        assert_eq!(expected, vec.length);
> +        assert!(vec.capacity >= expected);
> +        for i in 0..vec.length {
> +            assert_eq!(default_value, vec[i]);
> +        }
> [snip]

-- 
Kristoffer Haugsbakk

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-23  8:12       ` Kristoffer Haugsbakk
@ 2025-08-23  9:29         ` Ezekiel Newren
  0 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren @ 2025-08-23  9:29 UTC (permalink / raw)
  To: Kristoffer Haugsbakk
  Cc: Josh Soref, git, Elijah Newren, brian m. carlson, Taylor Blau,
	Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones

On Sat, Aug 23, 2025 at 2:13 AM Kristoffer Haugsbakk
<kristofferhaugsbakk@fastmail.com> wrote:
>
> On Sat, Aug 23, 2025, at 05:55, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <ezekielnewren@gmail.com>
> >
> > Trying to use Rust's Vec in C, or git's ALLOC_GROW() macros (via
> > wrapper functions) in Rust is painful because:
>
> nit: s/git's/Git's/
>
> > [snip]
> > diff --git a/rust/interop/src/ivec.rs b/rust/interop/src/ivec.rs
> > [snip]
> > +        // assert_eq!(vec.capacity, vec.slice.len());
>
> Why are there three commented-out assertions? (all capacity/length)
>
> > +        assert_eq!(expected, vec.length);
> > +        assert!(vec.capacity >= expected);
> > +        for i in 0..vec.length {
> > +            assert_eq!(default_value, vec[i]);
> > +        }
> > [snip]
>
> --
> Kristoffer Haugsbakk

Good catch, I should have removed those commented out lines. Looking
back through the code I also missed calling std::ptr::drop_in_place()
if the IVec shrinks. I'll apply those changes in the next version.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* RE: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-23  3:55     ` [PATCH v3 02/15] xdiff: introduce rust Ezekiel Newren via GitGitGadget
@ 2025-08-23 13:43       ` rsbecker
  2025-08-23 14:26         ` Kristoffer Haugsbakk
  2025-08-23 14:29         ` Ezekiel Newren
  0 siblings, 2 replies; 198+ messages in thread
From: rsbecker @ 2025-08-23 13:43 UTC (permalink / raw)
  To: 'Ezekiel Newren via GitGitGadget', git
  Cc: 'Elijah Newren', 'brian m. carlson',
	'Taylor Blau', 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Patrick Steinhardt', 'Sam James',
	'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Ezekiel Newren'

On August 22, 2025 11:56 PM, Ezekiel Newren wrote:
>From: Ezekiel Newren <ezekielnewren@gmail.com>
>
>Upcoming patches will simplify xdiff, while also porting parts of it to Rust. In
>preparation, add some stubs and setup the Rust build. For now, it is easier to let
>cargo build rust and have make or meson merely link against the static library that
>cargo builds. In line with ongoing libification efforts, use multiple crates to allow
>more modularity on the Rust side. xdiff is the crate that this series will focus on, but
>we also introduce the interop crate for future patch series.
>
>In order to facilitate interoperability between C and Rust, introduce C definitions for
>Rust primitive types in git-compat-util.h.
>
>Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
>---
> .gitignore              |  3 +++
> Makefile                | 53 ++++++++++++++++++++++++++++----------
> build_rust.sh           | 57 +++++++++++++++++++++++++++++++++++++++++
> git-compat-util.h       | 17 ++++++++++++
> meson.build             | 52 +++++++++++++++++++++++++++++++------
> rust/Cargo.toml         |  6 +++++
> rust/interop/Cargo.toml | 14 ++++++++++  rust/interop/src/lib.rs |  0
> rust/xdiff/Cargo.toml   | 15 +++++++++++
> rust/xdiff/src/lib.rs   |  0
> 10 files changed, 196 insertions(+), 21 deletions(-)  create mode 100755
>build_rust.sh  create mode 100644 rust/Cargo.toml  create mode 100644
>rust/interop/Cargo.toml  create mode 100644 rust/interop/src/lib.rs  create mode
>100644 rust/xdiff/Cargo.toml  create mode 100644 rust/xdiff/src/lib.rs
>
>diff --git a/.gitignore b/.gitignore
>index 04c444404e4b..ff81e3580c4e 100644
>--- a/.gitignore
>+++ b/.gitignore
>@@ -254,3 +254,6 @@ Release/
> /contrib/buildsystems/out
> /contrib/libgit-rs/target
> /contrib/libgit-sys/target
>+/.idea/
>+/rust/target/
>+/rust/Cargo.lock
>diff --git a/Makefile b/Makefile
>index 70d1543b6b86..1ec0c1ee6603 100644
>--- a/Makefile
>+++ b/Makefile
>@@ -919,6 +919,29 @@ TEST_SHELL_PATH = $(SHELL_PATH)
>
> LIB_FILE = libgit.a
> XDIFF_LIB = xdiff/lib.a
>+
>+EXTLIBS =
>+
>+ifeq ($(DEBUG), 1)
>+  RUST_BUILD_MODE = debug
>+else
>+  RUST_BUILD_MODE = release
>+endif
>+
>+RUST_TARGET_DIR = rust/target/$(RUST_BUILD_MODE) RUST_FLAGS_FOR_C =
>+-L$(RUST_TARGET_DIR)
>+
>+.PHONY: compile_rust
>+compile_rust:
>+	./build_rust.sh . $(RUST_BUILD_MODE) xdiff
>+
>+EXTLIBS += ./$(RUST_TARGET_DIR)/libxdiff.a
>+
>+UNAME_S := $(shell uname -s)
>+ifeq ($(UNAME_S),Linux)
>+  EXTLIBS += -ldl
>+endif
>+
> REFTABLE_LIB = reftable/libreftable.a
>
> GENERATED_H += command-list.h
>@@ -1390,7 +1413,7 @@ UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/lib-reftable.o
>
> # xdiff and reftable libs may in turn depend on what is in libgit.a  GITLIBS =
>common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE) -EXTLIBS =
>+
>
> GIT_USER_AGENT = git/$(GIT_VERSION)
>
>@@ -2541,7 +2564,7 @@ git.sp git.s git.o: EXTRA_CPPFLAGS = \
> 	'-DGIT_MAN_PATH="$(mandir_relative_SQ)"' \
> 	'-DGIT_INFO_PATH="$(infodir_relative_SQ)"'
>
>-git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS)
>+git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS) compile_rust
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
> 		$(filter %.o,$^) $(LIBS)
>
>@@ -2891,17 +2914,17 @@ headless-git.o: compat/win32/headless.c GIT-
>CFLAGS
> headless-git$X: headless-git.o git.res GIT-LDFLAGS
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) $(ALL_LDFLAGS) -mwindows -o $@
>$< git.res
>
>-git-%$X: %.o GIT-LDFLAGS $(GITLIBS)
>+git-%$X: %.o GIT-LDFLAGS $(GITLIBS) compile_rust
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter
>%.o,$^) $(LIBS)
>
>-git-imap-send$X: imap-send.o $(IMAP_SEND_BUILDDEPS) GIT-LDFLAGS
>$(GITLIBS)
>+git-imap-send$X: imap-send.o $(IMAP_SEND_BUILDDEPS) GIT-LDFLAGS
>+$(GITLIBS) compile_rust
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter
>%.o,$^) \
> 		$(IMAP_SEND_LDFLAGS) $(LIBS)
>
>-git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS $(GITLIBS)
>+git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS
>+$(GITLIBS) compile_rust
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter
>%.o,$^) \
> 		$(CURL_LIBCURL) $(LIBS)
>-git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS)
>+git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS) compile_rust
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter
>%.o,$^) \
> 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
>
>@@ -2911,11 +2934,11 @@ $(REMOTE_CURL_ALIASES):
>$(REMOTE_CURL_PRIMARY)
> 	ln -s $< $@ 2>/dev/null || \
> 	cp $< $@
>
>-$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS
>$(GITLIBS)
>+$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS
>+$(GITLIBS) compile_rust
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter
>%.o,$^) \
> 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
>
>-scalar$X: scalar.o GIT-LDFLAGS $(GITLIBS)
>+scalar$X: scalar.o GIT-LDFLAGS $(GITLIBS) compile_rust
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
> 		$(filter %.o,$^) $(LIBS)
>
>@@ -2925,6 +2948,7 @@ $(LIB_FILE): $(LIB_OBJS)
> $(XDIFF_LIB): $(XDIFF_OBJS)
> 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
>
>+
> $(REFTABLE_LIB): $(REFTABLE_OBJS)
> 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
>
>@@ -3294,7 +3318,7 @@ perf: all
>
> t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
>$(UNIT_TEST_DIR)/test-lib.o
>
>-t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
>+t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
>+compile_rust
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter
>%.o,$^) $(filter %.a,$^) $(LIBS)
>
> check-sha1:: t/helper/test-tool$X
>@@ -3756,7 +3780,10 @@ cocciclean:
> 	$(RM) -r .build/contrib/coccinelle
> 	$(RM) contrib/coccinelle/*.cocci.patch
>
>-clean: profile-clean coverage-clean cocciclean
>+rustclean:
>+	cd rust && cargo clean
>+
>+clean: profile-clean coverage-clean cocciclean rustclean
> 	$(RM) -r .build $(UNIT_TEST_BIN)
> 	$(RM) GIT-TEST-SUITES
> 	$(RM) po/git.pot po/git-core.pot
>@@ -3911,13 +3938,13 @@ FUZZ_CXXFLAGS ?= $(ALL_CFLAGS)
> .PHONY: fuzz-all
> fuzz-all: $(FUZZ_PROGRAMS)
>
>-$(FUZZ_PROGRAMS): %: %.o oss-fuzz/dummy-cmd-main.o $(GITLIBS) GIT-
>LDFLAGS
>+$(FUZZ_PROGRAMS): %: %.o oss-fuzz/dummy-cmd-main.o $(GITLIBS)
>+GIT-LDFLAGS compile_rust
> 	$(QUIET_LINK)$(FUZZ_CXX) $(FUZZ_CXXFLAGS) -o $@ $(ALL_LDFLAGS) \
> 		-Wl,--allow-multiple-definition \
> 		$(filter %.o,$^) $(filter %.a,$^) $(LIBS) $(LIB_FUZZING_ENGINE)
>
> $(UNIT_TEST_PROGS): $(UNIT_TEST_BIN)/%$X: $(UNIT_TEST_DIR)/%.o
>$(UNIT_TEST_OBJS) \
>-	$(GITLIBS) GIT-LDFLAGS
>+	$(GITLIBS) GIT-LDFLAGS compile_rust
> 	$(call mkdir_p_parent_template)
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
> 		$(filter %.o,$^) $(filter %.a,$^) $(LIBS) @@ -3936,7 +3963,7 @@
>$(UNIT_TEST_DIR)/clar.suite: $(UNIT_TEST_DIR)/clar-decls.h
>$(UNIT_TEST_DIR)/gene
> $(UNIT_TEST_DIR)/clar/clar.o: $(UNIT_TEST_DIR)/clar.suite
> $(CLAR_TEST_OBJS): $(UNIT_TEST_DIR)/clar-decls.h
> $(CLAR_TEST_OBJS): EXTRA_CPPFLAGS = -I$(UNIT_TEST_DIR)
>-$(CLAR_TEST_PROG): $(UNIT_TEST_DIR)/clar.suite $(CLAR_TEST_OBJS) $(GITLIBS)
>GIT-LDFLAGS
>+$(CLAR_TEST_PROG): $(UNIT_TEST_DIR)/clar.suite $(CLAR_TEST_OBJS)
>+$(GITLIBS) GIT-LDFLAGS compile_rust
> 	$(call mkdir_p_parent_template)
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter
>%.o,$^) $(LIBS)
>
>diff --git a/build_rust.sh b/build_rust.sh new file mode 100755 index
>000000000000..192385a1d961
>--- /dev/null
>+++ b/build_rust.sh
>@@ -0,0 +1,57 @@
>+#!/bin/sh
>+
>+
>+rustc -vV || exit $?
>+cargo --version || exit $?
>+
>+dir_git_root=${0%/*}
>+dir_build=$1
>+rust_build_profile=$2
>+crate=$3
>+
>+dir_rust=$dir_git_root/rust
>+
>+if [ "$dir_git_root" = "" ]; then
>+  echo "did not specify the directory for the root of git"
>+  exit 1
>+fi
>+
>+if [ "$dir_build" = "" ]; then
>+  echo "did not specify the build directory"
>+  exit 1
>+fi
>+
>+if [ "$rust_build_profile" = "" ]; then
>+  echo "did not specify the rust_build_profile"
>+  exit 1
>+fi
>+
>+if [ "$rust_build_profile" = "release" ]; then
>+  rust_args="--release"
>+  export RUSTFLAGS=''
>+elif [ "$rust_build_profile" = "debug" ]; then
>+  rust_args=""
>+  export RUSTFLAGS='-C debuginfo=2 -C opt-level=1 -C force-frame-pointers=yes'
>+else
>+  echo "illegal rust_build_profile value $rust_build_profile"
>+  exit 1
>+fi
>+
>+cd $dir_rust && cargo clean && pwd && cargo build -p $crate $rust_args;
>+cd $dir_git_root
>+
>+libfile="lib${crate}.a"
>+if rustup show active-toolchain | grep windows-msvc; then
>+  libfile="${crate}.lib"
>+fi
>+dst=$dir_build/$libfile
>+
>+if [ "$dir_git_root" != "$dir_build" ]; then
>+  src=$dir_rust/target/$rust_build_profile/$libfile
>+  if [ ! -f $src ]; then
>+    echo >&2 "::error:: cannot find path of static library $src is not a file or does not
>exist"
>+    exit 5
>+  fi
>+
>+  rm $dst 2>/dev/null
>+  mv $src $dst
>+fi
>diff --git a/git-compat-util.h b/git-compat-util.h index
>4678e21c4cb8..82dc99764ac0 100644
>--- a/git-compat-util.h
>+++ b/git-compat-util.h
>@@ -196,6 +196,23 @@ static inline int is_xplatform_dir_sep(int c)  #include
>"compat/msvc.h"
> #endif
>
>+/* rust types */
>+typedef uint8_t   u8;
>+typedef uint16_t  u16;
>+typedef uint32_t  u32;
>+typedef uint64_t  u64;
>+
>+typedef int8_t    i8;
>+typedef int16_t   i16;
>+typedef int32_t   i32;
>+typedef int64_t   i64;
>+
>+typedef float     f32;
>+typedef double    f64;
>+
>+typedef size_t    usize;
>+typedef ptrdiff_t isize;
>+
> /* used on Mac OS X */
> #ifdef PRECOMPOSE_UNICODE
> #include "compat/precompose_utf8.h"
>diff --git a/meson.build b/meson.build
>index 596f5ac7110e..324f968338b9 100644
>--- a/meson.build
>+++ b/meson.build
>@@ -267,6 +267,40 @@ version_gen_environment.set('GIT_DATE',
>get_option('build_date'))  version_gen_environment.set('GIT_USER_AGENT',
>get_option('user_agent'))  version_gen_environment.set('GIT_VERSION',
>get_option('version'))
>
>+if get_option('optimization') in ['2', '3', 's', 'z']
>+  rust_build_profile = 'release'
>+else
>+  rust_build_profile = 'debug'
>+endif
>+
>+# Run `rustup show active-toolchain` and capture output rustup_out =
>+run_command('rustup', 'show', 'active-toolchain',
>+                         check: true).stdout().strip()
>+
>+rust_crates = ['xdiff']
>+rust_builds = []
>+
>+foreach crate : rust_crates
>+  if rustup_out.contains('windows-msvc')
>+    libfile = crate + '.lib'
>+  else
>+    libfile = 'lib' + crate + '.a'
>+  endif
>+
>+  rust_builds += custom_target(
>+    'rust_build_'+crate,
>+    output: libfile,
>+    build_by_default: true,
>+    build_always_stale: true,
>+    command: [
>+      meson.project_source_root() / 'build_rust.sh',
>+      meson.current_build_dir(), rust_build_profile, crate,
>+    ],
>+    install: false,
>+  )
>+endforeach
>+
>+
> compiler = meson.get_compiler('c')
>
> libgit_sources = [
>@@ -1678,14 +1712,16 @@ version_def_h = custom_target(  libgit_sources +=
>version_def_h
>
> libgit = declare_dependency(
>-  link_with: static_library('git',
>-    sources: libgit_sources,
>-    c_args: libgit_c_args + [
>-      '-DGIT_VERSION_H="' + version_def_h.full_path() + '"',
>-    ],
>-    dependencies: libgit_dependencies,
>-    include_directories: libgit_include_directories,
>-  ),
>+  link_with: [
>+    static_library('git',
>+      sources: libgit_sources,
>+      c_args: libgit_c_args + [
>+        '-DGIT_VERSION_H="' + version_def_h.full_path() + '"',
>+      ],
>+      dependencies: libgit_dependencies,
>+      include_directories: libgit_include_directories,
>+    ),
>+  ] + rust_builds,
>   compile_args: libgit_c_args,
>   dependencies: libgit_dependencies,
>   include_directories: libgit_include_directories, diff --git a/rust/Cargo.toml
>b/rust/Cargo.toml new file mode 100644 index 000000000000..ed3d79d7f827
>--- /dev/null
>+++ b/rust/Cargo.toml
>@@ -0,0 +1,6 @@
>+[workspace]
>+members = [
>+    "xdiff",
>+    "interop",
>+]
>+resolver = "2"
>diff --git a/rust/interop/Cargo.toml b/rust/interop/Cargo.toml new file mode
>100644 index 000000000000..045e3b01cfad
>--- /dev/null
>+++ b/rust/interop/Cargo.toml
>@@ -0,0 +1,14 @@
>+[package]
>+name = "interop"
>+version = "0.1.0"
>+edition = "2021"
>+
>+[lib]
>+name = "interop"
>+path = "src/lib.rs"
>+## staticlib to generate xdiff.a for use by gcc ## cdylib (optional) to
>+generate xdiff.so for use by gcc ## rlib is required by the rust unit
>+tests crate-type = ["staticlib", "rlib"]
>+
>+[dependencies]
>diff --git a/rust/interop/src/lib.rs b/rust/interop/src/lib.rs new file mode 100644
>index 000000000000..e69de29bb2d1 diff --git a/rust/xdiff/Cargo.toml
>b/rust/xdiff/Cargo.toml new file mode 100644 index
>000000000000..eb7966aada64
>--- /dev/null
>+++ b/rust/xdiff/Cargo.toml
>@@ -0,0 +1,15 @@
>+[package]
>+name = "xdiff"
>+version = "0.1.0"
>+edition = "2021"
>+
>+[lib]
>+name = "xdiff"
>+path = "src/lib.rs"
>+## staticlib to generate xdiff.a for use by gcc ## cdylib (optional) to
>+generate xdiff.so for use by gcc ## rlib is required by the rust unit
>+tests crate-type = ["staticlib", "rlib"]
>+
>+[dependencies]
>+interop = { path = "../interop" }
>diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs new file mode 100644 index
>000000000000..e69de29bb2d1

Does this introduce Rust as a mandatory dependency for git? If so, it cuts out
numerous platforms.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-23 13:43       ` rsbecker
@ 2025-08-23 14:26         ` Kristoffer Haugsbakk
  2025-08-23 15:06           ` rsbecker
  2025-08-23 14:29         ` Ezekiel Newren
  1 sibling, 1 reply; 198+ messages in thread
From: Kristoffer Haugsbakk @ 2025-08-23 14:26 UTC (permalink / raw)
  To: rsbecker, Josh Soref, git
  Cc: Elijah Newren, brian m. carlson, Taylor Blau, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Patrick Steinhardt,
	Sam James, Collin Funk, Mike Hommey, Pierre-Emmanuel Patry,
	D. Ben Knoble, Ramsay Jones, Ezekiel Newren

On Sat, Aug 23, 2025, at 15:43, rsbecker@nexbridge.com wrote:
> On August 22, 2025 11:56 PM, Ezekiel Newren wrote:
>>From: Ezekiel Newren <ezekielnewren@gmail.com>
>>
>>Upcoming patches will simplify xdiff, while also porting parts of it to Rust. In
>>preparation, add some stubs and setup the Rust build. For now, it is easier to let
>>cargo build rust and have make or meson merely link against the static library that
>>cargo builds. In line with ongoing libification efforts, use multiple crates to allow
>>more modularity on the Rust side. xdiff is the crate that this series will focus on, but
>>we also introduce the interop crate for future patch series.
>>
>>In order to facilitate interoperability between C and Rust, introduce C definitions for
>>Rust primitive types in git-compat-util.h.
>>
>>Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
>>[snip snip]
>
> Does this introduce Rust as a mandatory dependency for git? If so, it cuts out
> numerous platforms.

The proposed platform support policy is in patch 1.

https://lore.kernel.org/git/6d065f550fe871cf010409f7bd2a63438cf52723.1755921357.git.gitgitgadget@gmail.com/

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-23 13:43       ` rsbecker
  2025-08-23 14:26         ` Kristoffer Haugsbakk
@ 2025-08-23 14:29         ` Ezekiel Newren
  1 sibling, 0 replies; 198+ messages in thread
From: Ezekiel Newren @ 2025-08-23 14:29 UTC (permalink / raw)
  To: rsbecker
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ben Knoble, Ramsay Jones

On Sat, Aug 23, 2025 at 7:44 AM <rsbecker@nexbridge.com> wrote:

> Does this introduce Rust as a mandatory dependency for git? If so, it cuts out
> numerous platforms.

Yes it does.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* RE: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-23 14:26         ` Kristoffer Haugsbakk
@ 2025-08-23 15:06           ` rsbecker
  2025-08-23 18:30             ` Elijah Newren
  0 siblings, 1 reply; 198+ messages in thread
From: rsbecker @ 2025-08-23 15:06 UTC (permalink / raw)
  To: 'Kristoffer Haugsbakk', 'Josh Soref', git
  Cc: 'Elijah Newren', 'brian m. carlson',
	'Taylor Blau', 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Patrick Steinhardt', 'Sam James',
	'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren'

On August 23, 2025 10:26 AM, Kristoffer Haugsbakk wrote:
>On Sat, Aug 23, 2025, at 15:43, rsbecker@nexbridge.com wrote:
>> On August 22, 2025 11:56 PM, Ezekiel Newren wrote:
>>>From: Ezekiel Newren <ezekielnewren@gmail.com>
>>>
>>>Upcoming patches will simplify xdiff, while also porting parts of it
>>>to Rust. In preparation, add some stubs and setup the Rust build. For
>>>now, it is easier to let cargo build rust and have make or meson
>>>merely link against the static library that cargo builds. In line with
>>>ongoing libification efforts, use multiple crates to allow more
>>>modularity on the Rust side. xdiff is the crate that this series will focus on, but we
>also introduce the interop crate for future patch series.
>>>
>>>In order to facilitate interoperability between C and Rust, introduce
>>>C definitions for Rust primitive types in git-compat-util.h.
>>>
>>>Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> [snip snip]
>>
>> Does this introduce Rust as a mandatory dependency for git? If so, it
>> cuts out numerous platforms.
>
>The proposed platform support policy is in patch 1.
>
>https://lore.kernel.org/git/6d065f550fe871cf010409f7bd2a63438cf52723.1755
>921357.git.gitgitgadget@gmail.com/

It is a very disappointing policy to be honest. It kicks me off git because Rust is
not available on my platform, representing tens of thousands of users in North
American alone. Rust is not available, but may be in a few years, but there is no
guarantee that the hardware vendor (HPE) will provide support. I previously
commented about the problem with Rust and was not taken seriously. This is
disappointing and exclusionary.

The assertion in the policy that Rust is easily interoperable is incorrect.

Not Thanks,
Randall


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-23  3:55     ` [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust Ezekiel Newren via GitGitGadget
  2025-08-23  8:12       ` Kristoffer Haugsbakk
@ 2025-08-23 16:14       ` Junio C Hamano
  2025-08-23 16:37         ` Ezekiel Newren
  2025-08-23 18:05       ` Junio C Hamano
  2025-08-24 13:31       ` Ben Knoble
  3 siblings, 1 reply; 198+ messages in thread
From: Junio C Hamano @ 2025-08-23 16:14 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget
  Cc: git, Elijah Newren, brian m. carlson, Taylor Blau,
	Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ben Knoble, Ramsay Jones,
	Ezekiel Newren

"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
> index e69de29bb2d1..8b137891791f 100644
> --- a/rust/xdiff/src/lib.rs
> +++ b/rust/xdiff/src/lib.rs
> @@ -0,0 +1 @@
> +

This triggers an "new blank line at EOF" whitespace error while
applying.  Intended?


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-23 16:14       ` Junio C Hamano
@ 2025-08-23 16:37         ` Ezekiel Newren
  0 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren @ 2025-08-23 16:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ben Knoble, Ramsay Jones

On Sat, Aug 23, 2025 at 10:14 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
> > index e69de29bb2d1..8b137891791f 100644
> > --- a/rust/xdiff/src/lib.rs
> > +++ b/rust/xdiff/src/lib.rs
> > @@ -0,0 +1 @@
> > +
>
> This triggers an "new blank line at EOF" whitespace error while
> applying.  Intended?

"new blank line at EOF" is intentional, but it is showing up in the
wrong place in this patch series. Cargo format automatically creates a
blank line for empty files. These warnings should have shown up on the
"xdiff: introduce rust" commit. I will fix this.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-23  3:55     ` [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust Ezekiel Newren via GitGitGadget
  2025-08-23  8:12       ` Kristoffer Haugsbakk
  2025-08-23 16:14       ` Junio C Hamano
@ 2025-08-23 18:05       ` Junio C Hamano
  2025-08-23 20:29         ` Ezekiel Newren
  2025-08-25 19:16         ` Elijah Newren
  2025-08-24 13:31       ` Ben Knoble
  3 siblings, 2 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-08-23 18:05 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget
  Cc: git, Elijah Newren, brian m. carlson, Taylor Blau,
	Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ben Knoble, Ramsay Jones,
	Ezekiel Newren

"Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Ezekiel Newren <ezekielnewren@gmail.com>
>
> Trying to use Rust's Vec in C, or git's ALLOC_GROW() macros (via
> wrapper functions) in Rust is painful because:
>
>   * C doing vector things the Rust way would require wrapper functions,
>     and Rust doing vector things the C way would require wrapper
>     ...
>   * Currently, Rust defines its own 'Vec' type that is generic, but its
>     memory allocator and struct layout weren't designed for
>     interoperability with C (or any language for that matter), meaning
>     ...
>   * Similarly, git defines ALLOC_GROW() and related macros in
>     git-compat-util.h. While we could add functions allowing Rust to
>     ...

All the good reasons any C (or any non-Rust language for that
matter) projects would want to have an interoperability Shim
for their dynamically allocated and grown array-like things.

> To address these issue, introduce a new type, ivec -- short for
> interoperable vector. (We refer to it as 'ivec' generally, though on
> the Rust side the struct is called IVec to match Rust style.) 

I however was hoping by now Rust getting used more widely, somebody
has already created a generic "this is how you make C-array and Rust
vectors interoperate" wrapper that latecomer projects like us can
use without inventing our own.

> +INTEROP_OBJS += interop/ivec.o
> +.PHONY: interop-objs
> +interop-objs: $(INTEROP_OBJS)

What is this phony target used for?  No other targets seem to depend
on this one (I am wondering if we need the latter two lines).

> diff --git a/interop/ivec.c b/interop/ivec.c
> new file mode 100644
> index 000000000000..9bc2258c04ad
> --- /dev/null
> +++ b/interop/ivec.c

I am wondering if this needs a new hierarchy "interop"; shouldn't
the existing "compat" be a good fit enough?  I dunno.

Even though this is a shim to somebody else's code, it still is a
part of our codebase, so our CodingGuidelines for C programs should
apply.  

> @@ -0,0 +1,151 @@
> +#include "ivec.h"
> +
> +static void ivec_set_capacity(void* self, usize new_capacity) {
> +	struct rawivec *this = self;

 - Asterisk sticks to the variable, not type.

 - The opening and closing {braces} for the function body are
   written at the leftmost column on its own line.

 - There should be a blank line between the declarations and the
   first statement.

> +	if (new_capacity == 0)
> +		FREE_AND_NULL(this->ptr);
> +	else
> +		this->ptr = xrealloc(this->ptr, new_capacity * this->element_size);
> +	this->capacity = new_capacity;
> +}
> +
> +void ivec_init(void* self, usize element_size) {
> +	struct rawivec *this = self;
> +	this->ptr = NULL;
> +	this->length = 0;
> +	this->capacity = 0;
> +	this->element_size = element_size;
> +}

I notice that this reintroduces a variable named "this", which was
eradicated in 585c0e2e (diff: rename 'this' variables, 2018-02-14).

I do not think those who want to use C++ compilers on our C code
would not mind "self", so how about doing something like...


	void ivec_init(void *self_, usize element_size)
	{
		struct rawivec *self = self;

		self->ptr = NULL;
		self->len = 0;
		self->capacity = 0;
		self->element_size = element_size;
	}

... perhaps?

> diff --git a/interop/ivec.h b/interop/ivec.h
> new file mode 100644
> index 000000000000..98be4bbeb54a
> --- /dev/null
> +++ b/interop/ivec.h
> @@ -0,0 +1,52 @@
> +#ifndef IVEC_H
> +#define IVEC_H
> +
> +#include "../git-compat-util.h"

As we use -I. on the command line, there is no need to add "../"
here; just writing

	#include <git-compat-util.h>

should be enough.  Also, if this file does not depend on the
services compat-util header provides (and I do not think it does
from a brief look at its contents), it is better not to include it.

Instead, the sources (like ivec.c next door we just saw) should
begin themselves with #include of git-compat-util.h header before
including ivec.h.

> diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
> index e69de29bb2d1..8b137891791f 100644
> --- a/rust/xdiff/src/lib.rs
> +++ b/rust/xdiff/src/lib.rs
> @@ -0,0 +1 @@
> +

If this empty line in an otherwise empty file is absolutely
necessary to make Rust work, then please arrange .gitattributes to
tell git that this file is excempt from the usual blank-at-eof
whitespace rule we use.  If not, remove that unnecessary empty line.

Or perhaps remove the file altogether if nobody looks at it???

In any case, given that our top-level .gitattributes file starts
with

    * whitespace=!indent,trail,space
    *.[ch] whitespace=indent,trail,space diff=cpp
    *.sh whitespace=indent,trail,space text eol=lf
    ...
    *.bat text eol=crlf
    CODE_OF_CONDUCT.md -whitespace
    ...

I think a new rule to cover "*.rs" and perhaps *.toml files right
before rules for each specific file begin would be in order.

Thanks.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-23 15:06           ` rsbecker
@ 2025-08-23 18:30             ` Elijah Newren
  2025-08-23 19:24               ` brian m. carlson
  2025-08-27  1:57               ` Taylor Blau
  0 siblings, 2 replies; 198+ messages in thread
From: Elijah Newren @ 2025-08-23 18:30 UTC (permalink / raw)
  To: rsbecker
  Cc: Kristoffer Haugsbakk, Josh Soref, git, brian m. carlson,
	Taylor Blau, Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones,
	Ezekiel Newren, Josh Steadmon, Calvin Wan

Hi Randall,

On Sat, Aug 23, 2025 at 8:06 AM <rsbecker@nexbridge.com> wrote:
>
> On August 23, 2025 10:26 AM, Kristoffer Haugsbakk wrote:
> >On Sat, Aug 23, 2025, at 15:43, rsbecker@nexbridge.com wrote:
[...]
> >> Does this introduce Rust as a mandatory dependency for git? If so, it
> >> cuts out numerous platforms.
> >
> >The proposed platform support policy is in patch 1.
> >
> >https://lore.kernel.org/git/6d065f550fe871cf010409f7bd2a63438cf52723.1755
> >921357.git.gitgitgadget@gmail.com/
>
> It is a very disappointing policy to be honest. It kicks me off git because Rust is
> not available on my platform, representing tens of thousands of users in North
> American alone. Rust is not available, but may be in a few years, but there is no
> guarantee that the hardware vendor (HPE) will provide support. I previously
> commented about the problem with Rust and was not taken seriously. This is
> disappointing and exclusionary.

I don't think that's fair.  A quick reminder on the history: There was
lots of excitement about potentially introducing Rust two years ago at
our virtual Git contributors conference.  Taylor formally proposed
adopting it on the mailing list a year and a half ago.  And at Git
Merge last year, among those in attendance, there was broad
significant interest in adopting Rust with unanimous support for
letting it move forward among those that were present (which, yes, we
know wasn't everyone).  And there's the three rounds so far of this
patch series.  At every discussion where you weren't present, someone
else would always bring up you and NonStop, and point out how you've
been a very positive long-term member of the Git community and how
Rust adoption would likely negatively affect you, which would be
regrettable.  We waited years to adopt Rust precisely (and I believe
solely) because of your objections.  Josh and Calvin even went the
route of making optional not-even-built-by-default Rust libraries
(libgit-rs and libgit-sys) when they wanted to add some Rust bindings.
If years of deference by other community members isn't considered
taking you seriously, I don't know what is.

I agree that it is disappointing that there isn't a clear way to both
gain the compelling advantages of Rust while also retaining the full
current extent of our widespread platform support.  It's doubly
unfortunate since you're such a positive contributing member of the
community.  But not allowing us to ever gain the advantages of Rust is
problematic too.  So, a decision has to be made, one way or the other.

If it helps, here's the statements I've seen from long term community
members on Ezekiel's proposal for a hard dependency so far, most of
which call out the reduced platform support (whether in favor of the
proposal or not):
  * Randall: https://lore.kernel.org/git/031601dc143f$7a9a25d0$6fce7170$@nexbridge.com/
  * brian: https://lore.kernel.org/git/aHlwZPbiKnakMN75@fruit.crustytoothpaste.net/
  * Taylor: https://lore.kernel.org/git/aHl4U98BBvpA5eKF@nand.local/
  * Patrick: https://lore.kernel.org/git/aH-CN0RYFmpm7fMt@pks.im/
  * Phillip: https://lore.kernel.org/git/f439958d-64ce-417f-8175-720f69387d48@gmail.com/

There's also been some emails that can be read as implicitly making a
position statement on the topic from long term community members:
  * Junio: https://lore.kernel.org/git/xmqqzfd12ujv.fsf@gitster.g/
  * Johannes: https://lore.kernel.org/git/ac871bc4-df93-31f4-55f2-d6fc538a422d@gmx.de/
  * Elijah: https://lore.kernel.org/git/pull.1980.git.git.1752784344.gitgitgadget@gmail.com/
(I figured my noted assistance of this series meant I didn't need to
explicitly call out my support for it.)

> The assertion in the policy that Rust is easily interoperable is incorrect.

Are you mixing up interoperability with portability?  Without further
context than your email provides, it appears so to me.  Rust code can
call C code and vice-versa within the same process without huge
amounts of serializing and deserializing of data structures, and
without what amounts to something close to an operating system context
switch in order to ensure call stacks are as expected for the language
in question.  To me, that means we can call the two languages easily
interoperable.  On the other hand, portability of those languages is
about whether those languages have compilers supported on various
hardware platforms.  The document explicitly calls out that fewer
systems have a Rust compiler than have a C compiler, and that Rust
adoption would thus reduce how portable Git is.  Are you referring to
this lower portability that the document itself also calls out, or are
you pointing out additional issues with interoperation between the
languages on a platform where compilers for both languages exist?  If
the latter, could you provide more details?


I know my email is probably disappointing to you, at a minimum.  I'm
sorry about that.  I hope it's helpful, at least in having links to
where various folks in the community stand if nothing else.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-23 18:30             ` Elijah Newren
@ 2025-08-23 19:24               ` brian m. carlson
  2025-08-23 20:04                 ` rsbecker
                                   ` (2 more replies)
  2025-08-27  1:57               ` Taylor Blau
  1 sibling, 3 replies; 198+ messages in thread
From: brian m. carlson @ 2025-08-23 19:24 UTC (permalink / raw)
  To: Elijah Newren
  Cc: rsbecker, Kristoffer Haugsbakk, Josh Soref, git, Taylor Blau,
	Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones,
	Ezekiel Newren, Josh Steadmon, Calvin Wan

[-- Attachment #1: Type: text/plain, Size: 3926 bytes --]

On 2025-08-23 at 18:30:26, Elijah Newren wrote:
> I don't think that's fair.  A quick reminder on the history: There was
> lots of excitement about potentially introducing Rust two years ago at
> our virtual Git contributors conference.  Taylor formally proposed
> adopting it on the mailing list a year and a half ago.  And at Git
> Merge last year, among those in attendance, there was broad
> significant interest in adopting Rust with unanimous support for
> letting it move forward among those that were present (which, yes, we
> know wasn't everyone).  And there's the three rounds so far of this
> patch series.  At every discussion where you weren't present, someone
> else would always bring up you and NonStop, and point out how you've
> been a very positive long-term member of the Git community and how
> Rust adoption would likely negatively affect you, which would be
> regrettable.  We waited years to adopt Rust precisely (and I believe
> solely) because of your objections.  Josh and Calvin even went the
> route of making optional not-even-built-by-default Rust libraries
> (libgit-rs and libgit-sys) when they wanted to add some Rust bindings.
> If years of deference by other community members isn't considered
> taking you seriously, I don't know what is.
> 
> I agree that it is disappointing that there isn't a clear way to both
> gain the compelling advantages of Rust while also retaining the full
> current extent of our widespread platform support.  It's doubly
> unfortunate since you're such a positive contributing member of the
> community.  But not allowing us to ever gain the advantages of Rust is
> problematic too.  So, a decision has to be made, one way or the other.

I think it's worth saying that I do appreciate your (Randall's) positive
contributions as well and I would love some way to continue to support
NonStop as we adopt Rust.  To be clear, I care deeply about portability:
I have owned PowerPC, UltraSPARC, MIPS, and ARM hardware, and I test
many of my personal projects on at least Linux, FreeBSD, and NetBSD.

There is an alternative Rust compiler, mrustc[0], which is written in
C++ and that I have played around with to see if it could meet our
needs.  I've been very busy lately and haven't had the time to test it
out fully, and although it will likely require some upstream changes for
static libraries and a compatibility wrapper because its minicargo is
very limited in functionality, it might be an option that we could
leverage.  There will necessarily be work on Rust upstream as well, but
I'm hoping that mrustc will at least open doors for us.

I also think that Rust is becoming a more and more common language in
technology because of its interoperability with C and its memory safety.
The support policy I wrote up explains why there is an increasing push
from governments, security professionals, and the technology industry
for memory-safe languages.  If Git is to continue its success and broad
adoption, we don't want it to be labelled software that is using
security anti-patterns, and we also don't want it to be a CVE factory
like libxml2 or ImageMagick.  This is the reason I ultimately started
work on the SHA-256 project many years ago: I knew we'd need to do it
for security reasons and that without a more secure hash algorithm, Git
would eventually be dropped.

My hope is that NonStop can find some way to support Rust because I
think it's a compelling language and NonStop would greatly benefit from
the wider variety of software available.  My sense of previous
discussions was that we do very much want NonStop to continue to come
along as we support Rust in Git and that if there are ways we make it
easier for both, we'd want to do that.  That's certainly my view, at
least.

[0] https://github.com/thepowersgang/mrustc
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* RE: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-23 19:24               ` brian m. carlson
@ 2025-08-23 20:04                 ` rsbecker
  2025-08-23 20:36                 ` Sam James
  2025-08-23 21:17                 ` Haelwenn (lanodan) Monnier
  2 siblings, 0 replies; 198+ messages in thread
From: rsbecker @ 2025-08-23 20:04 UTC (permalink / raw)
  To: 'brian m. carlson', 'Elijah Newren'
  Cc: 'Kristoffer Haugsbakk', 'Josh Soref', git,
	'Taylor Blau', 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Patrick Steinhardt', 'Sam James',
	'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On August 23, 2025 3:25 PM, brian m. carlson wrote:
>On 2025-08-23 at 18:30:26, Elijah Newren wrote:
>> I don't think that's fair.  A quick reminder on the history: There was
>> lots of excitement about potentially introducing Rust two years ago at
>> our virtual Git contributors conference.  Taylor formally proposed
>> adopting it on the mailing list a year and a half ago.  And at Git
>> Merge last year, among those in attendance, there was broad
>> significant interest in adopting Rust with unanimous support for
>> letting it move forward among those that were present (which, yes, we
>> know wasn't everyone).  And there's the three rounds so far of this
>> patch series.  At every discussion where you weren't present, someone
>> else would always bring up you and NonStop, and point out how you've
>> been a very positive long-term member of the Git community and how
>> Rust adoption would likely negatively affect you, which would be
>> regrettable.  We waited years to adopt Rust precisely (and I believe
>> solely) because of your objections.  Josh and Calvin even went the
>> route of making optional not-even-built-by-default Rust libraries
>> (libgit-rs and libgit-sys) when they wanted to add some Rust bindings.
>> If years of deference by other community members isn't considered
>> taking you seriously, I don't know what is.
>>
>> I agree that it is disappointing that there isn't a clear way to both
>> gain the compelling advantages of Rust while also retaining the full
>> current extent of our widespread platform support.  It's doubly
>> unfortunate since you're such a positive contributing member of the
>> community.  But not allowing us to ever gain the advantages of Rust is
>> problematic too.  So, a decision has to be made, one way or the other.
>
>I think it's worth saying that I do appreciate your (Randall's) positive contributions
>as well and I would love some way to continue to support NonStop as we adopt
>Rust.  To be clear, I care deeply about portability:
>I have owned PowerPC, UltraSPARC, MIPS, and ARM hardware, and I test many of
>my personal projects on at least Linux, FreeBSD, and NetBSD.
>
>There is an alternative Rust compiler, mrustc[0], which is written in
>C++ and that I have played around with to see if it could meet our
>needs.  I've been very busy lately and haven't had the time to test it out fully, and
>although it will likely require some upstream changes for static libraries and a
>compatibility wrapper because its minicargo is very limited in functionality, it might
>be an option that we could leverage.  There will necessarily be work on Rust
>upstream as well, but I'm hoping that mrustc will at least open doors for us.
>
>I also think that Rust is becoming a more and more common language in technology
>because of its interoperability with C and its memory safety.
>The support policy I wrote up explains why there is an increasing push from
>governments, security professionals, and the technology industry for memory-safe
>languages.  If Git is to continue its success and broad adoption, we don't want it to
>be labelled software that is using security anti-patterns, and we also don't want it to
>be a CVE factory like libxml2 or ImageMagick.  This is the reason I ultimately started
>work on the SHA-256 project many years ago: I knew we'd need to do it for security
>reasons and that without a more secure hash algorithm, Git would eventually be
>dropped.
>
>My hope is that NonStop can find some way to support Rust because I think it's a
>compelling language and NonStop would greatly benefit from the wider variety of
>software available.  My sense of previous discussions was that we do very much
>want NonStop to continue to come along as we support Rust in Git and that if there
>are ways we make it easier for both, we'd want to do that.  That's certainly my view,
>at least.
>
>[0] https://github.com/thepowersgang/mrustc

I appreciate the encouragement, Brian. I have been trying to port Rust (and GO)
for years, without success on the platform. It is only POSIX, but not Linux, which
seems to be the requirement to do almost anything anymore.

I gave mrustc a try before. It appears to require GCC, which does not port to NonStop.
If you have any build recipes that work with c11, that would be helpful. We are
Expecting c17 soon. I can run c11 in c++ mode, but the makefiles seem to require
G++, which is part of GCC.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-23 18:05       ` Junio C Hamano
@ 2025-08-23 20:29         ` Ezekiel Newren
  2025-08-25 19:16         ` Elijah Newren
  1 sibling, 0 replies; 198+ messages in thread
From: Ezekiel Newren @ 2025-08-23 20:29 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ben Knoble, Ramsay Jones

On Sat, Aug 23, 2025 at 12:05 PM Junio C Hamano <gitster@pobox.com> wrote:
> > To address these issue, introduce a new type, ivec -- short for
> > interoperable vector. (We refer to it as 'ivec' generally, though on
> > the Rust side the struct is called IVec to match Rust style.)
>
> I however was hoping by now Rust getting used more widely, somebody
> has already created a generic "this is how you make C-array and Rust
> vectors interoperate" wrapper that latecomer projects like us can
> use without inventing our own.

I've looked for code that will do what Git needs, but as far as I know
nothing can do everything that my ivec can.

> > +INTEROP_OBJS += interop/ivec.o
> > +.PHONY: interop-objs
> > +interop-objs: $(INTEROP_OBJS)
>
> What is this phony target used for?  No other targets seem to depend
> on this one (I am wondering if we need the latter two lines).

You are correct. I will remove those lines.

> > diff --git a/interop/ivec.c b/interop/ivec.c
> > new file mode 100644
> > index 000000000000..9bc2258c04ad
> > --- /dev/null
> > +++ b/interop/ivec.c
>
> I am wondering if this needs a new hierarchy "interop"; shouldn't
> the existing "compat" be a good fit enough?  I dunno.

I had considered compat/, but I thought it didn’t fit.  I thought it
meant that the same API would exist everywhere, as opposed to “We
speak different languages, but we’ve agreed on a translator or common
protocol”.  In particular, an example from ivec, a function on both
the C and Rust sides:

    void ivec_extend_from_slice(void *_self, void const *ptr, usize size);
    pub fn extend_from_slice(&mut self, slice: &[T]) where T: Clone,

The Rust side uses a slice or “fat” pointer, where the C side uses two
arguments (a pointer and a size) in its place.  The API is different,
even if semantically they are the same and they are interoperable.

Was I reading too much into the meaning of compat/?  Do folks object
to using interop/?

> Even though this is a shim to somebody else's code, it still is a
> part of our codebase, so our CodingGuidelines for C programs should
> apply.

Sorry I missed those. I will fix them up.

> > @@ -0,0 +1,151 @@
> > +#include "ivec.h"
> > +
> > +static void ivec_set_capacity(void* self, usize new_capacity) {
> > +     struct rawivec *this = self;
>
>  - Asterisk sticks to the variable, not type.
>
>  - The opening and closing {braces} for the function body are
>    written at the leftmost column on its own line.
>
>  - There should be a blank line between the declarations and the
>    first statement.

I will make those changes.

> > +     if (new_capacity == 0)
> > +             FREE_AND_NULL(this->ptr);
> > +     else
> > +             this->ptr = xrealloc(this->ptr, new_capacity * this->element_size);
> > +     this->capacity = new_capacity;
> > +}
> > +
> > +void ivec_init(void* self, usize element_size) {
> > +     struct rawivec *this = self;
> > +     this->ptr = NULL;
> > +     this->length = 0;
> > +     this->capacity = 0;
> > +     this->element_size = element_size;
> > +}
>
> I notice that this reintroduces a variable named "this", which was
> eradicated in 585c0e2e (diff: rename 'this' variables, 2018-02-14).
>
> I do not think those who want to use C++ compilers on our C code
> would not mind "self", so how about doing something like...
>
>
>         void ivec_init(void *self_, usize element_size)
>         {
>                 struct rawivec *self = self;
>
>                 self->ptr = NULL;
>                 self->len = 0;
>                 self->capacity = 0;
>                 self->element_size = element_size;
>         }
>
> ... perhaps?

That sounds good. I will make those changes.

> > diff --git a/interop/ivec.h b/interop/ivec.h
> > new file mode 100644
> > index 000000000000..98be4bbeb54a
> > --- /dev/null
> > +++ b/interop/ivec.h
> > @@ -0,0 +1,52 @@
> > +#ifndef IVEC_H
> > +#define IVEC_H
> > +
> > +#include "../git-compat-util.h"
>
> As we use -I. on the command line, there is no need to add "../"
> here; just writing
>
>         #include <git-compat-util.h>
>
> should be enough.  Also, if this file does not depend on the
> services compat-util header provides (and I do not think it does
> from a brief look at its contents), it is better not to include it.

This file actually does depend on git-compat-util.h, particularly the
Rust primitive definitions (e.g. usize, u64, etc...).

I'll use the include style you mentioned.

> > diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
> > index e69de29bb2d1..8b137891791f 100644
> > --- a/rust/xdiff/src/lib.rs
> > +++ b/rust/xdiff/src/lib.rs
> > @@ -0,0 +1 @@
> > +
>
> If this empty line in an otherwise empty file is absolutely
> necessary to make Rust work, then please arrange .gitattributes to
> tell git that this file is excempt from the usual blank-at-eof
> whitespace rule we use.  If not, remove that unnecessary empty line.
>
> Or perhaps remove the file altogether if nobody looks at it???
>
> In any case, given that our top-level .gitattributes file starts
> with
>
>     * whitespace=!indent,trail,space
>     *.[ch] whitespace=indent,trail,space diff=cpp
>     *.sh whitespace=indent,trail,space text eol=lf
>     ...
>     *.bat text eol=crlf
>     CODE_OF_CONDUCT.md -whitespace
>     ...
>
> I think a new rule to cover "*.rs" and perhaps *.toml files right
> before rules for each specific file begin would be in order.

This is only a problem for empty files, and we only have empty .rs
files to setup the basic Rust build in this commit. It's not a problem
later in this series and shouldn't be a problem in the future. I will
remove those blank lines from those commits.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-23 19:24               ` brian m. carlson
  2025-08-23 20:04                 ` rsbecker
@ 2025-08-23 20:36                 ` Sam James
  2025-08-23 21:17                 ` Haelwenn (lanodan) Monnier
  2 siblings, 0 replies; 198+ messages in thread
From: Sam James @ 2025-08-23 20:36 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Elijah Newren, rsbecker, Kristoffer Haugsbakk, Josh Soref, git,
	Taylor Blau, Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones,
	Ezekiel Newren, Josh Steadmon, Calvin Wan

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> On 2025-08-23 at 18:30:26, Elijah Newren wrote:
>> I don't think that's fair.  A quick reminder on the history: There was
>> lots of excitement about potentially introducing Rust two years ago at
>> our virtual Git contributors conference.  Taylor formally proposed
>> adopting it on the mailing list a year and a half ago.  And at Git
>> Merge last year, among those in attendance, there was broad
>> significant interest in adopting Rust with unanimous support for
>> letting it move forward among those that were present (which, yes, we
>> know wasn't everyone).  And there's the three rounds so far of this
>> patch series.  At every discussion where you weren't present, someone
>> else would always bring up you and NonStop, and point out how you've
>> been a very positive long-term member of the Git community and how
>> Rust adoption would likely negatively affect you, which would be
>> regrettable.  We waited years to adopt Rust precisely (and I believe
>> solely) because of your objections.  Josh and Calvin even went the
>> route of making optional not-even-built-by-default Rust libraries
>> (libgit-rs and libgit-sys) when they wanted to add some Rust bindings.
>> If years of deference by other community members isn't considered
>> taking you seriously, I don't know what is.
>> 
>> I agree that it is disappointing that there isn't a clear way to both
>> gain the compelling advantages of Rust while also retaining the full
>> current extent of our widespread platform support.  It's doubly
>> unfortunate since you're such a positive contributing member of the
>> community.  But not allowing us to ever gain the advantages of Rust is
>> problematic too.  So, a decision has to be made, one way or the other.
>
> I think it's worth saying that I do appreciate your (Randall's) positive
> contributions as well and I would love some way to continue to support
> NonStop as we adopt Rust.  To be clear, I care deeply about portability:
> I have owned PowerPC, UltraSPARC, MIPS, and ARM hardware, and I test
> many of my personal projects on at least Linux, FreeBSD, and NetBSD.
>
> There is an alternative Rust compiler, mrustc[0], which is written in
> C++ and that I have played around with to see if it could meet our
> needs.

As far as I'm aware, mrustc is intended purely for having a bootstrap
path to rustc, not to be a full blown Rust implementation.

We discussed the other options in this area in
https://lore.kernel.org/git/874iv4gqxv.fsf@gentoo.org/ and Patrick's
reply.

I still think it's dubious to move to something where there's only one
implementation (and an implementation that moves very fast) when
currently we go to pains to support even incomplete C compilers! See the
"test balloon" for 'bool'.

> [...]

sam

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-23 19:24               ` brian m. carlson
  2025-08-23 20:04                 ` rsbecker
  2025-08-23 20:36                 ` Sam James
@ 2025-08-23 21:17                 ` Haelwenn (lanodan) Monnier
  2 siblings, 0 replies; 198+ messages in thread
From: Haelwenn (lanodan) Monnier @ 2025-08-23 21:17 UTC (permalink / raw)
  To: brian m. carlson, Elijah Newren, rsbecker, Kristoffer Haugsbakk,
	Josh Soref, git, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Johannes Schindelin, Matthias Aßhauer,
	Patrick Steinhardt, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones,
	Ezekiel Newren, Josh Steadmon, Calvin Wan

[2025-08-23 19:24:33+0000] brian m. carlson:
>On 2025-08-23 at 18:30:26, Elijah Newren wrote:
>> I don't think that's fair.  A quick reminder on the history: There was
>> lots of excitement about potentially introducing Rust two years ago at
>> our virtual Git contributors conference.  Taylor formally proposed
>> adopting it on the mailing list a year and a half ago.  And at Git
>> Merge last year, among those in attendance, there was broad
>> significant interest in adopting Rust with unanimous support for
>> letting it move forward among those that were present (which, yes, we
>> know wasn't everyone).  And there's the three rounds so far of this
>> patch series.  At every discussion where you weren't present, someone
>> else would always bring up you and NonStop, and point out how you've
>> been a very positive long-term member of the Git community and how
>> Rust adoption would likely negatively affect you, which would be
>> regrettable.  We waited years to adopt Rust precisely (and I believe
>> solely) because of your objections.  Josh and Calvin even went the
>> route of making optional not-even-built-by-default Rust libraries
>> (libgit-rs and libgit-sys) when they wanted to add some Rust bindings.
>> If years of deference by other community members isn't considered
>> taking you seriously, I don't know what is.
>>
>> I agree that it is disappointing that there isn't a clear way to both
>> gain the compelling advantages of Rust while also retaining the full
>> current extent of our widespread platform support.  It's doubly
>> unfortunate since you're such a positive contributing member of the
>> community.  But not allowing us to ever gain the advantages of Rust is
>> problematic too.  So, a decision has to be made, one way or the other.
>
>I think it's worth saying that I do appreciate your (Randall's) positive
>contributions as well and I would love some way to continue to support
>NonStop as we adopt Rust.  To be clear, I care deeply about portability:
>I have owned PowerPC, UltraSPARC, MIPS, and ARM hardware, and I test
>many of my personal projects on at least Linux, FreeBSD, and NetBSD.
>
>There is an alternative Rust compiler, mrustc[0], which is written in
>C++ and that I have played around with to see if it could meet our
>needs.  I've been very busy lately and haven't had the time to test it
>out fully, and although it will likely require some upstream changes for
>static libraries and a compatibility wrapper because its minicargo is
>very limited in functionality, it might be an option that we could
>leverage.  There will necessarily be work on Rust upstream as well, but
>I'm hoping that mrustc will at least open doors for us.
>
>I also think that Rust is becoming a more and more common language in
>technology because of its interoperability with C and its memory safety.
>The support policy I wrote up explains why there is an increasing push
>from governments, security professionals, and the technology industry
>for memory-safe languages.  If Git is to continue its success and broad
>adoption, we don't want it to be labelled software that is using
>security anti-patterns, and we also don't want it to be a CVE factory
>like libxml2 or ImageMagick.  This is the reason I ultimately started
>work on the SHA-256 project many years ago: I knew we'd need to do it
>for security reasons and that without a more secure hash algorithm, Git
>would eventually be dropped.
>
>My hope is that NonStop can find some way to support Rust because I
>think it's a compelling language and NonStop would greatly benefit from
>the wider variety of software available.  My sense of previous
>discussions was that we do very much want NonStop to continue to come
>along as we support Rust in Git and that if there are ways we make it
>easier for both, we'd want to do that.  That's certainly my view, at
>least.
>
>[0] https://github.com/thepowersgang/mrustc
>-- 
>brian m. carlson (they/them)
>Toronto, Ontario, CA

Hello,

mrustc isn't really a alternative compiler, it only serves
to bootstrap rustc+cargo from source code rather than binaries,
you can't really use it to compile arbitrary Rust code.

You'd still need to port LLVM and rustc.

gccrs would more be the alternative compiler but it still seems
to have a long road ahead of it: https://rust-gcc.github.io/

Best regards

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-23  3:55     ` [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust Ezekiel Newren via GitGitGadget
                         ` (2 preceding siblings ...)
  2025-08-23 18:05       ` Junio C Hamano
@ 2025-08-24 13:31       ` Ben Knoble
  2025-08-25 20:40         ` Ezekiel Newren
  3 siblings, 1 reply; 198+ messages in thread
From: Ben Knoble @ 2025-08-24 13:31 UTC (permalink / raw)
  To: Ezekiel Newren via GitGitGadget
  Cc: git, Elijah Newren, brian m. carlson, Taylor Blau,
	Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ramsay Jones, Ezekiel Newren


> Le 22 août 2025 à 23:56, Ezekiel Newren via GitGitGadget <gitgitgadget@gmail.com> a écrit :
> 
> From: Ezekiel Newren <ezekielnewren@gmail.com>
> 
> Trying to use Rust's Vec in C, or git's ALLOC_GROW() macros (via
> wrapper functions) in Rust is painful because:
> 
>  * C doing vector things the Rust way would require wrapper functions,
>    and Rust doing vector things the C way would require wrapper
>    functions, so ivec was created to ensure a consistent contract
>    between the 2 languages for how to manipulate a vector.
>  * Currently, Rust defines its own 'Vec' type that is generic, but its
>    memory allocator and struct layout weren't designed for
>    interoperability with C (or any language for that matter), meaning
>    that the C side cannot push to or expand a 'Vec' without defining
>    wrapper functions in Rust that C can call. Without special care,
>    the two languages might use different allocators (malloc/free on
>    the C side, and possibly something else in Rust), which would make
>    it difficult for a function in one language to free elements
>    allocated by a call from a function in the other language.
>  * Similarly, git defines ALLOC_GROW() and related macros in
>    git-compat-util.h. While we could add functions allowing Rust to
>    invoke something similar to those macros, passing three variables
>    (pointer, length, allocated_size) instead of a single variable
>    (vector) across the language boundary requires more cognitive
>    overhead for readers to keep track of and makes it easier to make
>    mistakes. Further, for low-level components that we want to
>    eventually convert to pure Rust, such triplets would feel very out
>    of place.

I’m mildly surprised Vec isn’t a good fit: isn’t it a pointer, length, capacity triple? But it sounds like the main issue is allocator interop… which I would also have thought was supported? At least the current version is documented as being generic against an Allocator, too.

> 
> To address these issue, introduce a new type, ivec -- short for
> interoperable vector. (We refer to it as 'ivec' generally, though on
> the Rust side the struct is called IVec to match Rust style.)  This new
> type is specifically designed for FFI purposes, so that both languages
> handle the vector in the same way, though it could be used on either
> side independently. This type is designed such that it can easily be
> replaced by a standard Rust 'Vec' once interoperability is no longer a
> concern.

Am I reading the patch correctly that the ivec implementation is primarily C? I’m not familiar with too many FFI projects in Rust, but I might have hoped we could write parts in Rust to gain any benefits from that, too. Is that a fool’s errand I’m thinking of?

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [-SPAM-] [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification
  2025-08-19  2:00     ` Elijah Newren
@ 2025-08-24 16:52       ` Patrick Steinhardt
  0 siblings, 0 replies; 198+ messages in thread
From: Patrick Steinhardt @ 2025-08-24 16:52 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Ramsay Jones, Ezekiel Newren via GitGitGadget, git

On Mon, Aug 18, 2025 at 07:00:16PM -0700, Elijah Newren wrote:
> On Fri, Aug 15, 2025 at 8:10 AM Ramsay Jones
> <ramsay@ramsayjones.plus.com> wrote:
> >
> > On 15/08/2025 02:22, Ezekiel Newren via GitGitGadget wrote:
> > > Changes in this second round of this RFC:
> > >
> > >  * Now builds and passes tests on all platforms (example run:
> > >    https://github.com/ezekielnewren/git/actions/runs/16974821401). Special
> > >    thanks to Johannes Schindelin for patches to things for Windows and
> > >    linux32.
> >
> > Hmm, builds on *all* platforms may be a bit optimistic (it doesn't on
> > cygwin, for instance), so I'm guessing you mean all platforms which
> > have CI defined. Perhaps you could mention the platforms which you
> > have tested on. :)
> 
> Ezekiel says this email didn't show up in his inbox (no idea why), but
> yes what was meant was all platforms where gitgitgadget CI runs.  If
> you follow the github.com link in the text that you quoted, you can
> see all those platforms (various windows flavors, various osx builds,
> musl, sparse, static analysis, etc.).

I do have some patches sitting around for a long while already that
implements CI via MSYS2 in different environments. It works with both
MSYS and MinGW, where I think they are somewhat related to Cygwin? If it
would prove useful I could maybe polish this patch series and send it
upstream.

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-23 18:05       ` Junio C Hamano
  2025-08-23 20:29         ` Ezekiel Newren
@ 2025-08-25 19:16         ` Elijah Newren
  2025-08-26  5:40           ` Junio C Hamano
  1 sibling, 1 reply; 198+ messages in thread
From: Elijah Newren @ 2025-08-25 19:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ezekiel Newren via GitGitGadget, git, brian m. carlson,
	Taylor Blau, Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ben Knoble, Ramsay Jones,
	Ezekiel Newren

On Sat, Aug 23, 2025 at 11:05 AM Junio C Hamano <gitster@pobox.com> wrote:
> > diff --git a/interop/ivec.c b/interop/ivec.c
> > new file mode 100644
> > index 000000000000..9bc2258c04ad
> > --- /dev/null
> > +++ b/interop/ivec.c
>
> Even though this is a shim to somebody else's code, it still is a
> part of our codebase, so our CodingGuidelines for C programs should
> apply.

Sorry, I should have caught these in my preliminary review before he
sent this off to the list.  One question, though...

> > diff --git a/interop/ivec.h b/interop/ivec.h
> > new file mode 100644
> > index 000000000000..98be4bbeb54a
> > --- /dev/null
> > +++ b/interop/ivec.h
> > @@ -0,0 +1,52 @@
> > +#ifndef IVEC_H
> > +#define IVEC_H
> > +
> > +#include "../git-compat-util.h"
>
> As we use -I. on the command line, there is no need to add "../"
> here; just writing
>
>         #include <git-compat-util.h>
>
> should be enough.  Also, if this file does not depend on the
> services compat-util header provides (and I do not think it does
> from a brief look at its contents), it is better not to include it.

Should this rather be

   #include "git-compat-util.h"

with quotes rather than angle brackets?  In particular:

$ git grep include.*git-compat-util -- '*.[ch]' | wc -l
362
$ git grep include.*git-compat-util -- '*/*.[ch]' | wc -l
125

So, we have 362 includes of git-compat-util.h in our codebase, 125
from subdirectories.  Of those:

$ git grep include.*git-compat-util -- '*.[ch]' | grep '"' | wc -l
361
$ git grep include.*git-compat-util -- '*.[ch]' | grep '<' | wc -l
1

Only one of these include statements uses angle brackets -- the
compiler-tricks/not-constant.c file (which appears to be a temporary
hack that we'll eventually delete).  I had always assumed <> were for
system includes and "" for project includes, but a quick Google search
shows the actual situation is quite a bit murkier than I'd realized.
Still, our current project practice appears to be double quotes; is
that fine here or are you suggesting you'd like the current project
practice to be changed?

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-24 13:31       ` Ben Knoble
@ 2025-08-25 20:40         ` Ezekiel Newren
  2025-08-26 13:30           ` D. Ben Knoble
  0 siblings, 1 reply; 198+ messages in thread
From: Ezekiel Newren @ 2025-08-25 20:40 UTC (permalink / raw)
  To: Ben Knoble
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ramsay Jones

On Sun, Aug 24, 2025 at 7:31 AM Ben Knoble <ben.knoble@gmail.com> wrote:
> I’m mildly surprised Vec isn’t a good fit: isn’t it a pointer, length, capacity triple? But it sounds like the main issue is allocator interop… which I would also have thought was supported? At least the current version is documented as being generic against an Allocator, too.

Conceptually yes, semantically and syntactically no. On top of Vec<T>
not being defined with #[repr(C)] (which ensures field order, C ABI
layout, padding, etc...) the struct definition for Vec isn't constant
between Rust versions. I'd be open to suggestions for an alternative
to my ivec type.

=== Rust version 1.61.0 ===
from: https://doc.rust-lang.org/1.61.0/src/alloc/vec/mod.rs.html#400
#[stable(feature = "rust1", since = "1.0.0")]
#[cfg_attr(not(test), rustc_diagnostic_item = "Vec")]
#[rustc_insignificant_dtor]
pub struct Vec<T, #[unstable(feature = "allocator_api", issue =
"32838")] A: Allocator = Global> {
    buf: RawVec<T, A>,
    len: usize,
}

from: https://doc.rust-lang.org/1.61.0/src/alloc/raw_vec.rs.html#52
#[allow(missing_debug_implementations)]
pub(crate) struct RawVec<T, A: Allocator = Global> {
    ptr: Unique<T>,
    cap: usize,
    alloc: A,
}

=== Rust version 1.89.0 ===
from: https://doc.rust-lang.org/1.89.0/src/alloc/vec/mod.rs.html#414
#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_diagnostic_item = "Vec"]
#[rustc_insignificant_dtor]
pub struct Vec<T, #[unstable(feature = "allocator_api", issue =
"32838")] A: Allocator = Global> {
    buf: RawVec<T, A>,
    len: usize,
}

from: https://doc.rust-lang.org/1.89.0/src/alloc/raw_vec/mod.rs.html#74
#[allow(missing_debug_implementations)]
pub(crate) struct RawVec<T, A: Allocator = Global> {
    inner: RawVecInner<A>,
    _marker: PhantomData<T>,
}

from: https://doc.rust-lang.org/1.89.0/src/alloc/raw_vec/mod.rs.html#86
#[allow(missing_debug_implementations)]
struct RawVecInner<A: Allocator = Global> {
    ptr: Unique<u8>,
    /// Never used for ZSTs; it's `capacity()`'s responsibility to
return usize::MAX in that case.
    ///
    /// # Safety
    ///
    /// `cap` must be in the `0..=isize::MAX` range.
    cap: Cap,
    alloc: A,
}

> Am I reading the patch correctly that the ivec implementation is primarily C? I’m not familiar with too many FFI projects in Rust, but I might have hoped we could write parts in Rust to gain any benefits from that, too. Is that a fool’s errand I’m thinking of?

The ivec type is defined and implemented in C (interop/ivec.[ch]) and
Rust (rust/interop/src/ivec.rs). When I started writing the ivec type
I didn't know if the Git community would accept a hard dependency on
Rust, so I made ivec usable in C without needing Rust.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-25 19:16         ` Elijah Newren
@ 2025-08-26  5:40           ` Junio C Hamano
  0 siblings, 0 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-08-26  5:40 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Ezekiel Newren via GitGitGadget, git, brian m. carlson,
	Taylor Blau, Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ben Knoble, Ramsay Jones,
	Ezekiel Newren

Elijah Newren <newren@gmail.com> writes:

>> > +#include "../git-compat-util.h"
>>
>> As we use -I. on the command line, there is no need to add "../"
>> here; just writing
>>
>>         #include <git-compat-util.h>
>>
>> should be enough.  Also, if this file does not depend on the
>> services compat-util header provides (and I do not think it does
>> from a brief look at its contents), it is better not to include it.
>
> Should this rather be
>
>    #include "git-compat-util.h"

I meant <>; when "" included header is not found, it falls back as
if it were <> included, IIRC, so writing <> when you specify exactly
where your headers are with -I. avoids such unnecessary fallback in
theory, but as both <> and "" search for implementation-defined
places, the distinction does not make much practical difference.

> Still, our current project practice appears to be double quotes; is
> that fine here or are you suggesting you'd like the current project
> practice to be changed?

It would be nice if we could do so, but I do not think it is worth
the patch churn.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-25 20:40         ` Ezekiel Newren
@ 2025-08-26 13:30           ` D. Ben Knoble
  2025-08-26 18:47             ` Ezekiel Newren
  0 siblings, 1 reply; 198+ messages in thread
From: D. Ben Knoble @ 2025-08-26 13:30 UTC (permalink / raw)
  To: Ezekiel Newren
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ramsay Jones

On Mon, Aug 25, 2025 at 4:40 PM Ezekiel Newren <ezekielnewren@gmail.com> wrote:
>
> On Sun, Aug 24, 2025 at 7:31 AM Ben Knoble <ben.knoble@gmail.com> wrote:
> > I’m mildly surprised Vec isn’t a good fit: isn’t it a pointer, length, capacity triple? But it sounds like the main issue is allocator interop… which I would also have thought was supported? At least the current version is documented as being generic against an Allocator, too.
>
> Conceptually yes, semantically and syntactically no. On top of Vec<T>
> not being defined with #[repr(C)] (which ensures field order, C ABI
> layout, padding, etc...) the struct definition for Vec isn't constant
> between Rust versions. I'd be open to suggestions for an alternative
> to my ivec type.

Ah, thanks—I had forgotten about the #[repr(C)] needs and changes. Makes sense.

> > Am I reading the patch correctly that the ivec implementation is primarily C? I’m not familiar with too many FFI projects in Rust, but I might have hoped we could write parts in Rust to gain any benefits from that, too. Is that a fool’s errand I’m thinking of?
>
> The ivec type is defined and implemented in C (interop/ivec.[ch]) and
> Rust (rust/interop/src/ivec.rs). When I started writing the ivec type
> I didn't know if the Git community would accept a hard dependency on
> Rust, so I made ivec usable in C without needing Rust.

Right—I saw both implementations, but it looked like C did most of the
work, which was my main question. Re-reading, it looks like Rust does
more work than I thought (with implementations of insert/push/etc.)

That said, I think it's sensible to leave the type useable from just C
unless/until Rust becomes required (and then we can move things over).

Thanks!

-- 
D. Ben Knoble

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-26 13:30           ` D. Ben Knoble
@ 2025-08-26 18:47             ` Ezekiel Newren
  2025-08-26 22:01               ` brian m. carlson
  0 siblings, 1 reply; 198+ messages in thread
From: Ezekiel Newren @ 2025-08-26 18:47 UTC (permalink / raw)
  To: D. Ben Knoble
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	brian m. carlson, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ramsay Jones

On Tue, Aug 26, 2025 at 7:30 AM D. Ben Knoble <ben.knoble@gmail.com> wrote:
> > > Am I reading the patch correctly that the ivec implementation is primarily C? I’m not familiar with too many FFI projects in Rust, but I might have hoped we could write parts in Rust to gain any benefits from that, too. Is that a fool’s errand I’m thinking of?
> >
> > The ivec type is defined and implemented in C (interop/ivec.[ch]) and
> > Rust (rust/interop/src/ivec.rs). When I started writing the ivec type
> > I didn't know if the Git community would accept a hard dependency on
> > Rust, so I made ivec usable in C without needing Rust.
>
> Right—I saw both implementations, but it looked like C did most of the
> work, which was my main question. Re-reading, it looks like Rust does
> more work than I thought (with implementations of insert/push/etc.)
>
> That said, I think it's sensible to leave the type useable from just C
> unless/until Rust becomes required (and then we can move things over).

I like your idea of implementation consolidation. I just don't know
what that would look like yet.

It's not straightforward because C doesn't have generics. I'll use
IVec as an example, but this applies to any generic type in Rust. For
a function like push() in IVec<T> it will have N definitions if there
are N IVec types. e.g. If your code uses IVec<u64>, IVec<u8>,
IVec<i32> that would mean that pub fn push(&mut self) {} would compile
to 3 functions. If you don't use #[no_mangle] you'd have to figure out
the Rust compiler's exact behavior for function names when calling it
from C, which isn't stable or easily predictable. If you do use
#[no_mangle] then the Rust compiler can't generate a generic function
for each type.

Another problem is that the functions in ivec mostly deal with
resizing the memory rather than controlling access to memory for the C
side. Even if the C side used Rust defined functions, that wouldn't
solve memory access issues to the pointer on the C side. We could
enforce access to each element by requiring C to call a Rust defined
function for each element, but that sounds very painful and slow. ivec
is meant to be used as a scaffolding type to help transition C to
Rust.

Other projects that do use Rust's builtin Vec (or some other
collection type) often Box it and write wrapper functions. This means
that the C side sees an opaque void* instead of a transparent struct
like ivec with ptr, length, capacity, and element_size.

I'm curious if the community has more design feedback, or suggestions
for an alternative to my ivec type.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust
  2025-08-26 18:47             ` Ezekiel Newren
@ 2025-08-26 22:01               ` brian m. carlson
  0 siblings, 0 replies; 198+ messages in thread
From: brian m. carlson @ 2025-08-26 22:01 UTC (permalink / raw)
  To: Ezekiel Newren
  Cc: D. Ben Knoble, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren, Taylor Blau, Christian Brabandt, Phillip Wood,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, Ramsay Jones

[-- Attachment #1: Type: text/plain, Size: 898 bytes --]

On 2025-08-26 at 18:47:06, Ezekiel Newren wrote:
> I'm curious if the community has more design feedback, or suggestions
> for an alternative to my ivec type.

I think this is a fine approach.  We used a similar Vec<u8>-like
structure when porting a service from C to Rust at $DAYJOB, but we've
also used the boxing approach to good success.

Ultimately, I think boxing is the right choice when we don't need access
from C.  For instance, if I were to convert the loose object mapping
code to Rust, I would store the data as a pair of Box<HashMap<_, _>> and
then attach them to struct repository as `void *`.

But it's less definitive when you need access from C.  Because of the
way we intimately mess with the internals of our data structures in this
codebase, the ivec is probably the right choice here, so I'd keep that.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-23 18:30             ` Elijah Newren
  2025-08-23 19:24               ` brian m. carlson
@ 2025-08-27  1:57               ` Taylor Blau
  2025-08-27 14:39                 ` rsbecker
  1 sibling, 1 reply; 198+ messages in thread
From: Taylor Blau @ 2025-08-27  1:57 UTC (permalink / raw)
  To: Elijah Newren
  Cc: rsbecker, Kristoffer Haugsbakk, Josh Soref, git, brian m. carlson,
	Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Patrick Steinhardt, Sam James, Collin Funk,
	Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones,
	Ezekiel Newren, Josh Steadmon, Calvin Wan

On Sat, Aug 23, 2025 at 11:30:26AM -0700, Elijah Newren wrote:
> > The assertion in the policy that Rust is easily interoperable is incorrect.
>
> Are you mixing up interoperability with portability?  Without further
> context than your email provides, it appears so to me.  Rust code can
> call C code and vice-versa within the same process without huge
> amounts of serializing and deserializing of data structures, and
> without what amounts to something close to an operating system context
> switch in order to ensure call stacks are as expected for the language
> in question.  To me, that means we can call the two languages easily
> interoperable.  On the other hand, portability of those languages is
> about whether those languages have compilers supported on various
> hardware platforms.  The document explicitly calls out that fewer
> systems have a Rust compiler than have a C compiler, and that Rust
> adoption would thus reduce how portable Git is.  Are you referring to
> this lower portability that the document itself also calls out, or are
> you pointing out additional issues with interoperation between the
> languages on a platform where compilers for both languages exist?  If
> the latter, could you provide more details?

I think that this is the main point from my point of view. Yes, we are
strictly worsening the project's portability by adding Rust as a
non-optional build component. But it is *not* the case that two Git
clients (one hypothetical one built with Rust components, one existing
one without) can't work on the same Git repository, even including one
on the same machine.

Forgetting Rust for a moment, I don't think it is a realistic goal to
have support for all platforms that could possibly want to run Git. I
would imagine that there are platforms today that cannot run the latest
and greatest version of Git for just that reason. My hope is that
whatever version(s) *are* compatible with those platforms are good
enough to support the workflows that those users need.

So my personal feeling is that that (not having a 100% portable version
of Git across all possible platforms) is OK. But of course that does
raise the concern that security fixes will be more difficult to backport
across a hypothetical version boundary where Rust is introduced.

To that end, I would note a couple of things:

 - This assumes that the Rust code has the same security vulnerabilities
   as the C code that it replaces. I don't think that is a given
   whatsoever, and I would bet that emperically there are fewer such
   vulnerabilities on the Rust side than on the C one (in fact, that is
   one of the reasons that we are considering Rust in the first place;
   brian m. carlson explains this point quite well IMHO).

 - If there *is* a security vulnerability in the Rust code that also
   presents a vulnerability on the corresponding C side, I would hope
   that the project's track record of generously backporting security
   fixes would suggest that we would do so in this case as well, despite
   crossing a language boundary.

   On the other side of that coin, if there is a security vulnerability
   in an older version of Git that isn't present in a newer one
   (regardless of whether or not Rust is involved), I would imagine that
   that we would write security patches against an even earlier maint-
   branch and forward-port them up to the most recent vulnerable
   version.

So my impression is that the main contention here is a concern that
worsening the portability will make it harder to push out security fixes
in either direction. But I don't think that's necessarily the case. Even
if it is, I would again hope that the track record of the folks on the
git-security list would suggest that we'd do the right thing and not
abandon users on older platforms the moment Rust is introduced into the
codebase.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 198+ messages in thread

* RE: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-27  1:57               ` Taylor Blau
@ 2025-08-27 14:39                 ` rsbecker
  2025-08-27 17:06                   ` Junio C Hamano
  0 siblings, 1 reply; 198+ messages in thread
From: rsbecker @ 2025-08-27 14:39 UTC (permalink / raw)
  To: 'Taylor Blau', 'Elijah Newren'
  Cc: 'Kristoffer Haugsbakk', 'Josh Soref', git,
	'brian m. carlson', 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Patrick Steinhardt', 'Sam James',
	'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On August 26, 2025 9:58 PM, Taylor Blau wrote:
>On Sat, Aug 23, 2025 at 11:30:26AM -0700, Elijah Newren wrote:
>> > The assertion in the policy that Rust is easily interoperable is incorrect.
>>
>> Are you mixing up interoperability with portability?  Without further
>> context than your email provides, it appears so to me.  Rust code can
>> call C code and vice-versa within the same process without huge
>> amounts of serializing and deserializing of data structures, and
>> without what amounts to something close to an operating system context
>> switch in order to ensure call stacks are as expected for the language
>> in question.  To me, that means we can call the two languages easily
>> interoperable.  On the other hand, portability of those languages is
>> about whether those languages have compilers supported on various
>> hardware platforms.  The document explicitly calls out that fewer
>> systems have a Rust compiler than have a C compiler, and that Rust
>> adoption would thus reduce how portable Git is.  Are you referring to
>> this lower portability that the document itself also calls out, or are
>> you pointing out additional issues with interoperation between the
>> languages on a platform where compilers for both languages exist?  If
>> the latter, could you provide more details?
>
>I think that this is the main point from my point of view. Yes, we are strictly
>worsening the project's portability by adding Rust as a non-optional build
>component. But it is *not* the case that two Git clients (one hypothetical one built
>with Rust components, one existing one without) can't work on the same Git
>repository, even including one on the same machine.
>
>Forgetting Rust for a moment, I don't think it is a realistic goal to have support for all
>platforms that could possibly want to run Git. I would imagine that there are
>platforms today that cannot run the latest and greatest version of Git for just that
>reason. My hope is that whatever version(s) *are* compatible with those platforms
>are good enough to support the workflows that those users need.
>
>So my personal feeling is that that (not having a 100% portable version of Git across
>all possible platforms) is OK. But of course that does raise the concern that security
>fixes will be more difficult to backport across a hypothetical version boundary
>where Rust is introduced.
>
>To that end, I would note a couple of things:
>
> - This assumes that the Rust code has the same security vulnerabilities
>   as the C code that it replaces. I don't think that is a given
>   whatsoever, and I would bet that emperically there are fewer such
>   vulnerabilities on the Rust side than on the C one (in fact, that is
>   one of the reasons that we are considering Rust in the first place;
>   brian m. carlson explains this point quite well IMHO).
>
> - If there *is* a security vulnerability in the Rust code that also
>   presents a vulnerability on the corresponding C side, I would hope
>   that the project's track record of generously backporting security
>   fixes would suggest that we would do so in this case as well, despite
>   crossing a language boundary.
>
>   On the other side of that coin, if there is a security vulnerability
>   in an older version of Git that isn't present in a newer one
>   (regardless of whether or not Rust is involved), I would imagine that
>   that we would write security patches against an even earlier maint-
>   branch and forward-port them up to the most recent vulnerable
>   version.
>
>So my impression is that the main contention here is a concern that worsening the
>portability will make it harder to push out security fixes in either direction. But I
>don't think that's necessarily the case. Even if it is, I would again hope that the track
>record of the folks on the git-security list would suggest that we'd do the right thing
>and not abandon users on older platforms the moment Rust is introduced into the
>codebase.

This is indeed my concern and hope, Taylor, as the maintainer for a platform that is
feeling abandoned. Please note that HPE NonStop is an actively maintained and
vendor supported commercial platform based on x86_64 POSIX, just not a
Linux/Windows machine.
Thank you.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-27 14:39                 ` rsbecker
@ 2025-08-27 17:06                   ` Junio C Hamano
  2025-08-27 17:15                     ` rsbecker
  2025-08-27 20:12                     ` Taylor Blau
  0 siblings, 2 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-08-27 17:06 UTC (permalink / raw)
  To: rsbecker
  Cc: 'Taylor Blau', 'Elijah Newren',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'brian m. carlson', 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Patrick Steinhardt', 'Sam James',
	'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

<rsbecker@nexbridge.com> writes:

>>So my impression is that the main contention here is a concern that worsening the
>>portability will make it harder to push out security fixes in either direction. But I
>>don't think that's necessarily the case. Even if it is, I would again hope that the track
>>record of the folks on the git-security list would suggest that we'd do the right thing
>>and not abandon users on older platforms the moment Rust is introduced into the
>>codebase.
>
> This is indeed my concern and hope, Taylor, as the maintainer for a platform that is
> feeling abandoned. Please note that HPE NonStop is an actively maintained and
> vendor supported commercial platform based on x86_64 POSIX, just not a
> Linux/Windows machine.

Thanks for a friendly conversation, but I would have to say that
Taylor's "we know we end up having to support both, and we will do
so" is way underestimates the cost to do so.  And I hope that an
actively maintained and vendor supported commercial platform would
bear the burden of the major part of that cost themselves, when it
becomes necessary to do such a dual support.

Thanks.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* RE: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-27 17:06                   ` Junio C Hamano
@ 2025-08-27 17:15                     ` rsbecker
  2025-08-27 20:12                     ` Taylor Blau
  1 sibling, 0 replies; 198+ messages in thread
From: rsbecker @ 2025-08-27 17:15 UTC (permalink / raw)
  To: 'Junio C Hamano'
  Cc: 'Taylor Blau', 'Elijah Newren',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'brian m. carlson', 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Patrick Steinhardt', 'Sam James',
	'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On August 27, 2025 1:06 PM, Junio C Hamano wrote:
><rsbecker@nexbridge.com> writes:
>
>>>So my impression is that the main contention here is a concern that
>>>worsening the portability will make it harder to push out security
>>>fixes in either direction. But I don't think that's necessarily the
>>>case. Even if it is, I would again hope that the track record of the
>>>folks on the git-security list would suggest that we'd do the right
>>>thing and not abandon users on older platforms the moment Rust is introduced
>into the codebase.
>>
>> This is indeed my concern and hope, Taylor, as the maintainer for a
>> platform that is feeling abandoned. Please note that HPE NonStop is an
>> actively maintained and vendor supported commercial platform based on
>> x86_64 POSIX, just not a Linux/Windows machine.
>
>Thanks for a friendly conversation, but I would have to say that Taylor's "we know
>we end up having to support both, and we will do so" is way underestimates the
>cost to do so.  And I hope that an actively maintained and vendor supported
>commercial platform would bear the burden of the major part of that cost
>themselves, when it becomes necessary to do such a dual support.

If the platform provider had taken on git, that might be possible, but it is not
the case. This will come down to my small team to try to cope with this
situation - generally without any help from anyone else. I can do much,
but this will likely come down to my doing all of the work after hours
and on weekends with no other support, as has happened for the past
decade.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-27 17:06                   ` Junio C Hamano
  2025-08-27 17:15                     ` rsbecker
@ 2025-08-27 20:12                     ` Taylor Blau
  2025-08-27 20:22                       ` Junio C Hamano
  1 sibling, 1 reply; 198+ messages in thread
From: Taylor Blau @ 2025-08-27 20:12 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: rsbecker, 'Elijah Newren', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'brian m. carlson',
	'Christian Brabandt', 'Phillip Wood',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Patrick Steinhardt', 'Sam James',
	'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On Wed, Aug 27, 2025 at 10:06:17AM -0700, Junio C Hamano wrote:
> <rsbecker@nexbridge.com> writes:
>
> >>So my impression is that the main contention here is a concern that worsening the
> >>portability will make it harder to push out security fixes in either direction. But I
> >>don't think that's necessarily the case. Even if it is, I would again hope that the track
> >>record of the folks on the git-security list would suggest that we'd do the right thing
> >>and not abandon users on older platforms the moment Rust is introduced into the
> >>codebase.
> >
> > This is indeed my concern and hope, Taylor, as the maintainer for a platform that is
> > feeling abandoned. Please note that HPE NonStop is an actively maintained and
> > vendor supported commercial platform based on x86_64 POSIX, just not a
> > Linux/Windows machine.
>
> Thanks for a friendly conversation, but I would have to say that
> Taylor's "we know we end up having to support both, and we will do
> so" is way underestimates the cost to do so.

I don't mean to imply that doing so would not be costly or require
additional effort. I was trying to highlight that I believe we on
the git-security list have demonstrated a track record of supporting
quite old release tracks when new security releases are cut.

I don't mean to suggest whatsoever that adding Rust into the mix would
somehow not have an effect on the costliness of maintaining support for
older versions, just that I believe we have show ourselves to be up to
the challenge.

(As an aside, I mentioned in my earlier email to Randall that I have a
suspicion that Rust code will have fewer security issues than C code,
and so the likelihood of needing to backport a security fix from Rust to
C seems lower to me than having to simply patch old C code. Time will
tell, I guess.)

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-27 20:12                     ` Taylor Blau
@ 2025-08-27 20:22                       ` Junio C Hamano
  2025-09-02 11:16                         ` Patrick Steinhardt
  0 siblings, 1 reply; 198+ messages in thread
From: Junio C Hamano @ 2025-08-27 20:22 UTC (permalink / raw)
  To: Taylor Blau
  Cc: rsbecker, 'Elijah Newren', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'brian m. carlson',
	'Christian Brabandt', 'Phillip Wood',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Patrick Steinhardt', 'Sam James',
	'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

Taylor Blau <me@ttaylorr.com> writes:

> (As an aside, I mentioned in my earlier email to Randall that I have a
> suspicion that Rust code will have fewer security issues than C code,
> and so the likelihood of needing to backport a security fix from Rust to
> C seems lower to me than having to simply patch old C code. Time will
> tell, I guess.)

Just like back when scripted Porcelains were rewritten in C, in 5
years, when a lot of the existing C code is rewritten, who among us
would care to backport or "simply patch" old C code?

This of course assumes that these platforms that lack Rust still
lack Rust after 5 years, yet still matters to the users, and the
vendor does not care to support Git themselves.  Maybe one of these
three conditions would change and make the problem go away ;-)



^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-08-27 20:22                       ` Junio C Hamano
@ 2025-09-02 11:16                         ` Patrick Steinhardt
  2025-09-02 11:30                           ` Sam James
  2025-09-02 17:27                           ` brian m. carlson
  0 siblings, 2 replies; 198+ messages in thread
From: Patrick Steinhardt @ 2025-09-02 11:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Taylor Blau, rsbecker, 'Elijah Newren',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'brian m. carlson', 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On Wed, Aug 27, 2025 at 01:22:34PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
> 
> > (As an aside, I mentioned in my earlier email to Randall that I have a
> > suspicion that Rust code will have fewer security issues than C code,
> > and so the likelihood of needing to backport a security fix from Rust to
> > C seems lower to me than having to simply patch old C code. Time will
> > tell, I guess.)
> 
> Just like back when scripted Porcelains were rewritten in C, in 5
> years, when a lot of the existing C code is rewritten, who among us
> would care to backport or "simply patch" old C code?
> 
> This of course assumes that these platforms that lack Rust still
> lack Rust after 5 years, yet still matters to the users, and the
> vendor does not care to support Git themselves.  Maybe one of these
> three conditions would change and make the problem go away ;-)

It will definitely require additional maintenance by us. I think it's
reasonable to say that platforms without Rust won't get new features
anymore. But when it comes to security fixes or significant bugs I think
it's less sensible to say that they're left on their own.

I proposed this in a separate branch of these threads, but we could
counteract this by declaring the last major version before we introduce
Rust as an LTS version that will receive both security and severe bug
fixes going forward. Ideally, that LTS release would continue to be
maintained until the gcc-rs backend is ready for prime time, which
should alleviate a lot of the portability concerns.

As Pierre-Emmanuel menitoned in [1], the backend is likely to stabilize
next year. One or two years of backports for that particular LTS version
doesn't feel too bad. And if it does become more involved we can maybe
also distribute the load and rely on maintainers of impacted platforms
without Rust to help out with the backporting.

Also, all of this feels like a significant shift. I'm strongly in favor
of adopting Rust in our codebase, but I think we should do so carefully.
So we might take it extra carefully and say that Rust will become a
mandatory dependency in Git 3.0, where the last release before Git 3.0
will become an LTS release.

Patrick

[1]: <7bf054a1-0196-4ad8-aaa4-a432cd2c93a5@embecosm.com>

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-02 11:16                         ` Patrick Steinhardt
@ 2025-09-02 11:30                           ` Sam James
  2025-09-02 17:27                           ` brian m. carlson
  1 sibling, 0 replies; 198+ messages in thread
From: Sam James @ 2025-09-02 11:30 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Junio C Hamano, Taylor Blau, rsbecker, 'Elijah Newren',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'brian m. carlson', 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

Patrick Steinhardt <ps@pks.im> writes:

> On Wed, Aug 27, 2025 at 01:22:34PM -0700, Junio C Hamano wrote:
>> Taylor Blau <me@ttaylorr.com> writes:
>> 
>> > (As an aside, I mentioned in my earlier email to Randall that I have a
>> > suspicion that Rust code will have fewer security issues than C code,
>> > and so the likelihood of needing to backport a security fix from Rust to
>> > C seems lower to me than having to simply patch old C code. Time will
>> > tell, I guess.)
>> 
>> Just like back when scripted Porcelains were rewritten in C, in 5
>> years, when a lot of the existing C code is rewritten, who among us
>> would care to backport or "simply patch" old C code?
>> 
>> This of course assumes that these platforms that lack Rust still
>> lack Rust after 5 years, yet still matters to the users, and the
>> vendor does not care to support Git themselves.  Maybe one of these
>> three conditions would change and make the problem go away ;-)
>
> It will definitely require additional maintenance by us. I think it's
> reasonable to say that platforms without Rust won't get new features
> anymore. But when it comes to security fixes or significant bugs I think
> it's less sensible to say that they're left on their own.
>
> I proposed this in a separate branch of these threads, but we could
> counteract this by declaring the last major version before we introduce
> Rust as an LTS version that will receive both security and severe bug
> fixes going forward. Ideally, that LTS release would continue to be
> maintained until the gcc-rs backend is ready for prime time, which
> should alleviate a lot of the portability concerns.
>

That would be enormously appreciated and make me happy if it is possible.

> As Pierre-Emmanuel menitoned in [1], the backend is likely to stabilize
> next year. One or two years of backports for that particular LTS version
> doesn't feel too bad. And if it does become more involved we can maybe
> also distribute the load and rely on maintainers of impacted platforms
> without Rust to help out with the backporting.

I'd be open to that if we're still in the position of needing it by then.

>
> Also, all of this feels like a significant shift. I'm strongly in favor
> of adopting Rust in our codebase, but I think we should do so carefully.
> So we might take it extra carefully and say that Rust will become a
> mandatory dependency in Git 3.0, where the last release before Git 3.0
> will become an LTS release.

This is what I was hoping for :)

It should be considered a "breaking change" in a sense (the
"compatibility profile" of git changed) and 3.0 would be fitting.

It would perhaps liberate git developers in feeling free to make other
changes while adopting Rust as well if they see fit.

>
> Patrick
>
> [1]: <7bf054a1-0196-4ad8-aaa4-a432cd2c93a5@embecosm.com>

sam

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-02 11:16                         ` Patrick Steinhardt
  2025-09-02 11:30                           ` Sam James
@ 2025-09-02 17:27                           ` brian m. carlson
  2025-09-02 18:47                             ` Sam James
  2025-09-03  5:40                             ` Patrick Steinhardt
  1 sibling, 2 replies; 198+ messages in thread
From: brian m. carlson @ 2025-09-02 17:27 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Junio C Hamano, Taylor Blau, rsbecker, 'Elijah Newren',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'Christian Brabandt', 'Phillip Wood',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

[-- Attachment #1: Type: text/plain, Size: 2217 bytes --]

On 2025-09-02 at 11:16:19, Patrick Steinhardt wrote:
> As Pierre-Emmanuel menitoned in [1], the backend is likely to stabilize
> next year. One or two years of backports for that particular LTS version
> doesn't feel too bad. And if it does become more involved we can maybe
> also distribute the load and rely on maintainers of impacted platforms
> without Rust to help out with the backporting.

I'm very much in favour of supporting gccrs when it's available, but I
also want to say that it currently is targeting 1.49, which is much
older than we want.  It's also not necessarily going to be fully usable
or bug free in that amount of time.

I also want to point out that it's important that the maintainers of
affected platforms build the tooling necessary for their platforms to be
supported.  I'm not seeing ports of LLVM to those architectures or
contributions to gccrs that would make those platforms easier to
support.

> Also, all of this feels like a significant shift. I'm strongly in favor
> of adopting Rust in our codebase, but I think we should do so carefully.
> So we might take it extra carefully and say that Rust will become a
> mandatory dependency in Git 3.0, where the last release before Git 3.0
> will become an LTS release.

I'd prefer we not wait that long.  I'm doing some work in building the
new loose object mapping using Rust and it's much more efficient than
writing it in C because we don't have to sort the data when we use a
BTreeMap.  The code is much simpler, shorter, and easier to write.

Nobody else is currently working on the interoperability code and we
expressed that we ideally wanted it for Git 3.0.  Being able to use Rust
means I can write that code faster, with fewer errors (and hence less
debugging time), and better tests.  Otherwise, I'm afraid that it will
take longer and we might not have it fully upstream for Git 3.0.

We also have this series right now, which we'd have to abandon if we're
not going to support Rust right away.  I'd like to retain Ezekiel as a
contributor and incorporate Rust, and I think the best time to adopt
Rust is now, not at Git 3.0.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-02 17:27                           ` brian m. carlson
@ 2025-09-02 18:47                             ` Sam James
  2025-09-03 18:22                               ` Collin Funk
  2025-09-03  5:40                             ` Patrick Steinhardt
  1 sibling, 1 reply; 198+ messages in thread
From: Sam James @ 2025-09-02 18:47 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Patrick Steinhardt, Junio C Hamano, Taylor Blau, rsbecker,
	'Elijah Newren', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> On 2025-09-02 at 11:16:19, Patrick Steinhardt wrote:
>> As Pierre-Emmanuel menitoned in [1], the backend is likely to stabilize
>> next year. One or two years of backports for that particular LTS version
>> doesn't feel too bad. And if it does become more involved we can maybe
>> also distribute the load and rely on maintainers of impacted platforms
>> without Rust to help out with the backporting.
>
> I'm very much in favour of supporting gccrs when it's available, but I
> also want to say that it currently is targeting 1.49, which is much
> older than we want.  It's also not necessarily going to be fully usable
> or bug free in that amount of time.
>
> I also want to point out that it's important that the maintainers of
> affected platforms build the tooling necessary for their platforms to be
> supported.  I'm not seeing ports of LLVM to those architectures or
> contributions to gccrs that would make those platforms easier to
> support.

This isn't accurate. gccrs doesn't need particular porting to arches: at
least not yet, and if it does, it'll be very minor; any changes of this
sort will be in crates themselves which would go upstream.

As for the libgccjit-based backend for rustc, see
https://github.com/rust-lang/rustc_codegen_gcc/issues/49,
https://github.com/rust-lang/rustc_codegen_gcc/issues/744, and
https://github.com/rust-lang/rustc_codegen_gcc/issues/742 for discussion
and complications. But to say that nobody is doing it or working towards
it is inaccurate.

>
>> Also, all of this feels like a significant shift. I'm strongly in favor
>> of adopting Rust in our codebase, but I think we should do so carefully.
>> So we might take it extra carefully and say that Rust will become a
>> mandatory dependency in Git 3.0, where the last release before Git 3.0
>> will become an LTS release.
>
> I'd prefer we not wait that long.  I'm doing some work in building the
> new loose object mapping using Rust and it's much more efficient than
> writing it in C because we don't have to sort the data when we use a
> BTreeMap.  The code is much simpler, shorter, and easier to write.
>

I still think adopting Rust is a compatibility break and a "breaking
change". Again, keeping in mind that for adopting C99 features (!), the
Git project used "test balloons" very very recently.

> Nobody else is currently working on the interoperability code and we
> expressed that we ideally wanted it for Git 3.0.  Being able to use Rust
> means I can write that code faster, with fewer errors (and hence less
> debugging time), and better tests.  Otherwise, I'm afraid that it will
> take longer and we might not have it fully upstream for Git 3.0.
>
> We also have this series right now, which we'd have to abandon if we're
> not going to support Rust right away.  I'd like to retain Ezekiel as a
> contributor and incorporate Rust, and I think the best time to adopt
> Rust is now, not at Git 3.0.

I think there's going to be various issues that arise even on platforms
that support Rust that would make it fitting for Git 3.0, at least for
the first few releases that incorporate Rust. I'll note that the series
isn't currently using Meson's Rust integration as QEMU is doing.

sam

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-02 17:27                           ` brian m. carlson
  2025-09-02 18:47                             ` Sam James
@ 2025-09-03  5:40                             ` Patrick Steinhardt
  2025-09-03 16:22                               ` Ramsay Jones
                                                 ` (2 more replies)
  1 sibling, 3 replies; 198+ messages in thread
From: Patrick Steinhardt @ 2025-09-03  5:40 UTC (permalink / raw)
  To: brian m. carlson, Junio C Hamano, Taylor Blau, rsbecker,
	'Elijah Newren', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On Tue, Sep 02, 2025 at 05:27:10PM +0000, brian m. carlson wrote:
> On 2025-09-02 at 11:16:19, Patrick Steinhardt wrote:
> > As Pierre-Emmanuel menitoned in [1], the backend is likely to stabilize
> > next year. One or two years of backports for that particular LTS version
> > doesn't feel too bad. And if it does become more involved we can maybe
> > also distribute the load and rely on maintainers of impacted platforms
> > without Rust to help out with the backporting.
> 
> I'm very much in favour of supporting gccrs when it's available, but I
> also want to say that it currently is targeting 1.49, which is much
> older than we want.  It's also not necessarily going to be fully usable
> or bug free in that amount of time.

I cannot really say much about this. Overall I think that the rapid
release cycles and rapid adoption by projects that one typically sees in
Rust are an indicator to me that the whole ecosystem is not yet stable.

If I had the choice, I'd much rather adopt an ancient version of Rust if
it means that more platforms can support it.

> I also want to point out that it's important that the maintainers of
> affected platforms build the tooling necessary for their platforms to be
> supported.  I'm not seeing ports of LLVM to those architectures or
> contributions to gccrs that would make those platforms easier to
> support.

The gccrs maintainers are actively working on that backend, and as far
as I understand the main difference between LLVM and gccrs is that the
latter doesn't have to be ported over to every single platform
individually.

> > Also, all of this feels like a significant shift. I'm strongly in favor
> > of adopting Rust in our codebase, but I think we should do so carefully.
> > So we might take it extra carefully and say that Rust will become a
> > mandatory dependency in Git 3.0, where the last release before Git 3.0
> > will become an LTS release.
> 
> I'd prefer we not wait that long.  I'm doing some work in building the
> new loose object mapping using Rust and it's much more efficient than
> writing it in C because we don't have to sort the data when we use a
> BTreeMap.  The code is much simpler, shorter, and easier to write.

I still think we need to be mindful around the community though. I
understand that we want to have Rust in the codebase, and as I said I'm
in favor of adopting it. But we also have a certain responsibility with
Git given that it's used by almost every single developer out there.

A compromise could be to ease into Rust: we adopt Rust, but before Git
3.0 it is entirely optional. So Git will continue to work alright even
if there is no Rust compiler available. On the one hand this plays
nicely with platforms that do not have Rust. On the other hand it also
allows us to slowly iterate on the build infra for Rust, because I'm
very sure that there's going to be issues there initially.

With that we can:

  - Build confidence in our Rust tooling.

  - Figure out things as we go.

  - Give distributions and other platforms enough time to prepare for
    Rust becoming mandatory.

I think adopting Rust as a mandatory dependency out of nowhere would not
be playing nice. It may require significant effort from distros to adapt
to the new reality, so we should give them time to do so.

Note that I'm not saying that we need to have both a C and Rust
implementation for everything written in Rust. I don't think that's
sustainable in any way. But any feature written in Rust should be a
_new_ feature that can be disabled and that users can live without for
the time being.

> Nobody else is currently working on the interoperability code and we
> expressed that we ideally wanted it for Git 3.0.  Being able to use Rust
> means I can write that code faster, with fewer errors (and hence less
> debugging time), and better tests.  Otherwise, I'm afraid that it will
> take longer and we might not have it fully upstream for Git 3.0.
> 
> We also have this series right now, which we'd have to abandon if we're
> not going to support Rust right away.  I'd like to retain Ezekiel as a
> contributor and incorporate Rust, and I think the best time to adopt
> Rust is now, not at Git 3.0.

It would be a shame, but right now it's a risky bet to build anything on
top of Rust given that we don't officially accept it in Git yet. We need
to first make the decision whether or not we want to have it right now,
and if so how that's supposed to look like.

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-03  5:40                             ` Patrick Steinhardt
@ 2025-09-03 16:22                               ` Ramsay Jones
  2025-09-03 22:10                               ` Junio C Hamano
  2025-09-04  0:57                               ` brian m. carlson
  2 siblings, 0 replies; 198+ messages in thread
From: Ramsay Jones @ 2025-09-03 16:22 UTC (permalink / raw)
  To: Patrick Steinhardt, brian m. carlson, Junio C Hamano, Taylor Blau,
	rsbecker, 'Elijah Newren', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ezekiel Newren', 'Josh Steadmon',
	'Calvin Wan'



On 03/09/2025 06:40, Patrick Steinhardt wrote:
> On Tue, Sep 02, 2025 at 05:27:10PM +0000, brian m. carlson wrote:
>> On 2025-09-02 at 11:16:19, Patrick Steinhardt wrote:
[snip]

>> I'd prefer we not wait that long.  I'm doing some work in building the
>> new loose object mapping using Rust and it's much more efficient than
>> writing it in C because we don't have to sort the data when we use a
>> BTreeMap.  The code is much simpler, shorter, and easier to write.
> 
> I still think we need to be mindful around the community though. I
> understand that we want to have Rust in the codebase, and as I said I'm
> in favor of adopting it. But we also have a certain responsibility with
> Git given that it's used by almost every single developer out there.
> 
> A compromise could be to ease into Rust: we adopt Rust, but before Git
> 3.0 it is entirely optional. So Git will continue to work alright even
> if there is no Rust compiler available. On the one hand this plays
> nicely with platforms that do not have Rust. On the other hand it also
> allows us to slowly iterate on the build infra for Rust, because I'm
> very sure that there's going to be issues there initially.
> 
> With that we can:
> 
>   - Build confidence in our Rust tooling.
> 
>   - Figure out things as we go.
> 
>   - Give distributions and other platforms enough time to prepare for
>     Rust becoming mandatory.
> 
> I think adopting Rust as a mandatory dependency out of nowhere would not
> be playing nice. It may require significant effort from distros to adapt
> to the new reality, so we should give them time to do so.

I agree with everything you say above. Thank you for saying it. ;)

It is somewhat unfortunate that 'xdiff' was chosen as an initial
project for this, since a git without the ability to produce a diff
is, well, practically useless!

[I don't have any objection to making the rust code optional in this
case - it should be easily possible to have both C and rust xdiff code].

I have already stopped building git on cygwin, since rust is not
currently available on cygwin (and I'm not aware of any effort to
port it there - although LLVM v20+ was recently made available).

ATB,
Ramsay Jones




^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-02 18:47                             ` Sam James
@ 2025-09-03 18:22                               ` Collin Funk
  0 siblings, 0 replies; 198+ messages in thread
From: Collin Funk @ 2025-09-03 18:22 UTC (permalink / raw)
  To: Sam James
  Cc: brian m. carlson, Patrick Steinhardt, Junio C Hamano, Taylor Blau,
	rsbecker, 'Elijah Newren', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Mike Hommey', 'Pierre-Emmanuel Patry',
	'D. Ben Knoble', 'Ramsay Jones',
	'Ezekiel Newren', 'Josh Steadmon',
	'Calvin Wan'

Sam James <sam@gentoo.org> writes:

> I still think adopting Rust is a compatibility break and a "breaking
> change". Again, keeping in mind that for adopting C99 features (!), the
> Git project used "test balloons" very very recently.
>
>> Nobody else is currently working on the interoperability code and we
>> expressed that we ideally wanted it for Git 3.0.  Being able to use Rust
>> means I can write that code faster, with fewer errors (and hence less
>> debugging time), and better tests.  Otherwise, I'm afraid that it will
>> take longer and we might not have it fully upstream for Git 3.0.
>>
>> We also have this series right now, which we'd have to abandon if we're
>> not going to support Rust right away.  I'd like to retain Ezekiel as a
>> contributor and incorporate Rust, and I think the best time to adopt
>> Rust is now, not at Git 3.0.
>
> I think there's going to be various issues that arise even on platforms
> that support Rust that would make it fitting for Git 3.0, at least for
> the first few releases that incorporate Rust. I'll note that the series
> isn't currently using Meson's Rust integration as QEMU is doing.

Just want to voice my agreement with Sam here.

It seems strange that we have a test balloon for compound literals,
something that GCC has supported since before 2001 [1]. But at the same
time require a platform to support Rust. If a platform has Rust support,
it certainly has a compiler supporting compound literals.

Collin

[1] https://github.com/gcc-mirror/gcc/commit/cedd825f0f18088f7235f02136021bd63a2e12df

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-03  5:40                             ` Patrick Steinhardt
  2025-09-03 16:22                               ` Ramsay Jones
@ 2025-09-03 22:10                               ` Junio C Hamano
  2025-09-03 22:48                                 ` Josh Steadmon
  2025-09-04 11:10                                 ` Patrick Steinhardt
  2025-09-04  0:57                               ` brian m. carlson
  2 siblings, 2 replies; 198+ messages in thread
From: Junio C Hamano @ 2025-09-03 22:10 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: brian m. carlson, Taylor Blau, rsbecker, 'Elijah Newren',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'Christian Brabandt', 'Phillip Wood',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

Patrick Steinhardt <ps@pks.im> writes:

> I still think we need to be mindful around the community though. I
> understand that we want to have Rust in the codebase, and as I said I'm
> in favor of adopting it. But we also have a certain responsibility with
> Git given that it's used by almost every single developer out there.
>
> A compromise could be to ease into Rust: we adopt Rust, but before Git
> 3.0 it is entirely optional. So Git will continue to work alright even
> if there is no Rust compiler available. On the one hand this plays
> nicely with platforms that do not have Rust. On the other hand it also
> allows us to slowly iterate on the build infra for Rust, because I'm
> very sure that there's going to be issues there initially.
>
> With that we can:
>
>   - Build confidence in our Rust tooling.
>
>   - Figure out things as we go.
>
>   - Give distributions and other platforms enough time to prepare for
>     Rust becoming mandatory.
>
> I think adopting Rust as a mandatory dependency out of nowhere would not
> be playing nice. It may require significant effort from distros to adapt
> to the new reality, so we should give them time to do so.

Not just distros on exotic platforms, but also for users who cannot
afford to see regressions.  

> Note that I'm not saying that we need to have both a C and Rust
> implementation for everything written in Rust. I don't think that's
> sustainable in any way. But any feature written in Rust should be a
> _new_ feature that can be disabled and that users can live without for
> the time being.

Yes, if we can find such modular niche, it would be ideal.  But how
many areas that we can cleanly plug an optional thing in without
disrupting existing codebase are there?  Offhand, all I'd think of
are a new merge backend, a new rebase backend, a transport helper,
or perhaps a new diff-algorithm?

> It would be a shame, but right now it's a risky bet to build anything on
> top of Rust given that we don't officially accept it in Git yet. We need
> to first make the decision whether or not we want to have it right now,
> and if so how that's supposed to look like.

One more thing that I noticed.  What are our plans for the two
directories in contrib/libgit-{sys,rs}/?  IIRC, the new stuff from
Ezekiel did not interact with them at all, but it did not remove
them either, so I am a bit lost.

Thanks.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-03 22:10                               ` Junio C Hamano
@ 2025-09-03 22:48                                 ` Josh Steadmon
  2025-09-04 11:10                                 ` Patrick Steinhardt
  1 sibling, 0 replies; 198+ messages in thread
From: Josh Steadmon @ 2025-09-03 22:48 UTC (permalink / raw)
  To: Junio C Hamano, brian m. carlson
  Cc: Patrick Steinhardt, Taylor Blau, rsbecker,
	'Elijah Newren', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Calvin Wan'

On 2025.09.03 15:10, Junio C Hamano wrote:
> One more thing that I noticed.  What are our plans for the two
> directories in contrib/libgit-{sys,rs}/?  IIRC, the new stuff from
> Ezekiel did not interact with them at all, but it did not remove
> them either, so I am a bit lost.
> 
> Thanks.

I haven't followed this series closely, but I wouldn't expect it to
interact with contrib/libgit-*, since those libraries are intended use
by external projects.

That said, we at Google don't currently have plans on expanding
libgit-*, since JJ has been able to meet its needs by shelling out to
the Git CLI for cases where gitoxide is not sufficient. It's possible
(probable??) that we might return to libgit-rs in the future, but
nothing is on the radar right now.

When libgit-* was still under review, brian said[0] they were interested
in building on it, but I don't know if that is still accurate. brian,
any update on that?

[0] https://lore.kernel.org/git/Z47kr0_fYYdaMWyA@tapette.crustytoothpaste.net/

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-03  5:40                             ` Patrick Steinhardt
  2025-09-03 16:22                               ` Ramsay Jones
  2025-09-03 22:10                               ` Junio C Hamano
@ 2025-09-04  0:57                               ` brian m. carlson
  2025-09-04 11:39                                 ` Patrick Steinhardt
  2 siblings, 1 reply; 198+ messages in thread
From: brian m. carlson @ 2025-09-04  0:57 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Junio C Hamano, Taylor Blau, rsbecker, 'Elijah Newren',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'Christian Brabandt', 'Phillip Wood',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

[-- Attachment #1: Type: text/plain, Size: 8481 bytes --]

On 2025-09-03 at 05:40:54, Patrick Steinhardt wrote:
> If I had the choice, I'd much rather adopt an ancient version of Rust if
> it means that more platforms can support it.

I think you may be assuming that gccrs targeting Rust 1.49 will
magically make it work on more platforms than upstream Rust will.
That's not the case.

gccrs targeting Rust 1.49 will use libstd (the standard library) and
libcore (the library for freestanding implementations) from Rust 1.49
and that means it will only support those platforms that Rust 1.49 did.
For instance, Rust added support for Apache NuttX relatively recently.
Even if it has stellar support in GCC, it won't work with that version
of gccrs because the underlying libraries don't support any of those
platforms.  The only thing you can target that you couldn't before are
systems that use neither libstd nor libcore—which essentially means the
Linux kernel.  It's like using a glibc from 2009 and expecting to work
on RISC-V—it simply won't[0].

If you need support for new platforms, that requires a much _newer_
version of Rust.  Thus, to be able to use gccrs, porters need to use the
existing gcc codegen backend and get that code in immediately so that
when gccrs is out and supports Rust 1.91, the standard library will work
with those platforms.  The fastest way to getting platforms supported is
to port LLVM and then add them to upstream Rust that way.

I know there has been much complaint about the six-week lifespan of Rust
releases.  I myself dislike that.  But the situation is that LTS
releases require extensive amounts of work and nobody has stepped up to
do that or pay for it to be done.  Without dedicated staffing, it's not
going to happen.  That also means that individual projects decide what
versions of Rust they do and don't want to support.

We're already supporting the version in Debian stable for a year after
the new release comes out, so we're already far behind what everyone
else is doing.  For comparison, Rust 1.48 is in Debian 11, so we'd be
supporting an effectively five-year-old compiler instead of a
three-year-old compiler.

Requiring Rust 1.49 instead of Rust 1.63 makes it harder to use tools
like bindgen and cbindgen, which exist to automatically create types and
functions in one language in the other.  That, in turn, will hinder our
ability to effectively write code that crosses the boundary and
introduce hard-to-find bugs, since we'll have to do that work manually.
My experience is that these kinds of bugs tend to actually show up more
frequently on less common platforms, like big-endian systems, so we'll
be worsening the platform experience for those systems.

For context, when we ported a core service from C to Rust at work, we
used bindgen to generate C struct definitions, which made the process
much easier and avoided random crashes.  As a result, nobody noticed the
fact that we ported it incrementally over a couple of years.  If we
hadn't used bindgen, we probably would have had lots of random segfaults
due to failing to maintain compatibility between Rust and C definitions
of the same structures, which users would not have appreciated and would
not have helped our goal of making our software more reliable and easier
to maintain.

> The gccrs maintainers are actively working on that backend, and as far
> as I understand the main difference between LLVM and gccrs is that the
> latter doesn't have to be ported over to every single platform
> individually.

I don't think that's the case.  gccrs has to be compiled for every
platform just like LLVM does.  LLVM is actually easier to support
because it can cross-compile from any platform to any platform without
recompilation.  For instance, I can target riscv64gc-unknown-openbsd on
my Debian amd64 laptop assuming I can provide the necessary libraries
for OpenBSD when compiling, but GCC requires me to specifically compile
a compiler for that platform.

In any event, any portability changes will also likely need to go into
libstd and libcore, which is used identically with both compilers.

It is, however, the case that GCC supports more architectures (and
possibly more architecture/OS combinations) than LLVM.  For instance,
DEC Alpha and IA64 are only supported by GCC at the moment.

> I think adopting Rust as a mandatory dependency out of nowhere would not
> be playing nice. It may require significant effort from distros to adapt
> to the new reality, so we should give them time to do so.

We've actually had this discussion on the list several times where we've
proposed the inclusion of Rust.  This is not the first time it's come
up, or the second.  It was explicitly mentioned a year ago on the list
that we wanted to adopt Rust in the notes from the Contributor Summit.

There has been plenty of notice that this is coming down the line.  It's
not accurate to claim it's "out of nowhere" nor to claim that people
have not had plenty of time to port their systems.

Distros and porters should not be insensible to the increasing use of
Rust or the need for them to get their systems working.  For instance,
you cannot run a GNOME or MATE desktop environment without librsvg2,
which is written in Rust.  Python's cryptography package adopted Rust
over four years ago and there was the same gnashing of teeth[1], yet
little progress has been made by porters on the same affected
architectures since that time.  In that time, Debian has bootstrapped
and released an entire RISC-V port, complete with Rust.

I want to be clear I'm not opposed to supporting less common operating
systems or architectures.  For many years, my laptop was a PowerPC Mac,
and I've owned UltraSPARC, MIPS, and ARM hardware.  For personal code, I
try to test it in CI on at least Linux, macOS, FreeBSD, and NetBSD.  But
also, when a Debian package has not worked properly on PowerPC or
UltraSPARC, I've stepped up and fixed it.  My requests to other projects
when porting have been things like asking to write valid C or C++ (by
not making unaligned accesses or avoiding endianness assumptions, for
instance) and not to refrain from adding new languages or features.

It should be stated that there is a very easy way to get Rust working,
and that's to port LLVM to the platform in question.  IA-64 was removed
in 2009, but it might be possible to resurrect that out of tree if
there's interest and maybe even get it re-accepted upstream.  I'll point
out that AIX, Solaris, and QNX have done the necessary porting work to
get LLVM and Rust working over the past couple years, so it's not out of
the question for other platforms to do so as well.  And, for the
avoidance of doubt, I would be absolutely delighted if we were able to
support additional platforms with Rust as well.

Also, the approach of making it an optional component directly
contradicts the proposed policy I wrote up.  That's a recipe for
additional burdensome work maintaining two implementations, when we
actually want to make it easier for people to contribute functionality.
It also doesn't provide any of the memory safety benefits or address any
of the concerns from governments, security professionals, and other
parties about the real and substantial risks of continuing to develop in
C.

For example, there is zero chance I will implement any of the
SHA-1/SHA-256 compatibility code twice.  I'm already doing that in my
free time without any compensation at all and it's unreasonable to
expect me to do it twice or even to #ifdef out all the places it would
need to go.  I am happy to let someone else take responsibility for the
project instead, however, if they would like to do those things.

> It would be a shame, but right now it's a risky bet to build anything on
> top of Rust given that we don't officially accept it in Git yet. We need
> to first make the decision whether or not we want to have it right now,
> and if so how that's supposed to look like.

I think we had made the decision at the 2024 Contributor's Summit that
we wanted to adopt Rust in Git, so it was more of a matter of sending
the patches than actually making that decision.  As I recall, the
decision was unanimous.

[0] RISC-V was developed in 2010.
[1] https://www.reddit.com/r/rust/comments/lfysy9/pythons_cryptography_package_introduced_build/
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-03 22:10                               ` Junio C Hamano
  2025-09-03 22:48                                 ` Josh Steadmon
@ 2025-09-04 11:10                                 ` Patrick Steinhardt
  2025-09-04 15:45                                   ` Junio C Hamano
  1 sibling, 1 reply; 198+ messages in thread
From: Patrick Steinhardt @ 2025-09-04 11:10 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: brian m. carlson, Taylor Blau, rsbecker, 'Elijah Newren',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'Christian Brabandt', 'Phillip Wood',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On Wed, Sep 03, 2025 at 03:10:44PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > Note that I'm not saying that we need to have both a C and Rust
> > implementation for everything written in Rust. I don't think that's
> > sustainable in any way. But any feature written in Rust should be a
> > _new_ feature that can be disabled and that users can live without for
> > the time being.
> 
> Yes, if we can find such modular niche, it would be ideal.  But how
> many areas that we can cleanly plug an optional thing in without
> disrupting existing codebase are there?  Offhand, all I'd think of
> are a new merge backend, a new rebase backend, a transport helper,
> or perhaps a new diff-algorithm?

Not too many, I guess.

If we cannot find anything, an alternative could also be to take a very
simple subsystem that doesn't see a lot of changes and convert that to
Rust. We'd retain both implementations in that case, which I mentioned
is painful because we now have to keep both in sync. But if we say that
this is a testballoon, only, and that we don't continue to convert other
code until Git 3.0, then that might be fine.

"varint.c" could be a good match. It's trivial, only 30 lines of code,
and completely standalone.

We could still build new and optional functionality via Rust, but I
guess it also doesn't hurt to have a test balloon that is part of
libgit.a to test interoperability.

I'll send patches later today.

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-04  0:57                               ` brian m. carlson
@ 2025-09-04 11:39                                 ` Patrick Steinhardt
  2025-09-04 13:53                                   ` Sam James
                                                     ` (2 more replies)
  0 siblings, 3 replies; 198+ messages in thread
From: Patrick Steinhardt @ 2025-09-04 11:39 UTC (permalink / raw)
  To: brian m. carlson, Junio C Hamano, Taylor Blau, rsbecker,
	'Elijah Newren', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On Thu, Sep 04, 2025 at 12:57:25AM +0000, brian m. carlson wrote:
> On 2025-09-03 at 05:40:54, Patrick Steinhardt wrote:
> > If I had the choice, I'd much rather adopt an ancient version of Rust if
> > it means that more platforms can support it.
> 
> I think you may be assuming that gccrs targeting Rust 1.49 will
> magically make it work on more platforms than upstream Rust will.
> That's not the case.

I don't have enough context to be able to tell. I'm mostly going by what
the gccrs maintainers themselves are saying. But if I'm misunderstanding
what gccrs will bring to the table I'm happy to be corrected.

[snip]
> > I think adopting Rust as a mandatory dependency out of nowhere would not
> > be playing nice. It may require significant effort from distros to adapt
> > to the new reality, so we should give them time to do so.
> 
> We've actually had this discussion on the list several times where we've
> proposed the inclusion of Rust.  This is not the first time it's come
> up, or the second.  It was explicitly mentioned a year ago on the list
> that we wanted to adopt Rust in the notes from the Contributor Summit.
> 
> There has been plenty of notice that this is coming down the line.  It's
> not accurate to claim it's "out of nowhere" nor to claim that people
> have not had plenty of time to port their systems.
> 
> Distros and porters should not be insensible to the increasing use of
> Rust or the need for them to get their systems working.  For instance,
> you cannot run a GNOME or MATE desktop environment without librsvg2,
> which is written in Rust.  Python's cryptography package adopted Rust
> over four years ago and there was the same gnashing of teeth[1], yet
> little progress has been made by porters on the same affected
> architectures since that time.  In that time, Debian has bootstrapped
> and released an entire RISC-V port, complete with Rust.

Discussions of theoretical nature are one thing though. The transition
that is actually happening is a different thing, and distributions will
need to prepare for this. We already had multiple distro maintainers
coming into these discussions saying that this will require a bunch of
work, which should be an indicator to us that we need to take it slow.
We should accommodate for that.

[snip]
> It should be stated that there is a very easy way to get Rust working,
> and that's to port LLVM to the platform in question.  IA-64 was removed
> in 2009, but it might be possible to resurrect that out of tree if
> there's interest and maybe even get it re-accepted upstream.  I'll point
> out that AIX, Solaris, and QNX have done the necessary porting work to
> get LLVM and Rust working over the past couple years, so it's not out of
> the question for other platforms to do so as well.  And, for the
> avoidance of doubt, I would be absolutely delighted if we were able to
> support additional platforms with Rust as well.

I cannot really say how hard or easy it is to port LLVM to a different
platform. I'd be surprised though if that work really was that easy.

> Also, the approach of making it an optional component directly
> contradicts the proposed policy I wrote up.  That's a recipe for
> additional burdensome work maintaining two implementations, when we
> actually want to make it easier for people to contribute functionality.
> It also doesn't provide any of the memory safety benefits or address any
> of the concerns from governments, security professionals, and other
> parties about the real and substantial risks of continuing to develop in
> C.

The only reason why we want to have it as an optional component is to
make the transitioning period easier for downstream distributors. And
the intent is not to convert major components -- it should be trivial
components that we can use as test balloons, similar to how we did it
for all of our C99 test balloons.

We cannot just pull the rug away under their feet without advance notice
that this is going to happen.

> For example, there is zero chance I will implement any of the
> SHA-1/SHA-256 compatibility code twice.  I'm already doing that in my
> free time without any compensation at all and it's unreasonable to
> expect me to do it twice or even to #ifdef out all the places it would
> need to go.  I am happy to let someone else take responsibility for the
> project instead, however, if they would like to do those things.

And that's totally fair. From my point of view, this compatibility code
is a _new_ feature that we are adding to Git. And as I mentioned, I
think it is reasonable to say that new features may be implemented in
Rust now already, as platforms that aren't yet ready wouldn't lose any
existing functionality.

> > It would be a shame, but right now it's a risky bet to build anything on
> > top of Rust given that we don't officially accept it in Git yet. We need
> > to first make the decision whether or not we want to have it right now,
> > and if so how that's supposed to look like.
> 
> I think we had made the decision at the 2024 Contributor's Summit that
> we wanted to adopt Rust in Git, so it was more of a matter of sending
> the patches than actually making that decision.  As I recall, the
> decision was unanimous.

I think most or even all of the contributors are on board. But we never
really talked about timelines, or how we want to introduce Rust, so
that's a discussion we need to have now.

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-04 11:39                                 ` Patrick Steinhardt
@ 2025-09-04 13:53                                   ` Sam James
  2025-09-05  3:55                                     ` Elijah Newren
  2025-09-04 23:17                                   ` Ezekiel Newren
  2025-09-05  3:54                                   ` Elijah Newren
  2 siblings, 1 reply; 198+ messages in thread
From: Sam James @ 2025-09-04 13:53 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: brian m. carlson, Junio C Hamano, Taylor Blau, rsbecker,
	'Elijah Newren', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

Patrick Steinhardt <ps@pks.im> writes:

> On Thu, Sep 04, 2025 at 12:57:25AM +0000, brian m. carlson wrote:
>> On 2025-09-03 at 05:40:54, Patrick Steinhardt wrote:
>> > If I had the choice, I'd much rather adopt an ancient version of Rust if
>> > it means that more platforms can support it.
>> 
>> I think you may be assuming that gccrs targeting Rust 1.49 will
>> magically make it work on more platforms than upstream Rust will.
>> That's not the case.
>
> I don't have enough context to be able to tell. I'm mostly going by what
> the gccrs maintainers themselves are saying. But if I'm misunderstanding
> what gccrs will bring to the table I'm happy to be corrected.
>

(I also think it's obvious that once gccrs can handle 1.49, we will have
to put effort into making things build with it. Not sure who wanted or
claimed magic. I just think relyling on a single implementation isn't a
good idea.)

> [snip]
>> > I think adopting Rust as a mandatory dependency out of nowhere would not
>> > be playing nice. It may require significant effort from distros to adapt
>> > to the new reality, so we should give them time to do so.
>> 
>> We've actually had this discussion on the list several times where we've
>> proposed the inclusion of Rust.  This is not the first time it's come
>> up, or the second.  It was explicitly mentioned a year ago on the list
>> that we wanted to adopt Rust in the notes from the Contributor Summit.
>> 
>> There has been plenty of notice that this is coming down the line.  It's
>> not accurate to claim it's "out of nowhere" nor to claim that people
>> have not had plenty of time to port their systems.
>> 
>> Distros and porters should not be insensible to the increasing use of
>> Rust or the need for them to get their systems working.  For instance,
>> you cannot run a GNOME or MATE desktop environment without librsvg2,
>> which is written in Rust.  Python's cryptography package adopted Rust
>> over four years ago and there was the same gnashing of teeth[1], yet
>> little progress has been made by porters on the same affected
>> architectures since that time.  In that time, Debian has bootstrapped
>> and released an entire RISC-V port, complete with Rust.
>
> Discussions of theoretical nature are one thing though. The transition
> that is actually happening is a different thing, and distributions will
> need to prepare for this. We already had multiple distro maintainers
> coming into these discussions saying that this will require a bunch of
> work, which should be an indicator to us that we need to take it slow.
> We should accommodate for that.

I imagine most distributions have absolutely zero awareness of this
thread or plans for git. See below.

>
> [snip]
>> It should be stated that there is a very easy way to get Rust working,
>> and that's to port LLVM to the platform in question.  IA-64 was removed
>> in 2009, but it might be possible to resurrect that out of tree if
>> there's interest and maybe even get it re-accepted upstream.  I'll point
>> out that AIX, Solaris, and QNX have done the necessary porting work to
>> get LLVM and Rust working over the past couple years, so it's not out of
>> the question for other platforms to do so as well.  And, for the
>> avoidance of doubt, I would be absolutely delighted if we were able to
>> support additional platforms with Rust as well.
>
> I cannot really say how hard or easy it is to port LLVM to a different
> platform. I'd be surprised though if that work really was that easy.

I think it's an interesting characterisation indeed.

>
>> Also, the approach of making it an optional component directly
>> contradicts the proposed policy I wrote up.  That's a recipe for
>> additional burdensome work maintaining two implementations, when we
>> actually want to make it easier for people to contribute functionality.
>> It also doesn't provide any of the memory safety benefits or address any
>> of the concerns from governments, security professionals, and other
>> parties about the real and substantial risks of continuing to develop in
>> C.
>
> The only reason why we want to have it as an optional component is to
> make the transitioning period easier for downstream distributors. And
> the intent is not to convert major components -- it should be trivial
> components that we can use as test balloons, similar to how we did it
> for all of our C99 test balloons.

Yes, even if it were just for one release, having it optional for
something would mean we can adjust packaging without some huge pressure
where git had 0 Rust in one release and then mandatory Rust in another.

(I would of course prefer far more than one release, but I've tried
throughout this thread to give options even if the one I'd prefer isn't
pursued, not "teeth gnash").

sam

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-04 11:10                                 ` Patrick Steinhardt
@ 2025-09-04 15:45                                   ` Junio C Hamano
  2025-09-05  8:23                                     ` Patrick Steinhardt
  0 siblings, 1 reply; 198+ messages in thread
From: Junio C Hamano @ 2025-09-04 15:45 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: brian m. carlson, Taylor Blau, rsbecker, 'Elijah Newren',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'Christian Brabandt', 'Phillip Wood',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

Patrick Steinhardt <ps@pks.im> writes:

> If we cannot find anything, an alternative could also be to take a very
> simple subsystem that doesn't see a lot of changes and convert that to
> Rust. We'd retain both implementations in that case, which I mentioned
> is painful because we now have to keep both in sync. But if we say that
> this is a testballoon, only, and that we don't continue to convert other
> code until Git 3.0, then that might be fine.
>
> "varint.c" could be a good match. It's trivial, only 30 lines of code,
> and completely standalone.

I am afraid that it is a bit too trivial.  I didn't mention this
possibility of maintaining parallel implementations, but the
quiescent area I had in mind was patch-delta.c (no, I am not that
ambitious to suggest diff-delta.c as the first example).

> We could still build new and optional functionality via Rust, but I
> guess it also doesn't hurt to have a test balloon that is part of
> libgit.a to test interoperability.

OK.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-04 11:39                                 ` Patrick Steinhardt
  2025-09-04 13:53                                   ` Sam James
@ 2025-09-04 23:17                                   ` Ezekiel Newren
  2025-09-05  3:54                                   ` Elijah Newren
  2 siblings, 0 replies; 198+ messages in thread
From: Ezekiel Newren @ 2025-09-04 23:17 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: brian m. carlson, Junio C Hamano, Taylor Blau, rsbecker,
	Elijah Newren, Kristoffer Haugsbakk, Josh Soref, git,
	Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones, Josh Steadmon,
	Calvin Wan

On Thu, Sep 4, 2025 at 5:40 AM Patrick Steinhardt <ps@pks.im> wrote:
> The only reason why we want to have it as an optional component is to
> make the transitioning period easier for downstream distributors. And
> the intent is not to convert major components -- it should be trivial
> components that we can use as test balloons, similar to how we did it
> for all of our C99 test balloons.
>
> We cannot just pull the rug away under their feet without advance notice
> that this is going to happen.

I think making Rust optional for at least 1 version is a viable path.
I'm not opposed to that idea; it was just easier to develop and talk
about Rust as a hard dependency. I needed to know if making Rust
optional was in demand before spending any significant time on that.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-04 11:39                                 ` Patrick Steinhardt
  2025-09-04 13:53                                   ` Sam James
  2025-09-04 23:17                                   ` Ezekiel Newren
@ 2025-09-05  3:54                                   ` Elijah Newren
  2025-09-05  6:50                                     ` Patrick Steinhardt
  2025-09-05 10:31                                     ` Phillip Wood
  2 siblings, 2 replies; 198+ messages in thread
From: Elijah Newren @ 2025-09-05  3:54 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: brian m. carlson, Junio C Hamano, Taylor Blau, rsbecker,
	Kristoffer Haugsbakk, Josh Soref, git, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Sam James,
	Collin Funk, Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble,
	Ramsay Jones, Ezekiel Newren, Josh Steadmon, Calvin Wan

On Thu, Sep 4, 2025 at 4:40 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Thu, Sep 04, 2025 at 12:57:25AM +0000, brian m. carlson wrote:
> > On 2025-09-03 at 05:40:54, Patrick Steinhardt wrote:

[snip]

> > Also, the approach of making it an optional component directly
> > contradicts the proposed policy I wrote up.  That's a recipe for
> > additional burdensome work maintaining two implementations, when we
> > actually want to make it easier for people to contribute functionality.
> > It also doesn't provide any of the memory safety benefits or address any
> > of the concerns from governments, security professionals, and other
> > parties about the real and substantial risks of continuing to develop in
> > C.
>
> The only reason why we want to have it as an optional component is to
> make the transitioning period easier for downstream distributors. And
> the intent is not to convert major components -- it should be trivial
> components that we can use as test balloons, similar to how we did it
> for all of our C99 test balloons.
>
> We cannot just pull the rug away under their feet without advance notice
> that this is going to happen.

I find this statement a bit problematic for four reasons:

(1) "without advance notice" was already pointed out to be inaccurate
in this thread, including in the exact email you are responding to;
you could argue that there hasn't been _sufficient_ advance notice,
but then there should be more details about what is and isn't
sufficient.  Merely repeating this claim which brian just barely
pointed out to you as false almost feels dishonest.

(2) "pull the rug away" seems hyperbolic.  I would have liked some
explanation as to how a transition period is expected to help, and how
the existing transition period has been insufficient.  You do hint a
little at the former, which I'll discuss more in point 4, but you
neglect the latter to the point of pretending it didn't exist.   In
short, why is a further transition period needed, and how will it
differ from the existing one we've already had?  It's not clear to me
why distributors must immediately update to the latest git version.
Taylor discussed this aspect in detail in this thread; you even
responded briefly (and tangentially?), but still as far as I can tell
presume the latest and greatest is mandatory for them to adopt without
stating why.  Maybe they do need to adopt the latest and greatest, but
I haven't seen folks state why that's the case.  Did I miss it?

It also feels like Rust support is being lumped in with "breaking
changes", which to me feels misleading.  Historically, we have talked
about breaking changes and deprecation periods and such so that users
could adjust scripts or their command lines such that they would work
across multiple versions of Git.  The Rust case is somewhat different
in that we're not discussing behavioral changes of git, merely
implementation differences.  If someone has both a C-only version of
git and a newer version of git that was built with both Rust and C,
any commands they run should behave the same as far as the C-vs-Rust
goes (unless we have our normal discussions about specific behavior
and any deprecations we want to do related to it, of course).

I do agree that reduced platform support is a negative change (though
Rust brings other advantages that may offset this downside depending
on your viewpoint), but I don't see why it's a breaking change and
especially not a "pull the rug away under their feet" change.

(3) the use of "cannot" presupposes the policy stance which we are
having a discussion about, which, whether intended or not, feels like
an unfair way to attempt to shut down the conversation.

(4) you suggest that adding Rust as an optional component should avoid
the problem, yet we've already had Rust as an optional component for
the last three releases, going back to 2.49.0.  (libgit-rs and
libgit-sys).  In this case, you helpfully provided some details
distinguishing the type of optional component you want -- the
reference to a test balloon suggests you want an optional component
that is turned on by default (but which users can easily turn off).
Am I correct that this is your intention?  If that's the case, then
that's a useful distinction, but I think that distinction needs to be
made a bit more clearly (and as a side effect, acknowledge that Rust
has already been optionally shipped in some form, and was even
specifically highlighted by GitHub's and GitLab's blog posts about the
v2.49.0 release, among other places)

> > For example, there is zero chance I will implement any of the
> > SHA-1/SHA-256 compatibility code twice.  I'm already doing that in my
> > free time without any compensation at all and it's unreasonable to
> > expect me to do it twice or even to #ifdef out all the places it would
> > need to go.  I am happy to let someone else take responsibility for the
> > project instead, however, if they would like to do those things.
>
> And that's totally fair. From my point of view, this compatibility code
> is a _new_ feature that we are adding to Git. And as I mentioned, I
> think it is reasonable to say that new features may be implemented in
> Rust now already, as platforms that aren't yet ready wouldn't lose any
> existing functionality.

Am I correct to understand that you're suggesting a policy where brian
cannot modify any existing code to be written in Rust, and can only
add new Rust code?  Perhaps the SHA-1/SHA-256 compatibility code is
just new code, or can be done with minimal changes to existing C code
while adding new code.  If so, maybe this is a workable solution for
him.

But if it can't be done with minimal changes to existing C code and
this policy would impair brian's ability to deliver the compatibility
code, then I think this policy would be unworkable.  I really don't
want to hamstring brian's ability to implement the compatibility code.
It has sat dormant for years with no one else stepping up to the
plate, it's a really important project, and brian has time and energy
now.  I don't want any chicken-and-egg problems introduced for him
with the 3.0 release.  Even though I've been working with Ezekiel on
xdiff, and I'm obviously a bit biased in that area, I find the
sha1-sha256 compatibility work to be more critical and something we
should do everything possible to facilitate.

> I think most or even all of the contributors are on board. But we never
> really talked about timelines, or how we want to introduce Rust, so
> that's a discussion we need to have now.

Agreed.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-04 13:53                                   ` Sam James
@ 2025-09-05  3:55                                     ` Elijah Newren
  0 siblings, 0 replies; 198+ messages in thread
From: Elijah Newren @ 2025-09-05  3:55 UTC (permalink / raw)
  To: Sam James
  Cc: Patrick Steinhardt, brian m. carlson, Junio C Hamano, Taylor Blau,
	rsbecker, Kristoffer Haugsbakk, Josh Soref, git,
	Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones,
	Ezekiel Newren, Josh Steadmon, Calvin Wan

On Thu, Sep 4, 2025 at 6:53 AM Sam James <sam@gentoo.org> wrote:
>
> Patrick Steinhardt <ps@pks.im> writes:

[...]

> >> Also, the approach of making it an optional component directly
> >> contradicts the proposed policy I wrote up.  That's a recipe for
> >> additional burdensome work maintaining two implementations, when we
> >> actually want to make it easier for people to contribute functionality.
> >> It also doesn't provide any of the memory safety benefits or address any
> >> of the concerns from governments, security professionals, and other
> >> parties about the real and substantial risks of continuing to develop in
> >> C.
> >
> > The only reason why we want to have it as an optional component is to
> > make the transitioning period easier for downstream distributors. And
> > the intent is not to convert major components -- it should be trivial
> > components that we can use as test balloons, similar to how we did it
> > for all of our C99 test balloons.
>
> Yes, even if it were just for one release, having it optional for
> something would mean we can adjust packaging without some huge pressure
> where git had 0 Rust in one release and then mandatory Rust in another.
>
> (I would of course prefer far more than one release, but I've tried
> throughout this thread to give options even if the one I'd prefer isn't
> pursued, not "teeth gnash").

Rust has been an optional component of git for the last three releases
already, going back to v2.49.0.  See the v2.49.0 release notes, or
e.g. https://github.blog/open-source/git/highlights-from-git-2-49/
[*].

[*] A quote: "This release marks a major milestone in the Git project
with the first pieces of Rust code being checked in..."

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-05  3:54                                   ` Elijah Newren
@ 2025-09-05  6:50                                     ` Patrick Steinhardt
  2025-09-07  4:10                                       ` Elijah Newren
  2025-09-05 10:31                                     ` Phillip Wood
  1 sibling, 1 reply; 198+ messages in thread
From: Patrick Steinhardt @ 2025-09-05  6:50 UTC (permalink / raw)
  To: Elijah Newren
  Cc: brian m. carlson, Junio C Hamano, Taylor Blau, rsbecker,
	Kristoffer Haugsbakk, Josh Soref, git, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Sam James,
	Collin Funk, Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble,
	Ramsay Jones, Ezekiel Newren, Josh Steadmon, Calvin Wan

On Thu, Sep 04, 2025 at 08:54:19PM -0700, Elijah Newren wrote:
> On Thu, Sep 4, 2025 at 4:40 AM Patrick Steinhardt <ps@pks.im> wrote:
> >
> > On Thu, Sep 04, 2025 at 12:57:25AM +0000, brian m. carlson wrote:
> > > On 2025-09-03 at 05:40:54, Patrick Steinhardt wrote:
> > > Also, the approach of making it an optional component directly
> > > contradicts the proposed policy I wrote up.  That's a recipe for
> > > additional burdensome work maintaining two implementations, when we
> > > actually want to make it easier for people to contribute functionality.
> > > It also doesn't provide any of the memory safety benefits or address any
> > > of the concerns from governments, security professionals, and other
> > > parties about the real and substantial risks of continuing to develop in
> > > C.
> >
> > The only reason why we want to have it as an optional component is to
> > make the transitioning period easier for downstream distributors. And
> > the intent is not to convert major components -- it should be trivial
> > components that we can use as test balloons, similar to how we did it
> > for all of our C99 test balloons.
> >
> > We cannot just pull the rug away under their feet without advance notice
> > that this is going to happen.
> 
> I find this statement a bit problematic for four reasons:
> 
> (1) "without advance notice" was already pointed out to be inaccurate
> in this thread, including in the exact email you are responding to;
> you could argue that there hasn't been _sufficient_ advance notice,
> but then there should be more details about what is and isn't
> sufficient.  Merely repeating this claim which brian just barely
> pointed out to you as false almost feels dishonest.

I think there is a difference between communication that happens on the
mailing list/contributors summit and communication that is intended for
the broader ecosystem:

  - The former is basically us developers discussing potential futures
    and reviewing patches. It would be _nice_ if distro maintainers of
    Git were to read these, but given the large volume of traffic in
    general I think it unlikely that majority of maintainers is keeping
    up with that traffic.

  - The latter is in the form of e.g. our release notes as well as our
    BreakingChanges document. These _are_ intended to be reviewed by
    maintainers, and the blame is on them if they don't do so.

We have never communicated either via release notes or via any kind of
committed document that Rust is going to become mandatory. There have
been lots of large threads discussing it, true. But navigating these
threads and estimating consensus isn't easy even for us developers, so
it's going to be even harder for outsiders to the community.

> (2) "pull the rug away" seems hyperbolic.  I would have liked some
> explanation as to how a transition period is expected to help, and how
> the existing transition period has been insufficient.  You do hint a
> little at the former, which I'll discuss more in point 4, but you
> neglect the latter to the point of pretending it didn't exist.   In
> short, why is a further transition period needed, and how will it
> differ from the existing one we've already had?  It's not clear to me
> why distributors must immediately update to the latest git version.
> Taylor discussed this aspect in detail in this thread; you even
> responded briefly (and tangentially?), but still as far as I can tell
> presume the latest and greatest is mandatory for them to adopt without
> stating why.  Maybe they do need to adopt the latest and greatest, but
> I haven't seen folks state why that's the case.  Did I miss it?

The problem here is that we don't have a story to tell yet. I agree that
not everyone always needs the latest and greatest, which is also why I
mentioned that I think it's fine for _new_ features to be developed in
Rust right away.

But the story is altogether different for bug and security fixes.

  - We of course backport security fixes, but would that also be the
    case if we had ported the subsystem to Rust already and now had to
    implement the security fix twice?

  - What happens if only the old C version has a security bug? Do we
    still fix it?

  - Likewise, what happens with important bug fixes? We tend to backport
    those that are easy-ish to backport, but if people are potentially
    stuck with an older Git version for years it will become harder for
    us to do so.

I think without us having a proper answer to these questions we _are_
pulling the rug away. Distros may be stuck with an old version of Git
for a significant time, and from my point of view we have to do a couple
of compromises there.

> It also feels like Rust support is being lumped in with "breaking
> changes", which to me feels misleading.  Historically, we have talked
> about breaking changes and deprecation periods and such so that users
> could adjust scripts or their command lines such that they would work
> across multiple versions of Git.  The Rust case is somewhat different
> in that we're not discussing behavioral changes of git, merely
> implementation differences.  If someone has both a C-only version of
> git and a newer version of git that was built with both Rust and C,
> any commands they run should behave the same as far as the C-vs-Rust
> goes (unless we have our normal discussions about specific behavior
> and any deprecations we want to do related to it, of course).
> 
> I do agree that reduced platform support is a negative change (though
> Rust brings other advantages that may offset this downside depending
> on your viewpoint), but I don't see why it's a breaking change and
> especially not a "pull the rug away under their feet" change.

I honestly don't quite understand this perspective. How isn't it
breaking that you cannot use that Git version at all anymore?

> (3) the use of "cannot" presupposes the policy stance which we are
> having a discussion about, which, whether intended or not, feels like
> an unfair way to attempt to shut down the conversation.

Sorry, that's not my intent.

> (4) you suggest that adding Rust as an optional component should avoid
> the problem, yet we've already had Rust as an optional component for
> the last three releases, going back to 2.49.0.  (libgit-rs and
> libgit-sys).

I don't really think that either libgit-rs or libgit-sys help in any
way. These are part of "contrib/", not built by default, and neither are
they consumed by anyone out there. So there is no reason for anyone to
build that library to the best of my knowledge.

> In this case, you helpfully provided some details distinguishing the
> type of optional component you want -- the reference to a test balloon
> suggests you want an optional component that is turned on by default
> (but which users can easily turn off). Am I correct that this is your
> intention?  If that's the case, then that's a useful distinction, but
> I think that distinction needs to be made a bit more clearly (and as a
> side effect, acknowledge that Rust has already been optionally shipped
> in some form, and was even specifically highlighted by GitHub's and
> GitLab's blog posts about the v2.49.0 release, among other places)

Yes. I think we need to have a test balloon that allows us to iterate on
the build infrastructure and allows distributors to test with them. I
think that test balloon needs to be integrated into core Git so that it
is part of the normal build process, because otherwise it wouldn't have
any exposure at all and thus not serve its purpose.

> > > For example, there is zero chance I will implement any of the
> > > SHA-1/SHA-256 compatibility code twice.  I'm already doing that in my
> > > free time without any compensation at all and it's unreasonable to
> > > expect me to do it twice or even to #ifdef out all the places it would
> > > need to go.  I am happy to let someone else take responsibility for the
> > > project instead, however, if they would like to do those things.
> >
> > And that's totally fair. From my point of view, this compatibility code
> > is a _new_ feature that we are adding to Git. And as I mentioned, I
> > think it is reasonable to say that new features may be implemented in
> > Rust now already, as platforms that aren't yet ready wouldn't lose any
> > existing functionality.
> 
> Am I correct to understand that you're suggesting a policy where brian
> cannot modify any existing code to be written in Rust, and can only
> add new Rust code?  Perhaps the SHA-1/SHA-256 compatibility code is
> just new code, or can be done with minimal changes to existing C code
> while adding new code.  If so, maybe this is a workable solution for
> him.

Yeah, that's my hope, as well. There's probably nouances to this though,
and we'll have to figure it out once the series hits the mailing list.
So...

> But if it can't be done with minimal changes to existing C code and
> this policy would impair brian's ability to deliver the compatibility
> code, then I think this policy would be unworkable.  I really don't
> want to hamstring brian's ability to implement the compatibility code.
> It has sat dormant for years with no one else stepping up to the
> plate, it's a really important project, and brian has time and energy
> now.  I don't want any chicken-and-egg problems introduced for him
> with the 3.0 release.  Even though I've been working with Ezekiel on
> xdiff, and I'm obviously a bit biased in that area, I find the
> sha1-sha256 compatibility work to be more critical and something we
> should do everything possible to facilitate.

... I guess we'll have to see how this looks like in the end. If the
series rewrites a bunch of subsystems in Rust I think we should figure
out whether we can do without that. Or, in the worst case, whether it is
feasible to conditionally compile some of the code with either C or
Rust, even though nobody likes that.

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-04 15:45                                   ` Junio C Hamano
@ 2025-09-05  8:23                                     ` Patrick Steinhardt
  0 siblings, 0 replies; 198+ messages in thread
From: Patrick Steinhardt @ 2025-09-05  8:23 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: brian m. carlson, Taylor Blau, rsbecker, 'Elijah Newren',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'Christian Brabandt', 'Phillip Wood',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On Thu, Sep 04, 2025 at 08:45:56AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > If we cannot find anything, an alternative could also be to take a very
> > simple subsystem that doesn't see a lot of changes and convert that to
> > Rust. We'd retain both implementations in that case, which I mentioned
> > is painful because we now have to keep both in sync. But if we say that
> > this is a testballoon, only, and that we don't continue to convert other
> > code until Git 3.0, then that might be fine.
> >
> > "varint.c" could be a good match. It's trivial, only 30 lines of code,
> > and completely standalone.
> 
> I am afraid that it is a bit too trivial.  I didn't mention this
> possibility of maintaining parallel implementations, but the
> quiescent area I had in mind was patch-delta.c (no, I am not that
> ambitious to suggest diff-delta.c as the first example).

Oh, it certainly is very trivial. I think for the initial infra this is
a good trait though, as it means that we don't yet have to care about
interop between different parts of Git and can rather focus on the
bigger topic, which is the process for how to introduce Rust in the
first place.

But I agree, once we have the initial Rust infra landed we should then
also gain familiarity with more involved subsystems that _do_ require us
to hook into other subsystems so that we are forced to extend our build
systems as needed.

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-05  3:54                                   ` Elijah Newren
  2025-09-05  6:50                                     ` Patrick Steinhardt
@ 2025-09-05 10:31                                     ` Phillip Wood
  2025-09-05 11:32                                       ` Sam James
  2025-09-05 13:14                                       ` Phillip Wood
  1 sibling, 2 replies; 198+ messages in thread
From: Phillip Wood @ 2025-09-05 10:31 UTC (permalink / raw)
  To: Elijah Newren, Patrick Steinhardt
  Cc: brian m. carlson, Junio C Hamano, Taylor Blau, rsbecker,
	Kristoffer Haugsbakk, Josh Soref, git, Christian Brabandt,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones,
	Ezekiel Newren, Josh Steadmon, Calvin Wan

Hi Elijah

On 05/09/2025 04:54, Elijah Newren wrote:
> 
> (1) "without advance notice" was already pointed out to be inaccurate
> in this thread, including in the exact email you are responding to;
> you could argue that there hasn't been _sufficient_ advance notice,
> but then there should be more details about what is and isn't
> sufficient.  Merely repeating this claim which brian just barely
> pointed out to you as false almost feels dishonest.

I think there is a difference of understanding of what constitutes 
"advanced notice". While it is true that there have been discussions on 
the list for a couple of years where people were clearly enthusiastic 
about adopting rust those discussions have always petered out after 
concerns about portability were raised without us actually adopting 
rust. In those discussions there has been no clear conclusion about 
whether rust would be mandatory or optional. I think from the point of 
view of an outsider who was following the mailing list it has not been 
clear exactly where the rust discussion was going. For someone not 
following the mailing list but just reading the release notes there has 
been no indication that we're thinking of rust mandatory for building 
git as opposed to offering rust bindings for our C code.

> (2) "pull the rug away" seems hyperbolic.  I would have liked some
> explanation as to how a transition period is expected to help, and how
> the existing transition period has been insufficient.

I'm very unclear what "the existing transition period" has been

 > [...] > (4) you suggest that adding Rust as an optional component 
should avoid
> the problem, yet we've already had Rust as an optional component for
> the last three releases, going back to 2.49.0.  (libgit-rs and
> libgit-sys).

Right but from the point of view of someone trying to build git on a 
platform without rust support there is a world of difference between 
having some optional bindings for rust external projects to use, and 
making rust mandatory to build git.

I would like us to adopt rust but I am concerned about the implications 
for platforms without rust and think we should give some notice in the 
form a clear announcement in the release notes once we have a concrete 
plan. That plan should include a decision on what commitment we can 
realistically offer with regard to security updates for platforms 
without a rust compiler so maintainers on those platforms have a clear 
idea of how long they will be supported.

Thanks

Phillip


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-05 10:31                                     ` Phillip Wood
@ 2025-09-05 11:32                                       ` Sam James
  2025-09-05 13:14                                       ` Phillip Wood
  1 sibling, 0 replies; 198+ messages in thread
From: Sam James @ 2025-09-05 11:32 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Elijah Newren, Patrick Steinhardt, phillip.wood, brian m. carlson,
	Junio C Hamano, Taylor Blau, rsbecker, Kristoffer Haugsbakk,
	Josh Soref, git, Christian Brabandt, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones,
	Ezekiel Newren, Josh Steadmon, Calvin Wan

Phillip Wood <phillip.wood123@gmail.com> writes:

> Hi Elijah
>
> On 05/09/2025 04:54, Elijah Newren wrote:
>> (1) "without advance notice" was already pointed out to be
>> inaccurate
>> in this thread, including in the exact email you are responding to;
>> you could argue that there hasn't been _sufficient_ advance notice,
>> but then there should be more details about what is and isn't
>> sufficient.  Merely repeating this claim which brian just barely
>> pointed out to you as false almost feels dishonest.
>
> I think there is a difference of understanding of what constitutes
> "advanced notice". While it is true that there have been discussions
> on the list for a couple of years where people were clearly
> enthusiastic about adopting rust those discussions have always petered
> out after concerns about portability were raised without us actually
> adopting rust. In those discussions there has been no clear conclusion
> about whether rust would be mandatory or optional. I think from the
> point of view of an outsider who was following the mailing list it has
> not been clear exactly where the rust discussion was going. For
> someone not following the mailing list but just reading the release
> notes there has been no indication that we're thinking of rust
> mandatory for building git as opposed to offering rust bindings for
> our C code.
>
>> (2) "pull the rug away" seems hyperbolic.  I would have liked some
>> explanation as to how a transition period is expected to help, and how
>> the existing transition period has been insufficient.
>
> I'm very unclear what "the existing transition period" has been
>
>> [...] > (4) you suggest that adding Rust as an optional component
>   should avoid
>> the problem, yet we've already had Rust as an optional component for
>> the last three releases, going back to 2.49.0.  (libgit-rs and
>> libgit-sys).
>
> Right but from the point of view of someone trying to build git on a
> platform without rust support there is a world of difference between
> having some optional bindings for rust external projects to use, and
> making rust mandatory to build git.

Entirely agreed with the whole email.

I'll make some further observations wrt bindings:

1) Distributions often don't enable bindings for languages unless a user
requests them, or at the very least they're considered low priority (and
bindings existing for a language in a project do *not* imply the project
is going to be rewritten in that language);

2) There would be no value in distributions building Rust bindings
because Rust doesn't really support "system-wide" libraries. It doesn't
make sense as far as I can tell to install the bindings right now. I
don't even know where Rust bindings should be installed to, I've never
seen a package want them installed before.

3) It's not integrated with the Meson build system we're using so I
wouldn't have paid any attention to it, at least unless a user requested
it;

4) It's in contrib/.

>
> I would like us to adopt rust but I am concerned about the
> implications for platforms without rust and think we should give some
> notice in the form a clear announcement in the release notes once we
> have a concrete plan. That plan should include a decision on what
> commitment we can realistically offer with regard to security updates
> for platforms without a rust compiler so maintainers on those
> platforms have a clear idea of how long they will be supported.
>
> Thanks
>
> Phillip


sam

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-05 10:31                                     ` Phillip Wood
  2025-09-05 11:32                                       ` Sam James
@ 2025-09-05 13:14                                       ` Phillip Wood
  2025-09-05 13:23                                         ` Patrick Steinhardt
  2025-09-05 15:37                                         ` Junio C Hamano
  1 sibling, 2 replies; 198+ messages in thread
From: Phillip Wood @ 2025-09-05 13:14 UTC (permalink / raw)
  To: Elijah Newren, Patrick Steinhardt
  Cc: brian m. carlson, Junio C Hamano, Taylor Blau, rsbecker,
	Kristoffer Haugsbakk, Josh Soref, git, Christian Brabandt,
	Eli Schwartz, Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones,
	Ezekiel Newren, Josh Steadmon, Calvin Wan

On 05/09/2025 11:31, Phillip Wood wrote:
> 
> I would like us to adopt rust but I am concerned about the implications 
> for platforms without rust and think we should give some notice in the 
> form a clear announcement in the release notes once we have a concrete 
> plan. That plan should include a decision on what commitment we can 
> realistically offer with regard to security updates for platforms 
> without a rust compiler so maintainers on those platforms have a clear 
> idea of how long they will be supported.

Here's what such an announcement might look like

     This release introduces an optional dependency on rust that is
     enabled by default. Platforms without a rust compiler can continue
     to build git by passing NO_RUST=1. In six months time we plan to
     make rust mandatory for building git. From that point git 2.x.y (the
     last version that can be built without rust) will continue to
     receive security updates for three years.

To me the important elements are:

1) There is a short period where rust is optional. This allows
    (i) Distributors on platforms without a rust compiler time to notify
        their users that in the future they will only be able to offer
        security updates.
   (ii) Distributors on platforms with a rust compiler time to adjust
        their build procedures to include rust.
  (iii) The git project time to gain experience of using rust and writing
        the necessary bindings while building with it is optional.

2) Rust is enabled by default so platforms without a rust compiler are
    made aware of the problem but have an easy way to continue to build
    git while rust is optional.

3) There is a period of a small number of years where we continue to
    provide security updates for a version of git that can be built
    without rust. This is intended to  allow a realistic time for
    distributors on platforms without a rust compiler to port one or make
    other arrangements for providing future security updates without
    placing an undue burden on the project to provide security updates
    for niche platforms indefinitely.

Thanks

Phillip

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-05 13:14                                       ` Phillip Wood
@ 2025-09-05 13:23                                         ` Patrick Steinhardt
  2025-09-05 15:37                                         ` Junio C Hamano
  1 sibling, 0 replies; 198+ messages in thread
From: Patrick Steinhardt @ 2025-09-05 13:23 UTC (permalink / raw)
  To: phillip.wood
  Cc: Elijah Newren, brian m. carlson, Junio C Hamano, Taylor Blau,
	rsbecker, Kristoffer Haugsbakk, Josh Soref, git,
	Christian Brabandt, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Sam James,
	Collin Funk, Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble,
	Ramsay Jones, Ezekiel Newren, Josh Steadmon, Calvin Wan

On Fri, Sep 05, 2025 at 02:14:43PM +0100, Phillip Wood wrote:
> On 05/09/2025 11:31, Phillip Wood wrote:
> > 
> > I would like us to adopt rust but I am concerned about the implications
> > for platforms without rust and think we should give some notice in the
> > form a clear announcement in the release notes once we have a concrete
> > plan. That plan should include a decision on what commitment we can
> > realistically offer with regard to security updates for platforms
> > without a rust compiler so maintainers on those platforms have a clear
> > idea of how long they will be supported.
> 
> Here's what such an announcement might look like
> 
>     This release introduces an optional dependency on rust that is
>     enabled by default. Platforms without a rust compiler can continue
>     to build git by passing NO_RUST=1. In six months time we plan to
>     make rust mandatory for building git. From that point git 2.x.y (the
>     last version that can be built without rust) will continue to
>     receive security updates for three years.
> 
> To me the important elements are:
> 
> 1) There is a short period where rust is optional. This allows
>    (i) Distributors on platforms without a rust compiler time to notify
>        their users that in the future they will only be able to offer
>        security updates.
>   (ii) Distributors on platforms with a rust compiler time to adjust
>        their build procedures to include rust.
>  (iii) The git project time to gain experience of using rust and writing
>        the necessary bindings while building with it is optional.
> 
> 2) Rust is enabled by default so platforms without a rust compiler are
>    made aware of the problem but have an easy way to continue to build
>    git while rust is optional.
> 
> 3) There is a period of a small number of years where we continue to
>    provide security updates for a version of git that can be built
>    without rust. This is intended to  allow a realistic time for
>    distributors on platforms without a rust compiler to port one or make
>    other arrangements for providing future security updates without
>    placing an undue burden on the project to provide security updates
>    for niche platforms indefinitely.

Something like this is part of the BreakingChanges document I'm
proposing in [1]. I think we should also highlight this upcoming change
in the next release notes, with a pointer to that document.

Patrick

[1]: <20250904-b4-pks-rust-breaking-change-v1-0-3af1d25e0be9@pks.im>

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-05 13:14                                       ` Phillip Wood
  2025-09-05 13:23                                         ` Patrick Steinhardt
@ 2025-09-05 15:37                                         ` Junio C Hamano
  2025-09-08  6:40                                           ` Patrick Steinhardt
  1 sibling, 1 reply; 198+ messages in thread
From: Junio C Hamano @ 2025-09-05 15:37 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Elijah Newren, Patrick Steinhardt, brian m. carlson, Taylor Blau,
	rsbecker, Kristoffer Haugsbakk, Josh Soref, git,
	Christian Brabandt, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Sam James,
	Collin Funk, Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble,
	Ramsay Jones, Ezekiel Newren, Josh Steadmon, Calvin Wan

Phillip Wood <phillip.wood123@gmail.com> writes:

>     This release introduces an optional dependency on rust that is
>     enabled by default. Platforms without a rust compiler can continue
>     to build git by passing NO_RUST=1. In six months time we plan to
>     make rust mandatory for building git. From that point git 2.x.y (the
>     last version that can be built without rust) will continue to
>     receive security updates for three years.
>
> To me the important elements are:
>
> 1) There is a short period where rust is optional. This allows
>    (i) Distributors on platforms without a rust compiler time to notify
>        their users that in the future they will only be able to offer
>        security updates.
>   (ii) Distributors on platforms with a rust compiler time to adjust
>        their build procedures to include rust.
>  (iii) The git project time to gain experience of using rust and writing
>        the necessary bindings while building with it is optional.

Good.  I am not sure "short" should be an important element, but
having a known and agreed-upon deadline helps.

> 2) Rust is enabled by default so platforms without a rust compiler are
>    made aware of the problem but have an easy way to continue to build
>    git while rust is optional.

Obviously there is nothing to disagree with here, as it is the
definition of the word "optional" ;-).

> 3) There is a period of a small number of years where we continue to
>    provide security updates for a version of git that can be built
>    without rust. This is intended to  allow a realistic time for
>    distributors on platforms without a rust compiler to port one or make
>    other arrangements for providing future security updates without
>    placing an undue burden on the project to provide security updates
>    for niche platforms indefinitely.

I am not willing to see such a support for multiple years, though.
If the first item is 6 months, this backporting stale releases
should be on the same order of timeperiod.

If it were "3 years of optional period, 18 months of backporting
security updates", I would find it more realistic.  It would give
those platform maintainers enough time to robby, fundraise, or
otherwise campaign to bring Rust on their system.  I personally find
that 6 months is way too short (if we are _only_ looking for an
excuse to say "we have given them ample time to react, and now it is
their problem", 6 months may be good enough, though).


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-05  6:50                                     ` Patrick Steinhardt
@ 2025-09-07  4:10                                       ` Elijah Newren
  2025-09-07 16:09                                         ` rsbecker
  2025-09-08  6:40                                         ` Patrick Steinhardt
  0 siblings, 2 replies; 198+ messages in thread
From: Elijah Newren @ 2025-09-07  4:10 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: brian m. carlson, Junio C Hamano, Taylor Blau, rsbecker,
	Kristoffer Haugsbakk, Josh Soref, git, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Sam James,
	Collin Funk, Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble,
	Ramsay Jones, Ezekiel Newren, Josh Steadmon, Calvin Wan

Sorry for the delay; life outside of work is challenging at the moment...

On Thu, Sep 4, 2025 at 11:50 PM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Thu, Sep 04, 2025 at 08:54:19PM -0700, Elijah Newren wrote:
> > On Thu, Sep 4, 2025 at 4:40 AM Patrick Steinhardt <ps@pks.im> wrote:
> > >
> > > On Thu, Sep 04, 2025 at 12:57:25AM +0000, brian m. carlson wrote:
> > > > On 2025-09-03 at 05:40:54, Patrick Steinhardt wrote:
> > > > Also, the approach of making it an optional component directly
> > > > contradicts the proposed policy I wrote up.  That's a recipe for
> > > > additional burdensome work maintaining two implementations, when we
> > > > actually want to make it easier for people to contribute functionality.
> > > > It also doesn't provide any of the memory safety benefits or address any
> > > > of the concerns from governments, security professionals, and other
> > > > parties about the real and substantial risks of continuing to develop in
> > > > C.
> > >
> > > The only reason why we want to have it as an optional component is to
> > > make the transitioning period easier for downstream distributors. And
> > > the intent is not to convert major components -- it should be trivial
> > > components that we can use as test balloons, similar to how we did it
> > > for all of our C99 test balloons.
> > >
> > > We cannot just pull the rug away under their feet without advance notice
> > > that this is going to happen.
> >
> > I find this statement a bit problematic for four reasons:
> >
> > (1) "without advance notice" was already pointed out to be inaccurate
> > in this thread, including in the exact email you are responding to;
> > you could argue that there hasn't been _sufficient_ advance notice,
> > but then there should be more details about what is and isn't
> > sufficient.  Merely repeating this claim which brian just barely
> > pointed out to you as false almost feels dishonest.
>
> I think there is a difference between communication that happens on the
> mailing list/contributors summit and communication that is intended for
> the broader ecosystem:
>
>   - The former is basically us developers discussing potential futures
>     and reviewing patches. It would be _nice_ if distro maintainers of
>     Git were to read these, but given the large volume of traffic in
>     general I think it unlikely that majority of maintainers is keeping
>     up with that traffic.
>
>   - The latter is in the form of e.g. our release notes as well as our
>     BreakingChanges document. These _are_ intended to be reviewed by
>     maintainers, and the blame is on them if they don't do so.
>
> We have never communicated either via release notes or via any kind of
> committed document that Rust is going to become mandatory. There have
> been lots of large threads discussing it, true. But navigating these
> threads and estimating consensus isn't easy even for us developers, so
> it's going to be even harder for outsiders to the community.

I like this framing; this is useful.

I agree that we haven't communicated that it'll be mandatory, though
we have communicated beyond the list that Rust was likely coming:
  * The contributor summit notes on Rust (posted at
https://lore.kernel.org/git/Zu2D%2Fb1ZJbTlC1ml@nand.local/) were
widely picked up at other sites (e.g.
https://lwn.net/Articles/998115/,
https://www.reddit.com/r/linux/comments/1hcsvk5/nonstop_discussion_around_adding_rust_to_git/)
  * The release notes mention initial Rust inclusion
(https://lore.kernel.org/git/xmqqfrjfilc8.fsf@gitster.g/, "Foreign
language interface for Rust into our code base has been added.")
  * The GitHub blog on highlights from 2.49.0 (widely linked at news
sites even in preference to the release notes) adds more detail: "This
release marks a major milestone in the Git project with the first
pieces of Rust code being checked in"
(https://github.blog/open-source/git/highlights-from-git-2-49/)

Now, I can fully get behind that this may be _inadequate_ notice, and
I really like the idea of a test balloon.  I'm just noting that I very
much disagree with the characterization that there has been no notice
beyond the mailing list about Rust likely coming at some point, and
want us to make sure that if we delay, we use the time to meaningfully
provide more notice than we have already.  Another optional Rust
component that doesn't build by default, for example, fails that test.

> > (2) "pull the rug away" seems hyperbolic.  I would have liked some
> > explanation as to how a transition period is expected to help, and how
> > the existing transition period has been insufficient.  You do hint a
> > little at the former, which I'll discuss more in point 4, but you
> > neglect the latter to the point of pretending it didn't exist.   In
> > short, why is a further transition period needed, and how will it
> > differ from the existing one we've already had?  It's not clear to me
> > why distributors must immediately update to the latest git version.
> > Taylor discussed this aspect in detail in this thread; you even
> > responded briefly (and tangentially?), but still as far as I can tell
> > presume the latest and greatest is mandatory for them to adopt without
> > stating why.  Maybe they do need to adopt the latest and greatest, but
> > I haven't seen folks state why that's the case.  Did I miss it?
>
> The problem here is that we don't have a story to tell yet. I agree that
> not everyone always needs the latest and greatest, which is also why I
> mentioned that I think it's fine for _new_ features to be developed in
> Rust right away.
>
> But the story is altogether different for bug and security fixes.
>
>   - We of course backport security fixes, but would that also be the
>     case if we had ported the subsystem to Rust already and now had to
>     implement the security fix twice?
>
>   - What happens if only the old C version has a security bug? Do we
>     still fix it?
>
>   - Likewise, what happens with important bug fixes? We tend to backport
>     those that are easy-ish to backport, but if people are potentially
>     stuck with an older Git version for years it will become harder for
>     us to do so.
>
> I think without us having a proper answer to these questions we _are_
> pulling the rug away. Distros may be stuck with an old version of Git
> for a significant time, and from my point of view we have to do a couple
> of compromises there.

These are good questions...but they are ones to which I suspect
delaying will not provide the answer.  In fact, I don't think we'll
_ever_ have the answer to these questions, no matter how much we delay
or discuss.  Traditionally, if an issue was more severe, it has been
backported to more versions, even if the backport wasn't trivial.
There's a cost/benefit tradeoff to be had for each vulnerability, and
changes to the area making backports either be easy or hard always
need to be weighed against the severity of the vulnerability.  I don't
see that changing, and overpromising hurts in the long run probably
more than having no guidance.  I just don't see us coming up with
"proper answers" (which I'm guessing means fully spelled out answers?)
to these questions ahead of time.  The answer to all of them is
probably "we'll weigh the severity of the issue and the cost to
backport and give the last C-only version significant extra weight in
our considerations".  I doubt we'll ever be able to promise any more
detail than that until we get concrete cases; I'm not even sure that
this statement is acceptable to everyone on the list from the
overpromising angle despite being as incomplete as it is.

> > It also feels like Rust support is being lumped in with "breaking
> > changes", which to me feels misleading.  Historically, we have talked
> > about breaking changes and deprecation periods and such so that users
> > could adjust scripts or their command lines such that they would work
> > across multiple versions of Git.  The Rust case is somewhat different
> > in that we're not discussing behavioral changes of git, merely
> > implementation differences.  If someone has both a C-only version of
> > git and a newer version of git that was built with both Rust and C,
> > any commands they run should behave the same as far as the C-vs-Rust
> > goes (unless we have our normal discussions about specific behavior
> > and any deprecations we want to do related to it, of course).
> >
> > I do agree that reduced platform support is a negative change (though
> > Rust brings other advantages that may offset this downside depending
> > on your viewpoint), but I don't see why it's a breaking change and
> > especially not a "pull the rug away under their feet" change.
>
> I honestly don't quite understand this perspective. How isn't it
> breaking that you cannot use that Git version at all anymore?

Users might often face cases where they have to use different versions
of git -- at home, at work, on different work machines, etc.  As such,
when something forces workflow changes, we have to be cognizant of
that and provide deprecation periods, release announcement notices,
etc.  That's the point of our care around breaking changes.

If _distributors_ can't build a new version of git, users can still
use older versions.  They don't have to change their workflows.  When
distributors eventually figure out how to build a newer version
(because they work around pthreads not existing on their platform, or
they add stdbool to their compiler, or they port Rust to their system
or whatever), then when the new version becomes available, users can
use it without changes to their workflow.  The _users_ weren't broken.

I still don't see why distributors _must_ ship the latest version of
Git and why folks on some platforms are considered broken if they are
using a slightly older version.  Let me ask again: has anyone answered
why this is considered mandatory?  If they have, I've missed it, but
I've asked multiple times.  Even if you want to lump "distributors
cannot build a newer version" under the umbrella of "breaking
changes", I argue it's a much different kind of break and one which
merits different timelines for handling than e.g. lumping it in with
3.0.

> > (3) the use of "cannot" presupposes the policy stance which we are
> > having a discussion about, which, whether intended or not, feels like
> > an unfair way to attempt to shut down the conversation.
>
> Sorry, that's not my intent.

Thanks, and I appreciate you patiently explaining your point of view
in more detail.

> > (4) you suggest that adding Rust as an optional component should avoid
> > the problem, yet we've already had Rust as an optional component for
> > the last three releases, going back to 2.49.0.  (libgit-rs and
> > libgit-sys).
>
> I don't really think that either libgit-rs or libgit-sys help in any
> way. These are part of "contrib/", not built by default, and neither are
> they consumed by anyone out there. So there is no reason for anyone to
> build that library to the best of my knowledge.

I'm fully willing to accept they are inadequate notice (and perhaps
even barely helpful), but disagree with the characterization that they
don't help at all:
  * they were consumed in the past by Google
  * they recently received patches from someone outside Google
(https://lore.kernel.org/git/20250826233525.2635432-1-davvid@gmail.com/)
  * they were mentioned in the release notes highlighting at a minimum
that Rust is being added to Git.
  * they were highlighted in blog posts from both GitLab and GitHub as
being noteworthy new things in the v2.49.0 release

I agree with you there is certainly more we can do, and I like your
idea of a test ballon.  Let's just avoid repeating the problem of
adding an optional component that no one will try to build except for
those for whom we know can build it; doing that would provide no more
notice and thus provide no incremental benefit over libgit-rs and
libgit-sys.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* RE: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-07  4:10                                       ` Elijah Newren
@ 2025-09-07 16:09                                         ` rsbecker
  2025-09-08 10:12                                           ` Phillip Wood
                                                             ` (2 more replies)
  2025-09-08  6:40                                         ` Patrick Steinhardt
  1 sibling, 3 replies; 198+ messages in thread
From: rsbecker @ 2025-09-07 16:09 UTC (permalink / raw)
  To: 'Elijah Newren', 'Patrick Steinhardt'
  Cc: 'brian m. carlson', 'Junio C Hamano',
	'Taylor Blau', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On September 7, 2025 12:10 AM, Elijah Newren wrote:
>Sorry for the delay; life outside of work is challenging at the moment...
>

I am going to address the critical point mentioned below and snip the rest for brevity.

>I still don't see why distributors _must_ ship the latest version of Git and why folks
>on some platforms are considered broken if they are using a slightly older version.
>Let me ask again: has anyone answered why this is considered mandatory?  If they
>have, I've missed it, but I've asked multiple times.  Even if you want to lump
>"distributors cannot build a newer version" under the umbrella of "breaking
>changes", I argue it's a much different kind of break and one which merits different
>timelines for handling than e.g. lumping it in with 3.0.

I do not see that distributors _must_ ship the latest version. Suppose we are on
2.51.0 and a CVE comes out that prohibits its use in an organization that does
not allow any medium-high to high CVEs. This represents hundreds of thousands
of impacted users in my community alone. How does the CVE get applied if the
latest cannot be built and the git team does not apply the CVE fixes to old
versions. Personally, I do not care if git versions are different between work
and home, or even between CI/CD and other platforms. I don't even care
if I have to use JGit instead of git in some situations (which I see is a likely
outcome of this discussion). Is there an official statement of what an LTS
means? In other projects LTS is typically, and formally by policy 5 years.
From what others have said here, positions of 6 months, 3 years, and
"apply it yourself if you want to continue to use git" have been made.

The core problem of adding a breaking dependency is when a CVE comes
out that prohibits git from being used at all. If the git team is not going
to provide a clear statement, one way or another, if how CVEs (at
whatever severity level) will not have a commitment of any kind,
then distributors are essentially cast adrift and on our own. It would
be helpful of those of us who donate our time, for no compensation,
are able to plan for this in a meaningful way. Please remember that
we have to justify our participation to our management teams to be
allowed to continue to participate. Nothing is free from this end
and if fixing (not just applying fixes) CVEs are now 100% our
responsibility, if would be critical to know that when we build our
business cases to our bosses, who I am fairly certain will say an
emphatic no.

Also remember that without support from the git team, the
code base is no longer the same, meaning the auditors will not
necessarily accept fixes from third-party sources. This particular
point enabled adoption on some platforms, particularly NonStop.
Adoption was at 1-2 customers when we had a divergent code
based because some platform fixes being different from the
standard code-base and could not be certified as valid. Once the
code-base because common, adoption was rapid and enthusiastic.
If this goes away, I suspect that adoption rates will go negative.
I am aware that that particular discussion is actually happening
in some organizations in my community right now, with companies
looking for alternatives to git based on this discussion thread.

With over a decade of respect and participation,
Randall


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-07  4:10                                       ` Elijah Newren
  2025-09-07 16:09                                         ` rsbecker
@ 2025-09-08  6:40                                         ` Patrick Steinhardt
  1 sibling, 0 replies; 198+ messages in thread
From: Patrick Steinhardt @ 2025-09-08  6:40 UTC (permalink / raw)
  To: Elijah Newren
  Cc: brian m. carlson, Junio C Hamano, Taylor Blau, rsbecker,
	Kristoffer Haugsbakk, Josh Soref, git, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Sam James,
	Collin Funk, Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble,
	Ramsay Jones, Ezekiel Newren, Josh Steadmon, Calvin Wan

On Sat, Sep 06, 2025 at 09:10:28PM -0700, Elijah Newren wrote:
> On Thu, Sep 4, 2025 at 11:50 PM Patrick Steinhardt <ps@pks.im> wrote:
> > The problem here is that we don't have a story to tell yet. I agree that
> > not everyone always needs the latest and greatest, which is also why I
> > mentioned that I think it's fine for _new_ features to be developed in
> > Rust right away.
> >
> > But the story is altogether different for bug and security fixes.
> >
> >   - We of course backport security fixes, but would that also be the
> >     case if we had ported the subsystem to Rust already and now had to
> >     implement the security fix twice?
> >
> >   - What happens if only the old C version has a security bug? Do we
> >     still fix it?
> >
> >   - Likewise, what happens with important bug fixes? We tend to backport
> >     those that are easy-ish to backport, but if people are potentially
> >     stuck with an older Git version for years it will become harder for
> >     us to do so.
> >
> > I think without us having a proper answer to these questions we _are_
> > pulling the rug away. Distros may be stuck with an old version of Git
> > for a significant time, and from my point of view we have to do a couple
> > of compromises there.
> 
> These are good questions...but they are ones to which I suspect
> delaying will not provide the answer.  In fact, I don't think we'll
> _ever_ have the answer to these questions, no matter how much we delay
> or discuss.  Traditionally, if an issue was more severe, it has been
> backported to more versions, even if the backport wasn't trivial.
> There's a cost/benefit tradeoff to be had for each vulnerability, and
> changes to the area making backports either be easy or hard always
> need to be weighed against the severity of the vulnerability.  I don't
> see that changing, and overpromising hurts in the long run probably
> more than having no guidance.  I just don't see us coming up with
> "proper answers" (which I'm guessing means fully spelled out answers?)
> to these questions ahead of time.

That's fair, I guess. I don't think we need to fully spell out the
answer to each of these questions. But I think we should have some
general alignment on how we'll handle the last non-Rust release, and
what some guarantees are that we can and want to provide.

> The answer to all of them is
> probably "we'll weigh the severity of the issue and the cost to
> backport and give the last C-only version significant extra weight in
> our considerations".  I doubt we'll ever be able to promise any more
> detail than that until we get concrete cases; I'm not even sure that
> this statement is acceptable to everyone on the list from the
> overpromising angle despite being as incomplete as it is.

True, we don't want to overburden us, either. This is mostly why I
proposed the compromise of saying "We provide you with updates for the
LTS version for X amount of time. If you still depend on it after that
time, we will be happy to pass over maintainership of that branch to the
community."

[snip]
> I still don't see why distributors _must_ ship the latest version of
> Git and why folks on some platforms are considered broken if they are
> using a slightly older version.  Let me ask again: has anyone answered
> why this is considered mandatory?  If they have, I've missed it, but
> I've asked multiple times.  Even if you want to lump "distributors
> cannot build a newer version" under the umbrella of "breaking
> changes", I argue it's a much different kind of break and one which
> merits different timelines for handling than e.g. lumping it in with
> 3.0.

To me it's not necessarily about the _latest_ version, rather about
_any_ version. Some distributions will not be able to build Git at all
anymore, so they are stuck at the last non-Rust version for the time
being. And seeing that the timeline is years for them to get Rust
support they may not be on a slightly older version, but on an ancient
version eventually.

So the question to me is less whether users of that distro will miss out
on new features, which I think is acceptable. Hence my statement that it
is fine from my point of view for new features to be written in Rust
immediately.

    NB: There's some nuance here. If newer features mean that users
    cannot interact with modern upstreams anymore the picture would
    change quite significantly. But the only work that really comes to
    my mind is SHA256, which already exists. I don't know whether the
    interop code may fall into this category, I hope it doesn't.

But the bigger question is whether that old version still gets security
updates and important bug fixes. If the only available non-Rust version
is riddled with security holes then these distros won't be able to
provide it at all anymore.

> > > (4) you suggest that adding Rust as an optional component should avoid
> > > the problem, yet we've already had Rust as an optional component for
> > > the last three releases, going back to 2.49.0.  (libgit-rs and
> > > libgit-sys).
> >
> > I don't really think that either libgit-rs or libgit-sys help in any
> > way. These are part of "contrib/", not built by default, and neither are
> > they consumed by anyone out there. So there is no reason for anyone to
> > build that library to the best of my knowledge.
> 
> I'm fully willing to accept they are inadequate notice (and perhaps
> even barely helpful), but disagree with the characterization that they
> don't help at all:
>   * they were consumed in the past by Google
>   * they recently received patches from someone outside Google
> (https://lore.kernel.org/git/20250826233525.2635432-1-davvid@gmail.com/)
>   * they were mentioned in the release notes highlighting at a minimum
> that Rust is being added to Git.
>   * they were highlighted in blog posts from both GitLab and GitHub as
> being noteworthy new things in the v2.49.0 release

Okay, fair.

> I agree with you there is certainly more we can do, and I like your
> idea of a test ballon.  Let's just avoid repeating the problem of
> adding an optional component that no one will try to build except for
> those for whom we know can build it; doing that would provide no more
> notice and thus provide no incremental benefit over libgit-rs and
> libgit-sys.

Fully agreed. My current proposal includes several steps of how we
tighten the screws here, where we gradually start to require Rust by
default on more platforms. Before Git 3.0 it's still possible to opt
out, but eventually distributors need to opt out explicitly. So that
should hopefully alert them that something is cooking.

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-05 15:37                                         ` Junio C Hamano
@ 2025-09-08  6:40                                           ` Patrick Steinhardt
  0 siblings, 0 replies; 198+ messages in thread
From: Patrick Steinhardt @ 2025-09-08  6:40 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Phillip Wood, Elijah Newren, brian m. carlson, Taylor Blau,
	rsbecker, Kristoffer Haugsbakk, Josh Soref, git,
	Christian Brabandt, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Sam James,
	Collin Funk, Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble,
	Ramsay Jones, Ezekiel Newren, Josh Steadmon, Calvin Wan

On Fri, Sep 05, 2025 at 08:37:27AM -0700, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> > 3) There is a period of a small number of years where we continue to
> >    provide security updates for a version of git that can be built
> >    without rust. This is intended to  allow a realistic time for
> >    distributors on platforms without a rust compiler to port one or make
> >    other arrangements for providing future security updates without
> >    placing an undue burden on the project to provide security updates
> >    for niche platforms indefinitely.
> 
> I am not willing to see such a support for multiple years, though.
> If the first item is 6 months, this backporting stale releases
> should be on the same order of timeperiod.
> 
> If it were "3 years of optional period, 18 months of backporting
> security updates", I would find it more realistic.  It would give
> those platform maintainers enough time to robby, fundraise, or
> otherwise campaign to bring Rust on their system.  I personally find
> that 6 months is way too short (if we are _only_ looking for an
> excuse to say "we have given them ample time to react, and now it is
> their problem", 6 months may be good enough, though).

Yeah, I also think that six months is a bit short, but three years on
the other hand feels like it will cause quite some pain on our side. My
plan is shooting for roughly one year of optional support, which is
still way shorter than the three years you mention.

How would you feel about:

  - Pinning a date for Git 3.0 at the end of next year and tying
    mandatory Rust to it.

  - Guaranteeing at least one year of security backports for 2.99 (or
    whatever the last release before Git 3.0 is).

  - Explicitly stating that if anybody requires to maintain that version
    afterwards, we are happy to let the community maintain that branch
    going forward.

That means that we stop supporting 2.99 in a bit more than ~two years
from now, but we don't fully close the door on it if people still rely
on it.

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-07 16:09                                         ` rsbecker
@ 2025-09-08 10:12                                           ` Phillip Wood
  2025-09-08 15:32                                             ` rsbecker
  2025-09-08 15:10                                           ` Ezekiel Newren
  2025-09-08 15:31                                           ` Elijah Newren
  2 siblings, 1 reply; 198+ messages in thread
From: Phillip Wood @ 2025-09-08 10:12 UTC (permalink / raw)
  To: rsbecker, 'Elijah Newren', 'Patrick Steinhardt'
  Cc: 'brian m. carlson', 'Junio C Hamano',
	'Taylor Blau', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'Christian Brabandt',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

Hi Randall

On 07/09/2025 17:09, rsbecker@nexbridge.com wrote:
> On September 7, 2025 12:10 AM, Elijah Newren wrote:
>> Sorry for the delay; life outside of work is challenging at the moment...
>>
> 
> I am going to address the critical point mentioned below and snip the rest for brevity.
> 
>> I still don't see why distributors _must_ ship the latest version of Git and why folks
>> on some platforms are considered broken if they are using a slightly older version.
>> Let me ask again: has anyone answered why this is considered mandatory?  If they
>> have, I've missed it, but I've asked multiple times.  Even if you want to lump
>> "distributors cannot build a newer version" under the umbrella of "breaking
>> changes", I argue it's a much different kind of break and one which merits different
>> timelines for handling than e.g. lumping it in with 3.0.
> 
> I do not see that distributors _must_ ship the latest version. Suppose we are on
> 2.51.0 and a CVE comes out that prohibits its use in an organization that does
> not allow any medium-high to high CVEs. This represents hundreds of thousands
> of impacted users in my community alone. How does the CVE get applied if the
> latest cannot be built and the git team does not apply the CVE fixes to old
> versions. Personally, I do not care if git versions are different between work
> and home, or even between CI/CD and other platforms. I don't even care
> if I have to use JGit instead of git in some situations (which I see is a likely
> outcome of this discussion). Is there an official statement of what an LTS
> means?

We're currently discussing what promises we can make about supporting a 
non-rust version of git.

> In other projects LTS is typically, and formally by policy 5 years.

I know commercial linux distributions offer that kind of support but are 
there really open source projects that guarantee 5 years of security 
updates without any kind of support contract?

>  From what others have said here, positions of 6 months, 3 years, and
> "apply it yourself if you want to continue to use git" have been made.

Yes it is still being discussed, and no one is volunteering to offer 
five years of support.

> The core problem of adding a breaking dependency is when a CVE comes
> out that prohibits git from being used at all. If the git team is not going
> to provide a clear statement, one way or another, if how CVEs (at
> whatever severity level) will not have a commitment of any kind,
> then distributors are essentially cast adrift and on our own. It would
> be helpful of those of us who donate our time, for no compensation,
> are able to plan for this in a meaningful way.

Doesn't your company make a front end to git? Are you saying that the 
management does not allocate any staff time to work on git itself and 
expects the community to provide it with free security updates?

> Please remember that
> we have to justify our participation to our management teams to be
> allowed to continue to participate. 
I'm confused by this, as the sentence before say's you're donating your 
time for no compensation.

> Nothing is free from this end
> and if fixing (not just applying fixes) CVEs are now 100% our
> responsibility, if would be critical to know that when we build our
> business cases to our bosses, who I am fairly certain will say an
> emphatic no.

In the long term, unless your platform gains a rust compiler I'm afraid 
I think that is most likely outcome.

> Also remember that without support from the git team, the
> code base is no longer the same, meaning the auditors will not
> necessarily accept fixes from third-party sources.

I think I saw a suggestion/question about the possibility of hosting any 
long term support branch that is maintained by interested parties within 
the main repository. Would that help?

I appreciate that any move to rust would be very disappointing and 
disruptive to you but the community has to weigh up the benefits rust 
has to offer against that.

Phillip

> This particular
> point enabled adoption on some platforms, particularly NonStop.
> Adoption was at 1-2 customers when we had a divergent code
> based because some platform fixes being different from the
> standard code-base and could not be certified as valid. Once the
> code-base because common, adoption was rapid and enthusiastic.
> If this goes away, I suspect that adoption rates will go negative.
> I am aware that that particular discussion is actually happening
> in some organizations in my community right now, with companies
> looking for alternatives to git based on this discussion thread.
> 
> With over a decade of respect and participation,
> Randall
> 


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-07 16:09                                         ` rsbecker
  2025-09-08 10:12                                           ` Phillip Wood
@ 2025-09-08 15:10                                           ` Ezekiel Newren
  2025-09-08 15:41                                             ` rsbecker
  2025-09-08 15:31                                           ` Elijah Newren
  2 siblings, 1 reply; 198+ messages in thread
From: Ezekiel Newren @ 2025-09-08 15:10 UTC (permalink / raw)
  To: rsbecker
  Cc: Elijah Newren, Patrick Steinhardt, brian m. carlson,
	Junio C Hamano, Taylor Blau, Kristoffer Haugsbakk, Josh Soref,
	git, Christian Brabandt, Phillip Wood, Eli Schwartz,
	Haelwenn (lanodan) Monnier, Johannes Schindelin,
	Matthias Aßhauer, Sam James, Collin Funk, Mike Hommey,
	Pierre-Emmanuel Patry, D. Ben Knoble, Ramsay Jones, Josh Steadmon,
	Calvin Wan

On Sun, Sep 7, 2025 at 10:10 AM <rsbecker@nexbridge.com> wrote:
>
> On September 7, 2025 12:10 AM, Elijah Newren wrote:
> >Sorry for the delay; life outside of work is challenging at the moment...
> >
>
> I am going to address the critical point mentioned below and snip the rest for brevity.
>
> >I still don't see why distributors _must_ ship the latest version of Git and why folks
> >on some platforms are considered broken if they are using a slightly older version.
> >Let me ask again: has anyone answered why this is considered mandatory?  If they
> >have, I've missed it, but I've asked multiple times.  Even if you want to lump
> >"distributors cannot build a newer version" under the umbrella of "breaking
> >changes", I argue it's a much different kind of break and one which merits different
> >timelines for handling than e.g. lumping it in with 3.0.
>
> I do not see that distributors _must_ ship the latest version. Suppose we are on
> 2.51.0 and a CVE comes out that prohibits its use in an organization that does
> not allow any medium-high to high CVEs. This represents hundreds of thousands
> of impacted users in my community alone. How does the CVE get applied if the
> latest cannot be built and the git team does not apply the CVE fixes to old
> versions. Personally, I do not care if git versions are different between work
> and home, or even between CI/CD and other platforms. I don't even care
> ...

Ok, that answers the question for NonStop, but that doesn't answer the
question for the plethora of other distributions. Most distributions
don't ship the latest version of Git in their package manager, and if
an organization deems it critical to have the latest they can build it
themselves and ignore the Git version in the package manager. So why
does Windows, Mac, Linux, etc... _need_ the latest version of Git in
the package manager?

If security updates are backported to NonStop, until that platform
supports Rust, then I don't see why using an older version of Git in
Windows, Mac, Linux, etc... is a catastrophe. Most existing
distributions _can_ package the latest version of Git, but they
_don't_.

I reiterate Elijah's question "Why _must_ distributors ship the latest
version of Git?".

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-07 16:09                                         ` rsbecker
  2025-09-08 10:12                                           ` Phillip Wood
  2025-09-08 15:10                                           ` Ezekiel Newren
@ 2025-09-08 15:31                                           ` Elijah Newren
  2025-09-08 15:36                                             ` rsbecker
  2 siblings, 1 reply; 198+ messages in thread
From: Elijah Newren @ 2025-09-08 15:31 UTC (permalink / raw)
  To: rsbecker
  Cc: Patrick Steinhardt, brian m. carlson, Junio C Hamano, Taylor Blau,
	Kristoffer Haugsbakk, Josh Soref, git, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Sam James,
	Collin Funk, Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble,
	Ramsay Jones, Ezekiel Newren, Josh Steadmon, Calvin Wan

On Sun, Sep 7, 2025 at 9:10 AM <rsbecker@nexbridge.com> wrote:
>
> On September 7, 2025 12:10 AM, Elijah Newren wrote:
> >Sorry for the delay; life outside of work is challenging at the moment...
> >
>
> I am going to address the critical point mentioned below and snip the rest for brevity.
>
> >I still don't see why distributors _must_ ship the latest version of Git and why folks
> >on some platforms are considered broken if they are using a slightly older version.
> >Let me ask again: has anyone answered why this is considered mandatory?  If they
> >have, I've missed it, but I've asked multiple times.  Even if you want to lump
> >"distributors cannot build a newer version" under the umbrella of "breaking
> >changes", I argue it's a much different kind of break and one which merits different
> >timelines for handling than e.g. lumping it in with 3.0.
>
> I do not see that distributors _must_ ship the latest version. Suppose we are on
> 2.51.0 and a CVE comes out that prohibits its use in an organization that does
> not allow any medium-high to high CVEs. This represents hundreds of thousands
> of impacted users in my community alone. How does the CVE get applied if the
> latest cannot be built and the git team does not apply the CVE fixes to old
> versions. Personally, I do not care if git versions are different between work
> and home, or even between CI/CD and other platforms. I don't even care
> if I have to use JGit instead of git in some situations (which I see is a likely
> outcome of this discussion). Is there an official statement of what an LTS
> means? In other projects LTS is typically, and formally by policy 5 years.
> From what others have said here, positions of 6 months, 3 years, and
> "apply it yourself if you want to continue to use git" have been made.
>
> The core problem of adding a breaking dependency is when a CVE comes
> out that prohibits git from being used at all. If the git team is not going
> to provide a clear statement, one way or another, if how CVEs (at
> whatever severity level) will not have a commitment of any kind,
> then distributors are essentially cast adrift and on our own. It would
> be helpful of those of us who donate our time, for no compensation,
> are able to plan for this in a meaningful way. Please remember that
> we have to justify our participation to our management teams to be
> allowed to continue to participate. Nothing is free from this end
> and if fixing (not just applying fixes) CVEs are now 100% our
> responsibility, if would be critical to know that when we build our
> business cases to our bosses, who I am fairly certain will say an
> emphatic no.
>
> Also remember that without support from the git team, the
> code base is no longer the same, meaning the auditors will not
> necessarily accept fixes from third-party sources. This particular
> point enabled adoption on some platforms, particularly NonStop.
> Adoption was at 1-2 customers when we had a divergent code
> based because some platform fixes being different from the
> standard code-base and could not be certified as valid. Once the
> code-base because common, adoption was rapid and enthusiastic.
> If this goes away, I suspect that adoption rates will go negative.
> I am aware that that particular discussion is actually happening
> in some organizations in my community right now, with companies
> looking for alternatives to git based on this discussion thread.
>
> With over a decade of respect and participation,
> Randall

Thanks, Randall, this is useful information.  In regards to one point
not fully covered by Phillip:

> Also remember that without support from the git team, the
> code base is no longer the same, meaning the auditors will not
> necessarily accept fixes from third-party sources.

Why does it need to be "third-party" sources?  Linus years ago blessed
having someone else be in charge of providing updates for stable
releases of Linux.  Junio could do the same with Git and similarly
mark an individual or group of people as the maintainers for the last
Rust-optional version of Git, and those individuals could make
official releases of Git with extended security fix support.  Then
it's not every platform repeating the backporting work that needs to
be done, but rather individuals from the affected platform(s)
collaborating on that work and then making official first-party
releases.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* RE: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-08 10:12                                           ` Phillip Wood
@ 2025-09-08 15:32                                             ` rsbecker
  0 siblings, 0 replies; 198+ messages in thread
From: rsbecker @ 2025-09-08 15:32 UTC (permalink / raw)
  To: 'Phillip Wood', 'Elijah Newren',
	'Patrick Steinhardt'
  Cc: 'brian m. carlson', 'Junio C Hamano',
	'Taylor Blau', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'Christian Brabandt',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On September 8, 2025 6:12 AM, Phillip Wood wrote:
>On 07/09/2025 17:09, rsbecker@nexbridge.com wrote:
>> On September 7, 2025 12:10 AM, Elijah Newren wrote:
>>> Sorry for the delay; life outside of work is challenging at the moment...
>>>
>>
>> I am going to address the critical point mentioned below and snip the rest for
>brevity.
>>
>>> I still don't see why distributors _must_ ship the latest version of
>>> Git and why folks on some platforms are considered broken if they are using a
>slightly older version.
>>> Let me ask again: has anyone answered why this is considered
>>> mandatory?  If they have, I've missed it, but I've asked multiple
>>> times.  Even if you want to lump "distributors cannot build a newer
>>> version" under the umbrella of "breaking changes", I argue it's a
>>> much different kind of break and one which merits different timelines for
>handling than e.g. lumping it in with 3.0.
>>
>> I do not see that distributors _must_ ship the latest version. Suppose
>> we are on
>> 2.51.0 and a CVE comes out that prohibits its use in an organization
>> that does not allow any medium-high to high CVEs. This represents
>> hundreds of thousands of impacted users in my community alone. How
>> does the CVE get applied if the latest cannot be built and the git
>> team does not apply the CVE fixes to old versions. Personally, I do
>> not care if git versions are different between work and home, or even
>> between CI/CD and other platforms. I don't even care if I have to use
>> JGit instead of git in some situations (which I see is a likely
>> outcome of this discussion). Is there an official statement of what an LTS means?
>
>We're currently discussing what promises we can make about supporting a non-rust
>version of git.
>
>> In other projects LTS is typically, and formally by policy 5 years.
>
>I know commercial linux distributions offer that kind of support but are there really
>open source projects that guarantee 5 years of security updates without any kind
>of support contract?

OpenSSL provides 5 years of security fix support (at no cost) for LTS designated
releases. Currently 3.0 ending Sept 2026 and 3.5 ending around October 2030.
After those dates, there is a fee-based support arrangement available.

>>  From what others have said here, positions of 6 months, 3 years, and
>> "apply it yourself if you want to continue to use git" have been made.
>
>Yes it is still being discussed, and no one is volunteering to offer five years of
>support.
>
>> The core problem of adding a breaking dependency is when a CVE comes
>> out that prohibits git from being used at all. If the git team is not
>> going to provide a clear statement, one way or another, if how CVEs
>> (at whatever severity level) will not have a commitment of any kind,
>> then distributors are essentially cast adrift and on our own. It would
>> be helpful of those of us who donate our time, for no compensation,
>> are able to plan for this in a meaningful way.
>
>Doesn't your company make a front end to git? Are you saying that the
>management does not allocate any staff time to work on git itself and expects the
>community to provide it with free security updates?

The REAL PROBLEM that is not being addressed in this thread is that large
companies (the ones who process your credit cards, build your cars,
manufacture your drugs, and tool your factories (a.k.a. NonStop customers),
are generally unwilling to accept CVE fixes from third parties. The fixes have
to be part of the official code base or the fixes will not accepted. That means
that either:

1. the git team has to officially sanction the fixes; or

2. do the fixes themselves.

A compromise may be possible to keep a support branch around in the official
git repo, for those of us who do not have rust available to contribute to,
specifically for post C-deprecation CVE fixes, but I am not sure this is practical.
It would also require occasional assistance from the git team, to make sense of
some of the fixes, If they apply, as none of us are rust experts.

I am already allocated to spending between 10 and 20 hours a month to git,
which usually involves running and verifying build/test cycles. Since git tests
are flakey, in some cases, I have to manually examine each failure
situation and decide whether the failures are sufficient to pass the releases.
These have been reported previously without resolution and do not bear
discussion here.

It is important to understand that many git customers in high audit situations
build git on their own because they do not trust third party builds, so this
needs to remain an option.

>> Please remember that
>> we have to justify our participation to our management teams to be
>> allowed to continue to participate.
>I'm confused by this, as the sentence before say's you're donating your time for no
>compensation.

My company pretends to donates my time with very little direct benefit to
them. My participation is because I feel it is important for my community.
No, I do not get a salary for my git time. It is evenings and weekends. Any time
I spend during working hours has to be made up during off hours.

>
>> Nothing is free from this end
>> and if fixing (not just applying fixes) CVEs are now 100% our
>> responsibility, if would be critical to know that when we build our
>> business cases to our bosses, who I am fairly certain will say an
>> emphatic no.
>
>In the long term, unless your platform gains a rust compiler I'm afraid I think that is
>most likely outcome.
>
>> Also remember that without support from the git team, the code base is
>> no longer the same, meaning the auditors will not necessarily accept
>> fixes from third-party sources.
>
>I think I saw a suggestion/question about the possibility of hosting any long term
>support branch that is maintained by interested parties within the main repository.
>Would that help?

DEFINITELY. With assistance as above. With some help, we may be able to make
this work. It might require a deeper participation on my part and those on my
team to approve changes, which we would consider.

>I appreciate that any move to rust would be very disappointing and disruptive to
>you but the community has to weigh up the benefits rust has to offer against that.

The community has plans for Rust but they have not taken shape fully as of yet. I
have personally been badgering product management to make this happen, and I
might know more in a month or so. However, this takes time, and 6 months is not
enough.

--Randall


^ permalink raw reply	[flat|nested] 198+ messages in thread

* RE: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-08 15:31                                           ` Elijah Newren
@ 2025-09-08 15:36                                             ` rsbecker
  2025-09-08 16:13                                               ` Elijah Newren
  0 siblings, 1 reply; 198+ messages in thread
From: rsbecker @ 2025-09-08 15:36 UTC (permalink / raw)
  To: 'Elijah Newren'
  Cc: 'Patrick Steinhardt', 'brian m. carlson',
	'Junio C Hamano', 'Taylor Blau',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'Christian Brabandt', 'Phillip Wood',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On September 8, 2025 11:31 AM, Elijah Newren wrote:
>On Sun, Sep 7, 2025 at 9:10 AM <rsbecker@nexbridge.com> wrote:
>>
>> On September 7, 2025 12:10 AM, Elijah Newren wrote:
>> >Sorry for the delay; life outside of work is challenging at the moment...
>> >
>>
>> I am going to address the critical point mentioned below and snip the rest for
>brevity.
>>
>> >I still don't see why distributors _must_ ship the latest version of
>> >Git and why folks on some platforms are considered broken if they are using a
>slightly older version.
>> >Let me ask again: has anyone answered why this is considered
>> >mandatory?  If they have, I've missed it, but I've asked multiple
>> >times.  Even if you want to lump "distributors cannot build a newer
>> >version" under the umbrella of "breaking changes", I argue it's a
>> >much different kind of break and one which merits different timelines for
>handling than e.g. lumping it in with 3.0.
>>
>> I do not see that distributors _must_ ship the latest version. Suppose
>> we are on
>> 2.51.0 and a CVE comes out that prohibits its use in an organization
>> that does not allow any medium-high to high CVEs. This represents
>> hundreds of thousands of impacted users in my community alone. How
>> does the CVE get applied if the latest cannot be built and the git
>> team does not apply the CVE fixes to old versions. Personally, I do
>> not care if git versions are different between work and home, or even
>> between CI/CD and other platforms. I don't even care if I have to use
>> JGit instead of git in some situations (which I see is a likely
>> outcome of this discussion). Is there an official statement of what an LTS means?
>In other projects LTS is typically, and formally by policy 5 years.
>> From what others have said here, positions of 6 months, 3 years, and
>> "apply it yourself if you want to continue to use git" have been made.
>>
>> The core problem of adding a breaking dependency is when a CVE comes
>> out that prohibits git from being used at all. If the git team is not
>> going to provide a clear statement, one way or another, if how CVEs
>> (at whatever severity level) will not have a commitment of any kind,
>> then distributors are essentially cast adrift and on our own. It would
>> be helpful of those of us who donate our time, for no compensation,
>> are able to plan for this in a meaningful way. Please remember that we
>> have to justify our participation to our management teams to be
>> allowed to continue to participate. Nothing is free from this end and
>> if fixing (not just applying fixes) CVEs are now 100% our
>> responsibility, if would be critical to know that when we build our
>> business cases to our bosses, who I am fairly certain will say an
>> emphatic no.
>>
>> Also remember that without support from the git team, the code base is
>> no longer the same, meaning the auditors will not necessarily accept
>> fixes from third-party sources. This particular point enabled adoption
>> on some platforms, particularly NonStop.
>> Adoption was at 1-2 customers when we had a divergent code based
>> because some platform fixes being different from the standard
>> code-base and could not be certified as valid. Once the code-base
>> because common, adoption was rapid and enthusiastic.
>> If this goes away, I suspect that adoption rates will go negative.
>> I am aware that that particular discussion is actually happening in
>> some organizations in my community right now, with companies looking
>> for alternatives to git based on this discussion thread.
>>
>> With over a decade of respect and participation, Randall
>
>Thanks, Randall, this is useful information.  In regards to one point not fully covered
>by Phillip:
>
>> Also remember that without support from the git team, the code base is
>> no longer the same, meaning the auditors will not necessarily accept
>> fixes from third-party sources.
>
>Why does it need to be "third-party" sources?  Linus years ago blessed having
>someone else be in charge of providing updates for stable releases of Linux.  Junio
>could do the same with Git and similarly mark an individual or group of people as
>the maintainers for the last Rust-optional version of Git, and those individuals could
>make official releases of Git with extended security fix support.  Then it's not every
>platform repeating the backporting work that needs to be done, but rather
>individuals from the affected platform(s) collaborating on that work and then
>making official first-party releases.

Linux has one set of rules, and other platforms have others. I do not define the
audit requirements for PCI, SWIFT, or HIPPA compliance (and other rules outside
of North America), which apply one way or another to most of my community.
The audit teams, which are both internal to the companies and at
governmental regulatory levels, do this. It is 100% out of my control but is a
reality. Fixes to any code involved in managing financial and health instruments
must be done by authorized and recognized sources. I am not one of them.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* RE: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-08 15:10                                           ` Ezekiel Newren
@ 2025-09-08 15:41                                             ` rsbecker
  0 siblings, 0 replies; 198+ messages in thread
From: rsbecker @ 2025-09-08 15:41 UTC (permalink / raw)
  To: 'Ezekiel Newren'
  Cc: 'Elijah Newren', 'Patrick Steinhardt',
	'brian m. carlson', 'Junio C Hamano',
	'Taylor Blau', 'Kristoffer Haugsbakk',
	'Josh Soref', git, 'Christian Brabandt',
	'Phillip Wood', 'Eli Schwartz',
	'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Josh Steadmon',
	'Calvin Wan'

On September 8, 2025 11:10 AM, Ezekiel Newren wrote:
>On Sun, Sep 7, 2025 at 10:10 AM <rsbecker@nexbridge.com> wrote:
>>
>> On September 7, 2025 12:10 AM, Elijah Newren wrote:
>> >Sorry for the delay; life outside of work is challenging at the moment...
>> >
>>
>> I am going to address the critical point mentioned below and snip the rest for
>brevity.
>>
>> >I still don't see why distributors _must_ ship the latest version of
>> >Git and why folks on some platforms are considered broken if they are using a
>slightly older version.
>> >Let me ask again: has anyone answered why this is considered
>> >mandatory?  If they have, I've missed it, but I've asked multiple
>> >times.  Even if you want to lump "distributors cannot build a newer
>> >version" under the umbrella of "breaking changes", I argue it's a
>> >much different kind of break and one which merits different timelines for
>handling than e.g. lumping it in with 3.0.
>>
>> I do not see that distributors _must_ ship the latest version. Suppose
>> we are on
>> 2.51.0 and a CVE comes out that prohibits its use in an organization
>> that does not allow any medium-high to high CVEs. This represents
>> hundreds of thousands of impacted users in my community alone. How
>> does the CVE get applied if the latest cannot be built and the git
>> team does not apply the CVE fixes to old versions. Personally, I do
>> not care if git versions are different between work and home, or even
>> between CI/CD and other platforms. I don't even care ...
>
>Ok, that answers the question for NonStop, but that doesn't answer the question
>for the plethora of other distributions. Most distributions don't ship the latest
>version of Git in their package manager, and if an organization deems it critical to
>have the latest they can build it themselves and ignore the Git version in the
>package manager. So why does Windows, Mac, Linux, etc... _need_ the latest
>version of Git in the package manager?
>
>If security updates are backported to NonStop, until that platform supports Rust,
>then I don't see why using an older version of Git in Windows, Mac, Linux, etc... is a
>catastrophe. Most existing distributions _can_ package the latest version of Git, but
>they _don't_.
>
>I reiterate Elijah's question "Why _must_ distributors ship the latest version of
>Git?".

My emphatic answer is that they do not. There is no requirement from me or anyone
I know to ship the latest version. What is crucial is that there be fixes for medium-high
and above CVEs that are delivered in 30 days from initial fix availability (that would be
in Rust, for this conversation and applied to C). If that were supported, I could live with
as would my customers and their auditors for LTS releases. Please see my expectation
of LTS defined elsewhere in this thread - essentially 5 years. Perhaps 3 at a bare minimum.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-08 15:36                                             ` rsbecker
@ 2025-09-08 16:13                                               ` Elijah Newren
  2025-09-08 17:01                                                 ` rsbecker
  0 siblings, 1 reply; 198+ messages in thread
From: Elijah Newren @ 2025-09-08 16:13 UTC (permalink / raw)
  To: rsbecker
  Cc: Patrick Steinhardt, brian m. carlson, Junio C Hamano, Taylor Blau,
	Kristoffer Haugsbakk, Josh Soref, git, Christian Brabandt,
	Phillip Wood, Eli Schwartz, Haelwenn (lanodan) Monnier,
	Johannes Schindelin, Matthias Aßhauer, Sam James,
	Collin Funk, Mike Hommey, Pierre-Emmanuel Patry, D. Ben Knoble,
	Ramsay Jones, Ezekiel Newren, Josh Steadmon, Calvin Wan

On Mon, Sep 8, 2025 at 8:37 AM <rsbecker@nexbridge.com> wrote:
>
> On September 8, 2025 11:31 AM, Elijah Newren wrote:
> >On Sun, Sep 7, 2025 at 9:10 AM <rsbecker@nexbridge.com> wrote:
> >>
> >> On September 7, 2025 12:10 AM, Elijah Newren wrote:

> >Thanks, Randall, this is useful information.  In regards to one point not fully covered
> >by Phillip:
> >
> >> Also remember that without support from the git team, the code base is
> >> no longer the same, meaning the auditors will not necessarily accept
> >> fixes from third-party sources.
> >
> >Why does it need to be "third-party" sources?  Linus years ago blessed having
> >someone else be in charge of providing updates for stable releases of Linux.  Junio
> >could do the same with Git and similarly mark an individual or group of people as
> >the maintainers for the last Rust-optional version of Git, and those individuals could
> >make official releases of Git with extended security fix support.  Then it's not every
> >platform repeating the backporting work that needs to be done, but rather
> >individuals from the affected platform(s) collaborating on that work and then
> >making official first-party releases.
>
> Linux has one set of rules, and other platforms have others. I do not define the
> audit requirements for PCI, SWIFT, or HIPPA compliance (and other rules outside
> of North America), which apply one way or another to most of my community.
> The audit teams, which are both internal to the companies and at
> governmental regulatory levels, do this. It is 100% out of my control but is a
> reality. Fixes to any code involved in managing financial and health instruments
> must be done by authorized and recognized sources. I am not one of them.

Perhaps I wasn't clear?  Let me try to summarize what I've understood
of the conversation:

Randall: We need to have official git releases for the last
Rust-optional release.
Elijah: Great!  Let's enable interested folks to make official git
releases for the last Rust-optional release.
Randall: We need to have official git releases for the last
Rust-optional release.

Which makes me just want to repeat what I said last time -- let's
enable some folks to do that.

^ permalink raw reply	[flat|nested] 198+ messages in thread

* RE: [PATCH v3 02/15] xdiff: introduce rust
  2025-09-08 16:13                                               ` Elijah Newren
@ 2025-09-08 17:01                                                 ` rsbecker
  0 siblings, 0 replies; 198+ messages in thread
From: rsbecker @ 2025-09-08 17:01 UTC (permalink / raw)
  To: 'Elijah Newren'
  Cc: 'Patrick Steinhardt', 'brian m. carlson',
	'Junio C Hamano', 'Taylor Blau',
	'Kristoffer Haugsbakk', 'Josh Soref', git,
	'Christian Brabandt', 'Phillip Wood',
	'Eli Schwartz', 'Haelwenn (lanodan) Monnier',
	'Johannes Schindelin', 'Matthias Aßhauer',
	'Sam James', 'Collin Funk', 'Mike Hommey',
	'Pierre-Emmanuel Patry', 'D. Ben Knoble',
	'Ramsay Jones', 'Ezekiel Newren',
	'Josh Steadmon', 'Calvin Wan'

On September 8, 2025 12:13 PM, Elijah Newren wrote:
>On Mon, Sep 8, 2025 at 8:37 AM <rsbecker@nexbridge.com> wrote:
>>
>> On September 8, 2025 11:31 AM, Elijah Newren wrote:
>> >On Sun, Sep 7, 2025 at 9:10 AM <rsbecker@nexbridge.com> wrote:
>> >>
>> >> On September 7, 2025 12:10 AM, Elijah Newren wrote:
>
>> >Thanks, Randall, this is useful information.  In regards to one point not fully
>covered
>> >by Phillip:
>> >
>> >> Also remember that without support from the git team, the code base is
>> >> no longer the same, meaning the auditors will not necessarily accept
>> >> fixes from third-party sources.
>> >
>> >Why does it need to be "third-party" sources?  Linus years ago blessed having
>> >someone else be in charge of providing updates for stable releases of Linux.
>Junio
>> >could do the same with Git and similarly mark an individual or group of people as
>> >the maintainers for the last Rust-optional version of Git, and those individuals
>could
>> >make official releases of Git with extended security fix support.  Then it's not
>every
>> >platform repeating the backporting work that needs to be done, but rather
>> >individuals from the affected platform(s) collaborating on that work and then
>> >making official first-party releases.
>>
>> Linux has one set of rules, and other platforms have others. I do not define the
>> audit requirements for PCI, SWIFT, or HIPPA compliance (and other rules outside
>> of North America), which apply one way or another to most of my community.
>> The audit teams, which are both internal to the companies and at
>> governmental regulatory levels, do this. It is 100% out of my control but is a
>> reality. Fixes to any code involved in managing financial and health instruments
>> must be done by authorized and recognized sources. I am not one of them.
>
>Perhaps I wasn't clear?  Let me try to summarize what I've understood
>of the conversation:
>
>Randall: We need to have official git releases for the last
>Rust-optional release.
>Elijah: Great!  Let's enable interested folks to make official git
>releases for the last Rust-optional release.
>Randall: We need to have official git releases for the last
>Rust-optional release.
>
>Which makes me just want to repeat what I said last time -- let's
>enable some folks to do that.

Ok, what does that look like. When and how? Will I get added to the list of
committers for this? How many people? Starting when, ending when? It would
be very nice to have some kind of project plan for this with dependencies
and milestones - I have been asking. If someone sends me a list, I can start
officially tracking it.


^ permalink raw reply	[flat|nested] 198+ messages in thread

* gitoxide-compatible licensing of Git's Rust code, was Re: [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-07-20 10:14     ` Phillip Wood
@ 2025-09-23  9:57       ` Johannes Schindelin
  2025-09-23 17:48         ` Jeff King
  0 siblings, 1 reply; 198+ messages in thread
From: Johannes Schindelin @ 2025-09-23  9:57 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren, brian m. carlson

Hi Phillip,

On Sun, 20 Jul 2025, Phillip Wood wrote:

> Hi Johannes
> 
> On 19/07/2025 22:53, Johannes Schindelin wrote:
> > Hi Ezekiel,
> > 
> > On Thu, 17 Jul 2025, Ezekiel Newren via GitGitGadget wrote:
> > 
> > > diff --git a/rust/xdiff/src/lib.rs b/rust/xdiff/src/lib.rs
> > > index e69de29bb2d1..96975975a1ba 100644
> > > --- a/rust/xdiff/src/lib.rs
> > > +++ b/rust/xdiff/src/lib.rs
> > > @@ -0,0 +1,7 @@
> > > +
> > > +
> > > +#[no_mangle]
> > > +unsafe extern "C" fn xxh3_64(ptr: *const u8, size: usize) -> u64 {
> > > +    let slice = std::slice::from_raw_parts(ptr, size);
> > > +    xxhash_rust::xxh3::xxh3_64(slice)
> > > +}
> > 
> > I know that this is a pretty small file, but I do notice that it does not
> > have a license header.
> > 
> > This reminds me of the unfortunate oversight to be careful about making
> > (and keeping) libgit.a's source files compatible with libgit2's license to
> > nurture a fruitful exchange between those two projects.
> 
> I'm not sure I follow your reasoning here. libgit2 was started after git and
> chose to use an incompatible license. I wasn't around at the time but isn't
> there a list of git contributors who are happy to re-license their
> contributions with the linking exception used by libgit2?

Let me provide some historical context that might clarify the licensing
concern.

When libgit2 was created, the goal was to demonstrate that Git
functionality could be packaged as a reusable library, inviting innovation
via 3rd-party products. The project took existing Git code (with
permission) and wrapped it with a proper API. The hope was that this would
eventually become the foundation for Git itself.

What we learned from that experience is instructive: license
incompatibility became an insurmountable barrier to code sharing between
the projects.

Lack of functionality prevented commercial products using libgit2 to
provide powerful user interfaces that outshine Git's own user experience,
unless they accepted the limited functionality.

Even when volunteers wanted to port new Git features to libgit2, the
licensing prevented it.

This fragmentation weakened both projects - libgit2 couldn't benefit from
Git's innovations, and Git couldn't leverage libgit2's API improvements
nor corporate contributions that would have been more likely to target
libgit2 than Git.

You can see this in full action: merge ORT, partial clone, sparse index,
etc. All of those features are missing from libgit2, with little hope to
end up otherwise.

Innovations such as geographically-distributed, redundant data stores, or
Xet-like big-file storage to replace e.g. Git LFS with a fully native
solution, haven't happened, despite libgit2's architecture offering the
extensibility and proper delineation to make such improvements cleaner
and much more straight-forward than Git's own source code would allow.

> > With Rust, we still have a really good chance to learn from history and
> > avoid that mistake: Gitoxide is a very exciting project with clear overlap
> > in its mission to implement Git functionality in Rust. Gitoxide is
> > dual-licensed under the Apache License v2 and the MIT license (see
> > https://github.com/GitoxideLabs/gitoxide?tab=readme-ov-file#license).
> > 
> > Would you mind adding a license header to that file that explicitly allows
> > the contents of the file to be used in Gitoxide, to get the Rust effort
> > started on a good foot?
> 
> I wary of that for two reasons. Firstly over time it is de-facto re-licensing
> git as the amount of rust code grows and the amount of C code shrinks which
> deserves a wider discussion. Secondly it makes it harder to convert our C code
> which is licensed under GPL2 (or in the case of xdiff LGPL) to rust if the
> rust code uses a different license.

The industry adopted libgit2 widely precisely because it provided what Git
didn't: a clean API for building tools. But the licensing barrier meant
that innovation had to happen in isolation.

Due to the feature disparity, we saw "libgit2 evacuation" efforts,
starting with Visual Studio, later GitHub and GitLab followed, where work
was duplicated by moving away from libgit2 towards a distinctly non-API
way to invoke Git functionality: by spawning full-blown `git` processes
and communicating by parsing `stdout`, risking regressions due to typo
fixes such as the infamous "up-to-date -> up to date" patches. Such a lot
of extra work, away from proper API calls, just because of that
fragmentation!

With Rust and Gitoxide, we have a rare opportunity to avoid this
fragmentation from the start. Gitoxide's permissive dual licensing means
code can flow both ways. This isn't about "slipping in" a license change -
it's about learning from what happened before.

By the way, you made it sound as if I asked to re-license existing code,
which is not the case. I specifically asked for new code to be licensed in
a way that avoids to straight up prevent collaboration with the Gitoxide
project from the get-go.

It would not even take more than something as simple as GPLv2+exception.
We do have prior art for that: The Git project itself suggests in its very
own `COPYING` file to use the following license in new files:

        This file is licensed under the GPL v2, or a later version
        at the discretion of Linus.

Note the exception? For new Rust code (and of course excluding code that
has been ported verbatim from GPLv2-licensed code), GPL v2 could be used
with an exception along these lines: This file is licensed under the GPL
v2, with the exception that it can be freely used in the Gitoxide project.

I am not a lawyer (which everybody but laywers are nowadays required to
say), therefore this likely needs some tweaking.

> If someone wants to start a discussion about re-licensing git (and is
> prepared to do all of the associated admin in the event that it happens)
> then by all means do so but I don't think it we want to slip such a
> change into this series.

The "wider discussion" you mention is exactly what we need; Starting with
compatible licensing makes that discussion possible rather than
purely theoretical and moot.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: gitoxide-compatible licensing of Git's Rust code, was Re: [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-09-23  9:57       ` gitoxide-compatible licensing of Git's Rust code, was " Johannes Schindelin
@ 2025-09-23 17:48         ` Jeff King
  2025-09-24 13:48           ` Phillip Wood
  0 siblings, 1 reply; 198+ messages in thread
From: Jeff King @ 2025-09-23 17:48 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Phillip Wood, Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren, brian m. carlson

On Tue, Sep 23, 2025 at 11:57:18AM +0200, Johannes Schindelin wrote:

> It would not even take more than something as simple as GPLv2+exception.
> We do have prior art for that: The Git project itself suggests in its very
> own `COPYING` file to use the following license in new files:
> 
>         This file is licensed under the GPL v2, or a later version
>         at the discretion of Linus.
> 
> Note the exception? For new Rust code (and of course excluding code that
> has been ported verbatim from GPLv2-licensed code), GPL v2 could be used
> with an exception along these lines: This file is licensed under the GPL
> v2, with the exception that it can be freely used in the Gitoxide project.

I think this "and of course" parenthetical might be a sticking point.
Obviously taking the code verbatim and re-licensing it is not allowed.
But I think even reading the C code and then writing substantially
similar Rust code may be legally questionable. The Rust code under the
more permissive license has to either be clean-room, or have permission
for re-licensing from the original authors (which is getting to be all
but impossible over time as code ends up being touched by many people).

I think this is the same issue that libgit2 ran into. If it were just a
matter of porting over and re-writing new features, more of it would
have been done. But for code to come under the new license it can't just
be a port, but has to be an independent work.

So I wonder if this just creates the same awkward silo between Git's
Rust code and its C code (that we already have between Git and libgit2).

> I am not a lawyer (which everybody but laywers are nowadays required to
> say), therefore this likely needs some tweaking.

Me either. I do like the goal you're trying to accomplish, but I worry
that it will end up causing headaches down the line. Even if we, the
developers, are a bit permissive about what constitutes "porting" and
don't require a clean-room implementation, this kind of thing scares off
the legal teams that approve using the Rust modules in bigger projects.
IIRC Microsoft put in a big effort into vetting libgit2's provenance
before agreeing to use it in Visual Studio.

-Peff

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: gitoxide-compatible licensing of Git's Rust code, was Re: [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-09-23 17:48         ` Jeff King
@ 2025-09-24 13:48           ` Phillip Wood
  2025-09-25  2:25             ` Jeff King
  0 siblings, 1 reply; 198+ messages in thread
From: Phillip Wood @ 2025-09-24 13:48 UTC (permalink / raw)
  To: Jeff King, Johannes Schindelin
  Cc: Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren, brian m. carlson

On 23/09/2025 18:48, Jeff King wrote:
> On Tue, Sep 23, 2025 at 11:57:18AM +0200, Johannes Schindelin wrote:
> 
>> It would not even take more than something as simple as GPLv2+exception.
>> We do have prior art for that: The Git project itself suggests in its very
>> own `COPYING` file to use the following license in new files:
>>
>>          This file is licensed under the GPL v2, or a later version
>>          at the discretion of Linus.
>>
>> Note the exception? For new Rust code (and of course excluding code that
>> has been ported verbatim from GPLv2-licensed code), GPL v2 could be used
>> with an exception along these lines: This file is licensed under the GPL
>> v2, with the exception that it can be freely used in the Gitoxide project.
> 
> I think this "and of course" parenthetical might be a sticking point.
> Obviously taking the code verbatim and re-licensing it is not allowed.
> But I think even reading the C code and then writing substantially
> similar Rust code may be legally questionable. The Rust code under the
> more permissive license has to either be clean-room, or have permission
> for re-licensing from the original authors (which is getting to be all
> but impossible over time as code ends up being touched by many people).
> 
> I think this is the same issue that libgit2 ran into. If it were just a
> matter of porting over and re-writing new features, more of it would
> have been done. But for code to come under the new license it can't just
> be a port, but has to be an independent work.

Thanks for putting this so clearly, I agree with everything that you've 
written here. Another thing I'm concerned/confused about is how the 
exception for a single project works in practice. Does it mean that a 
third party that wants to re-use some code from GitOxide has to check if 
the code originally came from Git to determine which license it is 
under? Or does it mean that anyone who wants to use Git's code without 
the copyleft restrictions can do so if they launder it through GitOxide 
first? Neither of those seems like a great outcome.

Thanks

Phillip


^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: gitoxide-compatible licensing of Git's Rust code, was Re: [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-09-24 13:48           ` Phillip Wood
@ 2025-09-25  2:25             ` Jeff King
  2025-09-25  5:42               ` Patrick Steinhardt
  0 siblings, 1 reply; 198+ messages in thread
From: Jeff King @ 2025-09-25  2:25 UTC (permalink / raw)
  To: phillip.wood
  Cc: Johannes Schindelin, Ezekiel Newren via GitGitGadget, git,
	Elijah Newren, Ezekiel Newren, brian m. carlson

On Wed, Sep 24, 2025 at 02:48:26PM +0100, Phillip Wood wrote:

> Thanks for putting this so clearly, I agree with everything that you've
> written here. Another thing I'm concerned/confused about is how the
> exception for a single project works in practice. Does it mean that a third
> party that wants to re-use some code from GitOxide has to check if the code
> originally came from Git to determine which license it is under? Or does it
> mean that anyone who wants to use Git's code without the copyleft
> restrictions can do so if they launder it through GitOxide first? Neither of
> those seems like a great outcome.

If I understand the suggestion correctly, it's not to license it
specifically to GitOxide. It's to use a permissive license (like GPL
with linking exception) that would make it compatible with other
projects with similar licenses (like GitOxide).

-Peff

^ permalink raw reply	[flat|nested] 198+ messages in thread

* Re: gitoxide-compatible licensing of Git's Rust code, was Re: [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash
  2025-09-25  2:25             ` Jeff King
@ 2025-09-25  5:42               ` Patrick Steinhardt
  0 siblings, 0 replies; 198+ messages in thread
From: Patrick Steinhardt @ 2025-09-25  5:42 UTC (permalink / raw)
  To: Jeff King
  Cc: phillip.wood, Johannes Schindelin,
	Ezekiel Newren via GitGitGadget, git, Elijah Newren,
	Ezekiel Newren, brian m. carlson

On Wed, Sep 24, 2025 at 10:25:55PM -0400, Jeff King wrote:
> On Wed, Sep 24, 2025 at 02:48:26PM +0100, Phillip Wood wrote:
> 
> > Thanks for putting this so clearly, I agree with everything that you've
> > written here. Another thing I'm concerned/confused about is how the
> > exception for a single project works in practice. Does it mean that a third
> > party that wants to re-use some code from GitOxide has to check if the code
> > originally came from Git to determine which license it is under? Or does it
> > mean that anyone who wants to use Git's code without the copyleft
> > restrictions can do so if they launder it through GitOxide first? Neither of
> > those seems like a great outcome.
> 
> If I understand the suggestion correctly, it's not to license it
> specifically to GitOxide. It's to use a permissive license (like GPL
> with linking exception) that would make it compatible with other
> projects with similar licenses (like GitOxide).

Yeah, we certainly shouldn't single out a specific project from my point
of view. But going with something like LGPL or GPL with linking
exception would be quite a sensible choice from my point of view.

I cannot really say much about the concerns. Should we maybe ask the SFC
for some guidance here?

Patrick

^ permalink raw reply	[flat|nested] 198+ messages in thread

end of thread, other threads:[~2025-09-25  5:42 UTC | newest]

Thread overview: 198+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-17 20:32 [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification Ezekiel Newren via GitGitGadget
2025-07-17 20:32 ` [PATCH 1/7] xdiff: introduce rust Ezekiel Newren via GitGitGadget
2025-07-17 21:30   ` brian m. carlson
2025-07-17 21:54     ` Junio C Hamano
2025-07-17 22:39     ` Taylor Blau
2025-07-18 23:15     ` Ezekiel Newren
2025-07-23 21:57       ` brian m. carlson
2025-07-23 22:26         ` Junio C Hamano
2025-07-28 19:11         ` Ezekiel Newren
2025-07-31 22:37           ` brian m. carlson
2025-07-22 22:02     ` Mike Hommey
2025-07-22 23:52       ` brian m. carlson
2025-07-17 22:38   ` Taylor Blau
2025-07-17 20:32 ` [PATCH 2/7] xdiff/xprepare: remove superfluous forward declarations Ezekiel Newren via GitGitGadget
2025-07-17 22:41   ` Taylor Blau
2025-07-17 20:32 ` [PATCH 3/7] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
2025-07-17 20:32 ` [PATCH 4/7] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
2025-07-17 22:46   ` Taylor Blau
2025-07-17 23:13     ` brian m. carlson
2025-07-17 23:37       ` Elijah Newren
2025-07-18  0:23         ` Taylor Blau
2025-07-18  0:21       ` Taylor Blau
2025-07-18 13:35   ` Phillip Wood
2025-07-28 19:34     ` Ezekiel Newren
2025-07-28 19:52       ` Phillip Wood
2025-07-28 20:14         ` Ezekiel Newren
2025-07-31 14:20           ` Phillip Wood
2025-07-31 20:58             ` Ezekiel Newren
2025-08-01  9:14               ` Phillip Wood
2025-07-28 20:53         ` Junio C Hamano
2025-07-28 20:00       ` Collin Funk
2025-07-20  1:39   ` Johannes Schindelin
2025-07-17 20:32 ` [PATCH 5/7] xdiff: separate parsing lines from hashing them Ezekiel Newren via GitGitGadget
2025-07-17 22:59   ` Taylor Blau
2025-07-18 13:34   ` Phillip Wood
2025-07-17 20:32 ` [PATCH 6/7] xdiff: conditionally use Rust's implementation of xxhash Ezekiel Newren via GitGitGadget
2025-07-17 23:29   ` Taylor Blau
2025-07-18 19:00   ` Junio C Hamano
2025-07-31 21:13     ` Ezekiel Newren
2025-08-02  7:53       ` Matthias Aßhauer
2025-07-19 21:53   ` Johannes Schindelin
2025-07-20 10:14     ` Phillip Wood
2025-09-23  9:57       ` gitoxide-compatible licensing of Git's Rust code, was " Johannes Schindelin
2025-09-23 17:48         ` Jeff King
2025-09-24 13:48           ` Phillip Wood
2025-09-25  2:25             ` Jeff King
2025-09-25  5:42               ` Patrick Steinhardt
2025-07-17 20:32 ` [PATCH 7/7] github_workflows: install rust Ezekiel Newren via GitGitGadget
2025-07-17 21:23   ` brian m. carlson
2025-07-18 23:01     ` Ezekiel Newren
2025-07-25 23:56       ` Ben Knoble
2025-07-19 21:54   ` Johannes Schindelin
2025-07-17 21:51 ` [PATCH 0/7] RFC: Accelerate xdiff and begin its rustification brian m. carlson
2025-07-17 22:25   ` Taylor Blau
2025-07-18  0:29     ` brian m. carlson
2025-07-22 12:21       ` Patrick Steinhardt
2025-07-22 15:56         ` Junio C Hamano
2025-07-22 16:03     ` Sam James
2025-07-22 21:37       ` Elijah Newren
2025-07-22 21:55         ` Sam James
2025-07-22 22:08           ` Collin Funk
2025-07-18  9:23 ` Christian Brabandt
2025-07-18 16:26   ` Junio C Hamano
2025-07-19  0:32     ` Elijah Newren
2025-07-18 13:34 ` Phillip Wood
2025-07-18 21:25   ` Eli Schwartz
2025-07-19  0:48     ` Haelwenn (lanodan) Monnier
2025-07-22 12:21       ` Patrick Steinhardt
2025-07-22 14:24     ` Patrick Steinhardt
2025-07-22 15:14       ` Eli Schwartz
2025-07-22 15:56       ` Sam James
2025-07-23  4:32         ` Patrick Steinhardt
2025-07-24  9:01           ` Pierre-Emmanuel Patry
2025-07-24 10:00             ` Patrick Steinhardt
2025-07-28  9:06               ` Pierre-Emmanuel Patry
2025-07-18 14:38 ` Junio C Hamano
2025-07-18 21:56   ` Ezekiel Newren
2025-07-21 10:14   ` Phillip Wood
2025-07-21 18:33     ` Junio C Hamano
2025-07-19 21:53 ` Johannes Schindelin
2025-07-20  8:45   ` Matthias Aßhauer
2025-08-15  1:22 ` [PATCH v2 00/17] " Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 01/17] doc: add a policy for using Rust brian m. carlson via GitGitGadget
2025-08-15 17:03     ` Matthias Aßhauer
2025-08-15 21:31       ` Junio C Hamano
2025-08-16  8:06         ` Matthias Aßhauer
2025-08-19  2:06       ` Ezekiel Newren
2025-08-15  1:22   ` [PATCH v2 02/17] xdiff: introduce rust Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 03/17] xdiff/xprepare: remove superfluous forward declarations Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 04/17] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 05/17] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 06/17] xdiff: separate parsing lines from hashing them Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 07/17] xdiff: conditionally use Rust's implementation of xxhash Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 08/17] github workflows: install rust Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 09/17] Do support Windows again after requiring Rust Johannes Schindelin via GitGitGadget
2025-08-15 17:12     ` Matthias Aßhauer
2025-08-15 21:48       ` Junio C Hamano
2025-08-15 22:11         ` Johannes Schindelin
2025-08-15 23:37           ` Junio C Hamano
2025-08-15 23:37         ` Junio C Hamano
2025-08-16  8:53         ` Matthias Aßhauer
2025-08-17 15:57           ` Junio C Hamano
2025-08-19  2:22       ` Ezekiel Newren
2025-08-15  1:22   ` [PATCH v2 10/17] win+Meson: allow for xdiff to be compiled with MSVC Johannes Schindelin via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 11/17] win+Meson: do allow linking with the Rust-built xdiff Johannes Schindelin via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 12/17] github workflows: define rust versions and targets in the same place Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 13/17] github workflows: upload Cargo.lock Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 14/17] xdiff: implement a white space iterator in Rust Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 15/17] xdiff: create line_hash() and line_equal() Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 16/17] xdiff: optimize case where --ignore-cr-at-eol is the only whitespace flag Ezekiel Newren via GitGitGadget
2025-08-15  1:22   ` [PATCH v2 17/17] xdiff: use rust's version of whitespace processing Ezekiel Newren via GitGitGadget
2025-08-15 15:07   ` [-SPAM-] [PATCH v2 00/17] RFC: Accelerate xdiff and begin its rustification Ramsay Jones
2025-08-19  2:00     ` Elijah Newren
2025-08-24 16:52       ` Patrick Steinhardt
2025-08-18 22:31   ` Junio C Hamano
2025-08-18 23:52     ` Ben Knoble
2025-08-19  1:52     ` Elijah Newren
2025-08-19  9:47       ` Junio C Hamano
2025-08-23  3:55   ` [PATCH v3 00/15] RFC: Cleanup " Ezekiel Newren via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 01/15] doc: add a policy for using Rust brian m. carlson via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 02/15] xdiff: introduce rust Ezekiel Newren via GitGitGadget
2025-08-23 13:43       ` rsbecker
2025-08-23 14:26         ` Kristoffer Haugsbakk
2025-08-23 15:06           ` rsbecker
2025-08-23 18:30             ` Elijah Newren
2025-08-23 19:24               ` brian m. carlson
2025-08-23 20:04                 ` rsbecker
2025-08-23 20:36                 ` Sam James
2025-08-23 21:17                 ` Haelwenn (lanodan) Monnier
2025-08-27  1:57               ` Taylor Blau
2025-08-27 14:39                 ` rsbecker
2025-08-27 17:06                   ` Junio C Hamano
2025-08-27 17:15                     ` rsbecker
2025-08-27 20:12                     ` Taylor Blau
2025-08-27 20:22                       ` Junio C Hamano
2025-09-02 11:16                         ` Patrick Steinhardt
2025-09-02 11:30                           ` Sam James
2025-09-02 17:27                           ` brian m. carlson
2025-09-02 18:47                             ` Sam James
2025-09-03 18:22                               ` Collin Funk
2025-09-03  5:40                             ` Patrick Steinhardt
2025-09-03 16:22                               ` Ramsay Jones
2025-09-03 22:10                               ` Junio C Hamano
2025-09-03 22:48                                 ` Josh Steadmon
2025-09-04 11:10                                 ` Patrick Steinhardt
2025-09-04 15:45                                   ` Junio C Hamano
2025-09-05  8:23                                     ` Patrick Steinhardt
2025-09-04  0:57                               ` brian m. carlson
2025-09-04 11:39                                 ` Patrick Steinhardt
2025-09-04 13:53                                   ` Sam James
2025-09-05  3:55                                     ` Elijah Newren
2025-09-04 23:17                                   ` Ezekiel Newren
2025-09-05  3:54                                   ` Elijah Newren
2025-09-05  6:50                                     ` Patrick Steinhardt
2025-09-07  4:10                                       ` Elijah Newren
2025-09-07 16:09                                         ` rsbecker
2025-09-08 10:12                                           ` Phillip Wood
2025-09-08 15:32                                             ` rsbecker
2025-09-08 15:10                                           ` Ezekiel Newren
2025-09-08 15:41                                             ` rsbecker
2025-09-08 15:31                                           ` Elijah Newren
2025-09-08 15:36                                             ` rsbecker
2025-09-08 16:13                                               ` Elijah Newren
2025-09-08 17:01                                                 ` rsbecker
2025-09-08  6:40                                         ` Patrick Steinhardt
2025-09-05 10:31                                     ` Phillip Wood
2025-09-05 11:32                                       ` Sam James
2025-09-05 13:14                                       ` Phillip Wood
2025-09-05 13:23                                         ` Patrick Steinhardt
2025-09-05 15:37                                         ` Junio C Hamano
2025-09-08  6:40                                           ` Patrick Steinhardt
2025-08-23 14:29         ` Ezekiel Newren
2025-08-23  3:55     ` [PATCH v3 03/15] github workflows: install rust Ezekiel Newren via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 04/15] win+Meson: do allow linking with the Rust-built xdiff Johannes Schindelin via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 05/15] github workflows: upload Cargo.lock Ezekiel Newren via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 06/15] ivec: create a vector type that is interoperable between C and Rust Ezekiel Newren via GitGitGadget
2025-08-23  8:12       ` Kristoffer Haugsbakk
2025-08-23  9:29         ` Ezekiel Newren
2025-08-23 16:14       ` Junio C Hamano
2025-08-23 16:37         ` Ezekiel Newren
2025-08-23 18:05       ` Junio C Hamano
2025-08-23 20:29         ` Ezekiel Newren
2025-08-25 19:16         ` Elijah Newren
2025-08-26  5:40           ` Junio C Hamano
2025-08-24 13:31       ` Ben Knoble
2025-08-25 20:40         ` Ezekiel Newren
2025-08-26 13:30           ` D. Ben Knoble
2025-08-26 18:47             ` Ezekiel Newren
2025-08-26 22:01               ` brian m. carlson
2025-08-23  3:55     ` [PATCH v3 07/15] xdiff/xprepare: remove superfluous forward declarations Ezekiel Newren via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 08/15] xdiff: delete unnecessary fields from xrecord_t and xdfile_t Ezekiel Newren via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 09/15] xdiff: make fields of xrecord_t Rust friendly Ezekiel Newren via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 10/15] xdiff: use one definition for freeing xdfile_t Ezekiel Newren via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 11/15] xdiff: replace chastore with an ivec in xdfile_t Ezekiel Newren via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 12/15] xdiff: delete nrec field from xdfile_t Ezekiel Newren via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 13/15] xdiff: delete recs " Ezekiel Newren via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 14/15] xdiff: make xdfile_t more rust friendly Ezekiel Newren via GitGitGadget
2025-08-23  3:55     ` [PATCH v3 15/15] xdiff: implement xdl_trim_ends() in Rust Ezekiel Newren via GitGitGadget

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).