All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phillip Wood <phillip.wood123@gmail.com>
To: Ezekiel Newren via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: Ezekiel Newren <ezekielnewren@gmail.com>
Subject: Re: [PATCH 01/10] ivec: introduce the C side of ivec
Date: Thu, 8 Jan 2026 14:34:48 +0000	[thread overview]
Message-ID: <0437b899-5a36-4499-a30a-c2a074a80f7e@gmail.com> (raw)
In-Reply-To: <adf1395d201e916f23accc7644d21aff4f58368b.1767379944.git.gitgitgadget@gmail.com>

Hi Ezekiel

On 02/01/2026 18:52, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
> 
> Trying to use Rust's Vec in C, or git's ALLOC_GROW() macros (via
> wrapper functions) in Rust is painful because:
> 
>    * C doesn't define its own vector type, and even though Rust does
>      have Vec its painful to use on the C side (more on that below).
>      However its still not viable to use Rust's Vec type because Git
>      needs to be able to compile without Rust. So ivec was created
>      expressley to be interoperable between C and Rust without needing
>      Rust.
>    * C doing vector things the Rust way would require wrapper functions,
>      and Rust doing vector things the C way would require wrapper
>      functions, so ivec was created to ensure a consistent contract
>      between the 2 languages for how to manipulate a vector.
>    * Currently, Rust defines its own 'Vec' type that is generic, but its
>      memory allocator and struct layout weren't designed for
>      interoperability with C (or any language for that matter), meaning
>      that the C side cannot push to or expand a 'Vec' without defining
>      wrapper functions in Rust that C can call. Without special care,
>      the two languages might use different allocators (malloc/free on
>      the C side, and possibly something else in Rust), which would make
>      it difficult for a function in one language to free elements
>      allocated by a call from a function in the other language.
>    * Similarly, git defines ALLOC_GROW() and related macros in
>      git-compat-util.h. While we could add functions allowing Rust to
>      invoke something similar to those macros, passing three variables
>      (pointer, length, allocated_size) instead of a single variable
>      (vector) across the language boundary requires more cognitive
>      overhead for readers to keep track of and makes it easier to make
>      mistakes. Further, for low-level components that we want to
>      eventually convert to pure Rust, such triplets would feel very out
>      of place.
> 
> To address these issue, introduce a new type, ivec -- short for
> interoperable vector. (We refer to it as 'ivec' generally, though on
> the Rust side the struct is called IVec to match Rust style.)  This new
> type is specifically designed for FFI purposes, so that both languages
> handle the vector in the same way, though it could be used on either
> side independently. This type is designed such that it can easily be
> replaced by a Rust 'Vec' once interoperability is no longer a concern.
> 
> One particular item to note is that Git's macros to handle vec
> operations infer the amount that a vec needs to grow from the size of
> a pointer, but that makes it somewhat specific to the macros used in C.
> To avoid defining every ivec function as a macro I opted to also
> include an element_size field that allows concrete functions like
> push() to know how much to grow the memory. This element_size also
> helps in verifying that the ivec is correct when passing from C to
> Rust.

I've left some comments below but I think this is a sensible direction.

> diff --git a/compat/ivec.c b/compat/ivec.c
> new file mode 100644
> index 0000000000..0a777e78dc
> --- /dev/null
> +++ b/compat/ivec.c
> @@ -0,0 +1,113 @@
> +#include "ivec.h"
> +
> +struct IVec_c_void {

We normally use all lower case names for structs but as this is shared 
with rust it maybe makes sense to use CamelCase so the names are the 
same in both languages.

> +	void *ptr;
> +	size_t length;
> +	size_t capacity;
> +	size_t element_size;
> +};
> +
> +static void _set_capacity(void *self_, size_t new_capacity)
> +{
> +	struct IVec_c_void *self = self_;

Passing any of the ivec variants defined below to this function invokes 
undefined behavior because we're not casting the pointer back to the 
orginal type. However I think on the platforms we care about 
sizeof(void*) == sizeof(T*) for all T so maybe we can look the other way.

> +
> +	if (new_capacity == self->capacity) {
> +		return;
> +	}
> +	if (new_capacity == 0) {
> +		free(self->ptr);
> +		self->ptr = NULL;
> +	} else {
> +		self->ptr = realloc(self->ptr, new_capacity * self->element_size);
> +	}
> +	self->capacity = new_capacity;

Not if realloc() returns NULL. We should check for that, probably by 
using xrealloc().

> +void ivec_zero(void *self_, size_t capacity)
> +{
> +	struct IVec_c_void *self = self_;
> +
> +	self->ptr = calloc(capacity, self->element_size);

We should be handling allocation failures here probably by using xcalloc().

> +void ivec_reserve(void *self_, size_t additional)
> +{
> +	struct IVec_c_void *self = self_;
> +
> +	size_t growby = 128;
> +	if (self->capacity > growby)
> +		growby = self->capacity;
> +	if (additional > growby)
> +		growby = additional;

This growth strategy differs from both ALLOC_GROW() and 
XDL_ALLOC_GROW(), if there isn't a good reason for that we should 
perhaps just use ALLOC_GROW() here.

> +void ivec_push(void *self_, const void *value)
> +{
> +	struct IVec_c_void *self = self_;
> +	void *dst = NULL;
> +
> +	if (self->length == self->capacity)
> +		ivec_reserve(self, 1);
> +
> +	dst = (uint8_t*)self->ptr + self->length * self->element_size;
> +	memcpy(dst, value, self->element_size);

If self->element_size was a compile time constant the compiler could 
easily optimize this call away. I'm not sure that is easy to achieve though.

> +	self->length++;
> +}
> +
> +void ivec_free(void *self_)

Normally we'd call a like this that free the allocations and 
re-initializes the members ivec_clear()

> +{
> +	struct IVec_c_void *self = self_;
> +
> +	free(self->ptr);
> +	self->ptr = NULL;
> +	self->length = 0;
> +	self->capacity = 0;
> +	// DO NOT MODIFY element_size!!!
> +}
> +
> +void ivec_move(void *src_, void *dst_)
> +{
> +	struct IVec_c_void *src = src_;
> +	struct IVec_c_void *dst = dst_;

Maybe we should add

	if (src->element_size != dst->element_size)
		BUG("moving incompatible arrays");
> +
> +	ivec_free(dst);
> +	dst->ptr = src->ptr;
> +	dst->length = src->length;
> +	dst->capacity = src->capacity;
> +	// DO NOT MODIFY element_size!!!

As the element sizes must match maybe *dst = *src would be clearer?

> +
> +	src->ptr = NULL;
> +	src->length = 0;
> +	src->capacity = 0;
> +	// DO NOT MODIFY element_size!!!
> +}
> diff --git a/compat/ivec.h b/compat/ivec.h
> new file mode 100644
> index 0000000000..654a05c506
> --- /dev/null
> +++ b/compat/ivec.h
> @@ -0,0 +1,52 @@
> +#ifndef IVEC_H
> +#define IVEC_H
> +
> +#include <git-compat-util.h>

It would be nice to have some documentation in this header, see the 
examples in strvec.h and hashmap.h

> +#define IVEC_INIT(variable) ivec_init(&(variable), sizeof(*(variable).ptr))

This is a bit cumbersome to use compared to our usual *_INIT macros. I'm 
struggling to see how we can make it nicer though as DEFINE_IVEC_TYPE 
cannot define a per-type initializer macro and I we cannot initialize 
the element size without knowing the type.

> +
> +#ifndef CBINDGEN
> +#define DEFINE_IVEC_TYPE(type, suffix) \
> +struct IVec_##suffix { \
> +	type* ptr; \
> +	size_t length; \
> +	size_t capacity; \
> +	size_t element_size; \
> +}

I wonder if we want to define type safe inline safe wrappers for the 
ivec_* functions here. I think the only functions where the element type 
matters are ivec_move() and ivec_push(), for the others like 
ivec_zero(), ivec_reserve() and ivec_free() the element type does not 
matter. ivec_push() would certainly be easier to use with a wrapper as 
means we can avoid forcing the caller to take the address of the value.

static inline ivec_##suffix##_push(struct IVec_##suffix *self, type 
value) { \
	const void *ptr = &value; \
	ivec_push(self, ptr); \
}

I'll try and take a look at the rest of this series next week

Thanks

Phillip

> +
> +DEFINE_IVEC_TYPE(bool, bool);
> +
> +DEFINE_IVEC_TYPE(uint8_t, u8);
> +DEFINE_IVEC_TYPE(uint16_t, u16);
> +DEFINE_IVEC_TYPE(uint32_t, u32);
> +DEFINE_IVEC_TYPE(uint64_t, u64);
> +
> +DEFINE_IVEC_TYPE(int8_t, i8);
> +DEFINE_IVEC_TYPE(int16_t, i16);
> +DEFINE_IVEC_TYPE(int32_t, i32);
> +DEFINE_IVEC_TYPE(int64_t, i64);
> +
> +DEFINE_IVEC_TYPE(float, f32);
> +DEFINE_IVEC_TYPE(double, f64);
> +
> +DEFINE_IVEC_TYPE(size_t, usize);
> +DEFINE_IVEC_TYPE(ssize_t, isize);
> +#endif
> +
> +void ivec_init(void *self_, size_t element_size);
> +
> +void ivec_zero(void *self_, size_t capacity);
> +
> +void ivec_reserve_exact(void *self_, size_t additional);
> +
> +void ivec_reserve(void *self_, size_t additional);
> +
> +void ivec_shrink_to_fit(void *self_);
> +
> +void ivec_push(void *self_, const void *value);
> +
> +void ivec_free(void *self_);
> +
> +void ivec_move(void *src, void *dst);
> +
> +#endif /* IVEC_H */
> diff --git a/meson.build b/meson.build
> index dd52efd1c8..42ac0c8c42 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -302,6 +302,7 @@ libgit_sources = [
>     'commit.c',
>     'common-exit.c',
>     'common-init.c',
> +  'compat/ivec.c',
>     'compat/nonblock.c',
>     'compat/obstack.c',
>     'compat/open.c',


  parent reply	other threads:[~2026-01-08 14:34 UTC|newest]

Thread overview: 124+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-02 18:52 [PATCH 00/10] Xdiff cleanup part 3 Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 01/10] ivec: introduce the C side of ivec Ezekiel Newren via GitGitGadget
2026-01-04  5:32   ` Junio C Hamano
2026-01-17 16:06     ` Ezekiel Newren
2026-01-08 14:34   ` Phillip Wood [this message]
2026-01-15 15:55     ` Ezekiel Newren
2026-01-16 10:39       ` Phillip Wood
2026-01-16 20:19         ` René Scharfe
2026-01-17 13:55           ` Phillip Wood
2026-01-17 16:04             ` Ezekiel Newren
2026-01-18 14:58               ` René Scharfe
2026-01-17 16:14         ` Ezekiel Newren
2026-01-17 16:16           ` Ezekiel Newren
2026-01-17 17:40           ` Phillip Wood
2026-01-19  5:59             ` Jeff King
2026-01-19 20:21               ` Ezekiel Newren
2026-01-19 20:40                 ` Jeff King
2026-01-20  2:36                   ` D. Ben Knoble
2026-01-21 21:00                   ` Ezekiel Newren
2026-01-21 21:20                     ` Jeff King
2026-01-21 21:31                       ` Junio C Hamano
2026-01-21 21:45                         ` Ezekiel Newren
2026-01-20 13:46               ` Phillip Wood
2026-01-20 14:06       ` Phillip Wood
2026-01-21 21:39         ` Ezekiel Newren
2026-01-28 11:15           ` Phillip Wood
2026-01-16 20:19   ` René Scharfe
2026-01-17 15:58     ` Ezekiel Newren
2026-01-18 14:55       ` René Scharfe
2026-01-02 18:52 ` [PATCH 02/10] xdiff: make classic diff explicit by creating xdl_do_classic_diff() Ezekiel Newren via GitGitGadget
2026-01-20 15:01   ` Phillip Wood
2026-01-21 21:05     ` Ezekiel Newren
2026-01-02 18:52 ` [PATCH 03/10] xdiff: don't waste time guessing the number of lines Ezekiel Newren via GitGitGadget
2026-01-20 15:02   ` Phillip Wood
2026-01-21 21:12     ` Ezekiel Newren
2026-01-22 10:16       ` Phillip Wood
2026-01-02 18:52 ` [PATCH 04/10] xdiff: let patience and histogram benefit from xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 15:02   ` Phillip Wood
2026-01-21 14:49     ` Phillip Wood
2026-01-02 18:52 ` [PATCH 05/10] xdiff: use xdfenv_t in xdl_trim_ends() and xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 06/10] xdiff: cleanup xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 07/10] xdiff: replace xdfile_t.dstart with xdfenv_t.delta_start Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-28 10:51     ` Phillip Wood
2026-01-02 18:52 ` [PATCH 08/10] xdiff: replace xdfile_t.dend with xdfenv_t.delta_end Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 09/10] xdiff: remove dependence on xdlclassifier from xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-16 20:19   ` René Scharfe
2026-01-17 16:34     ` Ezekiel Newren
2026-01-18 18:23       ` René Scharfe
2026-01-21 15:01   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 10/10] xdiff: move xdl_cleanup_records() from xprepare.c to xdiffi.c Ezekiel Newren via GitGitGadget
2026-01-21 15:01   ` Phillip Wood
2026-01-28 10:56     ` Phillip Wood
2026-01-04  2:44 ` [PATCH 00/10] Xdiff cleanup part 3 Junio C Hamano
2026-01-04  6:01 ` Yee Cheng Chin
2026-01-28 14:40 ` Phillip Wood
2026-03-06 23:03 ` Junio C Hamano
2026-03-09 19:06   ` Ezekiel Newren
2026-03-09 23:31     ` Junio C Hamano
2026-03-25 21:11 ` [PATCH v2 0/5] " Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 1/5] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 2/5] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 3/5] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 4/5] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 5/5] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-25 21:58     ` Junio C Hamano
2026-03-26  6:26   ` [PATCH v2 0/5] Xdiff cleanup part 3 SZEDER Gábor
2026-03-27 19:23   ` [PATCH v3 0/6] " Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-27 21:09       ` Junio C Hamano
2026-03-27 23:01         ` Junio C Hamano
2026-03-30 16:00           ` Ezekiel Newren
2026-03-30 19:59             ` Junio C Hamano
2026-03-31  1:29               ` Ezekiel Newren
2026-03-27 19:23     ` [PATCH v3 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 6/6] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-30 16:59     ` [PATCH v4 0/6] Xdiff cleanup part 3 Ezekiel Newren via GitGitGadget
2026-03-30 16:59       ` [PATCH v4 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-30 17:23         ` Ezekiel Newren
2026-03-30 22:53         ` Junio C Hamano
2026-03-30 16:59       ` [PATCH v4 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-03-30 22:59         ` Junio C Hamano
2026-03-30 17:00       ` [PATCH v4 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-30 17:00       ` [PATCH v4 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-31  9:44         ` Phillip Wood
2026-03-31 16:13           ` Junio C Hamano
2026-04-14 21:58           ` Ezekiel Newren
2026-04-14 22:15             ` Junio C Hamano
2026-04-15 13:54               ` Phillip Wood
2026-03-30 17:00       ` [PATCH v4 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-30 23:02         ` Junio C Hamano
2026-03-31  9:44           ` Phillip Wood
2026-03-30 17:00       ` [PATCH v4 6/6] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-31  9:43         ` Phillip Wood
2026-04-01 16:00         ` Phillip Wood
2026-03-30 23:04       ` [PATCH v4 0/6] Xdiff cleanup part 3 Junio C Hamano
2026-03-31  9:45         ` Phillip Wood
2026-04-08 20:26       ` [PATCH v5 " Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-04-14 10:09           ` Phillip Wood
2026-04-08 20:26         ` [PATCH v5 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 6/6] xdiff/xdl_cleanup_records: put braces around the else clause Ezekiel Newren via GitGitGadget
2026-04-08 21:28         ` [PATCH v5 0/6] Xdiff cleanup part 3 Junio C Hamano
2026-04-09 14:01           ` Phillip Wood
2026-04-14 10:08         ` Phillip Wood
2026-04-14 17:06           ` Junio C Hamano
2026-04-29 22:08         ` [PATCH v6 " Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 6/6] xdiff/xdl_cleanup_records: make execution of " Ezekiel Newren via GitGitGadget
2026-04-30 13:35           ` [PATCH v6 0/6] Xdiff cleanup part 3 Phillip Wood
2026-04-30 21:08             ` Ezekiel Newren
2026-05-04  0:59             ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0437b899-5a36-4499-a30a-c2a074a80f7e@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=ezekielnewren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.