From: Phillip Wood <phillip.wood123@gmail.com>
To: Ezekiel Newren <ezekielnewren@gmail.com>
Cc: phillip.wood@dunelm.org.uk,
Ezekiel Newren via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org
Subject: Re: [PATCH 01/10] ivec: introduce the C side of ivec
Date: Wed, 28 Jan 2026 11:15:22 +0000 [thread overview]
Message-ID: <7d6dd22d-286e-4f7b-a211-44e00e711401@gmail.com> (raw)
In-Reply-To: <CAH=ZcbAiGONrOyma7YjNKKLqNFoisU5LG=nGWjtOJ1wLfqX4cQ@mail.gmail.com>
On 21/01/2026 21:39, Ezekiel Newren wrote:
> On Tue, Jan 20, 2026 at 7:06 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>
>> Hi Ezekiel
>>
>> On 15/01/2026 15:55, Ezekiel Newren wrote:
>>> On Thu, Jan 8, 2026 at 7:34 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>>>> +void ivec_reserve(void *self_, size_t additional)
>>>>> +{
>>>>> + struct IVec_c_void *self = self_;
>>>>> +
>>>>> + size_t growby = 128;
>>>>> + if (self->capacity > growby)
>>>>> + growby = self->capacity;
>>>>> + if (additional > growby)
>>>>> + growby = additional;
>>>>
>>>> This growth strategy differs from both ALLOC_GROW() and
>>>> XDL_ALLOC_GROW(), if there isn't a good reason for that we should
>>>> perhaps just use ALLOC_GROW() here.
>>>
>>> XDL_ALLOW_GROW() can't be used because the pointer is always a void*
>>> in this function.
>>
>> Oh right. I'm not sure that's not a reason to use a different growth
>> strategy though. The minimum size of 128 elements is probably good for
>> the xdiff code that creates arrays with one element per line but if this
>> is supposed to be for general use it is going to waste space when we're
>> allocating a lot of small arrays. ALLOC_GROW() uses alloc_nr() to
>> calculate the new side so perhaps we could use that here?
>
> If ivec_reserve() isn't suitable then ivec_reserve_exact() should be
> used instead.
If some C code that pushes one element at a time to an array using
ALLOC_GROW() is converted to use an ivec then we don't want to change
the code behaves - that means it should grow the array in the same way.
I don't see how the suggestion to use ivec_reserve_exact() helps in that
situation. What is the advantage in having a different growth
characteristic?
>>>>> +void ivec_push(void *self_, const void *value)
>>>>> +{
>>>>> + struct IVec_c_void *self = self_;
>>>>> + void *dst = NULL;
>>>>> +
>>>>> + if (self->length == self->capacity)
>>>>> + ivec_reserve(self, 1);
>>>>> +
>>>>> + dst = (uint8_t*)self->ptr + self->length * self->element_size;
>>>>> + memcpy(dst, value, self->element_size);
>>>>
>>>> If self->element_size was a compile time constant the compiler could
>>>> easily optimize this call away. I'm not sure that is easy to achieve though.
>>>
>>> The problem is that I didn't want all of ivec to be macros that looked
>>> like function calls. I wanted to minimize use of macros so that it was
>>> easier to port and verify that the Rust implementation matches the
>>> behavior of the C implementation.
>>
>> I think that's a reasonable concern. So is the plan to have a parallel
>> rust implementation of these functions rather than call the C
>> implementation from rust?
>
> Yes, the Rust implementation will be independent of the C
> implementation, but will behave the same way. That's why I'm calling
> it an interoperable vec as opposed to a compatible vec. Rust can't
> call the C ivec functions and C can't call the Rust ivec functions,
> but they'll behave the same way.
Interesting - I'm curious what the advantage of that is over having rust
call the C implementation? I can see you wouldn't want to be calling
into C for each ivec.push() call, but checking if there is room to push
the new element in rust and calling into C to extend the vector if not
should be reasonable and then you don't have to re-implement everything
in rust.
>>>>> +void ivec_free(void *self_)
>>>>
>>>> Normally we'd call a like this that free the allocations and
>>>> re-initializes the members ivec_clear()
>>>
>>> In Rust Vec.clear() means to set length to zero, but leaves the
>>> allocation alone. The reason why I'm zeroing the struct is to help
>>> avoid FFI issues. If not zero then what should the members be set to,
>>> to indicate that using the struct is not valid anymore? In Rust an
>>> object is freed when it goes out of scope and _cannot_ be accessed
>>> afterward.
>
> Maybe I should call this ivec_drop(). Though the notion of explicitly
> freeing an object in Rust is _almost_ nonsense. The way you free
> something in Rust is to let it go out of scope.
Indeed - which means this wont be a public function in rust and so why
do we worry about naming it ivec_clear()? At least ivec_drop() does not
conflict with any of the standard function suffixes that we're already
using in git.
>> I'm aware that Vec::clear() has different semantics (it does what
>> strbuf_reset() does). That's unfortunate but this function has different
>> semantics to all the other *_free() functions in git. Our coding
>> guidelines say
>>
>> - There are several common idiomatic names for functions performing
>> specific tasks on a structure `S`:
>>
>> - `S_init()` initializes a structure without allocating the
>> structure itself.
>>
>> - `S_release()` releases a structure's contents without freeing the
>> structure.
>>
>> - `S_clear()` is equivalent to `S_release()` followed by `S_init()`
>> such that the structure is directly usable after clearing it. When
>> `S_clear()` is provided, `S_init()` shall not allocate resources
>> that need to be released again.
>>
>> - `S_free()` releases a structure's contents and frees the
>> structure.
>>
>> As we write more rust code and so wrap more of our existing structs
>> we're going to be wrapping C code that uses the definitions above so I
>> think we should do the same with struct IVec_*.
>
> I disagree. IVec isn't a wrapper around an existing struct.
So just because it is a new stuct it shouldn't have to follow the
existing naming conventions?
> ivec is
> meant to very closely mimic Rust's Vec while guaranteeing
> interoperability. For things like strbuf I haven't conceived of a
> solution for that yet. Making ivec diverge from Rust's Vec will result
> in POLA violations due to different behavior when refactoring an
> IVec<your_type_here> to Vec<your_type_here>.
On the other hand, vec.reset() does not exist so you'd get a compiler
error if you forgot to rename those calls when changing from IVec to Vec
and the rust code wouldn't be calling ivec.clear(). I'm not sure citing
POLA concerns is very convincing as ivec_free() in C is a POLA violation
for anyone familiar with git's code base so it's not like there's a
choice that avoids that concern.
Thanks
Phillip
>>>>> diff --git a/compat/ivec.h b/compat/ivec.h
>>>>> new file mode 100644
>>>>> index 0000000000..654a05c506
>>>>> --- /dev/null
>>>>> +++ b/compat/ivec.h
>>>>> @@ -0,0 +1,52 @@
>>>>> +#ifndef IVEC_H
>>>>> +#define IVEC_H
>>>>> +
>>>>> +#include <git-compat-util.h>
>>>>
>>>> It would be nice to have some documentation in this header, see the
>>>> examples in strvec.h and hashmap.h
>>>>
>>>>> +#define IVEC_INIT(variable) ivec_init(&(variable), sizeof(*(variable).ptr))
>>>>
>>>> This is a bit cumbersome to use compared to our usual *_INIT macros. I'm
>>>> struggling to see how we can make it nicer though as DEFINE_IVEC_TYPE
>>>> cannot define a per-type initializer macro and I we cannot initialize
>>>> the element size without knowing the type.
>>>
>>> I don't see what's cumbersome about it. Maybe an example use case
>>> would clarify things.
>>
>> It is cumbersome because it separates the initialization from the
>> declaration. Normally our *_INIT macros are initializer lists so we can
>> write
>>
>> struct strbuf = STRBUF_INIT;
>>
>> which keeps the declaration and initialization together. Although
>> they're on adjacent lines in your example in real code the
>> initialization likely to be separated from the declaration by other
>> variable declarations.
>
> Ah I see what you mean now. I'll experiment with making IVEC_INIT()
> work like that. One wrinkle is that STRBUF_INIT is a single concrete
> type whereas IVEC_INIT() is meant for generic types.
If you can get it to work that would be great, but I can't think of a
way of getting it to work for a generic type.
Thanks
Phillip
next prev parent reply other threads:[~2026-01-28 11:15 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-02 18:52 [PATCH 00/10] Xdiff cleanup part 3 Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 01/10] ivec: introduce the C side of ivec Ezekiel Newren via GitGitGadget
2026-01-04 5:32 ` Junio C Hamano
2026-01-17 16:06 ` Ezekiel Newren
2026-01-08 14:34 ` Phillip Wood
2026-01-15 15:55 ` Ezekiel Newren
2026-01-16 10:39 ` Phillip Wood
2026-01-16 20:19 ` René Scharfe
2026-01-17 13:55 ` Phillip Wood
2026-01-17 16:04 ` Ezekiel Newren
2026-01-18 14:58 ` René Scharfe
2026-01-17 16:14 ` Ezekiel Newren
2026-01-17 16:16 ` Ezekiel Newren
2026-01-17 17:40 ` Phillip Wood
2026-01-19 5:59 ` Jeff King
2026-01-19 20:21 ` Ezekiel Newren
2026-01-19 20:40 ` Jeff King
2026-01-20 2:36 ` D. Ben Knoble
2026-01-21 21:00 ` Ezekiel Newren
2026-01-21 21:20 ` Jeff King
2026-01-21 21:31 ` Junio C Hamano
2026-01-21 21:45 ` Ezekiel Newren
2026-01-20 13:46 ` Phillip Wood
2026-01-20 14:06 ` Phillip Wood
2026-01-21 21:39 ` Ezekiel Newren
2026-01-28 11:15 ` Phillip Wood [this message]
2026-01-16 20:19 ` René Scharfe
2026-01-17 15:58 ` Ezekiel Newren
2026-01-18 14:55 ` René Scharfe
2026-01-02 18:52 ` [PATCH 02/10] xdiff: make classic diff explicit by creating xdl_do_classic_diff() Ezekiel Newren via GitGitGadget
2026-01-20 15:01 ` Phillip Wood
2026-01-21 21:05 ` Ezekiel Newren
2026-01-02 18:52 ` [PATCH 03/10] xdiff: don't waste time guessing the number of lines Ezekiel Newren via GitGitGadget
2026-01-20 15:02 ` Phillip Wood
2026-01-21 21:12 ` Ezekiel Newren
2026-01-22 10:16 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 04/10] xdiff: let patience and histogram benefit from xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 15:02 ` Phillip Wood
2026-01-21 14:49 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 05/10] xdiff: use xdfenv_t in xdl_trim_ends() and xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-20 16:32 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 06/10] xdiff: cleanup xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 16:32 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 07/10] xdiff: replace xdfile_t.dstart with xdfenv_t.delta_start Ezekiel Newren via GitGitGadget
2026-01-20 16:32 ` Phillip Wood
2026-01-28 10:51 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 08/10] xdiff: replace xdfile_t.dend with xdfenv_t.delta_end Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 09/10] xdiff: remove dependence on xdlclassifier from xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-16 20:19 ` René Scharfe
2026-01-17 16:34 ` Ezekiel Newren
2026-01-18 18:23 ` René Scharfe
2026-01-21 15:01 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 10/10] xdiff: move xdl_cleanup_records() from xprepare.c to xdiffi.c Ezekiel Newren via GitGitGadget
2026-01-21 15:01 ` Phillip Wood
2026-01-28 10:56 ` Phillip Wood
2026-01-04 2:44 ` [PATCH 00/10] Xdiff cleanup part 3 Junio C Hamano
2026-01-04 6:01 ` Yee Cheng Chin
2026-01-28 14:40 ` Phillip Wood
2026-03-06 23:03 ` Junio C Hamano
2026-03-09 19:06 ` Ezekiel Newren
2026-03-09 23:31 ` Junio C Hamano
2026-03-25 21:11 ` [PATCH v2 0/5] " Ezekiel Newren via GitGitGadget
2026-03-25 21:11 ` [PATCH v2 1/5] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-25 21:11 ` [PATCH v2 2/5] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-25 21:11 ` [PATCH v2 3/5] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-25 21:11 ` [PATCH v2 4/5] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-25 21:11 ` [PATCH v2 5/5] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-25 21:58 ` Junio C Hamano
2026-03-26 6:26 ` [PATCH v2 0/5] Xdiff cleanup part 3 SZEDER Gábor
2026-03-27 19:23 ` [PATCH v3 0/6] " Ezekiel Newren via GitGitGadget
2026-03-27 19:23 ` [PATCH v3 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-27 19:23 ` [PATCH v3 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-03-27 19:23 ` [PATCH v3 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-27 19:23 ` [PATCH v3 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-27 21:09 ` Junio C Hamano
2026-03-27 23:01 ` Junio C Hamano
2026-03-27 19:23 ` [PATCH v3 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-27 19:23 ` [PATCH v3 6/6] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7d6dd22d-286e-4f7b-a211-44e00e711401@gmail.com \
--to=phillip.wood123@gmail.com \
--cc=ezekielnewren@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=phillip.wood@dunelm.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox