From: "René Scharfe" <l.s.r@web.de>
To: Ezekiel Newren <ezekielnewren@gmail.com>
Cc: Ezekiel Newren via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org
Subject: Re: [PATCH 09/10] xdiff: remove dependence on xdlclassifier from xdl_cleanup_records()
Date: Sun, 18 Jan 2026 19:23:48 +0100 [thread overview]
Message-ID: <914e4157-557e-4ea4-9b17-b6b1cb078283@web.de> (raw)
In-Reply-To: <CAH=ZcbDw0_Od3+zuGLsy3Z=bLR-4ByH8Fguiuw_MyLTi=U7gcQ@mail.gmail.com>
On 1/17/26 5:34 PM, Ezekiel Newren wrote:
> On Fri, Jan 16, 2026 at 1:19 PM René Scharfe <l.s.r@web.de> wrote:
>>
>> On 1/2/26 7:52 PM, Ezekiel Newren via GitGitGadget wrote:
>>> @@ -253,22 +250,44 @@ static bool xdl_clean_mmatch(uint8_t const *action, long i, long s, long e) {
>>> return rpdis1 * XDL_KPDIS_RUN < (rpdis1 + rdis1);
>>> }
>>>
>>> +struct xoccurrence
>>> +{
>>> + size_t file1, file2;
>>> +};
>>> +
>>> +
>>> +DEFINE_IVEC_TYPE(struct xoccurrence, xoccurrence);
>>> +
>>>
>>> /*
>>> * Try to reduce the problem complexity, discard records that have no
>>> * matches on the other file. Also, lines that have multiple matches
>>> * might be potentially discarded if they appear in a run of discardable.
>>> */
>>> -static int xdl_cleanup_records(xdlclassifier_t *cf, xdfenv_t *xe) {
>>> - long i, nm, mlim;
>>> +static int xdl_cleanup_records(xdfenv_t *xe, uint64_t flags) {
>>> + long i;
>>> + size_t nm, mlim;
>>> xrecord_t *recs;
>>> - xdlclass_t *rcrec;
>>> uint8_t *action1 = NULL, *action2 = NULL;
>>> - bool need_min = !!(cf->flags & XDF_NEED_MINIMAL);
>>> + struct IVec_xoccurrence occ;
>>> + bool need_min = !!(flags & XDF_NEED_MINIMAL);
>>> int ret = 0;
>>> ptrdiff_t dend1 = xe->xdf1.nrec - 1 - xe->delta_end;
>>> ptrdiff_t dend2 = xe->xdf2.nrec - 1 - xe->delta_end;
>>>
>>> + IVEC_INIT(occ);
>>> + ivec_zero(&occ, xe->mph_size);
>>
>> This array is presized here. It is neither grown nor shrunken.
>> CALLOC_ARRAY would work just as well, at least at this point, no?
>>
>>> +
>>> + for (size_t j = 0; j < xe->xdf1.nrec; j++) {
>>> + size_t mph1 = xe->xdf1.recs[j].minimal_perfect_hash;
>>> + occ.ptr[mph1].file1 += 1;
>>> + }
>>> +
>>> + for (size_t j = 0; j < xe->xdf2.nrec; j++) {
>>> + size_t mph2 = xe->xdf2.recs[j].minimal_perfect_hash;
>>> + occ.ptr[mph2].file2 += 1;
>>> + }
>>> +
>>> /*
>>> * Create temporary arrays that will help us decide if
>>> * changed[i] should remain false, or become true.
>>> @@ -288,16 +307,14 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfenv_t *xe) {
>>> if ((mlim = xdl_bogosqrt((long)xe->xdf1.nrec)) > XDL_MAX_EQLIMIT)
>>> mlim = XDL_MAX_EQLIMIT;
>>> for (i = xe->delta_start, recs = &xe->xdf1.recs[xe->delta_start]; i <= dend1; i++, recs++) {
>>> - rcrec = cf->rcrecs[recs->minimal_perfect_hash];
>>> - nm = rcrec ? rcrec->len2 : 0;
>>> + nm = occ.ptr[recs->minimal_perfect_hash].file2;
>>> action1[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
>>> }
>>>
>>> if ((mlim = xdl_bogosqrt((long)xe->xdf2.nrec)) > XDL_MAX_EQLIMIT)
>>> mlim = XDL_MAX_EQLIMIT;
>>> for (i = xe->delta_start, recs = &xe->xdf2.recs[xe->delta_start]; i <= dend2; i++, recs++) {
>>> - rcrec = cf->rcrecs[recs->minimal_perfect_hash];
>>> - nm = rcrec ? rcrec->len1 : 0;
>>> + nm = occ.ptr[recs->minimal_perfect_hash].file1;
>>> action2[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
>>> }
>>>
>>> @@ -332,6 +349,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfenv_t *xe) {
>>> cleanup:
>>> xdl_free(action1);
>>> xdl_free(action2);
>>> + ivec_free(&occ);
>>>
>>> return ret;
>>> }
>
> In Rust the memory management macros defined in git-compat-util.h will
> not be available. ivec was built expressly to bridge the gap between C
> and Rust. I'm avoiding using those macros because I'm trying to get C
> programmers familiar with how Rust's Vec operates without forcing them
> to read and write in Rust. Also, it makes converting from IVec to Vec
> super easy.
>
> ivec_zero() also sets length and capacity. Also CALLOC_ARRAY needs to
> know the type of the pointer which ivec_zero() does not have access
> to. This is one of the few ivec functions that does not have a direct
> equivalent in Rust's Vec, but is faster than what is logically
> equivalent in Rust.
>
> In Rust the closest safe equivalent would look like:
>
> let size = 35;
> let mut vec = Vec::<u64>::new();
> vec.reserve_exact(size);
> vec.fill(0); // requires that T implements the `Copy` trait
>
> The unsafe version would look like:
> let size = 35;
> let mut vec = Vec::<u64>::new();
> vec.reserve_exact(size);
> unsafe {
> std::ptr::write_bytes(vec.as_mut_ptr(), 0, size * size_of::<u64>());
> }
I was being unclear and made a few assumptions here. My point was just
that this is a fixed-size array and doesn't need to be stored in a
variable-sized container. This is the first Ivec user, and I would have
expected it to exercise the push function. I assume accessing a
fixed-size array via FFI would be a lot easier since allocation and
growth are out of the picture.
René
next prev parent reply other threads:[~2026-01-18 18:23 UTC|newest]
Thread overview: 124+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-02 18:52 [PATCH 00/10] Xdiff cleanup part 3 Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 01/10] ivec: introduce the C side of ivec Ezekiel Newren via GitGitGadget
2026-01-04 5:32 ` Junio C Hamano
2026-01-17 16:06 ` Ezekiel Newren
2026-01-08 14:34 ` Phillip Wood
2026-01-15 15:55 ` Ezekiel Newren
2026-01-16 10:39 ` Phillip Wood
2026-01-16 20:19 ` René Scharfe
2026-01-17 13:55 ` Phillip Wood
2026-01-17 16:04 ` Ezekiel Newren
2026-01-18 14:58 ` René Scharfe
2026-01-17 16:14 ` Ezekiel Newren
2026-01-17 16:16 ` Ezekiel Newren
2026-01-17 17:40 ` Phillip Wood
2026-01-19 5:59 ` Jeff King
2026-01-19 20:21 ` Ezekiel Newren
2026-01-19 20:40 ` Jeff King
2026-01-20 2:36 ` D. Ben Knoble
2026-01-21 21:00 ` Ezekiel Newren
2026-01-21 21:20 ` Jeff King
2026-01-21 21:31 ` Junio C Hamano
2026-01-21 21:45 ` Ezekiel Newren
2026-01-20 13:46 ` Phillip Wood
2026-01-20 14:06 ` Phillip Wood
2026-01-21 21:39 ` Ezekiel Newren
2026-01-28 11:15 ` Phillip Wood
2026-01-16 20:19 ` René Scharfe
2026-01-17 15:58 ` Ezekiel Newren
2026-01-18 14:55 ` René Scharfe
2026-01-02 18:52 ` [PATCH 02/10] xdiff: make classic diff explicit by creating xdl_do_classic_diff() Ezekiel Newren via GitGitGadget
2026-01-20 15:01 ` Phillip Wood
2026-01-21 21:05 ` Ezekiel Newren
2026-01-02 18:52 ` [PATCH 03/10] xdiff: don't waste time guessing the number of lines Ezekiel Newren via GitGitGadget
2026-01-20 15:02 ` Phillip Wood
2026-01-21 21:12 ` Ezekiel Newren
2026-01-22 10:16 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 04/10] xdiff: let patience and histogram benefit from xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 15:02 ` Phillip Wood
2026-01-21 14:49 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 05/10] xdiff: use xdfenv_t in xdl_trim_ends() and xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-20 16:32 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 06/10] xdiff: cleanup xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 16:32 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 07/10] xdiff: replace xdfile_t.dstart with xdfenv_t.delta_start Ezekiel Newren via GitGitGadget
2026-01-20 16:32 ` Phillip Wood
2026-01-28 10:51 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 08/10] xdiff: replace xdfile_t.dend with xdfenv_t.delta_end Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 09/10] xdiff: remove dependence on xdlclassifier from xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-16 20:19 ` René Scharfe
2026-01-17 16:34 ` Ezekiel Newren
2026-01-18 18:23 ` René Scharfe [this message]
2026-01-21 15:01 ` Phillip Wood
2026-01-02 18:52 ` [PATCH 10/10] xdiff: move xdl_cleanup_records() from xprepare.c to xdiffi.c Ezekiel Newren via GitGitGadget
2026-01-21 15:01 ` Phillip Wood
2026-01-28 10:56 ` Phillip Wood
2026-01-04 2:44 ` [PATCH 00/10] Xdiff cleanup part 3 Junio C Hamano
2026-01-04 6:01 ` Yee Cheng Chin
2026-01-28 14:40 ` Phillip Wood
2026-03-06 23:03 ` Junio C Hamano
2026-03-09 19:06 ` Ezekiel Newren
2026-03-09 23:31 ` Junio C Hamano
2026-03-25 21:11 ` [PATCH v2 0/5] " Ezekiel Newren via GitGitGadget
2026-03-25 21:11 ` [PATCH v2 1/5] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-25 21:11 ` [PATCH v2 2/5] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-25 21:11 ` [PATCH v2 3/5] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-25 21:11 ` [PATCH v2 4/5] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-25 21:11 ` [PATCH v2 5/5] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-25 21:58 ` Junio C Hamano
2026-03-26 6:26 ` [PATCH v2 0/5] Xdiff cleanup part 3 SZEDER Gábor
2026-03-27 19:23 ` [PATCH v3 0/6] " Ezekiel Newren via GitGitGadget
2026-03-27 19:23 ` [PATCH v3 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-27 19:23 ` [PATCH v3 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-03-27 19:23 ` [PATCH v3 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-27 19:23 ` [PATCH v3 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-27 21:09 ` Junio C Hamano
2026-03-27 23:01 ` Junio C Hamano
2026-03-30 16:00 ` Ezekiel Newren
2026-03-30 19:59 ` Junio C Hamano
2026-03-31 1:29 ` Ezekiel Newren
2026-03-27 19:23 ` [PATCH v3 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-27 19:23 ` [PATCH v3 6/6] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-30 16:59 ` [PATCH v4 0/6] Xdiff cleanup part 3 Ezekiel Newren via GitGitGadget
2026-03-30 16:59 ` [PATCH v4 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-30 17:23 ` Ezekiel Newren
2026-03-30 22:53 ` Junio C Hamano
2026-03-30 16:59 ` [PATCH v4 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-03-30 22:59 ` Junio C Hamano
2026-03-30 17:00 ` [PATCH v4 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-30 17:00 ` [PATCH v4 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-31 9:44 ` Phillip Wood
2026-03-31 16:13 ` Junio C Hamano
2026-04-14 21:58 ` Ezekiel Newren
2026-04-14 22:15 ` Junio C Hamano
2026-04-15 13:54 ` Phillip Wood
2026-03-30 17:00 ` [PATCH v4 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-30 23:02 ` Junio C Hamano
2026-03-31 9:44 ` Phillip Wood
2026-03-30 17:00 ` [PATCH v4 6/6] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-31 9:43 ` Phillip Wood
2026-04-01 16:00 ` Phillip Wood
2026-03-30 23:04 ` [PATCH v4 0/6] Xdiff cleanup part 3 Junio C Hamano
2026-03-31 9:45 ` Phillip Wood
2026-04-08 20:26 ` [PATCH v5 " Ezekiel Newren via GitGitGadget
2026-04-08 20:26 ` [PATCH v5 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-04-08 20:26 ` [PATCH v5 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-04-08 20:26 ` [PATCH v5 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-04-08 20:26 ` [PATCH v5 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-04-14 10:09 ` Phillip Wood
2026-04-08 20:26 ` [PATCH v5 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-04-08 20:26 ` [PATCH v5 6/6] xdiff/xdl_cleanup_records: put braces around the else clause Ezekiel Newren via GitGitGadget
2026-04-08 21:28 ` [PATCH v5 0/6] Xdiff cleanup part 3 Junio C Hamano
2026-04-09 14:01 ` Phillip Wood
2026-04-14 10:08 ` Phillip Wood
2026-04-14 17:06 ` Junio C Hamano
2026-04-29 22:08 ` [PATCH v6 " Ezekiel Newren via GitGitGadget
2026-04-29 22:08 ` [PATCH v6 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-04-29 22:08 ` [PATCH v6 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-04-29 22:08 ` [PATCH v6 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-04-29 22:08 ` [PATCH v6 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-04-29 22:08 ` [PATCH v6 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-04-29 22:08 ` [PATCH v6 6/6] xdiff/xdl_cleanup_records: make execution of " Ezekiel Newren via GitGitGadget
2026-04-30 13:35 ` [PATCH v6 0/6] Xdiff cleanup part 3 Phillip Wood
2026-04-30 21:08 ` Ezekiel Newren
2026-05-04 0:59 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=914e4157-557e-4ea4-9b17-b6b1cb078283@web.de \
--to=l.s.r@web.de \
--cc=ezekielnewren@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.