All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, "Yee Cheng Chin" <ychin.git@gmail.com>,
	"Phillip Wood" <phillip.wood123@gmail.com>,
	"René Scharfe" <l.s.r@web.de>, "Jeff King" <peff@peff.net>,
	"D. Ben Knoble" <ben.knoble@gmail.com>,
	"Ezekiel Newren" <ezekielnewren@gmail.com>
Subject: Re: [PATCH v3 4/6] xdiff/xdl_cleanup_records: make limits more clear
Date: Fri, 27 Mar 2026 16:01:02 -0700	[thread overview]
Message-ID: <xmqqcy0oj2s1.fsf@gitster.g> (raw)
In-Reply-To: <xmqqy0jdhtd0.fsf@gitster.g> (Junio C. Hamano's message of "Fri, 27 Mar 2026 14:09:47 -0700")

Junio C Hamano <gitster@pobox.com> writes:

> "Ezekiel Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Ezekiel Newren <ezekielnewren@gmail.com>
>>
>> Make the handling of per-file limits and the minimal-case clearer.
>>   * Use explicit per-file limit variables (mlim1, mlim2) and initialize
>>     them.
>>   * The additional condition `!need_min` is redudant now, remove it.
>> Best viewed with --color-words.
>>
>> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
>> ---
>>  xdiff/xprepare.c | 19 ++++++++++++-------
>>  1 file changed, 12 insertions(+), 7 deletions(-)
>
> t4071 and t8015 do not like this step, even though they are happy
> with 1-3/6 applied.
>
>
>> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
>> index 386668a92d..2cf1f8d1a8 100644
>> --- a/xdiff/xprepare.c
>> +++ b/xdiff/xprepare.c
>> @@ -268,7 +268,7 @@ static bool xdl_clean_mmatch(uint8_t const *action, ptrdiff_t i, ptrdiff_t s, pt
>>   * might be potentially discarded if they appear in a run of discardable.
>>   */
>>  static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
>> -	ptrdiff_t i, nm, mlim;
>> +	ptrdiff_t i, nm, mlim1, mlim2;

Ah, the problem may manifest itself in this step in the series, but
the root cause might be before this step.  ptrdiff_t is signed and
that is the type used for mlim/mlim1/mlim2 here, and before this
series these counters count in "long" that is signed.

>> +	if (need_min) {
>> +		/* i.e. infinity */
>> +		mlim1 = SIZE_MAX;
>> +		mlim2 = SIZE_MAX;

But SIZE_MAX is the maximum that a size_t (unsigned) can take.  No
wonder assigning it to ptrdiff_t and assuming that any other
sensible ptrdiff_t value can ever reach it.  Instead, this
essentially assigns -1 to mlim1 and mlim2 when need_min is true.

>> +	} else {
>> +		mlim1 = XDL_MIN(xdl_bogosqrt(xdf1->nrec), XDL_MAX_EQLIMIT);
>> +		mlim2 = XDL_MIN(xdl_bogosqrt(xdf2->nrec), XDL_MAX_EQLIMIT);

This side I do not think has much to do with the breakage, but the
way XDL_MIN() is implemented, it must be noted that xdl_bogosqrt()
is called twice on the same value with this rewrite ...

>> +	}
>> +
>>  	/*
>>  	 * Initialize temporary arrays with DISCARD, KEEP, or INVESTIGATE.
>>  	 */
>> -	if ((mlim = (long)xdl_bogosqrt((uint64_t)xdf1->nrec)) > XDL_MAX_EQLIMIT)
>> -		mlim = XDL_MAX_EQLIMIT;

... as opposed to computing the value only once, in the original.

>>  	for (i = xdf1->dstart; i <= xdf1->dend; i++) {
>>  		size_t mph1 = xdf1->recs[i].minimal_perfect_hash;
>>  		rcrec = cf->rcrecs[mph1];
>>  		nm = rcrec ? rcrec->len2 : 0;
>> -		action1[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;

So the original said, "if nm is not zero and need_min is true, do
not bother comparing nm with anything, and always use KEEP.  If
need_min is false, we use INVESTIGAGE only when nm is large enough,
otherwise KEEP.

>> +		action1[i] = (nm == 0) ? DISCARD: nm >= mlim1 ? INVESTIGATE: KEEP;

Updated code, when nm is not zero, does something different.  if
need_min is true, mlim1 is set to -1 and presumably nm is a count or
length that is bounded on its lower end with 0, so it is larger than
mlim1 (== -1), and we always take INVESTIGATE and never kEEP.

So the rewritten code is broken when need_min is true?

I suspect the remainder of the patch is broken exactly the same way,
so the remedy would be similar?

>>  	}
>>  
>> -	if ((mlim = (long)xdl_bogosqrt((uint64_t)xdf2->nrec)) > XDL_MAX_EQLIMIT)
>> -		mlim = XDL_MAX_EQLIMIT;
>>  	for (i = xdf2->dstart; i <= xdf2->dend; i++) {
>>  		size_t mph2 = xdf2->recs[i].minimal_perfect_hash;
>>  		rcrec = cf->rcrecs[mph2];
>>  		nm = rcrec ? rcrec->len1 : 0;
>> -		action2[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
>> +		action2[i] = (nm == 0) ? DISCARD: nm >= mlim2 ? INVESTIGATE: KEEP;
>>  	}
>>  
>>  	/*

  reply	other threads:[~2026-03-27 23:01 UTC|newest]

Thread overview: 124+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-02 18:52 [PATCH 00/10] Xdiff cleanup part 3 Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 01/10] ivec: introduce the C side of ivec Ezekiel Newren via GitGitGadget
2026-01-04  5:32   ` Junio C Hamano
2026-01-17 16:06     ` Ezekiel Newren
2026-01-08 14:34   ` Phillip Wood
2026-01-15 15:55     ` Ezekiel Newren
2026-01-16 10:39       ` Phillip Wood
2026-01-16 20:19         ` René Scharfe
2026-01-17 13:55           ` Phillip Wood
2026-01-17 16:04             ` Ezekiel Newren
2026-01-18 14:58               ` René Scharfe
2026-01-17 16:14         ` Ezekiel Newren
2026-01-17 16:16           ` Ezekiel Newren
2026-01-17 17:40           ` Phillip Wood
2026-01-19  5:59             ` Jeff King
2026-01-19 20:21               ` Ezekiel Newren
2026-01-19 20:40                 ` Jeff King
2026-01-20  2:36                   ` D. Ben Knoble
2026-01-21 21:00                   ` Ezekiel Newren
2026-01-21 21:20                     ` Jeff King
2026-01-21 21:31                       ` Junio C Hamano
2026-01-21 21:45                         ` Ezekiel Newren
2026-01-20 13:46               ` Phillip Wood
2026-01-20 14:06       ` Phillip Wood
2026-01-21 21:39         ` Ezekiel Newren
2026-01-28 11:15           ` Phillip Wood
2026-01-16 20:19   ` René Scharfe
2026-01-17 15:58     ` Ezekiel Newren
2026-01-18 14:55       ` René Scharfe
2026-01-02 18:52 ` [PATCH 02/10] xdiff: make classic diff explicit by creating xdl_do_classic_diff() Ezekiel Newren via GitGitGadget
2026-01-20 15:01   ` Phillip Wood
2026-01-21 21:05     ` Ezekiel Newren
2026-01-02 18:52 ` [PATCH 03/10] xdiff: don't waste time guessing the number of lines Ezekiel Newren via GitGitGadget
2026-01-20 15:02   ` Phillip Wood
2026-01-21 21:12     ` Ezekiel Newren
2026-01-22 10:16       ` Phillip Wood
2026-01-02 18:52 ` [PATCH 04/10] xdiff: let patience and histogram benefit from xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 15:02   ` Phillip Wood
2026-01-21 14:49     ` Phillip Wood
2026-01-02 18:52 ` [PATCH 05/10] xdiff: use xdfenv_t in xdl_trim_ends() and xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 06/10] xdiff: cleanup xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 07/10] xdiff: replace xdfile_t.dstart with xdfenv_t.delta_start Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-28 10:51     ` Phillip Wood
2026-01-02 18:52 ` [PATCH 08/10] xdiff: replace xdfile_t.dend with xdfenv_t.delta_end Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 09/10] xdiff: remove dependence on xdlclassifier from xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-16 20:19   ` René Scharfe
2026-01-17 16:34     ` Ezekiel Newren
2026-01-18 18:23       ` René Scharfe
2026-01-21 15:01   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 10/10] xdiff: move xdl_cleanup_records() from xprepare.c to xdiffi.c Ezekiel Newren via GitGitGadget
2026-01-21 15:01   ` Phillip Wood
2026-01-28 10:56     ` Phillip Wood
2026-01-04  2:44 ` [PATCH 00/10] Xdiff cleanup part 3 Junio C Hamano
2026-01-04  6:01 ` Yee Cheng Chin
2026-01-28 14:40 ` Phillip Wood
2026-03-06 23:03 ` Junio C Hamano
2026-03-09 19:06   ` Ezekiel Newren
2026-03-09 23:31     ` Junio C Hamano
2026-03-25 21:11 ` [PATCH v2 0/5] " Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 1/5] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 2/5] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 3/5] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 4/5] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 5/5] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-25 21:58     ` Junio C Hamano
2026-03-26  6:26   ` [PATCH v2 0/5] Xdiff cleanup part 3 SZEDER Gábor
2026-03-27 19:23   ` [PATCH v3 0/6] " Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-27 21:09       ` Junio C Hamano
2026-03-27 23:01         ` Junio C Hamano [this message]
2026-03-30 16:00           ` Ezekiel Newren
2026-03-30 19:59             ` Junio C Hamano
2026-03-31  1:29               ` Ezekiel Newren
2026-03-27 19:23     ` [PATCH v3 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 6/6] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-30 16:59     ` [PATCH v4 0/6] Xdiff cleanup part 3 Ezekiel Newren via GitGitGadget
2026-03-30 16:59       ` [PATCH v4 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-30 17:23         ` Ezekiel Newren
2026-03-30 22:53         ` Junio C Hamano
2026-03-30 16:59       ` [PATCH v4 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-03-30 22:59         ` Junio C Hamano
2026-03-30 17:00       ` [PATCH v4 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-30 17:00       ` [PATCH v4 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-31  9:44         ` Phillip Wood
2026-03-31 16:13           ` Junio C Hamano
2026-04-14 21:58           ` Ezekiel Newren
2026-04-14 22:15             ` Junio C Hamano
2026-04-15 13:54               ` Phillip Wood
2026-03-30 17:00       ` [PATCH v4 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-30 23:02         ` Junio C Hamano
2026-03-31  9:44           ` Phillip Wood
2026-03-30 17:00       ` [PATCH v4 6/6] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-31  9:43         ` Phillip Wood
2026-04-01 16:00         ` Phillip Wood
2026-03-30 23:04       ` [PATCH v4 0/6] Xdiff cleanup part 3 Junio C Hamano
2026-03-31  9:45         ` Phillip Wood
2026-04-08 20:26       ` [PATCH v5 " Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-04-14 10:09           ` Phillip Wood
2026-04-08 20:26         ` [PATCH v5 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 6/6] xdiff/xdl_cleanup_records: put braces around the else clause Ezekiel Newren via GitGitGadget
2026-04-08 21:28         ` [PATCH v5 0/6] Xdiff cleanup part 3 Junio C Hamano
2026-04-09 14:01           ` Phillip Wood
2026-04-14 10:08         ` Phillip Wood
2026-04-14 17:06           ` Junio C Hamano
2026-04-29 22:08         ` [PATCH v6 " Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 6/6] xdiff/xdl_cleanup_records: make execution of " Ezekiel Newren via GitGitGadget
2026-04-30 13:35           ` [PATCH v6 0/6] Xdiff cleanup part 3 Phillip Wood
2026-04-30 21:08             ` Ezekiel Newren
2026-05-04  0:59             ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqcy0oj2s1.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=ben.knoble@gmail.com \
    --cc=ezekielnewren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    --cc=phillip.wood123@gmail.com \
    --cc=ychin.git@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.