All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phillip Wood <phillip.wood123@gmail.com>
To: Ezekiel Newren via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: Ezekiel Newren <ezekielnewren@gmail.com>
Subject: Re: [PATCH 03/10] xdiff: don't waste time guessing the number of lines
Date: Tue, 20 Jan 2026 15:02:28 +0000	[thread overview]
Message-ID: <208da094-8a5d-4f16-b42b-5d5204576b5f@gmail.com> (raw)
In-Reply-To: <53e4840c1653772379dc8d5c883b34717b81ac43.1767379944.git.gitgitgadget@gmail.com>

On 02/01/2026 18:52, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <ezekielnewren@gmail.com>
> 
> All lines must be read anyway, so classify them after they're read in.
> Also move the memset() into xdl_init_classifier().

So instead of looping over the input lines one and a bit times (the bit 
being from xdl_guess_lines) we now loop over them twice as we split them 
first and then classify them in a separate loop. It does save some work 
not to call xdl_guess_lines but it is unclear if that offsets 
classifying them in a separate loop.

> +	for (size_t i = 0; i < xe->xdf1.nrec; i++) {
> +		xrecord_t *rec = &xe->xdf1.recs[i];
> +		xdl_classify_record(1, &cf, rec);

We seem to have lost the error handling if xdl_classify_record() fails.

Thanks

Phillip

> +	}
> +
> +	for (size_t i = 0; i < xe->xdf2.nrec; i++) {
> +		xrecord_t *rec = &xe->xdf2.recs[i];
> +		xdl_classify_record(2, &cf, rec);
>   	}
>   
>   	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
> diff --git a/xdiff/xutils.c b/xdiff/xutils.c
> index 77ee1ad9c8..b3d51197c1 100644
> --- a/xdiff/xutils.c
> +++ b/xdiff/xutils.c
> @@ -118,26 +118,6 @@ void *xdl_cha_alloc(chastore_t *cha) {
>   	return data;
>   }
>   
> -long xdl_guess_lines(mmfile_t *mf, long sample) {
> -	long nl = 0, size, tsize = 0;
> -	char const *data, *cur, *top;
> -
> -	if ((cur = data = xdl_mmfile_first(mf, &size))) {
> -		for (top = data + size; nl < sample && cur < top; ) {
> -			nl++;
> -			if (!(cur = memchr(cur, '\n', top - cur)))
> -				cur = top;
> -			else
> -				cur++;
> -		}
> -		tsize += (long) (cur - data);
> -	}
> -
> -	if (nl && tsize)
> -		nl = xdl_mmfile_size(mf) / (tsize / nl);
> -
> -	return nl + 1;
> -}
>   
>   int xdl_blankline(const char *line, long size, long flags)
>   {
> diff --git a/xdiff/xutils.h b/xdiff/xutils.h
> index 615b4a9d35..d800840dd0 100644
> --- a/xdiff/xutils.h
> +++ b/xdiff/xutils.h
> @@ -31,7 +31,6 @@ int xdl_emit_diffrec(char const *rec, long size, char const *pre, long psize,
>   int xdl_cha_init(chastore_t *cha, long isize, long icount);
>   void xdl_cha_free(chastore_t *cha);
>   void *xdl_cha_alloc(chastore_t *cha);
> -long xdl_guess_lines(mmfile_t *mf, long sample);
>   int xdl_blankline(const char *line, long size, long flags);
>   int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags);
>   uint64_t xdl_hash_record_verbatim(uint8_t const **data, uint8_t const *top);


  reply	other threads:[~2026-01-20 15:02 UTC|newest]

Thread overview: 124+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-02 18:52 [PATCH 00/10] Xdiff cleanup part 3 Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 01/10] ivec: introduce the C side of ivec Ezekiel Newren via GitGitGadget
2026-01-04  5:32   ` Junio C Hamano
2026-01-17 16:06     ` Ezekiel Newren
2026-01-08 14:34   ` Phillip Wood
2026-01-15 15:55     ` Ezekiel Newren
2026-01-16 10:39       ` Phillip Wood
2026-01-16 20:19         ` René Scharfe
2026-01-17 13:55           ` Phillip Wood
2026-01-17 16:04             ` Ezekiel Newren
2026-01-18 14:58               ` René Scharfe
2026-01-17 16:14         ` Ezekiel Newren
2026-01-17 16:16           ` Ezekiel Newren
2026-01-17 17:40           ` Phillip Wood
2026-01-19  5:59             ` Jeff King
2026-01-19 20:21               ` Ezekiel Newren
2026-01-19 20:40                 ` Jeff King
2026-01-20  2:36                   ` D. Ben Knoble
2026-01-21 21:00                   ` Ezekiel Newren
2026-01-21 21:20                     ` Jeff King
2026-01-21 21:31                       ` Junio C Hamano
2026-01-21 21:45                         ` Ezekiel Newren
2026-01-20 13:46               ` Phillip Wood
2026-01-20 14:06       ` Phillip Wood
2026-01-21 21:39         ` Ezekiel Newren
2026-01-28 11:15           ` Phillip Wood
2026-01-16 20:19   ` René Scharfe
2026-01-17 15:58     ` Ezekiel Newren
2026-01-18 14:55       ` René Scharfe
2026-01-02 18:52 ` [PATCH 02/10] xdiff: make classic diff explicit by creating xdl_do_classic_diff() Ezekiel Newren via GitGitGadget
2026-01-20 15:01   ` Phillip Wood
2026-01-21 21:05     ` Ezekiel Newren
2026-01-02 18:52 ` [PATCH 03/10] xdiff: don't waste time guessing the number of lines Ezekiel Newren via GitGitGadget
2026-01-20 15:02   ` Phillip Wood [this message]
2026-01-21 21:12     ` Ezekiel Newren
2026-01-22 10:16       ` Phillip Wood
2026-01-02 18:52 ` [PATCH 04/10] xdiff: let patience and histogram benefit from xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 15:02   ` Phillip Wood
2026-01-21 14:49     ` Phillip Wood
2026-01-02 18:52 ` [PATCH 05/10] xdiff: use xdfenv_t in xdl_trim_ends() and xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 06/10] xdiff: cleanup xdl_trim_ends() Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 07/10] xdiff: replace xdfile_t.dstart with xdfenv_t.delta_start Ezekiel Newren via GitGitGadget
2026-01-20 16:32   ` Phillip Wood
2026-01-28 10:51     ` Phillip Wood
2026-01-02 18:52 ` [PATCH 08/10] xdiff: replace xdfile_t.dend with xdfenv_t.delta_end Ezekiel Newren via GitGitGadget
2026-01-02 18:52 ` [PATCH 09/10] xdiff: remove dependence on xdlclassifier from xdl_cleanup_records() Ezekiel Newren via GitGitGadget
2026-01-16 20:19   ` René Scharfe
2026-01-17 16:34     ` Ezekiel Newren
2026-01-18 18:23       ` René Scharfe
2026-01-21 15:01   ` Phillip Wood
2026-01-02 18:52 ` [PATCH 10/10] xdiff: move xdl_cleanup_records() from xprepare.c to xdiffi.c Ezekiel Newren via GitGitGadget
2026-01-21 15:01   ` Phillip Wood
2026-01-28 10:56     ` Phillip Wood
2026-01-04  2:44 ` [PATCH 00/10] Xdiff cleanup part 3 Junio C Hamano
2026-01-04  6:01 ` Yee Cheng Chin
2026-01-28 14:40 ` Phillip Wood
2026-03-06 23:03 ` Junio C Hamano
2026-03-09 19:06   ` Ezekiel Newren
2026-03-09 23:31     ` Junio C Hamano
2026-03-25 21:11 ` [PATCH v2 0/5] " Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 1/5] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 2/5] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 3/5] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 4/5] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-25 21:11   ` [PATCH v2 5/5] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-25 21:58     ` Junio C Hamano
2026-03-26  6:26   ` [PATCH v2 0/5] Xdiff cleanup part 3 SZEDER Gábor
2026-03-27 19:23   ` [PATCH v3 0/6] " Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-27 21:09       ` Junio C Hamano
2026-03-27 23:01         ` Junio C Hamano
2026-03-30 16:00           ` Ezekiel Newren
2026-03-30 19:59             ` Junio C Hamano
2026-03-31  1:29               ` Ezekiel Newren
2026-03-27 19:23     ` [PATCH v3 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-27 19:23     ` [PATCH v3 6/6] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-30 16:59     ` [PATCH v4 0/6] Xdiff cleanup part 3 Ezekiel Newren via GitGitGadget
2026-03-30 16:59       ` [PATCH v4 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-03-30 17:23         ` Ezekiel Newren
2026-03-30 22:53         ` Junio C Hamano
2026-03-30 16:59       ` [PATCH v4 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-03-30 22:59         ` Junio C Hamano
2026-03-30 17:00       ` [PATCH v4 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-03-30 17:00       ` [PATCH v4 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-03-31  9:44         ` Phillip Wood
2026-03-31 16:13           ` Junio C Hamano
2026-04-14 21:58           ` Ezekiel Newren
2026-04-14 22:15             ` Junio C Hamano
2026-04-15 13:54               ` Phillip Wood
2026-03-30 17:00       ` [PATCH v4 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-03-30 23:02         ` Junio C Hamano
2026-03-31  9:44           ` Phillip Wood
2026-03-30 17:00       ` [PATCH v4 6/6] xdiff/xdl_cleanup_records: simplify INVESTIGATE handling for clarity Ezekiel Newren via GitGitGadget
2026-03-31  9:43         ` Phillip Wood
2026-04-01 16:00         ` Phillip Wood
2026-03-30 23:04       ` [PATCH v4 0/6] Xdiff cleanup part 3 Junio C Hamano
2026-03-31  9:45         ` Phillip Wood
2026-04-08 20:26       ` [PATCH v5 " Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-04-14 10:09           ` Phillip Wood
2026-04-08 20:26         ` [PATCH v5 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-04-08 20:26         ` [PATCH v5 6/6] xdiff/xdl_cleanup_records: put braces around the else clause Ezekiel Newren via GitGitGadget
2026-04-08 21:28         ` [PATCH v5 0/6] Xdiff cleanup part 3 Junio C Hamano
2026-04-09 14:01           ` Phillip Wood
2026-04-14 10:08         ` Phillip Wood
2026-04-14 17:06           ` Junio C Hamano
2026-04-29 22:08         ` [PATCH v6 " Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 1/6] xdiff/xdl_cleanup_records: delete local recs pointer Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 2/6] xdiff: use unambiguous types in xdl_bogo_sqrt() Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 3/6] xdiff/xdl_cleanup_records: use unambiguous types Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 4/6] xdiff/xdl_cleanup_records: make limits more clear Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 5/6] xdiff/xdl_cleanup_records: make setting action easier to follow Ezekiel Newren via GitGitGadget
2026-04-29 22:08           ` [PATCH v6 6/6] xdiff/xdl_cleanup_records: make execution of " Ezekiel Newren via GitGitGadget
2026-04-30 13:35           ` [PATCH v6 0/6] Xdiff cleanup part 3 Phillip Wood
2026-04-30 21:08             ` Ezekiel Newren
2026-05-04  0:59             ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=208da094-8a5d-4f16-b42b-5d5204576b5f@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=ezekielnewren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.