All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Toon Claes <toon@iotcl.com>
Cc: git@vger.kernel.org,  Karthik Nayak <karthik.188@gmail.com>,
	 Justin Tobler <jltobler@gmail.com>,
	 Taylor Blau <me@ttaylorr.com>,
	 "D. Ben Knoble" <ben.knoble@gmail.com>,
	 Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH v2] last-modified: implement faster algorithm
Date: Tue, 21 Oct 2025 10:52:11 -0700	[thread overview]
Message-ID: <xmqqy0p4uoqc.fsf@gitster.g> (raw)
In-Reply-To: <20251021-b4-toon-last-modified-faster-v2-1-f6dcbc26fc5c@iotcl.com> (Toon Claes's message of "Tue, 21 Oct 2025 14:56:19 +0200")

Toon Claes <toon@iotcl.com> writes:

> +static size_t path_idx(struct last_modified *lm, char *path)
> +{
> +	struct last_modified_entry *ent;
> +	ent = hashmap_get_entry_from_hash(&lm->paths, strhash(path), path,
> +					  struct last_modified_entry, hashent);
> +
> +	return ent ? ent->diff_idx : -1;
> +}

size_t is unsigned and cannot reutrn -1 sanely, unless the caller
knows that ((size_t)-1) signals an error.  The compiler warns, and
we compile with -Werror, so we end up getting

    builtin/last-modified.c: In function 'path_idx':
    builtin/last-modified.c:235:38: error: operand of '?:' changes signedness from 'int' to 'size_t' {aka 'long unsigned int'} due to unsignedness of other operand [-Werror=sign-compare]
      235 |         return ent ? ent->diff_idx : -1;
          |                                      ^~

> +static void pass_to_parent(struct last_modified *lm,
> +			   struct bitmap *c,
> +			   struct bitmap *p,
> +			   size_t pos)
> +{
> +	bitmap_unset(c, pos);
> +	bitmap_set(p, pos);
> +}

Mark lm as UNUSED, or we'd get 

    builtin/last-modified.c: In function 'pass_to_parent':
    builtin/last-modified.c:238:50: error: unused parameter 'lm' [-Werror=unused-parameter]
      238 | static void pass_to_parent(struct last_modified *lm,
          |                            ~~~~~~~~~~~~~~~~~~~~~~^~

> @@ -220,42 +273,195 @@ static bool maybe_changed_path(struct last_modified *lm, struct commit *origin)
>  	return false;
>  }
>  
> +static void process_parent(struct last_modified *lm,
> +			   struct prio_queue *queue,
> +			   struct commit *c, struct bitmap *active_c,
> +			   struct commit *parent, int parent_i)
> +{
> +	size_t i;
> +...
> +	for (i = 0; i < diff_queued_diff.nr; i++) {

Our -Wsign-compare forces the compiler to give a stupid warning
here, that i is unsigned and diff_queued_diff.nr is signed.  

    builtin/last-modified.c: In function 'process_parent':
    builtin/last-modified.c:304:23: error: comparison of integer expressions of different signedness: 'size_t' {aka 'long unsigned int'} and 'int' [-Werror=sign-compare]
      304 |         for (i = 0; i < diff_queued_diff.nr; i++) {
          |                       ^

Yes, it is comparing unsigned with signed but so what?

As people may now know, my preference is to wean ourselves off of
this dogmatic trust in -Wsign-compare but those who disagree and want
to remove #define DISABLE_SIGN_COMPARE_WARNINGS should help our poor
compiler here by telling it that this comparison is perfectly fine.

We know diff_queued_diff.nr is an int, and i is size_t, but we also
know diff_queued_diff.nr won't be negative (or we have much bigger
problems) and cannot be larger than what size_t can represent.

> +		struct diff_filepair *fp = diff_queued_diff.queue[i];
> +		size_t k = path_idx(lm, fp->two->path);
> +		if (0 <= k && bitmap_get(active_c, k))
> +			bitmap_set(lm->scratch, k);
> +	}

Earlier path_idx() wanted to signal an error by returning negative,
but the type is size_t that is unsigned so it cannot do so.  We
instead get

    builtin/last-modified.c:307:23: error: comparison of unsigned expression in '>= 0' is always true [-Werror=type-limits]
      307 |                 if (0 <= k && bitmap_get(active_c, k))
          |                       ^~

and in this case we actually deserve it (in the sense that this is
not the fault of dogmatic trust in -Wsign-compare; this is caused by
using size_t to count things).

And the solution for this is *not* "size_t" -> "ssize_t", because
ssize_t is not "store half the range of size_t with negative values
reserved for something else like errors".  Its width can be much
narrower (this came up in a separate thread very recently [*]).
Instead we'd need something ugly like

	if (k != (size_t)-1 && bitmap_get(active_c, k))

A quick band-aid patch to make it compile is attached at the end,
but it does not try to address the root causes, which are abuse of
size_t as count_t and religious use of "-Wsign-compare" [*].


[Reference]

* https://lore.kernel.org/git/9eafee4d-ea94-4382-ada0-58000d229d2e@gmail.com/
* https://staticthinking.wordpress.com/2023/07/25/wsign-compare-is-garbage/


 builtin/last-modified.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git c/builtin/last-modified.c w/builtin/last-modified.c
index e9050485a9..6135bcc584 100644
--- c/builtin/last-modified.c
+++ w/builtin/last-modified.c
@@ -232,10 +232,10 @@ static size_t path_idx(struct last_modified *lm, char *path)
 	ent = hashmap_get_entry_from_hash(&lm->paths, strhash(path), path,
 					  struct last_modified_entry, hashent);
 
-	return ent ? ent->diff_idx : -1;
+	return ent ? ent->diff_idx : (size_t)-1;
 }
 
-static void pass_to_parent(struct last_modified *lm,
+static void pass_to_parent(struct last_modified *lm UNUSED,
 			   struct bitmap *c,
 			   struct bitmap *p,
 			   size_t pos)
@@ -278,7 +278,6 @@ static void process_parent(struct last_modified *lm,
 			   struct commit *c, struct bitmap *active_c,
 			   struct commit *parent, int parent_i)
 {
-	size_t i;
 	struct bitmap *active_p;
 
 	repo_parse_commit(lm->rev.repo, parent);
@@ -301,13 +300,13 @@ static void process_parent(struct last_modified *lm,
 	 * First, collect all paths that are *not* TREESAME in 'scratch'.
 	 * Then, pass paths that *are* TREESAME and active to the parent.
 	 */
-	for (i = 0; i < diff_queued_diff.nr; i++) {
+	for (int i = 0; i < diff_queued_diff.nr; i++) {
 		struct diff_filepair *fp = diff_queued_diff.queue[i];
 		size_t k = path_idx(lm, fp->two->path);
-		if (0 <= k && bitmap_get(active_c, k))
+		if (k != (size_t)-1 && bitmap_get(active_c, k))
 			bitmap_set(lm->scratch, k);
 	}
-	for (i = 0; i < lm->all_paths_nr; i++) {
+	for (size_t i = 0; i < lm->all_paths_nr; i++) {
 		if (bitmap_get(active_c, i) && !bitmap_get(lm->scratch, i))
 			pass_to_parent(lm, active_c, active_p, i);
 	}


  reply	other threads:[~2025-10-21 17:52 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-16  8:39 [PATCH] last-modified: implement faster algorithm Toon Claes
2025-10-16 18:51 ` Justin Tobler
2025-10-17 10:38   ` Toon Claes
2025-10-16 20:48 ` D. Ben Knoble
2025-10-17 10:45   ` Toon Claes
2025-10-16 23:38 ` Taylor Blau
2025-10-17  6:30   ` Jeff King
2025-10-17 14:54     ` Taylor Blau
2025-10-21  8:20       ` Jeff King
2025-10-17 12:07   ` Toon Claes
2025-10-21  9:04     ` Toon Claes
2025-10-23 23:59       ` Taylor Blau
2025-10-21 13:00     ` Toon Claes
2025-10-23 23:56     ` Taylor Blau
2025-10-27 15:48       ` Toon Claes
2025-10-17  6:37 ` Jeff King
2025-10-17 10:47   ` Toon Claes
2025-10-21 12:56 ` [PATCH v2] " Toon Claes
2025-10-21 17:52   ` Junio C Hamano [this message]
2025-10-22  0:26     ` Taylor Blau
2025-10-22  0:28       ` Taylor Blau
2025-10-22  3:48       ` Junio C Hamano
2025-10-24  0:01         ` Taylor Blau
2025-10-24  0:37           ` Junio C Hamano
2025-10-27 19:22             ` Taylor Blau
2025-10-29 13:01               ` Toon Claes
2025-10-23  8:01     ` Toon Claes
2025-10-23  7:50   ` [PATCH v3] " Toon Claes
2025-10-24  0:03     ` Taylor Blau
2025-10-27  7:03       ` Toon Claes
2025-11-03 15:47   ` [PATCH v4] " Toon Claes
2025-11-03 16:44     ` Junio C Hamano
2025-11-04 15:08       ` Toon Claes
2025-11-19 11:34     ` t8020-last-modified.sh failure on s390x (Re: [PATCH v4] last-modified: implement faster algorithm) Anders Kaseorg
2025-11-19 13:49       ` Kristoffer Haugsbakk
2025-11-19 20:06         ` Anders Kaseorg
2025-11-20  8:16           ` Jeff King
2025-11-28 16:45             ` Toon Claes
2025-11-28 17:35               ` Kristoffer Haugsbakk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqy0p4uoqc.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=ben.knoble@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jltobler@gmail.com \
    --cc=karthik.188@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=stolee@gmail.com \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.