git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter
@ 2025-09-07 16:40 shejialuo
  2025-09-08 16:56 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: shejialuo @ 2025-09-07 16:40 UTC (permalink / raw)
  To: git

We would return negative index to indicate exact match by converting the
original positive index to be "-1 - index" in
"string_list_find_insert_index", which requires callers to decode this
information.

This is bad due to the following reasons:

1. The callers need to convert the negative index back to the original
   positive value, which requires the callers to understand the detail
   of the function.
2. As we have to return negative index, we need to specify the return
   type to be `int` instead of `size_t`, which would cause sign compare
   warnings.

Refactor "string_list_find_insert_index" to use an output parameter
"exact_match" for indicating the exact match rather than encoding
through negative return values.

Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 add-interactive.c | 7 ++++---
 mailmap.c         | 7 +++----
 refs.c            | 2 +-
 string-list.c     | 8 ++------
 string-list.h     | 2 +-
 5 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/add-interactive.c b/add-interactive.c
index 3e692b47ec..9a42b3b38b 100644
--- a/add-interactive.c
+++ b/add-interactive.c
@@ -221,7 +221,8 @@ static void find_unique_prefixes(struct prefix_item_list *list)
 
 static ssize_t find_unique(const char *string, struct prefix_item_list *list)
 {
-	int index = string_list_find_insert_index(&list->sorted, string, 1);
+	int exact_match;
+	int index = string_list_find_insert_index(&list->sorted, string, &exact_match);
 	struct string_list_item *item;
 
 	if (list->items.nr != list->sorted.nr)
@@ -229,8 +230,8 @@ static ssize_t find_unique(const char *string, struct prefix_item_list *list)
 		    " vs %"PRIuMAX")",
 		    (uintmax_t)list->items.nr, (uintmax_t)list->sorted.nr);
 
-	if (index < 0)
-		item = list->sorted.items[-1 - index].util;
+	if (exact_match)
+		item = list->sorted.items[index].util;
 	else if (index > 0 &&
 		 starts_with(list->sorted.items[index - 1].string, string))
 		return -1;
diff --git a/mailmap.c b/mailmap.c
index 56c72102d9..253517cdf6 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -243,10 +243,9 @@ void clear_mailmap(struct string_list *map)
 static struct string_list_item *lookup_prefix(struct string_list *map,
 					      const char *string, size_t len)
 {
-	int i = string_list_find_insert_index(map, string, 1);
-	if (i < 0) {
-		/* exact match */
-		i = -1 - i;
+	int exact_match;
+	int i = string_list_find_insert_index(map, string, &exact_match);
+	if (exact_match) {
 		if (!string[len])
 			return &map->items[i];
 		/*
diff --git a/refs.c b/refs.c
index 4ff55cf24f..f1ff5bf846 100644
--- a/refs.c
+++ b/refs.c
@@ -1699,7 +1699,7 @@ const char *find_descendant_ref(const char *dirname,
 	 * with dirname (remember, dirname includes the trailing
 	 * slash) and is not in skip, then we have a conflict.
 	 */
-	for (pos = string_list_find_insert_index(extras, dirname, 0);
+	for (pos = string_list_find_insert_index(extras, dirname, NULL);
 	     pos < extras->nr; pos++) {
 		const char *extra_refname = extras->items[pos].string;
 
diff --git a/string-list.c b/string-list.c
index bf358d1a5c..224bc182ff 100644
--- a/string-list.c
+++ b/string-list.c
@@ -92,13 +92,9 @@ int string_list_has_string(const struct string_list *list, const char *string)
 }
 
 int string_list_find_insert_index(const struct string_list *list, const char *string,
-				  int negative_existing_index)
+				  int *exact_match)
 {
-	int exact_match;
-	int index = get_entry_index(list, string, &exact_match);
-	if (exact_match)
-		index = -1 - (negative_existing_index ? index : 0);
-	return index;
+	return get_entry_index(list, string, exact_match);
 }
 
 struct string_list_item *string_list_lookup(struct string_list *list, const char *string)
diff --git a/string-list.h b/string-list.h
index 2b438c7733..03c7009472 100644
--- a/string-list.h
+++ b/string-list.h
@@ -174,7 +174,7 @@ void string_list_remove_empty_items(struct string_list *list, int free_util);
 /** Determine if the string_list has a given string or not. */
 int string_list_has_string(const struct string_list *list, const char *string);
 int string_list_find_insert_index(const struct string_list *list, const char *string,
-				  int negative_existing_index);
+				  int *exact_match);
 
 /**
  * Insert a new element to the string_list. The returned pointer can
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter
  2025-09-07 16:40 [PATCH 0/4] enhance string-list API to fix sign compare warnings shejialuo
@ 2025-09-07 16:42 ` shejialuo
  2025-09-09  6:22   ` Patrick Steinhardt
  0 siblings, 1 reply; 6+ messages in thread
From: shejialuo @ 2025-09-07 16:42 UTC (permalink / raw)
  To: git

We would return negative index to indicate exact match by converting the
original positive index to be "-1 - index" in
"string_list_find_insert_index", which requires callers to decode this
information.

This is bad due to the following reasons:

1. The callers need to convert the negative index back to the original
   positive value, which requires the callers to understand the detail
   of the function.
2. As we have to return negative index, we need to specify the return
   type to be `int` instead of `size_t`, which would cause sign compare
   warnings.

Refactor "string_list_find_insert_index" to use an output parameter
"exact_match" for indicating the exact match rather than encoding
through negative return values.

Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 add-interactive.c | 7 ++++---
 mailmap.c         | 7 +++----
 refs.c            | 2 +-
 string-list.c     | 8 ++------
 string-list.h     | 2 +-
 5 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/add-interactive.c b/add-interactive.c
index 3e692b47ec..9a42b3b38b 100644
--- a/add-interactive.c
+++ b/add-interactive.c
@@ -221,7 +221,8 @@ static void find_unique_prefixes(struct prefix_item_list *list)
 
 static ssize_t find_unique(const char *string, struct prefix_item_list *list)
 {
-	int index = string_list_find_insert_index(&list->sorted, string, 1);
+	int exact_match;
+	int index = string_list_find_insert_index(&list->sorted, string, &exact_match);
 	struct string_list_item *item;
 
 	if (list->items.nr != list->sorted.nr)
@@ -229,8 +230,8 @@ static ssize_t find_unique(const char *string, struct prefix_item_list *list)
 		    " vs %"PRIuMAX")",
 		    (uintmax_t)list->items.nr, (uintmax_t)list->sorted.nr);
 
-	if (index < 0)
-		item = list->sorted.items[-1 - index].util;
+	if (exact_match)
+		item = list->sorted.items[index].util;
 	else if (index > 0 &&
 		 starts_with(list->sorted.items[index - 1].string, string))
 		return -1;
diff --git a/mailmap.c b/mailmap.c
index 56c72102d9..253517cdf6 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -243,10 +243,9 @@ void clear_mailmap(struct string_list *map)
 static struct string_list_item *lookup_prefix(struct string_list *map,
 					      const char *string, size_t len)
 {
-	int i = string_list_find_insert_index(map, string, 1);
-	if (i < 0) {
-		/* exact match */
-		i = -1 - i;
+	int exact_match;
+	int i = string_list_find_insert_index(map, string, &exact_match);
+	if (exact_match) {
 		if (!string[len])
 			return &map->items[i];
 		/*
diff --git a/refs.c b/refs.c
index 4ff55cf24f..f1ff5bf846 100644
--- a/refs.c
+++ b/refs.c
@@ -1699,7 +1699,7 @@ const char *find_descendant_ref(const char *dirname,
 	 * with dirname (remember, dirname includes the trailing
 	 * slash) and is not in skip, then we have a conflict.
 	 */
-	for (pos = string_list_find_insert_index(extras, dirname, 0);
+	for (pos = string_list_find_insert_index(extras, dirname, NULL);
 	     pos < extras->nr; pos++) {
 		const char *extra_refname = extras->items[pos].string;
 
diff --git a/string-list.c b/string-list.c
index bf358d1a5c..224bc182ff 100644
--- a/string-list.c
+++ b/string-list.c
@@ -92,13 +92,9 @@ int string_list_has_string(const struct string_list *list, const char *string)
 }
 
 int string_list_find_insert_index(const struct string_list *list, const char *string,
-				  int negative_existing_index)
+				  int *exact_match)
 {
-	int exact_match;
-	int index = get_entry_index(list, string, &exact_match);
-	if (exact_match)
-		index = -1 - (negative_existing_index ? index : 0);
-	return index;
+	return get_entry_index(list, string, exact_match);
 }
 
 struct string_list_item *string_list_lookup(struct string_list *list, const char *string)
diff --git a/string-list.h b/string-list.h
index 2b438c7733..03c7009472 100644
--- a/string-list.h
+++ b/string-list.h
@@ -174,7 +174,7 @@ void string_list_remove_empty_items(struct string_list *list, int free_util);
 /** Determine if the string_list has a given string or not. */
 int string_list_has_string(const struct string_list *list, const char *string);
 int string_list_find_insert_index(const struct string_list *list, const char *string,
-				  int negative_existing_index);
+				  int *exact_match);
 
 /**
  * Insert a new element to the string_list. The returned pointer can
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter
  2025-09-07 16:40 [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
@ 2025-09-08 16:56 ` Junio C Hamano
  2025-09-15 12:24   ` shejialuo
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2025-09-08 16:56 UTC (permalink / raw)
  To: shejialuo; +Cc: git

shejialuo <shejialuo@gmail.com> writes:

> We would return negative index to indicate exact match by converting the
> original positive index to be "-1 - index" in
> "string_list_find_insert_index", which requires callers to decode this
> information.
>
> This is bad due to the following reasons:
>
> 1. The callers need to convert the negative index back to the original
>    positive value, which requires the callers to understand the detail
>    of the function.

That has pretty much been the convention so far, not convincing that
it is "bad" at all.

> 2. As we have to return negative index, we need to specify the return
>    type to be `int` instead of `size_t`, which would cause sign compare
>    warnings.

That sounds more like the tail wagging the dog.

Construct your argument the other way around, perhaps?

 - We NEED to be able to use the full range of size_t to express the
   index in the array string_list holds for SUCH AND SUCH REASONS.
   But string_list_find_insert_index() uses "int", which may not be
   large enough to cover the range size_t covers.

 - In addition, in order to signal that the returned value for a
   query is about an existing entry in the array, or a location that
   an entry would be inserted at, we use a signed int and use the
   bog standard "-1 - index" encoding for this purpose.  This
   further halves the range of valid array index.

 - To allow us to use the full range of size_t, use full size_t for
   the index, and have a separate bit to tell if that index is about
   an existing entry, or where the queried entry would be stored at
   if we inserted it.

Your argument does not justify the first point, your desire to use
size_t in the first place, and that is what makes it sound
backwards, I think..

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter
  2025-09-07 16:42 ` [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
@ 2025-09-09  6:22   ` Patrick Steinhardt
  2025-09-15 12:11     ` shejialuo
  0 siblings, 1 reply; 6+ messages in thread
From: Patrick Steinhardt @ 2025-09-09  6:22 UTC (permalink / raw)
  To: shejialuo; +Cc: git

On Mon, Sep 08, 2025 at 12:42:29AM +0800, shejialuo wrote:
> diff --git a/mailmap.c b/mailmap.c
> index 56c72102d9..253517cdf6 100644
> --- a/mailmap.c
> +++ b/mailmap.c
> @@ -243,10 +243,9 @@ void clear_mailmap(struct string_list *map)
>  static struct string_list_item *lookup_prefix(struct string_list *map,
>  					      const char *string, size_t len)
>  {
> -	int i = string_list_find_insert_index(map, string, 1);
> -	if (i < 0) {
> -		/* exact match */
> -		i = -1 - i;
> +	int exact_match;
> +	int i = string_list_find_insert_index(map, string, &exact_match);
> +	if (exact_match) {
>  		if (!string[len])
>  			return &map->items[i];
>  		/*

Yeah, this looks much cleaner compared to before.

> diff --git a/string-list.c b/string-list.c
> index bf358d1a5c..224bc182ff 100644
> --- a/string-list.c
> +++ b/string-list.c
> @@ -92,13 +92,9 @@ int string_list_has_string(const struct string_list *list, const char *string)
>  }
>  
>  int string_list_find_insert_index(const struct string_list *list, const char *string,
> -				  int negative_existing_index)
> +				  int *exact_match)
>  {
> -	int exact_match;
> -	int index = get_entry_index(list, string, &exact_match);
> -	if (exact_match)
> -		index = -1 - (negative_existing_index ? index : 0);
> -	return index;
> +	return get_entry_index(list, string, exact_match);
>  }
>  
>  struct string_list_item *string_list_lookup(struct string_list *list, const char *string)

Okay, this here is where the preceding patch comes from, as some callers
pass `NULL` to `string_list_find_insert_index()`.

> diff --git a/string-list.h b/string-list.h
> index 2b438c7733..03c7009472 100644
> --- a/string-list.h
> +++ b/string-list.h
> @@ -174,7 +174,7 @@ void string_list_remove_empty_items(struct string_list *list, int free_util);
>  /** Determine if the string_list has a given string or not. */
>  int string_list_has_string(const struct string_list *list, const char *string);
>  int string_list_find_insert_index(const struct string_list *list, const char *string,
> -				  int negative_existing_index);
> +				  int *exact_match);
>  

Makes me wonder whether we want to use `bool *exact_match` now to hint
that this is really only a true/false value? If so, we'd also have to
adapt the signature in the preceding commit.

Patrick

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter
  2025-09-09  6:22   ` Patrick Steinhardt
@ 2025-09-15 12:11     ` shejialuo
  0 siblings, 0 replies; 6+ messages in thread
From: shejialuo @ 2025-09-15 12:11 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Tue, Sep 09, 2025 at 08:22:56AM +0200, Patrick Steinhardt wrote:
> > index 2b438c7733..03c7009472 100644
> > --- a/string-list.h
> > +++ b/string-list.h
> > @@ -174,7 +174,7 @@ void string_list_remove_empty_items(struct string_list *list, int free_util);
> >  /** Determine if the string_list has a given string or not. */
> >  int string_list_has_string(const struct string_list *list, const char *string);
> >  int string_list_find_insert_index(const struct string_list *list, const char *string,
> > -				  int negative_existing_index);
> > +				  int *exact_match);
> >  
> 
> Makes me wonder whether we want to use `bool *exact_match` now to hint
> that this is really only a true/false value? If so, we'd also have to
> adapt the signature in the preceding commit.
> 

That's right, I think `bool *` would be much better. I would improve
this in the next version.

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter
  2025-09-08 16:56 ` Junio C Hamano
@ 2025-09-15 12:24   ` shejialuo
  0 siblings, 0 replies; 6+ messages in thread
From: shejialuo @ 2025-09-15 12:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, Sep 08, 2025 at 09:56:08AM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > We would return negative index to indicate exact match by converting the
> > original positive index to be "-1 - index" in
> > "string_list_find_insert_index", which requires callers to decode this
> > information.
> >
> > This is bad due to the following reasons:
> >
> > 1. The callers need to convert the negative index back to the original
> >    positive value, which requires the callers to understand the detail
> >    of the function.
> 
> That has pretty much been the convention so far, not convincing that
> it is "bad" at all.
> 

Good point, I somehow like to use the words containing emotion, which is
not suitable. And after googling, I realize that this is the convention.

> > 2. As we have to return negative index, we need to specify the return
> >    type to be `int` instead of `size_t`, which would cause sign compare
> >    warnings.
> 
> That sounds more like the tail wagging the dog.
> 
> Construct your argument the other way around, perhaps?
> 
>  - We NEED to be able to use the full range of size_t to express the
>    index in the array string_list holds for SUCH AND SUCH REASONS.
>    But string_list_find_insert_index() uses "int", which may not be
>    large enough to cover the range size_t covers.
> 
>  - In addition, in order to signal that the returned value for a
>    query is about an existing entry in the array, or a location that
>    an entry would be inserted at, we use a signed int and use the
>    bog standard "-1 - index" encoding for this purpose.  This
>    further halves the range of valid array index.
> 
>  - To allow us to use the full range of size_t, use full size_t for
>    the index, and have a separate bit to tell if that index is about
>    an existing entry, or where the queried entry would be stored at
>    if we inserted it.
> 
> Your argument does not justify the first point, your desire to use
> size_t in the first place, and that is what makes it sound
> backwards, I think..

Thanks for the suggestion, I will improve this in the next version.

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-09-15 12:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-07 16:40 [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
2025-09-08 16:56 ` Junio C Hamano
2025-09-15 12:24   ` shejialuo
  -- strict thread matches above, loose matches on Subject: below --
2025-09-07 16:40 [PATCH 0/4] enhance string-list API to fix sign compare warnings shejialuo
2025-09-07 16:42 ` [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
2025-09-09  6:22   ` Patrick Steinhardt
2025-09-15 12:11     ` shejialuo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).