git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: shejialuo <shejialuo@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Karthik Nayak <karthik.188@gmail.com>,
	git@vger.kernel.org, Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH v2 2/4] string-list: replace negative index encoding with "exact_match" parameter
Date: Sun, 5 Oct 2025 22:06:13 +0800	[thread overview]
Message-ID: <aOJ7VTkmFgtuCjvf@ArchLinux> (raw)
In-Reply-To: <xmqq348dovi3.fsf@gitster.g>

On Tue, Sep 23, 2025 at 11:48:36AM -0700, Junio C Hamano wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
> 
> > shejialuo <shejialuo@gmail.com> writes:
> >
> >> We would return negative index to indicate exact match by converting the
> >> original positive index to be "-1 - index" in
> >> "string_list_find_insert_index", which requires callers to decode this
> >> information. This approach has several limitations:
> >>
> >
> > Nit: It would be nice to start by explaining what
> > "string_list_find_insert_index" does and then talking about the negative
> > index. Perhaps something like:
> >
> >   The `string_list_find_insert_index()` function is used to determine
> >   the correct insertion index for a new string within the string list.
> >   The function also doubles up to convey if the string is already
> >   existing in the list, this is done by returning a negative index
> >   "-1 -index". Users are expected to decode this information.
> 
> Yeah, such an introductory statement would help those who are not
> familiar with the convention.  Thanks for suggesting it.
> 
> >> 1. It prevents us from using the full range of size_t, which is
> >>    necessary for large string list.
> 
> It is a disease to think that countable things must be counted in
> size_t and it needs to be somehow cured.
> 
> It is a type to count the size of memory allocations, nothing more.
> If you are holding 1000-bytes per the stuff you are counting, you
> would not need the full range of size_t --- you'll ran out your
> memory way before you fill size_t with the things you are counting.
> 

Make sense. We don't need the full range of size_t. I would improve
commit message later.

> When there is no external constraints (like you need to specify
> exact size to describe a file format to be interoperable), the most
> appropriate type to count things in is a platform natural "int".
> You wouldn't be handling billions of strings in string-list anyway
> (and that is smaller than half of 32-bit size_t; 64-bit size_t is
> much larger).
> 
> >> 2. Using int for indices while other parts of the codebase use size_t
> >>    creates signed comparison warnings when these values are compared.
> 
> The other thing may be (mis)using size_t when it should not be.  If
> they were also using "int" that would also squelch the warnings from
> "-Wsign-compare".
> 

At first, I feel quite hard to understand above. After reading below
mail and the blog

    https://staticthinking.wordpress.com/2023/07/25/wsign-compare-is-garbage/

I get your point. For sign compare warnings, the most intuitive way is
change all `int`s to `unsigned int`s or change all `unsigned int`s to
`int`s. However, this is a bad way as it does not solve the problem (and
it might cause other problems). Instead of simply making the type match,
we should check the value range.

For some cases, it is always ok to compare `size_t with `int` as long as
`size_t` does not cause overflow. Sign warnings might be a hint. And my
commit message is bad as I just want to express because others are
`unsigned`, so we should change. But the true point is that at now, the
returned index may overflow. Because we may have index that is greater
than what `int` could count. I would improve commit message later.

> For an amusing read:
> 
>   https://lore.kernel.org/lkml/CAHk-=wg+_6eQnLWm-kihFxJo1_EmyLSGruKVGzuRUwACE=osrA@mail.gmail.com/

Really thanks for this insightful hint.

Jialuo

  parent reply	other threads:[~2025-10-05 14:06 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-07 16:40 [PATCH 0/4] enhance string-list API to fix sign compare warnings shejialuo
2025-09-07 16:42 ` [PATCH 1/4] string-list: allow passing NULL for `get_entry_index` shejialuo
2025-09-09  6:22   ` Patrick Steinhardt
2025-09-07 16:42 ` [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
2025-09-09  6:22   ` Patrick Steinhardt
2025-09-15 12:11     ` shejialuo
2025-09-07 16:42 ` [PATCH 3/4] string-list: change "string_list_find_insert_index" return type to "size_t" shejialuo
2025-09-09  6:23   ` Patrick Steinhardt
2025-09-09 19:21     ` Junio C Hamano
2025-09-10  4:57       ` Patrick Steinhardt
2025-09-07 16:42 ` [PATCH 4/4] refs: enable sign compare warnings check shejialuo
2025-09-09  6:23   ` Patrick Steinhardt
2025-09-07 16:43 ` [PATCH 0/4] enhance string-list API to fix sign compare warnings shejialuo
2025-09-17  9:18 ` [PATCH v2 " shejialuo
2025-09-17  9:19   ` [PATCH v2 1/4] string-list: use bool instead of int for "exact_match" shejialuo
2025-09-17  9:19   ` [PATCH v2 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
2025-09-23  8:14     ` Patrick Steinhardt
2025-10-05 13:31       ` shejialuo
2025-09-23  9:35     ` Karthik Nayak
2025-09-23 18:48       ` Junio C Hamano
2025-09-24  5:36         ` Jeff King
2025-09-24 13:20           ` Junio C Hamano
2025-09-25  2:50             ` Jeff King
2025-09-25 13:33               ` Junio C Hamano
2025-10-09  5:52                 ` Jeff King
2025-10-08  1:49             ` Collin Funk
2025-10-09  5:55               ` Jeff King
2025-10-05 14:11           ` shejialuo
2025-10-05 14:06         ` shejialuo [this message]
2025-09-17  9:20   ` [PATCH v2 3/4] string-list: change "string_list_find_insert_index" return type to "size_t" shejialuo
2025-09-23  9:44     ` Karthik Nayak
2025-10-05  9:29       ` shejialuo
2025-09-17  9:20   ` [PATCH v2 4/4] refs: enable sign compare warnings check shejialuo
2025-10-06  6:28   ` [PATCH v3 0/4] enhance string-list API to fix sign compare warnings shejialuo
2025-10-06  6:32     ` [PATCH v3 1/4] string-list: use bool instead of int for "exact_match" shejialuo
2025-10-06  6:32     ` [PATCH v3 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
2025-10-06  6:32     ` [PATCH v3 3/4] string-list: change "string_list_find_insert_index" return type to "size_t" shejialuo
2025-10-09  6:03       ` Jeff King
2025-10-06  6:32     ` [PATCH v3 4/4] refs: enable sign compare warnings check shejialuo
2025-10-06 22:09     ` [PATCH v3 0/4] enhance string-list API to fix sign compare warnings Junio C Hamano
2025-10-08  1:52       ` Collin Funk
2025-10-08 15:56         ` Junio C Hamano
2025-10-08  8:11       ` Karthik Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOJ7VTkmFgtuCjvf@ArchLinux \
    --to=shejialuo@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=karthik.188@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).