git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: shejialuo <shejialuo@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
	Karthik Nayak <karthik.188@gmail.com>,
	git@vger.kernel.org, Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH v2 2/4] string-list: replace negative index encoding with "exact_match" parameter
Date: Sun, 5 Oct 2025 22:11:08 +0800	[thread overview]
Message-ID: <aOJ8fAZVQ8y1oMgR@ArchLinux> (raw)
In-Reply-To: <20250924053601.GC1173044@coredump.intra.peff.net>

On Wed, Sep 24, 2025 at 01:36:01AM -0400, Jeff King wrote:
> On Tue, Sep 23, 2025 at 11:48:36AM -0700, Junio C Hamano wrote:
> 
> > >> 1. It prevents us from using the full range of size_t, which is
> > >>    necessary for large string list.
> > 
> > It is a disease to think that countable things must be counted in
> > size_t and it needs to be somehow cured.
> > 
> > It is a type to count the size of memory allocations, nothing more.
> > If you are holding 1000-bytes per the stuff you are counting, you
> > would not need the full range of size_t --- you'll ran out your
> > memory way before you fill size_t with the things you are counting.
> > 
> > When there is no external constraints (like you need to specify
> > exact size to describe a file format to be interoperable), the most
> > appropriate type to count things in is a platform natural "int".
> > You wouldn't be handling billions of strings in string-list anyway
> > (and that is smaller than half of 32-bit size_t; 64-bit size_t is
> > much larger).
> 
> I agree that size_t is much more than one needs for counting most
> things. But the problem is that "int" is much too small, if you are
> worried about malicious input causing integer overflows that could cause
> memory access errors.
> 
> A nice property of counting everything as size_t is that if we are
> storing even a single byte per item, we will fail to allocate before
> hitting an integer overflow. So no, we do not expect to store billions
> of strings. But it is not that hard to convince Git to allocate billions
> of items in a list on a 64-bit system with 32-bit ints. And it is nice
> to know that iterating over them or trying to extend the array will
> never hit an integer overflow bug.
> 

Make sense.

> I'd say the "right" size for preventing overflows probably only needs to
> be 58-60 bits or so, since usually we are storing more than one byte
> (plus overhead). But 64-bit is the natural machine word size that
> matches what we want. However, we should _not_ be worried about losing
> one bit to making it signed, especially if that makes it less
> error-prone to convert instances of "int" to use "size_t". I would be
> surprised if an attacker could convince a program to truly use up half
> of its address space.
> 
> > >> 2. Using int for indices while other parts of the codebase use size_t
> > >>    creates signed comparison warnings when these values are compared.
> > 
> > The other thing may be (mis)using size_t when it should not be.  If
> > they were also using "int" that would also squelch the warnings from
> > "-Wsign-compare".
> 
> So I really care only about truncation and overflow above. Sign issues
> can cause bugs, of course, but the real issue is the size mismatch
> between "int" and "size_t". And while -Wsign-compare is sometimes an
> easy way to find those mismatches (because of the sign mismatch between
> them), it may bring more hassle than it's worth.
> 

That's right, I would improve my commit message to show the correct
motivation.

Thanks,
Jialuo

  parent reply	other threads:[~2025-10-05 14:11 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-07 16:40 [PATCH 0/4] enhance string-list API to fix sign compare warnings shejialuo
2025-09-07 16:42 ` [PATCH 1/4] string-list: allow passing NULL for `get_entry_index` shejialuo
2025-09-09  6:22   ` Patrick Steinhardt
2025-09-07 16:42 ` [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
2025-09-09  6:22   ` Patrick Steinhardt
2025-09-15 12:11     ` shejialuo
2025-09-07 16:42 ` [PATCH 3/4] string-list: change "string_list_find_insert_index" return type to "size_t" shejialuo
2025-09-09  6:23   ` Patrick Steinhardt
2025-09-09 19:21     ` Junio C Hamano
2025-09-10  4:57       ` Patrick Steinhardt
2025-09-07 16:42 ` [PATCH 4/4] refs: enable sign compare warnings check shejialuo
2025-09-09  6:23   ` Patrick Steinhardt
2025-09-07 16:43 ` [PATCH 0/4] enhance string-list API to fix sign compare warnings shejialuo
2025-09-17  9:18 ` [PATCH v2 " shejialuo
2025-09-17  9:19   ` [PATCH v2 1/4] string-list: use bool instead of int for "exact_match" shejialuo
2025-09-17  9:19   ` [PATCH v2 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
2025-09-23  8:14     ` Patrick Steinhardt
2025-10-05 13:31       ` shejialuo
2025-09-23  9:35     ` Karthik Nayak
2025-09-23 18:48       ` Junio C Hamano
2025-09-24  5:36         ` Jeff King
2025-09-24 13:20           ` Junio C Hamano
2025-09-25  2:50             ` Jeff King
2025-09-25 13:33               ` Junio C Hamano
2025-10-09  5:52                 ` Jeff King
2025-10-08  1:49             ` Collin Funk
2025-10-09  5:55               ` Jeff King
2025-10-05 14:11           ` shejialuo [this message]
2025-10-05 14:06         ` shejialuo
2025-09-17  9:20   ` [PATCH v2 3/4] string-list: change "string_list_find_insert_index" return type to "size_t" shejialuo
2025-09-23  9:44     ` Karthik Nayak
2025-10-05  9:29       ` shejialuo
2025-09-17  9:20   ` [PATCH v2 4/4] refs: enable sign compare warnings check shejialuo
2025-10-06  6:28   ` [PATCH v3 0/4] enhance string-list API to fix sign compare warnings shejialuo
2025-10-06  6:32     ` [PATCH v3 1/4] string-list: use bool instead of int for "exact_match" shejialuo
2025-10-06  6:32     ` [PATCH v3 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
2025-10-06  6:32     ` [PATCH v3 3/4] string-list: change "string_list_find_insert_index" return type to "size_t" shejialuo
2025-10-09  6:03       ` Jeff King
2025-10-06  6:32     ` [PATCH v3 4/4] refs: enable sign compare warnings check shejialuo
2025-10-06 22:09     ` [PATCH v3 0/4] enhance string-list API to fix sign compare warnings Junio C Hamano
2025-10-08  1:52       ` Collin Funk
2025-10-08 15:56         ` Junio C Hamano
2025-10-08  8:11       ` Karthik Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOJ8fAZVQ8y1oMgR@ArchLinux \
    --to=shejialuo@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=karthik.188@gmail.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).