From: Phillip Wood <phillip.wood123@gmail.com>
To: Patrick Steinhardt <ps@pks.im>, git@vger.kernel.org
Cc: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>,
Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com>,
Jeff King <peff@peff.net>
Subject: Re: [PATCH] git-compat-util: introduce `count_t` typedef
Date: Thu, 7 Aug 2025 15:17:14 +0100 [thread overview]
Message-ID: <582e8e75-c6eb-4845-8f3b-62f234f0964f@gmail.com> (raw)
In-Reply-To: <20250807-pks-introduce-count-t-v1-1-e96be52d8db1@pks.im>
Hi Patrick
On 07/08/2025 10:22, Patrick Steinhardt wrote:
> Historically, Git has been very lenient with its use of integer types
> and didn't really give much thought into which type to use in what
> situation. We interchangeably mix and match signed and unsigned types
> and often times blindly convert them. This use has led to several
> out-of-bounds reads and writes in the past, some of which could be
> turned into arbitrary code execution.
My feeling is that one of the main problems has been using different
types for loop indexes and loop limits. If we mandated that the loop
index had to be the same type as the limit that would improve things
considerably and without mandating a particular type.
> A discussion that regularly comes up in this context though is what
> types to use for counting entities:
>
> - One question is whether the type should be signed or unsigned.
> Arguably, the answer should be to use unsigned types as long as we
> know that we never need a negative value, e.g. as a sentinel. This
> helps guide the reader and explicitly conveys the sense that such a
> counter is only ever going to be a non-negative number. Otherwise,
> code would need to be more careful as it may hold negative values.
The counter argument to this is that it is easy to write incorrect loops
when counting down if the loop variable is unsigned. Using a typedef
that hides the actual type makes that harder to spot as it is not
immediately obvious whether the loop index is signed or not. As we have
cases that do need to store a negative value then we're still left with
using a mix of signed and unsigned types for counting in our code base.
> Introduce a new typedef for `count_t` that is of type `uintptr_t` to
> give clear guidance what type to use for counting entities. This type
> was chosen because in the worst case, an entity may be a single byte and
> we fill all of our memory with these entities. As `uintptr_t` is
> guaranteed to hold at least the value of a pointer, we know that it
> could be used to index into every single such entity.
How many sites actually allocate anything like that number of
entities?Generally we use ALLOC_GROW() or ALLOC_GROW_BY() which means
that we're not normally counting bytes. ALLOC_GROW_BY() assumes the
number of entities fits into a size_t so should be be changing that to
use count_t? If we're worried about overflows then maybe we should look
at alloc_nr() which calculates the new allocation with
(nr + 16) * 3 / 2
which which will start overflowing long before we starting allocating
UINTPTR_MAX single byte entities.
Thanks
Phillip
next prev parent reply other threads:[~2025-08-07 14:17 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-07 9:22 [PATCH] git-compat-util: introduce `count_t` typedef Patrick Steinhardt
2025-08-07 11:00 ` Matthias Aßhauer
2025-08-07 14:17 ` Phillip Wood [this message]
2025-08-07 16:43 ` Junio C Hamano
2025-08-07 16:38 ` Junio C Hamano
2025-08-07 22:07 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=582e8e75-c6eb-4845-8f3b-62f234f0964f@gmail.com \
--to=phillip.wood123@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=oswald.buddenhagen@gmx.de \
--cc=peff@peff.net \
--cc=phillip.wood@dunelm.org.uk \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).