[PATCH 1/2] strbuf: use st_add3() in strbuf

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] strbuf: use st_add3() in strbuf_grow()
@ 2026-05-14 15:11 René Scharfe
  2026-05-14 15:13 ` [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang René Scharfe
  2026-05-14 19:07 ` [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() Junio C Hamano
  0 siblings, 2 replies; 13+ messages in thread
From: René Scharfe @ 2026-05-14 15:11 UTC (permalink / raw)
  To: Git List

Simplify the code by calling st_add3() to do overflow checks instead of
open-coding it.  This changes the error message to include the offending
summands, which can be helpful when tracking down the cause.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 strbuf.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 3e04addc22..bb04d3910e 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -106,12 +106,9 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t len, size_t alloc)
 void strbuf_grow(struct strbuf *sb, size_t extra)
 {
 	int new_buf = !sb->alloc;
-	if (unsigned_add_overflows(extra, 1) ||
-	    unsigned_add_overflows(sb->len, extra + 1))
-		die("you want to use way too much memory");
 	if (new_buf)
 		sb->buf = NULL;
-	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
+	ALLOC_GROW(sb->buf, st_add3(sb->len, extra, 1), sb->alloc);
 	if (new_buf)
 		sb->buf[0] = '\0';
 }
-- 
2.54.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang
  2026-05-14 15:11 [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() René Scharfe
@ 2026-05-14 15:13 ` René Scharfe
  2026-05-14 19:12   ` Junio C Hamano
  2026-05-15  4:40   ` Jeff King
  2026-05-14 19:07 ` [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() Junio C Hamano
  1 sibling, 2 replies; 13+ messages in thread
From: René Scharfe @ 2026-05-14 15:13 UTC (permalink / raw)
  To: Git List; +Cc: Jeff King

Clang and GCC optimize away comparisons of overflow checks by checking
the carry flag on x64.  GCC does the same on ARM64, but Clang currently
(version 22.1) doesn't.

Provide a variant of st_add() that wraps __builtin_add_overflow() to
help Clang optimize it.  Use it on all platforms for simplicity.

On an Apple M1 I get a nice speedup for a command that builds lots of
strings using a strbuf, which exercises the st_add3() in strbuf_grow()
for every line of output:

Benchmark 1: ./git_main cat-file --batch-all-objects --batch-check='%(objectname)'
  Time (mean ± σ):     119.8 ms ±   0.2 ms    [User: 113.0 ms, System: 5.8 ms]
  Range (min … max):   119.6 ms … 120.4 ms    24 runs

Benchmark 2: ./git cat-file --batch-all-objects --batch-check='%(objectname)'
  Time (mean ± σ):     114.6 ms ±   0.1 ms    [User: 107.6 ms, System: 6.0 ms]
  Range (min … max):   114.4 ms … 114.9 ms    25 runs

Summary
  ./git cat-file --batch-all-objects --batch-check='%(objectname)' ran
    1.05 ± 0.00 times faster than ./git_main cat-file --batch-all-objects --batch-check='%(objectname)'

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 git-compat-util.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index ae1bdc90a4..aa088d04bb 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -614,6 +614,17 @@ static inline bool strip_suffix(const char *str, const char *suffix,
 int git_open_cloexec(const char *name, int flags);
 #define git_open(name) git_open_cloexec(name, O_RDONLY)
 
+/* Help Clang; GCC generates the same code for both variants. */
+#if defined(__clang__)
+static inline size_t st_add(size_t a, size_t b)
+{
+	size_t sum;
+	if (__builtin_add_overflow(a, b, &sum))
+		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
+		    (uintmax_t)a, (uintmax_t)b);
+	return sum;
+}
+#else
 static inline size_t st_add(size_t a, size_t b)
 {
 	if (unsigned_add_overflows(a, b))
@@ -621,6 +632,7 @@ static inline size_t st_add(size_t a, size_t b)
 		    (uintmax_t)a, (uintmax_t)b);
 	return a + b;
 }
+#endif
 #define st_add3(a,b,c)   st_add(st_add((a),(b)),(c))
 #define st_add4(a,b,c,d) st_add(st_add3((a),(b),(c)),(d))
 
-- 
2.54.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang
  2026-05-14 15:13 ` [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang René Scharfe
@ 2026-05-14 19:12   ` Junio C Hamano
  2026-05-14 20:17     ` René Scharfe
  2026-05-15  4:40   ` Jeff King
  1 sibling, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2026-05-14 19:12 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List, Jeff King

René Scharfe <l.s.r@web.de> writes:

> Provide a variant of st_add() that wraps __builtin_add_overflow() to
> help Clang optimize it.  Use it on all platforms for simplicity.
> ...
> +/* Help Clang; GCC generates the same code for both variants. */
> +#if defined(__clang__)
> +static inline size_t st_add(size_t a, size_t b)
> +{
> +	size_t sum;
> +	if (__builtin_add_overflow(a, b, &sum))
> +		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
> +		    (uintmax_t)a, (uintmax_t)b);
> +	return sum;
> +}
> +#else
>  static inline size_t st_add(size_t a, size_t b)
>  {
>  	if (unsigned_add_overflows(a, b))
> @@ -621,6 +632,7 @@ static inline size_t st_add(size_t a, size_t b)
>  		    (uintmax_t)a, (uintmax_t)b);
>  	return a + b;
>  }
> +#endif

Makes me wonder if we tweaked unsigned_add_overflows() to take an
extra *dst parameter to match __builtin_add_overflow(), which of
course requires us to all of 18 callsites, it might make the whole
thing a bit simpler.  New uses of unsigned_add_overflows(), if we
ever add them, would automatically benefit, right?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang
  2026-05-14 19:12   ` Junio C Hamano
@ 2026-05-14 20:17     ` René Scharfe
  2026-05-15 16:49       ` René Scharfe
  0 siblings, 1 reply; 13+ messages in thread
From: René Scharfe @ 2026-05-14 20:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Jeff King

On 5/14/26 9:12 PM, Junio C Hamano wrote:
> René Scharfe <l.s.r@web.de> writes:
> 
>> Provide a variant of st_add() that wraps __builtin_add_overflow() to
>> help Clang optimize it.  Use it on all platforms for simplicity.
>> ...
>> +/* Help Clang; GCC generates the same code for both variants. */
>> +#if defined(__clang__)
>> +static inline size_t st_add(size_t a, size_t b)
>> +{
>> +	size_t sum;
>> +	if (__builtin_add_overflow(a, b, &sum))
>> +		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
>> +		    (uintmax_t)a, (uintmax_t)b);
>> +	return sum;
>> +}
>> +#else
>>  static inline size_t st_add(size_t a, size_t b)
>>  {
>>  	if (unsigned_add_overflows(a, b))
>> @@ -621,6 +632,7 @@ static inline size_t st_add(size_t a, size_t b)
>>  		    (uintmax_t)a, (uintmax_t)b);
>>  	return a + b;
>>  }
>> +#endif
> 
> Makes me wonder if we tweaked unsigned_add_overflows() to take an
> extra *dst parameter to match __builtin_add_overflow(), which of
> course requires us to all of 18 callsites, it might make the whole
> thing a bit simpler.  New uses of unsigned_add_overflows(), if we
> ever add them, would automatically benefit, right?

Hmm.  It sounds like a lot of churn, but it would make sure that
we use the checked result and not check a + b and then go on and
use x + y because the code de-synced at some point.

How to do it, though?  It needs to be generic and evaluate its
arguments only once.  Perhaps like this?


diff --git a/git-compat-util.h b/git-compat-util.h
index ca89cfb0b3..27fbb622d7 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -103,6 +103,21 @@ struct strbuf;
 #define unsigned_add_overflows(a, b) \
     ((b) > maximum_unsigned_value_of_type(a) - (a))
 
+static bool uint_add_overflow(uintmax_t a, uintmax_t b,
+			      uintmax_t *out, size_t out_size)
+{
+	if (b > UINTMAX_MAX - a)
+		return true;
+	a += b;
+	if (a > (UINTMAX_MAX >> (bitsizeof(uintmax_t) - CHAR_BIT * out_size)))
+		return true;
+	*out = a;
+	return false;
+}
+
+#define UINT_ADD_OVERFLOW(a, b, out) \
+	uint_add_overflow((a), (b), (out), sizeof(a))
+
 /*
  * Returns true if the multiplication of "a" and "b" will
  * overflow. The types of "a" and "b" must match and must be unsigned.
@@ -616,10 +631,11 @@ int git_open_cloexec(const char *name, int flags);
 
 static inline size_t st_add(size_t a, size_t b)
 {
-	if (unsigned_add_overflows(a, b))
+	size_t ret;
+	if (UINT_ADD_OVERFLOW(a, b, &ret))
 		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
 		    (uintmax_t)a, (uintmax_t)b);
-	return a + b;
+	return ret;
 }
 #define st_add3(a,b,c)   st_add(st_add((a),(b)),(c))
 #define st_add4(a,b,c,d) st_add(st_add3((a),(b),(c)),(d))


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang
  2026-05-14 20:17     ` René Scharfe
@ 2026-05-15 16:49       ` René Scharfe
  0 siblings, 0 replies; 13+ messages in thread
From: René Scharfe @ 2026-05-15 16:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Jeff King

On 5/14/26 10:17 PM, RenÃ© Scharfe wrote:
> On 5/14/26 9:12 PM, Junio C Hamano wrote:
>> René Scharfe <l.s.r@web.de> writes:
>>
>>> Provide a variant of st_add() that wraps __builtin_add_overflow() to
>>> help Clang optimize it.  Use it on all platforms for simplicity.
>>> ...
>>> +/* Help Clang; GCC generates the same code for both variants. */
>>> +#if defined(__clang__)
>>> +static inline size_t st_add(size_t a, size_t b)
>>> +{
>>> +	size_t sum;
>>> +	if (__builtin_add_overflow(a, b, &sum))
>>> +		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
>>> +		    (uintmax_t)a, (uintmax_t)b);
>>> +	return sum;
>>> +}
>>> +#else
>>>  static inline size_t st_add(size_t a, size_t b)
>>>  {
>>>  	if (unsigned_add_overflows(a, b))
>>> @@ -621,6 +632,7 @@ static inline size_t st_add(size_t a, size_t b)
>>>  		    (uintmax_t)a, (uintmax_t)b);
>>>  	return a + b;
>>>  }
>>> +#endif
>>
>> Makes me wonder if we tweaked unsigned_add_overflows() to take an
>> extra *dst parameter to match __builtin_add_overflow(), which of
>> course requires us to all of 18 callsites, it might make the whole
>> thing a bit simpler.  New uses of unsigned_add_overflows(), if we
>> ever add them, would automatically benefit, right?
> 
> Hmm.  It sounds like a lot of churn, but it would make sure that
> we use the checked result and not check a + b and then go on and
> use x + y because the code de-synced at some point.
> 
> How to do it, though?  It needs to be generic and evaluate its
> arguments only once.  Perhaps like this?
> 
> 
> diff --git a/git-compat-util.h b/git-compat-util.h
> index ca89cfb0b3..27fbb622d7 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -103,6 +103,21 @@ struct strbuf;
>  #define unsigned_add_overflows(a, b) \
>      ((b) > maximum_unsigned_value_of_type(a) - (a))
>  
> +static bool uint_add_overflow(uintmax_t a, uintmax_t b,
> +			      uintmax_t *out, size_t out_size)
> +{
> +	if (b > UINTMAX_MAX - a)
> +		return true;
> +	a += b;
> +	if (a > (UINTMAX_MAX >> (bitsizeof(uintmax_t) - CHAR_BIT * out_size)))
> +		return true;
> +	*out = a;
> +	return false;
> +}
> +
> +#define UINT_ADD_OVERFLOW(a, b, out) \
> +	uint_add_overflow((a), (b), (out), sizeof(a))
> +
>  /*
>   * Returns true if the multiplication of "a" and "b" will
>   * overflow. The types of "a" and "b" must match and must be unsigned.
> @@ -616,10 +631,11 @@ int git_open_cloexec(const char *name, int flags);
>  
>  static inline size_t st_add(size_t a, size_t b)
>  {
> -	if (unsigned_add_overflows(a, b))
> +	size_t ret;
> +	if (UINT_ADD_OVERFLOW(a, b, &ret))

Type mismatch of third argument: pointer to size_t given, pointer to
uintmax_t expected.

>  		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
>  		    (uintmax_t)a, (uintmax_t)b);
> -	return a + b;
> +	return ret;
>  }
>  #define st_add3(a,b,c)   st_add(st_add((a),(b)),(c))
>  #define st_add4(a,b,c,d) st_add(st_add3((a),(b),(c)),(d))
Perhaps like this instead?


diff --git a/git-compat-util.h b/git-compat-util.h
index ae1bdc90a4..23ea42f373 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -103,6 +103,25 @@ struct strbuf;
 #define unsigned_add_overflows(a, b) \
     ((b) > maximum_unsigned_value_of_type(a) - (a))
 
+static inline uintmax_t uint_add_overflow(uintmax_t a, uintmax_t b,
+					  uintmax_t max, bool *overflow)
+{
+	*overflow = a > max || b > max - a;
+	return a + b;
+}
+
+#ifdef __clang__
+#define UINT_ADD_OVERFLOW(a, b, out, overflow) \
+	(*(overflow) = __builtin_add_overflow((a), (b), (out)))
+#else
+#define UINT_ADD_OVERFLOW(a, b, out, overflow) ( \
+	*(out) = uint_add_overflow((a), (b), \
+				   maximum_unsigned_value_of_type(*(out)), \
+				   (overflow)), \
+	*(overflow) \
+)
+#endif
+
 /*
  * Returns true if the multiplication of "a" and "b" will
  * overflow. The types of "a" and "b" must match and must be unsigned.
@@ -616,10 +635,12 @@ int git_open_cloexec(const char *name, int flags);
 
 static inline size_t st_add(size_t a, size_t b)
 {
-	if (unsigned_add_overflows(a, b))
+	bool overflow;
+	size_t ret;
+	if (UINT_ADD_OVERFLOW(a, b, &ret, &overflow))
 		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
 		    (uintmax_t)a, (uintmax_t)b);
-	return a + b;
+	return ret;
 }
 #define st_add3(a,b,c)   st_add(st_add((a),(b)),(c))
 #define st_add4(a,b,c,d) st_add(st_add3((a),(b),(c)),(d))


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang
  2026-05-14 15:13 ` [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang René Scharfe
  2026-05-14 19:12   ` Junio C Hamano
@ 2026-05-15  4:40   ` Jeff King
  2026-05-15 14:36     ` René Scharfe
  1 sibling, 1 reply; 13+ messages in thread
From: Jeff King @ 2026-05-15  4:40 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

On Thu, May 14, 2026 at 05:13:46PM +0200, René Scharfe wrote:

> Clang and GCC optimize away comparisons of overflow checks by checking
> the carry flag on x64.  GCC does the same on ARM64, but Clang currently
> (version 22.1) doesn't.
> 
> Provide a variant of st_add() that wraps __builtin_add_overflow() to
> help Clang optimize it.  Use it on all platforms for simplicity.

OK. I probably would have just used the intrinsic everywhere with
__GNUC__, but if gcc is already figuring it out, it doesn't matter in
practice.

> +/* Help Clang; GCC generates the same code for both variants. */
> +#if defined(__clang__)
> +static inline size_t st_add(size_t a, size_t b)
> +{
> +	size_t sum;
> +	if (__builtin_add_overflow(a, b, &sum))
> +		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
> +		    (uintmax_t)a, (uintmax_t)b);
> +	return sum;
> +}
> +#else
>  static inline size_t st_add(size_t a, size_t b)
>  {
>  	if (unsigned_add_overflows(a, b))

It's a shame we can't share more code here, especially the die message.

I guess the ideal primitive is probably a wrapper with the same
interface as __builtin_add_overflow(), which could then be used
everywhere that unsigned_add_overflows() with some minor conversion.

But it gets awkward to do as a macro, and using an inline function runs
into type questions.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang
  2026-05-15  4:40   ` Jeff King
@ 2026-05-15 14:36     ` René Scharfe
  2026-05-15 16:53       ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: René Scharfe @ 2026-05-15 14:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On 5/15/26 6:40 AM, Jeff King wrote:
> On Thu, May 14, 2026 at 05:13:46PM +0200, René Scharfe wrote:
> 
>> Clang and GCC optimize away comparisons of overflow checks by checking
>> the carry flag on x64.  GCC does the same on ARM64, but Clang currently
>> (version 22.1) doesn't.
>>
>> Provide a variant of st_add() that wraps __builtin_add_overflow() to
>> help Clang optimize it.  Use it on all platforms for simplicity.
> 
> OK. I probably would have just used the intrinsic everywhere with
> __GNUC__, but if gcc is already figuring it out, it doesn't matter in
> practice.

Did that initially, but found it hard to justify when it has no benefit
and reduces the test coverage of the hand-made check significantly.

>> +/* Help Clang; GCC generates the same code for both variants. */
>> +#if defined(__clang__)
>> +static inline size_t st_add(size_t a, size_t b)
>> +{
>> +	size_t sum;
>> +	if (__builtin_add_overflow(a, b, &sum))
>> +		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
>> +		    (uintmax_t)a, (uintmax_t)b);
>> +	return sum;
>> +}
>> +#else
>>  static inline size_t st_add(size_t a, size_t b)
>>  {
>>  	if (unsigned_add_overflows(a, b))
> 
> It's a shame we can't share more code here, especially the die message.
> 
> I guess the ideal primitive is probably a wrapper with the same
> interface as __builtin_add_overflow(), which could then be used
> everywhere that unsigned_add_overflows() with some minor conversion.

Junio said the same. :)
> But it gets awkward to do as a macro, and using an inline function runs
> into type questions.
Indeed.  If it was easy then this wouldn't exist as a builtin.  We
can approximate it somewhat, but will it be robust enough?

René


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang
  2026-05-15 14:36     ` René Scharfe
@ 2026-05-15 16:53       ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2026-05-15 16:53 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

On Fri, May 15, 2026 at 04:36:15PM +0200, René Scharfe wrote:

> > I guess the ideal primitive is probably a wrapper with the same
> > interface as __builtin_add_overflow(), which could then be used
> > everywhere that unsigned_add_overflows() with some minor conversion.
> 
> Junio said the same. :)
> > But it gets awkward to do as a macro, and using an inline function runs
> > into type questions.
> Indeed.  If it was easy then this wouldn't exist as a builtin.  We
> can approximate it somewhat, but will it be robust enough?

I always thought it was a builtin because the most efficient way
involves checking the carry flag, which can't be accessed from C.

But yeah, the type issues are real, too. ;)

I think we should take what you posted for now, and we can iterate on a
more general add-and-check interface later (or never if it's too
tricky).

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow()
  2026-05-14 15:11 [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() René Scharfe
  2026-05-14 15:13 ` [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang René Scharfe
@ 2026-05-14 19:07 ` Junio C Hamano
  2026-05-14 20:13   ` René Scharfe
  1 sibling, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2026-05-14 19:07 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

René Scharfe <l.s.r@web.de> writes:

> Simplify the code by calling st_add3() to do overflow checks instead of
> open-coding it.  This changes the error message to include the offending
> summands, which can be helpful when tracking down the cause.
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  strbuf.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/strbuf.c b/strbuf.c
> index 3e04addc22..bb04d3910e 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -106,12 +106,9 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t len, size_t alloc)
>  void strbuf_grow(struct strbuf *sb, size_t extra)
>  {
>  	int new_buf = !sb->alloc;
> -	if (unsigned_add_overflows(extra, 1) ||
> -	    unsigned_add_overflows(sb->len, extra + 1))
> -		die("you want to use way too much memory");
>  	if (new_buf)
>  		sb->buf = NULL;
> -	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
> +	ALLOC_GROW(sb->buf, st_add3(sb->len, extra, 1), sb->alloc);

ALLOC_GROW() being a macro that references its second argument three
times, doesn't this rewrite rely on the compiler being clever enough
to notice that checking for the same overflow three times is
pointless and does it only once?  I guess the original has the same
issue already, so this may not be so bad but it makes me feel a bit
queasy.

>  	if (new_buf)
>  		sb->buf[0] = '\0';
>  }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow()
  2026-05-14 19:07 ` [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() Junio C Hamano
@ 2026-05-14 20:13   ` René Scharfe
  2026-05-15  4:36     ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: René Scharfe @ 2026-05-14 20:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List

On 5/14/26 9:07 PM, Junio C Hamano wrote:
> René Scharfe <l.s.r@web.de> writes:
> 
>> Simplify the code by calling st_add3() to do overflow checks instead of
>> open-coding it.  This changes the error message to include the offending
>> summands, which can be helpful when tracking down the cause.
>>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>>  strbuf.c | 5 +----
>>  1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/strbuf.c b/strbuf.c
>> index 3e04addc22..bb04d3910e 100644
>> --- a/strbuf.c
>> +++ b/strbuf.c
>> @@ -106,12 +106,9 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t len, size_t alloc)
>>  void strbuf_grow(struct strbuf *sb, size_t extra)
>>  {
>>  	int new_buf = !sb->alloc;
>> -	if (unsigned_add_overflows(extra, 1) ||
>> -	    unsigned_add_overflows(sb->len, extra + 1))
>> -		die("you want to use way too much memory");
>>  	if (new_buf)
>>  		sb->buf = NULL;
>> -	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
>> +	ALLOC_GROW(sb->buf, st_add3(sb->len, extra, 1), sb->alloc);
> 
> ALLOC_GROW() being a macro that references its second argument three
> times, doesn't this rewrite rely on the compiler being clever enough
> to notice that checking for the same overflow three times is
> pointless and does it only once?  I guess the original has the same
> issue already, so this may not be so bad but it makes me feel a bit
> queasy.

As long as it has no side-effect (as is the case for addition) and we
keep compiler optimization enabled we should be fine.

I guess the reason for having multiple references as opposed to
loading the value once into a private variable is to support arbitrary
types.  In the end it needs to fit into a size_t, though.  Something
like this could bring the reference count down to one:

--- 8< ---
diff --git a/git-compat-util.h b/git-compat-util.h
index ae1bdc90a4..ca89cfb0b3 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -812,6 +812,15 @@ static inline void move_array(void *dst, const void *src, size_t n, size_t size)
 
 #define alloc_nr(x) (((x)+16)*3/2)
 
+static inline bool st_alloc_nr(size_t nr, size_t alloc, size_t *out)
+{
+	if (nr > alloc) {
+		*out = alloc_nr(alloc) < nr ? nr : alloc_nr(alloc);
+		return true;
+	}
+	return false;
+}
+
 /**
  * Dynamically growing an array using realloc() is error prone and boring.
  *
@@ -857,12 +866,10 @@ static inline void move_array(void *dst, const void *src, size_t n, size_t size)
  */
 #define ALLOC_GROW(x, nr, alloc) \
 	do { \
-		if ((nr) > alloc) { \
-			if (alloc_nr(alloc) < (nr)) \
-				alloc = (nr); \
-			else \
-				alloc = alloc_nr(alloc); \
-			REALLOC_ARRAY(x, alloc); \
+		size_t alloc_grow_new_alloc_; \
+		if (st_alloc_nr((nr), (alloc), &alloc_grow_new_alloc_)) { \
+			alloc = alloc_grow_new_alloc_; \
+			REALLOC_ARRAY(x, alloc_grow_new_alloc_); \
 		} \
 	} while (0)
 
--- >8 ---

Hmm, alloc_nr() doesn't do any overflow checking.  It should, though,
shouldn't it?

René


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow()
  2026-05-14 20:13   ` René Scharfe
@ 2026-05-15  4:36     ` Jeff King
  2026-05-15 14:30       ` René Scharfe
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2026-05-15  4:36 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, Git List

On Thu, May 14, 2026 at 10:13:19PM +0200, René Scharfe wrote:

> Hmm, alloc_nr() doesn't do any overflow checking.  It should, though,
> shouldn't it?

Yes, probably. It's a known blind spot in the overflow checking, but
I think is OK in practice because:

  1. We are growing an existing buffer by ~3/2. So even with ordering
     the multiplication first, an overflow implies that you have a
     single buffer consuming ~1/3 of your address space.

     On 64-bit systems that's impractically large, and on 32-bit systems I
     think you generally run into fragmentation and address-space issues
     first.

  2. If alloc_nr(alloc) is less than the desired nr, we just use that nr
     directly. So even if we did overflow, I think the result is
     too-slow allocation, and not a buffer overflow.

But it would be nice to be less hand-wavy. One of the reasons I hadn't
dug into it further is that I wanted to start making use of intrinsics
to avoid slowdowns. But since you're already doing that (and finding
that the compiler was doing the fast thing anyway!) it might be a good
time to make the jump.

That's all assuming that no overflow happens before ALLOC_GROW() gets
the values. We also tend to do unchecked computions for the "nr" field
there, but it's usually just "nr_foo + 1", so the same logic applies:
you'd have to have an existing array consuming the entire address space
minus one byte to trigger an overflow.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow()
  2026-05-15  4:36     ` Jeff King
@ 2026-05-15 14:30       ` René Scharfe
  2026-05-15 16:50         ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: René Scharfe @ 2026-05-15 14:30 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git List

On 5/15/26 6:36 AM, Jeff King wrote:
> On Thu, May 14, 2026 at 10:13:19PM +0200, René Scharfe wrote:
> 
>> Hmm, alloc_nr() doesn't do any overflow checking.  It should, though,
>> shouldn't it?
> 
> Yes, probably. It's a known blind spot in the overflow checking, but
> I think is OK in practice because:
> 
>   1. We are growing an existing buffer by ~3/2. So even with ordering
>      the multiplication first, an overflow implies that you have a
>      single buffer consuming ~1/3 of your address space.
> 
>      On 64-bit systems that's impractically large, and on 32-bit systems I
>      think you generally run into fragmentation and address-space issues
>      first.
> 
>   2. If alloc_nr(alloc) is less than the desired nr, we just use that nr
>      directly. So even if we did overflow, I think the result is
>      too-slow allocation, and not a buffer overflow.
> > But it would be nice to be less hand-wavy. One of the reasons I hadn't
> dug into it further is that I wanted to start making use of intrinsics
> to avoid slowdowns. But since you're already doing that (and finding
> that the compiler was doing the fast thing anyway!) it might be a good
> time to make the jump.

Didn't look at __builtin_mul_overflow() in detail; its situation could
be  different than for __builtin_add_overflow(), which turned out to be
unnecessary on x64.

> That's all assuming that no overflow happens before ALLOC_GROW() gets
> the values. We also tend to do unchecked computions for the "nr" field
> there, but it's usually just "nr_foo + 1", so the same logic applies:
> you'd have to have an existing array consuming the entire address space
> minus one byte to trigger an overflow.
The use in read-cache.c::do_read_index() looks odd.  Has been present
since commit one.  Is the point that it over-allocates to have room for
additions right from the start?  For read-only commands this only wastes
memory, no?

René


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow()
  2026-05-15 14:30       ` René Scharfe
@ 2026-05-15 16:50         ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2026-05-15 16:50 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, Git List

On Fri, May 15, 2026 at 04:30:34PM +0200, René Scharfe wrote:

> > That's all assuming that no overflow happens before ALLOC_GROW() gets
> > the values. We also tend to do unchecked computions for the "nr" field
> > there, but it's usually just "nr_foo + 1", so the same logic applies:
> > you'd have to have an existing array consuming the entire address space
> > minus one byte to trigger an overflow.
>
> The use in read-cache.c::do_read_index() looks odd.  Has been present
> since commit one.  Is the point that it over-allocates to have room for
> additions right from the start?  For read-only commands this only wastes
> memory, no?

Hmm, yeah, that is weird, and unusual to use alloc_nr() directly. We are
presumably picking up istate->cache_nr from the on-disk file, so it
could be anything, and that alloc_nr() could overflow.

We'd store the too-small value in alloc, so we _know_ it's too small. So
later when we use ALLOC_GROW(), the problem would be resolved as we grow
the array. I'm not convinced the initial load might not overflow the
array, though.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-05-15 16:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-14 15:11 [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() René Scharfe
2026-05-14 15:13 ` [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang René Scharfe
2026-05-14 19:12   ` Junio C Hamano
2026-05-14 20:17     ` René Scharfe
2026-05-15 16:49       ` René Scharfe
2026-05-15  4:40   ` Jeff King
2026-05-15 14:36     ` René Scharfe
2026-05-15 16:53       ` Jeff King
2026-05-14 19:07 ` [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() Junio C Hamano
2026-05-14 20:13   ` René Scharfe
2026-05-15  4:36     ` Jeff King
2026-05-15 14:30       ` René Scharfe
2026-05-15 16:50         ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.