* [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() @ 2026-05-14 15:11 René Scharfe 2026-05-14 15:13 ` [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang René Scharfe 2026-05-14 19:07 ` [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() Junio C Hamano 0 siblings, 2 replies; 13+ messages in thread From: René Scharfe @ 2026-05-14 15:11 UTC (permalink / raw) To: Git List Simplify the code by calling st_add3() to do overflow checks instead of open-coding it. This changes the error message to include the offending summands, which can be helpful when tracking down the cause. Signed-off-by: René Scharfe <l.s.r@web.de> --- strbuf.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/strbuf.c b/strbuf.c index 3e04addc22..bb04d3910e 100644 --- a/strbuf.c +++ b/strbuf.c @@ -106,12 +106,9 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t len, size_t alloc) void strbuf_grow(struct strbuf *sb, size_t extra) { int new_buf = !sb->alloc; - if (unsigned_add_overflows(extra, 1) || - unsigned_add_overflows(sb->len, extra + 1)) - die("you want to use way too much memory"); if (new_buf) sb->buf = NULL; - ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc); + ALLOC_GROW(sb->buf, st_add3(sb->len, extra, 1), sb->alloc); if (new_buf) sb->buf[0] = '\0'; } -- 2.54.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang 2026-05-14 15:11 [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() René Scharfe @ 2026-05-14 15:13 ` René Scharfe 2026-05-14 19:12 ` Junio C Hamano 2026-05-15 4:40 ` Jeff King 2026-05-14 19:07 ` [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() Junio C Hamano 1 sibling, 2 replies; 13+ messages in thread From: René Scharfe @ 2026-05-14 15:13 UTC (permalink / raw) To: Git List; +Cc: Jeff King Clang and GCC optimize away comparisons of overflow checks by checking the carry flag on x64. GCC does the same on ARM64, but Clang currently (version 22.1) doesn't. Provide a variant of st_add() that wraps __builtin_add_overflow() to help Clang optimize it. Use it on all platforms for simplicity. On an Apple M1 I get a nice speedup for a command that builds lots of strings using a strbuf, which exercises the st_add3() in strbuf_grow() for every line of output: Benchmark 1: ./git_main cat-file --batch-all-objects --batch-check='%(objectname)' Time (mean ± σ): 119.8 ms ± 0.2 ms [User: 113.0 ms, System: 5.8 ms] Range (min … max): 119.6 ms … 120.4 ms 24 runs Benchmark 2: ./git cat-file --batch-all-objects --batch-check='%(objectname)' Time (mean ± σ): 114.6 ms ± 0.1 ms [User: 107.6 ms, System: 6.0 ms] Range (min … max): 114.4 ms … 114.9 ms 25 runs Summary ./git cat-file --batch-all-objects --batch-check='%(objectname)' ran 1.05 ± 0.00 times faster than ./git_main cat-file --batch-all-objects --batch-check='%(objectname)' Suggested-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> --- git-compat-util.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/git-compat-util.h b/git-compat-util.h index ae1bdc90a4..aa088d04bb 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -614,6 +614,17 @@ static inline bool strip_suffix(const char *str, const char *suffix, int git_open_cloexec(const char *name, int flags); #define git_open(name) git_open_cloexec(name, O_RDONLY) +/* Help Clang; GCC generates the same code for both variants. */ +#if defined(__clang__) +static inline size_t st_add(size_t a, size_t b) +{ + size_t sum; + if (__builtin_add_overflow(a, b, &sum)) + die("size_t overflow: %"PRIuMAX" + %"PRIuMAX, + (uintmax_t)a, (uintmax_t)b); + return sum; +} +#else static inline size_t st_add(size_t a, size_t b) { if (unsigned_add_overflows(a, b)) @@ -621,6 +632,7 @@ static inline size_t st_add(size_t a, size_t b) (uintmax_t)a, (uintmax_t)b); return a + b; } +#endif #define st_add3(a,b,c) st_add(st_add((a),(b)),(c)) #define st_add4(a,b,c,d) st_add(st_add3((a),(b),(c)),(d)) -- 2.54.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang 2026-05-14 15:13 ` [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang René Scharfe @ 2026-05-14 19:12 ` Junio C Hamano 2026-05-14 20:17 ` René Scharfe 2026-05-15 4:40 ` Jeff King 1 sibling, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2026-05-14 19:12 UTC (permalink / raw) To: René Scharfe; +Cc: Git List, Jeff King René Scharfe <l.s.r@web.de> writes: > Provide a variant of st_add() that wraps __builtin_add_overflow() to > help Clang optimize it. Use it on all platforms for simplicity. > ... > +/* Help Clang; GCC generates the same code for both variants. */ > +#if defined(__clang__) > +static inline size_t st_add(size_t a, size_t b) > +{ > + size_t sum; > + if (__builtin_add_overflow(a, b, &sum)) > + die("size_t overflow: %"PRIuMAX" + %"PRIuMAX, > + (uintmax_t)a, (uintmax_t)b); > + return sum; > +} > +#else > static inline size_t st_add(size_t a, size_t b) > { > if (unsigned_add_overflows(a, b)) > @@ -621,6 +632,7 @@ static inline size_t st_add(size_t a, size_t b) > (uintmax_t)a, (uintmax_t)b); > return a + b; > } > +#endif Makes me wonder if we tweaked unsigned_add_overflows() to take an extra *dst parameter to match __builtin_add_overflow(), which of course requires us to all of 18 callsites, it might make the whole thing a bit simpler. New uses of unsigned_add_overflows(), if we ever add them, would automatically benefit, right? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang 2026-05-14 19:12 ` Junio C Hamano @ 2026-05-14 20:17 ` René Scharfe 2026-05-15 16:49 ` René Scharfe 0 siblings, 1 reply; 13+ messages in thread From: René Scharfe @ 2026-05-14 20:17 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git List, Jeff King On 5/14/26 9:12 PM, Junio C Hamano wrote: > René Scharfe <l.s.r@web.de> writes: > >> Provide a variant of st_add() that wraps __builtin_add_overflow() to >> help Clang optimize it. Use it on all platforms for simplicity. >> ... >> +/* Help Clang; GCC generates the same code for both variants. */ >> +#if defined(__clang__) >> +static inline size_t st_add(size_t a, size_t b) >> +{ >> + size_t sum; >> + if (__builtin_add_overflow(a, b, &sum)) >> + die("size_t overflow: %"PRIuMAX" + %"PRIuMAX, >> + (uintmax_t)a, (uintmax_t)b); >> + return sum; >> +} >> +#else >> static inline size_t st_add(size_t a, size_t b) >> { >> if (unsigned_add_overflows(a, b)) >> @@ -621,6 +632,7 @@ static inline size_t st_add(size_t a, size_t b) >> (uintmax_t)a, (uintmax_t)b); >> return a + b; >> } >> +#endif > > Makes me wonder if we tweaked unsigned_add_overflows() to take an > extra *dst parameter to match __builtin_add_overflow(), which of > course requires us to all of 18 callsites, it might make the whole > thing a bit simpler. New uses of unsigned_add_overflows(), if we > ever add them, would automatically benefit, right? Hmm. It sounds like a lot of churn, but it would make sure that we use the checked result and not check a + b and then go on and use x + y because the code de-synced at some point. How to do it, though? It needs to be generic and evaluate its arguments only once. Perhaps like this? diff --git a/git-compat-util.h b/git-compat-util.h index ca89cfb0b3..27fbb622d7 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -103,6 +103,21 @@ struct strbuf; #define unsigned_add_overflows(a, b) \ ((b) > maximum_unsigned_value_of_type(a) - (a)) +static bool uint_add_overflow(uintmax_t a, uintmax_t b, + uintmax_t *out, size_t out_size) +{ + if (b > UINTMAX_MAX - a) + return true; + a += b; + if (a > (UINTMAX_MAX >> (bitsizeof(uintmax_t) - CHAR_BIT * out_size))) + return true; + *out = a; + return false; +} + +#define UINT_ADD_OVERFLOW(a, b, out) \ + uint_add_overflow((a), (b), (out), sizeof(a)) + /* * Returns true if the multiplication of "a" and "b" will * overflow. The types of "a" and "b" must match and must be unsigned. @@ -616,10 +631,11 @@ int git_open_cloexec(const char *name, int flags); static inline size_t st_add(size_t a, size_t b) { - if (unsigned_add_overflows(a, b)) + size_t ret; + if (UINT_ADD_OVERFLOW(a, b, &ret)) die("size_t overflow: %"PRIuMAX" + %"PRIuMAX, (uintmax_t)a, (uintmax_t)b); - return a + b; + return ret; } #define st_add3(a,b,c) st_add(st_add((a),(b)),(c)) #define st_add4(a,b,c,d) st_add(st_add3((a),(b),(c)),(d)) ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang 2026-05-14 20:17 ` René Scharfe @ 2026-05-15 16:49 ` René Scharfe 0 siblings, 0 replies; 13+ messages in thread From: René Scharfe @ 2026-05-15 16:49 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git List, Jeff King On 5/14/26 10:17 PM, René Scharfe wrote: > On 5/14/26 9:12 PM, Junio C Hamano wrote: >> René Scharfe <l.s.r@web.de> writes: >> >>> Provide a variant of st_add() that wraps __builtin_add_overflow() to >>> help Clang optimize it. Use it on all platforms for simplicity. >>> ... >>> +/* Help Clang; GCC generates the same code for both variants. */ >>> +#if defined(__clang__) >>> +static inline size_t st_add(size_t a, size_t b) >>> +{ >>> + size_t sum; >>> + if (__builtin_add_overflow(a, b, &sum)) >>> + die("size_t overflow: %"PRIuMAX" + %"PRIuMAX, >>> + (uintmax_t)a, (uintmax_t)b); >>> + return sum; >>> +} >>> +#else >>> static inline size_t st_add(size_t a, size_t b) >>> { >>> if (unsigned_add_overflows(a, b)) >>> @@ -621,6 +632,7 @@ static inline size_t st_add(size_t a, size_t b) >>> (uintmax_t)a, (uintmax_t)b); >>> return a + b; >>> } >>> +#endif >> >> Makes me wonder if we tweaked unsigned_add_overflows() to take an >> extra *dst parameter to match __builtin_add_overflow(), which of >> course requires us to all of 18 callsites, it might make the whole >> thing a bit simpler. New uses of unsigned_add_overflows(), if we >> ever add them, would automatically benefit, right? > > Hmm. It sounds like a lot of churn, but it would make sure that > we use the checked result and not check a + b and then go on and > use x + y because the code de-synced at some point. > > How to do it, though? It needs to be generic and evaluate its > arguments only once. Perhaps like this? > > > diff --git a/git-compat-util.h b/git-compat-util.h > index ca89cfb0b3..27fbb622d7 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -103,6 +103,21 @@ struct strbuf; > #define unsigned_add_overflows(a, b) \ > ((b) > maximum_unsigned_value_of_type(a) - (a)) > > +static bool uint_add_overflow(uintmax_t a, uintmax_t b, > + uintmax_t *out, size_t out_size) > +{ > + if (b > UINTMAX_MAX - a) > + return true; > + a += b; > + if (a > (UINTMAX_MAX >> (bitsizeof(uintmax_t) - CHAR_BIT * out_size))) > + return true; > + *out = a; > + return false; > +} > + > +#define UINT_ADD_OVERFLOW(a, b, out) \ > + uint_add_overflow((a), (b), (out), sizeof(a)) > + > /* > * Returns true if the multiplication of "a" and "b" will > * overflow. The types of "a" and "b" must match and must be unsigned. > @@ -616,10 +631,11 @@ int git_open_cloexec(const char *name, int flags); > > static inline size_t st_add(size_t a, size_t b) > { > - if (unsigned_add_overflows(a, b)) > + size_t ret; > + if (UINT_ADD_OVERFLOW(a, b, &ret)) Type mismatch of third argument: pointer to size_t given, pointer to uintmax_t expected. > die("size_t overflow: %"PRIuMAX" + %"PRIuMAX, > (uintmax_t)a, (uintmax_t)b); > - return a + b; > + return ret; > } > #define st_add3(a,b,c) st_add(st_add((a),(b)),(c)) > #define st_add4(a,b,c,d) st_add(st_add3((a),(b),(c)),(d)) Perhaps like this instead? diff --git a/git-compat-util.h b/git-compat-util.h index ae1bdc90a4..23ea42f373 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -103,6 +103,25 @@ struct strbuf; #define unsigned_add_overflows(a, b) \ ((b) > maximum_unsigned_value_of_type(a) - (a)) +static inline uintmax_t uint_add_overflow(uintmax_t a, uintmax_t b, + uintmax_t max, bool *overflow) +{ + *overflow = a > max || b > max - a; + return a + b; +} + +#ifdef __clang__ +#define UINT_ADD_OVERFLOW(a, b, out, overflow) \ + (*(overflow) = __builtin_add_overflow((a), (b), (out))) +#else +#define UINT_ADD_OVERFLOW(a, b, out, overflow) ( \ + *(out) = uint_add_overflow((a), (b), \ + maximum_unsigned_value_of_type(*(out)), \ + (overflow)), \ + *(overflow) \ +) +#endif + /* * Returns true if the multiplication of "a" and "b" will * overflow. The types of "a" and "b" must match and must be unsigned. @@ -616,10 +635,12 @@ int git_open_cloexec(const char *name, int flags); static inline size_t st_add(size_t a, size_t b) { - if (unsigned_add_overflows(a, b)) + bool overflow; + size_t ret; + if (UINT_ADD_OVERFLOW(a, b, &ret, &overflow)) die("size_t overflow: %"PRIuMAX" + %"PRIuMAX, (uintmax_t)a, (uintmax_t)b); - return a + b; + return ret; } #define st_add3(a,b,c) st_add(st_add((a),(b)),(c)) #define st_add4(a,b,c,d) st_add(st_add3((a),(b),(c)),(d)) ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang 2026-05-14 15:13 ` [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang René Scharfe 2026-05-14 19:12 ` Junio C Hamano @ 2026-05-15 4:40 ` Jeff King 2026-05-15 14:36 ` René Scharfe 1 sibling, 1 reply; 13+ messages in thread From: Jeff King @ 2026-05-15 4:40 UTC (permalink / raw) To: René Scharfe; +Cc: Git List On Thu, May 14, 2026 at 05:13:46PM +0200, René Scharfe wrote: > Clang and GCC optimize away comparisons of overflow checks by checking > the carry flag on x64. GCC does the same on ARM64, but Clang currently > (version 22.1) doesn't. > > Provide a variant of st_add() that wraps __builtin_add_overflow() to > help Clang optimize it. Use it on all platforms for simplicity. OK. I probably would have just used the intrinsic everywhere with __GNUC__, but if gcc is already figuring it out, it doesn't matter in practice. > +/* Help Clang; GCC generates the same code for both variants. */ > +#if defined(__clang__) > +static inline size_t st_add(size_t a, size_t b) > +{ > + size_t sum; > + if (__builtin_add_overflow(a, b, &sum)) > + die("size_t overflow: %"PRIuMAX" + %"PRIuMAX, > + (uintmax_t)a, (uintmax_t)b); > + return sum; > +} > +#else > static inline size_t st_add(size_t a, size_t b) > { > if (unsigned_add_overflows(a, b)) It's a shame we can't share more code here, especially the die message. I guess the ideal primitive is probably a wrapper with the same interface as __builtin_add_overflow(), which could then be used everywhere that unsigned_add_overflows() with some minor conversion. But it gets awkward to do as a macro, and using an inline function runs into type questions. -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang 2026-05-15 4:40 ` Jeff King @ 2026-05-15 14:36 ` René Scharfe 2026-05-15 16:53 ` Jeff King 0 siblings, 1 reply; 13+ messages in thread From: René Scharfe @ 2026-05-15 14:36 UTC (permalink / raw) To: Jeff King; +Cc: Git List On 5/15/26 6:40 AM, Jeff King wrote: > On Thu, May 14, 2026 at 05:13:46PM +0200, René Scharfe wrote: > >> Clang and GCC optimize away comparisons of overflow checks by checking >> the carry flag on x64. GCC does the same on ARM64, but Clang currently >> (version 22.1) doesn't. >> >> Provide a variant of st_add() that wraps __builtin_add_overflow() to >> help Clang optimize it. Use it on all platforms for simplicity. > > OK. I probably would have just used the intrinsic everywhere with > __GNUC__, but if gcc is already figuring it out, it doesn't matter in > practice. Did that initially, but found it hard to justify when it has no benefit and reduces the test coverage of the hand-made check significantly. >> +/* Help Clang; GCC generates the same code for both variants. */ >> +#if defined(__clang__) >> +static inline size_t st_add(size_t a, size_t b) >> +{ >> + size_t sum; >> + if (__builtin_add_overflow(a, b, &sum)) >> + die("size_t overflow: %"PRIuMAX" + %"PRIuMAX, >> + (uintmax_t)a, (uintmax_t)b); >> + return sum; >> +} >> +#else >> static inline size_t st_add(size_t a, size_t b) >> { >> if (unsigned_add_overflows(a, b)) > > It's a shame we can't share more code here, especially the die message. > > I guess the ideal primitive is probably a wrapper with the same > interface as __builtin_add_overflow(), which could then be used > everywhere that unsigned_add_overflows() with some minor conversion. Junio said the same. :) > But it gets awkward to do as a macro, and using an inline function runs > into type questions. Indeed. If it was easy then this wouldn't exist as a builtin. We can approximate it somewhat, but will it be robust enough? René ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang 2026-05-15 14:36 ` René Scharfe @ 2026-05-15 16:53 ` Jeff King 0 siblings, 0 replies; 13+ messages in thread From: Jeff King @ 2026-05-15 16:53 UTC (permalink / raw) To: René Scharfe; +Cc: Git List On Fri, May 15, 2026 at 04:36:15PM +0200, René Scharfe wrote: > > I guess the ideal primitive is probably a wrapper with the same > > interface as __builtin_add_overflow(), which could then be used > > everywhere that unsigned_add_overflows() with some minor conversion. > > Junio said the same. :) > > But it gets awkward to do as a macro, and using an inline function runs > > into type questions. > Indeed. If it was easy then this wouldn't exist as a builtin. We > can approximate it somewhat, but will it be robust enough? I always thought it was a builtin because the most efficient way involves checking the carry flag, which can't be accessed from C. But yeah, the type issues are real, too. ;) I think we should take what you posted for now, and we can iterate on a more general add-and-check interface later (or never if it's too tricky). -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() 2026-05-14 15:11 [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() René Scharfe 2026-05-14 15:13 ` [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang René Scharfe @ 2026-05-14 19:07 ` Junio C Hamano 2026-05-14 20:13 ` René Scharfe 1 sibling, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2026-05-14 19:07 UTC (permalink / raw) To: René Scharfe; +Cc: Git List René Scharfe <l.s.r@web.de> writes: > Simplify the code by calling st_add3() to do overflow checks instead of > open-coding it. This changes the error message to include the offending > summands, which can be helpful when tracking down the cause. > > Signed-off-by: René Scharfe <l.s.r@web.de> > --- > strbuf.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/strbuf.c b/strbuf.c > index 3e04addc22..bb04d3910e 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -106,12 +106,9 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t len, size_t alloc) > void strbuf_grow(struct strbuf *sb, size_t extra) > { > int new_buf = !sb->alloc; > - if (unsigned_add_overflows(extra, 1) || > - unsigned_add_overflows(sb->len, extra + 1)) > - die("you want to use way too much memory"); > if (new_buf) > sb->buf = NULL; > - ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc); > + ALLOC_GROW(sb->buf, st_add3(sb->len, extra, 1), sb->alloc); ALLOC_GROW() being a macro that references its second argument three times, doesn't this rewrite rely on the compiler being clever enough to notice that checking for the same overflow three times is pointless and does it only once? I guess the original has the same issue already, so this may not be so bad but it makes me feel a bit queasy. > if (new_buf) > sb->buf[0] = '\0'; > } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() 2026-05-14 19:07 ` [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() Junio C Hamano @ 2026-05-14 20:13 ` René Scharfe 2026-05-15 4:36 ` Jeff King 0 siblings, 1 reply; 13+ messages in thread From: René Scharfe @ 2026-05-14 20:13 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git List On 5/14/26 9:07 PM, Junio C Hamano wrote: > René Scharfe <l.s.r@web.de> writes: > >> Simplify the code by calling st_add3() to do overflow checks instead of >> open-coding it. This changes the error message to include the offending >> summands, which can be helpful when tracking down the cause. >> >> Signed-off-by: René Scharfe <l.s.r@web.de> >> --- >> strbuf.c | 5 +---- >> 1 file changed, 1 insertion(+), 4 deletions(-) >> >> diff --git a/strbuf.c b/strbuf.c >> index 3e04addc22..bb04d3910e 100644 >> --- a/strbuf.c >> +++ b/strbuf.c >> @@ -106,12 +106,9 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t len, size_t alloc) >> void strbuf_grow(struct strbuf *sb, size_t extra) >> { >> int new_buf = !sb->alloc; >> - if (unsigned_add_overflows(extra, 1) || >> - unsigned_add_overflows(sb->len, extra + 1)) >> - die("you want to use way too much memory"); >> if (new_buf) >> sb->buf = NULL; >> - ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc); >> + ALLOC_GROW(sb->buf, st_add3(sb->len, extra, 1), sb->alloc); > > ALLOC_GROW() being a macro that references its second argument three > times, doesn't this rewrite rely on the compiler being clever enough > to notice that checking for the same overflow three times is > pointless and does it only once? I guess the original has the same > issue already, so this may not be so bad but it makes me feel a bit > queasy. As long as it has no side-effect (as is the case for addition) and we keep compiler optimization enabled we should be fine. I guess the reason for having multiple references as opposed to loading the value once into a private variable is to support arbitrary types. In the end it needs to fit into a size_t, though. Something like this could bring the reference count down to one: --- 8< --- diff --git a/git-compat-util.h b/git-compat-util.h index ae1bdc90a4..ca89cfb0b3 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -812,6 +812,15 @@ static inline void move_array(void *dst, const void *src, size_t n, size_t size) #define alloc_nr(x) (((x)+16)*3/2) +static inline bool st_alloc_nr(size_t nr, size_t alloc, size_t *out) +{ + if (nr > alloc) { + *out = alloc_nr(alloc) < nr ? nr : alloc_nr(alloc); + return true; + } + return false; +} + /** * Dynamically growing an array using realloc() is error prone and boring. * @@ -857,12 +866,10 @@ static inline void move_array(void *dst, const void *src, size_t n, size_t size) */ #define ALLOC_GROW(x, nr, alloc) \ do { \ - if ((nr) > alloc) { \ - if (alloc_nr(alloc) < (nr)) \ - alloc = (nr); \ - else \ - alloc = alloc_nr(alloc); \ - REALLOC_ARRAY(x, alloc); \ + size_t alloc_grow_new_alloc_; \ + if (st_alloc_nr((nr), (alloc), &alloc_grow_new_alloc_)) { \ + alloc = alloc_grow_new_alloc_; \ + REALLOC_ARRAY(x, alloc_grow_new_alloc_); \ } \ } while (0) --- >8 --- Hmm, alloc_nr() doesn't do any overflow checking. It should, though, shouldn't it? René ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() 2026-05-14 20:13 ` René Scharfe @ 2026-05-15 4:36 ` Jeff King 2026-05-15 14:30 ` René Scharfe 0 siblings, 1 reply; 13+ messages in thread From: Jeff King @ 2026-05-15 4:36 UTC (permalink / raw) To: René Scharfe; +Cc: Junio C Hamano, Git List On Thu, May 14, 2026 at 10:13:19PM +0200, René Scharfe wrote: > Hmm, alloc_nr() doesn't do any overflow checking. It should, though, > shouldn't it? Yes, probably. It's a known blind spot in the overflow checking, but I think is OK in practice because: 1. We are growing an existing buffer by ~3/2. So even with ordering the multiplication first, an overflow implies that you have a single buffer consuming ~1/3 of your address space. On 64-bit systems that's impractically large, and on 32-bit systems I think you generally run into fragmentation and address-space issues first. 2. If alloc_nr(alloc) is less than the desired nr, we just use that nr directly. So even if we did overflow, I think the result is too-slow allocation, and not a buffer overflow. But it would be nice to be less hand-wavy. One of the reasons I hadn't dug into it further is that I wanted to start making use of intrinsics to avoid slowdowns. But since you're already doing that (and finding that the compiler was doing the fast thing anyway!) it might be a good time to make the jump. That's all assuming that no overflow happens before ALLOC_GROW() gets the values. We also tend to do unchecked computions for the "nr" field there, but it's usually just "nr_foo + 1", so the same logic applies: you'd have to have an existing array consuming the entire address space minus one byte to trigger an overflow. -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() 2026-05-15 4:36 ` Jeff King @ 2026-05-15 14:30 ` René Scharfe 2026-05-15 16:50 ` Jeff King 0 siblings, 1 reply; 13+ messages in thread From: René Scharfe @ 2026-05-15 14:30 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Git List On 5/15/26 6:36 AM, Jeff King wrote: > On Thu, May 14, 2026 at 10:13:19PM +0200, René Scharfe wrote: > >> Hmm, alloc_nr() doesn't do any overflow checking. It should, though, >> shouldn't it? > > Yes, probably. It's a known blind spot in the overflow checking, but > I think is OK in practice because: > > 1. We are growing an existing buffer by ~3/2. So even with ordering > the multiplication first, an overflow implies that you have a > single buffer consuming ~1/3 of your address space. > > On 64-bit systems that's impractically large, and on 32-bit systems I > think you generally run into fragmentation and address-space issues > first. > > 2. If alloc_nr(alloc) is less than the desired nr, we just use that nr > directly. So even if we did overflow, I think the result is > too-slow allocation, and not a buffer overflow. > > But it would be nice to be less hand-wavy. One of the reasons I hadn't > dug into it further is that I wanted to start making use of intrinsics > to avoid slowdowns. But since you're already doing that (and finding > that the compiler was doing the fast thing anyway!) it might be a good > time to make the jump. Didn't look at __builtin_mul_overflow() in detail; its situation could be different than for __builtin_add_overflow(), which turned out to be unnecessary on x64. > That's all assuming that no overflow happens before ALLOC_GROW() gets > the values. We also tend to do unchecked computions for the "nr" field > there, but it's usually just "nr_foo + 1", so the same logic applies: > you'd have to have an existing array consuming the entire address space > minus one byte to trigger an overflow. The use in read-cache.c::do_read_index() looks odd. Has been present since commit one. Is the point that it over-allocates to have room for additions right from the start? For read-only commands this only wastes memory, no? René ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() 2026-05-15 14:30 ` René Scharfe @ 2026-05-15 16:50 ` Jeff King 0 siblings, 0 replies; 13+ messages in thread From: Jeff King @ 2026-05-15 16:50 UTC (permalink / raw) To: René Scharfe; +Cc: Junio C Hamano, Git List On Fri, May 15, 2026 at 04:30:34PM +0200, René Scharfe wrote: > > That's all assuming that no overflow happens before ALLOC_GROW() gets > > the values. We also tend to do unchecked computions for the "nr" field > > there, but it's usually just "nr_foo + 1", so the same logic applies: > > you'd have to have an existing array consuming the entire address space > > minus one byte to trigger an overflow. > > The use in read-cache.c::do_read_index() looks odd. Has been present > since commit one. Is the point that it over-allocates to have room for > additions right from the start? For read-only commands this only wastes > memory, no? Hmm, yeah, that is weird, and unusual to use alloc_nr() directly. We are presumably picking up istate->cache_nr from the on-disk file, so it could be anything, and that alloc_nr() could overflow. We'd store the too-small value in alloc, so we _know_ it's too small. So later when we use ALLOC_GROW(), the problem would be resolved as we grow the array. I'm not convinced the initial load might not overflow the array, though. -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-05-15 16:53 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-14 15:11 [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() René Scharfe 2026-05-14 15:13 ` [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang René Scharfe 2026-05-14 19:12 ` Junio C Hamano 2026-05-14 20:17 ` René Scharfe 2026-05-15 16:49 ` René Scharfe 2026-05-15 4:40 ` Jeff King 2026-05-15 14:36 ` René Scharfe 2026-05-15 16:53 ` Jeff King 2026-05-14 19:07 ` [PATCH 1/2] strbuf: use st_add3() in strbuf_grow() Junio C Hamano 2026-05-14 20:13 ` René Scharfe 2026-05-15 4:36 ` Jeff King 2026-05-15 14:30 ` René Scharfe 2026-05-15 16:50 ` Jeff King
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.